Two business professionals reviewing graphs and data on a tablet and paper report indoors.

Before You Buy or Sell a Website, Check This: Search Visibility Does Not Equal AI Crawler Control

If you are buying, selling, or valuing a website, do not treat Google visibility and AI crawler access as the same asset. They are separate controls with different business implications.

Google’s current Search Central documentation says AI Overviews and AI Mode use the same core guidance as Search. There is no separate AI optimization track a site owner must enable to be considered. In practical terms, a site can remain eligible for Google search experiences while making different decisions about whether other companies’ AI crawlers can access content for training or retrieval.

That matters in diligence because sellers may point to rankings, impressions, or brand visibility as proof of long-term content value while the site has also been broadly accessible to non-Google AI bots. Those are not equivalent facts. One speaks to discoverability in Google. The other speaks to content-use exposure, defensibility, and whether future referral value may trail the amount of crawling already happening.

What Google AI search visibility actually depends on

Per Google Search Central, site owners do not need a separate AI-specific setup for AI Overviews or AI Mode beyond normal Search guidance. Google also documents that preview controls such as snippet-related settings can affect how content appears in Search features, but that is different from granting blanket access to third-party AI crawlers.

Separately, Google’s 2025 robots.txt refresher is a useful reminder that robots.txt is still a core crawler access mechanism. But buyers should read that as the start of the audit, not the end of it. A robots.txt file tells compliant crawlers what is allowed or disallowed. It does not mean every bot behaves the same way, and it does not prove what happens at the edge if a CDN or security layer is modifying responses.

This is where diligence often breaks down. A seller may show you a clean robots.txt file at the origin, but if the site runs through Cloudflare, Cloudflare can prepend or manage robots.txt behavior. Cloudflare also maintains AI crawler controls and bot references that classify specific AI crawlers and allow operators to manage access at the edge. That means inherited settings may not be obvious from the CMS, hosting panel, or origin file system alone.

What buyers and operators need to audit

For a small business site, this is not an abstract policy debate. It affects valuation, traffic assumptions, licensing posture, and maintenance risk.

Start with the simple question: does the site get meaningful business value from the entities crawling it? If crawl exposure is high but attributable referral traffic, leads, assisted conversions, or branded demand are weak, then the content asset may be less defensible than the seller suggests.

Next, inspect control surfaces in the right order:

  • robots.txt at the live edge: Check the public file that bots actually fetch, not just the file in the server account or WordPress plugin screen.
  • Cloudflare settings: Review managed robots.txt behavior, AI crawl controls, bot management, and any rules that alter crawler handling before requests reach origin.
  • Verified bot handling: Determine which crawlers are explicitly allowed, blocked, challenged, or exempted.
  • Logs and analytics: Ask for server logs, CDN logs, or bot reports showing actual crawl activity. Compare that with real referral sessions and conversion value.
  • Content-use posture: Confirm whether current settings reflect the owner’s intent or are leftovers from an agency, plugin, or prior security configuration.

A common diligence mistake is assuming that blocking third-party AI bots would remove a site from Google Search or Google AI features. The approved claim here is narrower: Google says its AI search features follow normal Search guidance. That is not the same thing as allowing every other AI crawler.

What to do next

Before you buy, sell, or revalue a site this week, request five items:

  1. The current live robots.txt as fetched publicly.
  2. A screenshot or export of Cloudflare bot and AI crawl settings, if Cloudflare is in use.
  3. A list of any managed or injected robots rules from CDN, WAF, hosting, or security tools.
  4. Recent log evidence showing major bot activity by crawler type.
  5. A traffic and conversion view that separates crawl volume from actual referral and lead value.

If those records are missing, treat that as an operational-risk flag. You may be buying a site with strong Search visibility but weak control over how its content has been accessed, reused, or monetized by others. For due diligence, that distinction is now material.

Sources

Know someone who would benefit from this update? Share this article with them.

This article is for informational purposes only and reflects general marketing, technology, website, and small-business guidance. Platform features, policies, search behavior, pricing, and security conditions can change. Verify current requirements with the relevant platform, provider, or professional advisor before acting. Nothing in this article should be treated as legal, tax, financial, cybersecurity, or other professional advice.