Check if your HTML is within Google's 2MB crawl limit. Measure raw and compressed sizes to ensure Googlebot can fully crawl your page.
Google's Googlebot will only fetch the first 2MB of an HTML page. After reaching that threshold, the fetch stops entirely. The downloaded portion — the first 2MB of bytes — is passed to Google's indexing systems and the Web Rendering Service (WRS) as if it were the complete file. Any bytes beyond the cutoff are never fetched, never rendered, and never indexed.
This 2MB limit applies to the raw (uncompressed) HTML. While Googlebot accepts gzip and brotli encoding for transfer, the limit is evaluated against the decompressed HTML size. Compression helps with transfer speed but does not change whether your page fits within the crawl limit.
Referenced resources — CSS, JavaScript, images, fonts — are fetched separately by WRS with their own independent per-URL byte counter. They do not count toward the size of the parent HTML page.
For most websites, 2MB of raw HTML is enormous — the vast majority of pages are well under this limit. However, certain types of pages can approach or exceed it:
If critical content — product listings, important text, internal links — falls beyond the 2MB cutoff, Google will never see it. This can silently hurt your indexing and rankings without any obvious error in Google Search Console.
Configure gzip or brotli compression on your web server. This is the single most impactful change — brotli typically achieves 15-25% better compression than gzip for HTML.
Move large blocks of inline CSS and JavaScript to external files. These are fetched separately and don't count toward the HTML page size limit.
Break very large category pages or long lists into multiple pages. This distributes content across URLs and keeps each page well within limits.
Remove unnecessary whitespace, comments, and redundant attributes from your HTML. While the compression savings are modest, every byte counts for very large pages.
Large product catalog schemas embedded as JSON-LD can add significant size. Keep structured data concise and only include required properties.
Load non-critical content via JavaScript after initial render. Note: Google does execute JavaScript, but the initial HTML payload must still contain your most important content for reliable indexing.
Google's Googlebot will only process the first 2MB of raw (uncompressed) HTML. Any content beyond this 2MB threshold is not rendered or indexed. This limit applies to the decompressed HTML document, not the compressed transfer size.
The 2MB limit applies to the raw (uncompressed) HTML size. While Googlebot uses gzip and brotli compression during transfer for efficiency, the crawl limit is evaluated against the decompressed document. A 3MB raw HTML file would exceed the limit regardless of how small it compresses.
No. CSS, JavaScript, and other referenced resources have their own separate per-URL byte counters. They do not count toward the 2MB limit of the parent HTML page. Each resource is fetched independently with its own size limit.
Any HTML beyond the 2MB raw size threshold is ignored by Google. It is not rendered or indexed. The first 2MB of the uncompressed HTML is passed to Google's indexing systems and Web Rendering Service as if it were the complete file.
Use this Crawl Size Checker tool to enter any URL and instantly see the raw HTML size, gzip compressed size, and brotli compressed size. The tool shows whether your page is within Google's 2MB crawl limit and what percentage of the budget you've used.