An orphaned URL implies that the link is not integrated into the navigable structure of the website, rendering it inaccessible through traditional crawling methods.
Why is this important?
Guiding both users and search engine crawlers through your site depends largely on internal linking. It’s a strategic way to convey the hierarchy and significance of pages to search engines.
As orphan URLs lack internal connections, crawlers are unable to autonomously locate these pages. They're often discovered through alternative means like XML Sitemaps, manually compiled URL lists, or data from Google Analytics and Google Search Console.
The existence of orphan URLs is not inherently problematic. For example, if old URLs show in Google Analytics history but are not present on the live site, they should give a 404 or 410 status to communicate their unavailability.
Orphan URLs that return a successful 200 HTTP response require closer inspection for potential removal or necessary integration into the site's link structure.
What does the Optimization check?
The check is activated for any URL lacking 'Crawler' as its Crawl Source.
Examples that trigger this Optimization:
A URL may trigger this Optimization if it has any of these Crawl Sources but is missing 'Crawler':
Google Search Console
How do you resolve this issue?
Addressing orphaned URLs is case-specific—no single solution applies. Assess what the URLs are, their origins, and their corresponding HTTP responses.
If orphaned URLs in Google Analytics or Google Search Console all return a 404, it's actually beneficial that they're disconnected internally.
For orphan URLs with a 200 status, further investigation is warranted. Determine their necessity; if they're relics of discontinued products, a 404 or redirect may be in order. If they're new and unlinked, they require internal links to be integrated into the site structure.