30 Minutes

Meta Robots Tag: The ultimate SEO guide for indexing & noindex

Stefanie Sommer

9.4.2025

Inhalt

What is a Meta Robots Tag?
Index vs. Noindex: Which pages should end up in the index?
Follow vs. Nofollow: Should crawlers follow the links?‍
Understanding important combinations of the Meta Robots Tag ‍
Typical Use Cases and Best Practices for Noindex
Common Mistakes and Misunderstandings with the Meta Robots Tag
Conclusion: Key Takeaways
Meta Robots Day FAQ section

Alle Beiträge

Imagine being able to tell search engines exactly which pages on your website they should index and which they should not. That's precisely what the Meta Robots Tag lets you do. For anyone working in SEO – whether you are just starting out or you are an experienced SEO manager – this little HTML tag is a powerful tool. Used correctly, it helps keep unnecessary or sensitive pages out of the search results and protects the ranking potential of your important pages. Get it wrong, though, and valuable content can disappear into the depths of Google, never to be seen.

In this practical guide, we'll explain clearly what the Meta Robots Tag is, how it works, and how you can use it strategically for your SEO. We'll look at typical use cases, common mistakes and misunderstandings, and provide best practices as well as solutions from an SEO perspective. Whether you are just getting started with SEO or you have already got some experience under your belt – you'll find valuable knowledge and tips here to optimally control the indexing of your website. For more information on the basics of indexing, you can also read out index management basics guide.

What is the Meta Tag for Robots and why is it important?

The Meta Robots Tag (also called "Robots Meta Tag") is an HTML command that's placed in the <head> section of a page. It tells search engine crawlers (like Googlebot, Bingbot, and the like) how they should handle that specific page. Unlike the robots.txt file, which regulates which areas of a website can be crawled at the domain level, the Meta Robots Tag works on an individual page basis. It determines whether a single page can be indexed and whether the links on it should be followed.

Here's what a Meta Robots Tag looks like in the source code: <meta name="robots" content="noindex, follow"/>. With instructions like these, you can specify, for example, that a page should not be included in the search index (noindex), but the crawler is still allowed to follow all the links on it (follow). This fine-grained control is incredibly important for SEOs to make sensible use of their crawl budget and to keep unwanted content out of the search results.

Why is this important?

Ideally, every page in Google's index should offer high-quality content that's relevant to users. Pages that have no business being there – such as internal search results, duplicates, or outdated content – can harm your website's overall image or tie up important crawler resources. With the Meta Robots Tag, you retain control: you decide which pages Google and others are allowed to show and where they should go next. This know-how is gold, especially for SEO managers and internal marketers, to strategically manage the visibility of their website.

In short: The Meta Robots Tag is your direct line to the search engine crawler. Use it wisely – it can decide whether a page ranks or disappears into obscurity.

Google Search Console graph, index vs noindex pages — How Google Search Console display which website pages are indexed or not indexed.

Index vs. noindex: Which pages should end up in the index?

One of the main functions of the Meta Robots tag is the decision subscript or Noindex. These directives determine whether a page is included in the search engine index (index) or just not (Noindex). Let's look at both options in detail.

Index — share pages with search engines

The index directive signals: "This page can appear in the search results." By default, search engines always assume that your page can be indexed if there is no Meta Robots Tag present. That means if you don't specify anything, the behaviour is as if index were there. In practice, you rarely need to explicitly write index – it's the default setting.

Example: Your newly created blog post with high-quality content should, of course, be found. You don't specify a Meta Robots Tag, so Google will index it (provided the page is crawlable and there are no other conflicting factors). Indexing means that this URL will be included in the search index and can appear for relevant search queries.

Noindex – Specifically excluding pages from indexing

With noindex, you tell the crawlers: "Please do not include this page in the index." As soon as Google discovers this instruction while crawling, the page will be removed from the search results during the next update, or it won't be included in the first place. Important: The search engine must be able to crawl the page to see the noindex tag. Only then can it follow the instruction.

What happens with Noindex? As long as the tag is present, the page will not appear in the search results. If it was already in the index, it will be removed after some time. (This process isn't immediate; more on that later.)

What do you use Noindex for? For all pages that don't offer any added value in search or should not be publicly discoverable for other reasons. We'll look at typical examples further down (think: internal pages, duplicate content, thank-you pages, etc.).

Careful: Noindex is a double-edged sword. If you use it on the wrong page – such as an important product or category page – you'll lose rankings and traffic because the page will disappear completely from Google and others. Especially on large websites, noindex should be used very selectively and sparingly.

Remember: Noindex removes a page firmly from the index, while a Canonical tag (another SEO control tool) is more of a recommendation to index a different page. Google treats Canonicals as a hint, whereas noindex is a direct instruction. If Canonical strategies fail or are not sufficient, noindex can serve as the next step – but always with caution.

‍

Follow vs. nofollow: Should crawlers follow the links?

The second important function of the Meta Robots Tag concerns the following of links. Here, follow and nofollow are the two options. With these, you control whether the links on a page should be further followed by search engine crawlers and included in their evaluation, or not.

Follow – Allowing link passing (Default)

The follow instruction means: "Follow all the links on this page." This is also the default behaviour for crawlers. So, if a page doesn't contain any other instruction, Google and other bots will follow all the links they find, crawl the page behind them (if allowed), and potentially include it in the index. follow therefore doesn't need to be explicitly written in the code – it's assumed automatically.

SEO Impact: follow allows link juice (ranking power through links) to be passed on. Internal links can thus have their positive effect, and the crawler would also visit external links (although for external links, the linking website decides via the rel="nofollow" attribute, more on that in a moment). In short: Follow keeps the crawler flow going. A page with index, follow is a normal part of the web, fully integrated into the index, and passes on its link power.

Nofollow – Not evaluating and not following links

With nofollow, you give the instruction: "Do not follow the links on this page." This means the crawler will look at the content of the page but will ignore all the links on it – as if they were dead ends. This option is used to specifically prevent link passing.

A few important points about nofollow:

Application: In practice, nofollow as a Meta Robots setting is rarely used globally for an entire page. It's more common to see rel="nofollow" at the link level – for example, to mark individual external links as untrustworthy or to devalue paid links. Setting all links on a page to nofollow globally via the Meta Tag only makes sense in exceptional cases (such as on archive pages with endless unimportant links, or a page that is solely a directory of links that should not be evaluated at all).
Effect on SEO: Nofollow means no link juice flows further. Internal linking structures would be interrupted. In the past, this was sometimes used for PageRank sculpting (consciously controlling the distribution of link power), but this is less relevant today. Search engines have also become smarter: Google now considers nofollow links as a hint and no longer as a strict instruction for indexing. Generally, nofollow links are not included in the ranking calculation, but Google can theoretically still decide to follow a nofollow link to discover new content. So, don't rely on nofollow always stopping a crawler 100% – it usually does, but it's not guaranteed.
Internal vs. External: Internal pages should ideally not be devalued with nofollow, otherwise you'll cut off valuable internal linking and lead the crawler in circles. For external links, you can use nofollow if you don't want to give an endorsement (e.g., for user-generated content, in comments, or for paid partnerships according to Google's guidelines).

Remember: If a hint about following is missing in the Meta Robots Tag, crawlers assume follow. nofollow should be used selectively and sparingly. Keep in mind that noindex, nofollow together creates a kind of black hole: the page disappears from the index, and the crawler finds no further paths from there. As a result, you waste crawl resources. Therefore, SEOs usually prefer noindex, follow – this keeps the page invisible in search, but links on it are still followed (for example, so that internal links to pages that are still indexed don't lead nowhere).

‍

Understanding important combinations of the Meta Robots tag

The Meta Robots instructions can be combined. In total, there are four possible combinations, with two being particularly relevant:

index, follow – Standard case: index the page, follow links.
noindex, follow – Do not index the page, but follow links on it. (Common setting for pages to be excluded)
index, nofollow – Index the page, but do not follow any links on it. (Rarely used)
noindex, nofollow – Do not index the page and do not follow links. (Edge case, to be avoided if possible)

Index, Follow (Default):

As mentioned, this is the behaviour without a specific tag. You only actively need to specify this combination if, for example, you want to ensure that a previously differently configured area is indexed again (for instance, if you accidentally set Noindex and want to correct it). Otherwise, you can leave it to the search engine – it indexes and follows by default.

Noindex, Follow (Recommended for pages to be excluded):

This setting is used when you want to keep a page out of the search results without disrupting the crawler's flow. Examples: You have an internal "Thank You" page after a form submission. The content on it is not relevant for other users, so it shouldn't rank (Noindex). At the same time, it might contain a link back to the homepage or other helpful links – these are welcome to be crawled (Follow). Noindex, Follow is best practice for most cases where you want to exclude pages. For example, A/B test landing pages for campaigns or internal search results pages stay out of the index but still pass on link power.

Index, Nofollow (Special Case):

This combination says: "The page can be in the index, but do not trust any links on it." When could this be useful? Hardly ever in normal SEO practice. If you trust a page so little that none of its links should count, you usually wouldn't want it indexed either. A conceivable case: a public page with a huge directory of links that you don't want to be evaluated (e.g., a user-generated link listing). Here, you could theoretically set index, nofollow – the page itself can be found (if it contains informative text, for example), but all the listed links within it are considered "not recommended". Nevertheless, such scenarios are rare. Most of the time, you'd use Nofollow directly on individual links, not site-wide.

Noindex, Nofollow (To be avoided):

This setting excludes the page from the index and stops link following. Effectively, you completely isolate the page. The crawler arrives, sees Noindex and Nofollow, takes nothing, and leaves. This only makes sense in a few edge cases – perhaps for a temporary test page where you are 100% sure that no links on it are important. In most cases, however, securing it with a password or Disallow in robots.txt would be better than letting crawlers on it at all. Conclusion: noindex, nofollow makes the crawler blind and deaf to this page and everything on it – only use this if you know exactly what you are doing.

‍

How the different combinations can appear in the HTML.

Typical use cases and best practices for Noindex

The targeted use of Noindex can improve your SEO performance by keeping irrelevant or problematic pages out of the search index. Here are some practical examples of when you should set pages to Noindex – and when you shouldn't:

Suitable Pages for Noindex: ("These pages I don't want Google to see")

Internal Search Results: Pages that display search results on your own website (e.g., ?s=keyword or filter search pages). They offer no added value for external users on Google and often lead to duplicate content. -> Solution: Use noindex, follow so that Google ignores these pages but still finds any product links within them.
Thank You/Confirmation Pages: After a contact request or order, users often land on "Thank you for your enquiry" pages. These don't have any independent content for search. -> Noindex keeps them out of the index.
Login and Account Areas: Login pages, shopping carts, user profiles, etc., should not be publicly discoverable. They are also often sensitive. -> Noindex (and additionally, possibly secure via robots.txt, see below).
Pages with Little or No Content: Placeholder pages, "Under Construction" pages, or very thin content. Before such thin pages appear in the index and lower the quality of your domain, it's better to set them to Noindex until they are finished.
Duplicates or Print Versions: Do you have multiple versions of the same content (e.g., a print version of a page without added value)? -> Index one version (the main page), and set the other to Noindex or, better, link them with a Canonical tag. (If technically possible, Canonical is often preferable here so that link signals remain unified.)
Campaign Landing Pages without SEO Value: Performance marketing teams often create numerous landing pages for ads that are not optimised for search engines and may only be relevant for a short time. -> You can mark these pages with Noindex, Follow so that they don't appear in the organic index, but internal links might still count.

Pages where Noindex should not be set: ("These pages should remain indexed")

Important Product and Category Pages: In e-commerce, for example, you generally want all relevant products and category overviews to be indexed. Product variations (e.g., different colour or size) should usually also not be excluded via Noindex, as they can be useful for users. Here, a Canonical solution would be more appropriate if they are to be grouped together from an SEO perspective.
High-Quality Content Pages: Every blog article, every guide page (like this one), and generally pages that visitors should find via Google must, of course, not be set to Noindex. Sounds obvious, but it can quickly happen that you accidentally copy a Noindex when copying page templates – fatal for visibility! (Tip: Check in the Google Search Console whether important pages are reported as "Excluded by noindex".)
Pages that point to others via Canonical: If Page A has a Canonical link to Page B (thus indicating B as the preferred index version), Page A should not also have Noindex. It's one or the other. A common mistake is to provide faceted search pages with both Canonical and Noindex. This is redundant and can lead to confusion – either set a Canonical (if you want the page to be crawled but not indexed in favour of another) or Noindex (if the page really shouldn't appear at all). Both together are unnecessary; in the worst case, Google might ignore the Canonical or waste crawl resources.

In summary: Noindex is excellent for keeping certain types of pages (internal, duplicated, irrelevant) out of the search results. But always consider whether there are better alternatives (Canonical, access protection, etc.) and whether you might accidentally hide useful content by excluding them. Every page you set to Noindex should be a conscious choice.

Common Mistakes and Misunderstandings with the Meta Robots Tag (Causes, Effects & Solutions)

Despite its simple syntax, mistakes often happen in practice when using the Meta Robots Tag. These errors can range from harmless to catastrophic. Here are the most common pitfalls – each with its cause, potential impact on SEO, and the corresponding solution from an SEO perspective:

1. Noindex + robots.txt-Disallow on the same page

Error/Cause: You block a page in the robots.txt (Disallow) but simultaneously set a Noindex on the page itself. Many do this assuming that double is better.
Impact: A paradox – the crawler is not allowed to access the page due to the robots.txt and therefore doesn't see the Noindex tag at all. The consequence: if the page was previously indexed, it may remain in the index (Google knows the URL but is not allowed to crawl it). It might then appear as "Page is in the index, however blocked by robots.txt" (a typical report in the Search Console). Thus, the unwanted result remains visible, often without content or with old content.
Solution: Don't combine them! Depending on your goal, decide either for noindex or for Disallow. Rule of thumb:
- If you want to remove a page from the index, allow Google access and use Noindex. Only block it afterwards via robots.txt, if necessary at all. (Practical tip: First set all relevant URLs to Noindex and wait until they disappear. Later, you can still block them via robots.txt to save crawl budget once they are safely de-indexed.)
If you don't want a page to be crawled in the first place (e.g., for security reasons or because it's irrelevant but should never be in the index), then use Disallow in the robots.txt without Noindex. For example, for admin areas, tracking parameter pages, or similar where you don't want to reveal any content at all.
‍Remember: Noindex only works if crawlers are allowed to see the page. A recent Google statement (2024) emphasises exactly this: don't use Disallow and Noindex at the same time, as a disallowed page can otherwise remain indexed despite the Noindex – albeit without content. Better: let Google crawl the page to pick up the Noindex, or block it completely without Noindex.

2. Noindex vs. Canonical Confused or Doubled Up

Error/Cause: Pages are marked with Noindex when a Canonical tag would actually be appropriate (or vice versa). An example: you have very similar pages (e.g., filter pages in a shop) and you actually only want one main page to rank. Instead of setting a Canonical, you mistakenly take all the alternatives out of the index with Noindex. Or you set both: the alternative pages have a Canonical to the main page and additionally Noindex.
Impact: In cases where Canonical is sufficient, a Noindex can be too strict. Noindex kicks the page out completely – even if it might still bring in traffic if the Canonical doesn't take effect. Furthermore, internal links from this page might be lost if you use Noindex+Nofollow. Conversely, a page with a Canonical to itself and Noindex (yes, this happens due to copy errors) would be nonsensical – it would indicate itself as the main page but still never be indexed.
Solution: Develop a clear strategy: Is it a duplicate content or a variation problem? -> Use Canonical tags so that Google understands the main version, but continue to crawl the alternatives (without Noindex) so that, for example, ratings or other signals can be transferred. Only use Noindex if a page really shouldn't appear at all. And then usually without a Canonical, except for the special case: a Noindex page can certainly have a Canonical to another page (but Google will probably ignore it since the page isn't indexed anyway). The important thing is: don't mix contradictory instructions. Either say "take A instead of B" (Canonical) or "don't take B at all" (Noindex).

3. Global Nofollow Usage (Yesterday's Link Sculpting)

Error/Cause: Older SEO strategies recommended setting less important links site-wide to Nofollow to "save link power". Some webmasters even devalued entire page sections or the entire menu via Meta Robots nofollow.
Impact: Crawler dead ends. Providing internal navigation with Nofollow prevents Google from logically understanding your website. Important pages might not be found or no longer contextually linked internally. This does more harm than good. You also give away ranking potential because internal links are also an indication of the importance of pages for Google. A global Nofollow isolates pages from each other.
Solution: Use Nofollow sparingly and selectively. Practically never for internal links, except in absolute exceptional cases. If you want to direct "link juice", do it via a clever internal linking structure (e.g., which pages you frequently link from the menu), but not via Nofollow tags. Google itself advises against it. Nowadays, Google considers Nofollow as a hint and might still crawl – but the links definitely don't contribute to ranking. In short: focus on using good internal linking instead of trying to artificially build up link juice.

4. Hreflang References to Noindex Pages

Error/Cause: On internationalised websites, it happens that hreflang links (which link language/country versions) point to pages that have Noindex. Or vice versa: a Noindex page contains hreflang information to other pages. This often happens if, for example, you want to exclude a specific language version from the index but forget to remove it from the hreflang sets.
Impact: Hreflang tags are supposed to help Google recognise equivalent pages in different languages and display them appropriately. However, if one of these linked pages is Noindex, it creates confusion. Google might not be able to resolve the hreflang group correctly because one member shouldn't be indexed at all. In the worst case, Google ignores the hreflang information completely because something isn't consistent. Besides, it makes little sense: why offer a search engine alternative language pages that it's not allowed to index?
Solution: Ensure consistency: only indexable pages should be included in hreflang clusters. If, for example, you don't want to serve certain regions, it's better to omit these hreflang entries altogether instead of setting the pages to Noindex and still referencing them. Check regularly (especially after website updates) whether any page with noindex accidentally still has hreflang references or is referenced by other hreflang tags. Clean up such cases to send clear signals to Google.
For more information on hreflang you can also access our guide on why hreflang is essential for global brands.

5. Noindex Not Removed in the Development Environment

Error/Cause: During a website relaunch or on a staging environment, a global Noindex is often set so that unfinished content doesn't appear in Google. This is initially correct. The mistake happens when the Noindex tag is forgotten to be removed when going live. Suddenly, the freshly relaunched page is invisible to Google – a nightmare!
Impact: As long as the Noindex is active on the live page, the page loses rankings every day. Existing users might still get there directly, but new customers will find nothing in the search. Traffic and sales can plummet dramatically. Every day counts here. This error is unfortunately one of the most costly SEO blunders that can happen.
Solution: Checks and safeguards: when setting up the development environment, it should already be password-protected or protected by IP blocking so that crawlers can't get there at all. The Noindex then only serves as an additional safety net if something does become public. When launching, "remove Noindex" belongs right at the top of the launch checklist! Ideally, run a crawl (e.g., with Screaming Frog) before and after going live and filter for pages with Noindex to make sure no productive page is accidentally excluded. And if it has happened: remove the tag as quickly as possible and request a recrawl by Google (e.g., via the Search Console "Test Live URL" and "Request Indexing").

6. Expectation: "Noindex Works Immediately"

Misunderstanding/Cause: Many think that as soon as they give a page noindex, it immediately disappears from the search results. This assumption is understandable, but that's not how it works.
Impact: You set Noindex, check Google an hour later – and the page is still there. Uncertainty spreads: Did it not work? Is Google ignoring me? In the worst case, people start frantically experimenting – perhaps also removing it via the Search Console, changing robots.txt, etc., which leads to a chaos of signals.
Solution: Patience and, if necessary, acceleration: Noindex takes effect as soon as Google has recrawled and processed the page. Depending on the crawl rate, this can take anywhere from a few hours to several weeks. In the Google Search Console under "Indexing > Pages", you can see which pages are marked as "Excluded by noindex tag" – this shows that Google has recognised the tag. If you need it to be quick, use the removal tool in the Search Console to temporarily remove the URL from the search results. This is an emergency lever that takes immediate effect, but in parallel, the page should of course continue to carry the Noindex tag so that it remains permanently excluded. Basically: don't panic, Noindex takes some time. This is especially true for large websites with many URLs – here, complete de-indexing can take months until Google catches them all. This is normal.

7. Noindex Pages in the XML Sitemap

Error/Cause: Your XML sitemap lists all pages, including those that are actually Noindex. This happens if the CMS/sitemap plugin doesn't differentiate between indexable and non-indexable pages.
Impact: Google finds a URL in the sitemap and thinks it should be indexed, but then sees the Noindex on the page. This isn't the end of the world – Google often marks this in the Search Console as a hint "URL in Sitemap, but excluded by noindex". It's a bit contradictory and could waste crawler resources, as Google keeps checking "is it perhaps indexable now?".
Solution: Clean up the sitemap: ideally, you should only list pages in your sitemap that are intended to be indexed. You can omit Noindex pages. This sends clear signals: what's in the sitemap should be included, the rest is either not that important or deliberately excluded. Most modern SEO plugins or CMS have options to exclude, for example, thank-you pages or category pages with Noindex from the sitemap.

These are just some common mistakes. The important thing is: every time you use the Meta Robots Tag, ask yourself why and check the effects. A small typo or a slip of the mind ("Oops, put the wrong page on Noindex!") can have significant consequences. If in doubt, consult an experienced SEO or test changes on a smaller scale first.

Internal Tip: Use tools like Screaming Frog, SISTRIX Optimizer, or Ahrefs Crawler to regularly check your website. They quickly show you where Meta Robots instructions are located and whether there are, for example, pages with Noindex that are still in the sitemap or important pages that have accidentally been set to Noindex/Nofollow. The Google Search Console is also your friend: in the indexing report, you can immediately see if URLs are "Excluded by the noindex tag" or if other indexing problems occur. This allows you to identify and fix errors early on.

Conclusion: Key takeaways on the Meta Robots Tag

The Meta Robots Tag is a powerful control instrument in SEO that allows you to determine for each individual page whether it will be indexed and how crawlers should handle it. Used correctly, you help search engines focus on the important pages and avoid irrelevant or sensitive content appearing in the search results. In conclusion, here are the most important insights and tips at a glance:

Meta Robots = Page-Specific Crawling & Indexing Control Centre: Via <meta name="robots" content="...">, you control indexing (index/noindex) and link following (follow/nofollow) per page. What is not written in the tag remains at the default setting (index, follow).
Use Noindex Consciously and Selectively: Only use noindex where you really want to remove pages from the search results (e.g., internal search, thank-you pages, duplicates). Remember that these pages will then no longer be able to deliver traffic via search. Never set important pages to Noindex!
Don't Contradict with robots.txt: Avoid combining Disallow (robots.txt) and noindex. Depending on your objective, decide whether to allow crawling so that Noindex can take effect, or to prohibit crawling via Disallow if indexing shouldn't be an issue in the first place (e.g., internal login pages). Double blocking only causes confusion.
Handle Follow/Nofollow Cleverly: Generally allow crawlers to follow links (follow) so that your website is fully recorded and link juice can flow. Global nofollow cuts off connections – only use it in exceptional cases. For external links, you can selectively use rel="nofollow", but usually not internally.
Check Regularly: Make it a habit to check your pages for unwanted Meta Robots settings. A small error (e.g., a forgotten Noindex from a relaunch) can cause enormous damage. Tools and the Search Console will help you with monitoring.
When in Doubt: Ask Experts: Technical SEO can be tricky. If you are unsure whether your indexing strategy is correct – for example, with complex combinations of Noindex, Canonical, Hreflang, and Co. – get advice from experienced SEOs. Often, such topics are better checked twice than implemented incorrectly once.

With these points in mind, you are well-equipped to use the Meta Robots Tag effectively and without nasty surprises. This way, you retain control over what Google gets to see of your website – an essential success factor in SEO.

Call to Action: Are you unsure whether your website is optimally indexed or do you suspect errors in the use of Meta Robots? Then let our SEO experts support you! We at panpan.digital will be happy to help you with a non-binding SEO consultation or a thorough SEO audit to correctly adjust technical levers such as the Meta Robots Tag. Contact us now and get professional support for your SEO strategy – so that no important content remains hidden!

Would you like more technical SEO tips? Take a look at our guide section! For example, our article "404 Errors and SEO: Your Comprehensive Guide" explains how to deal with not found pages and what impact they have on indexing. Happy reading and optimising!

‍

Meta Robots Tag FAQ

What exactly does the Meta Robots Tag do?

The Meta Robots Tag is an important HTML code on every HTML page that gives search engine bots (also called web crawlers) instructions at the page level. By using specific values in the so-called Content attribute, you can determine whether your website is indexed on the internet or whether the crawler is allowed to follow links. Without the Meta Robots Tag, search engines by default assume that pages should be indexed ("index") and links should be followed ("follow").

Where do I insert the Meta Robots Tag?

You place the Meta Robots Tag in the HTML <head> section of your page, usually along with other meta tags. For example: <meta name="robots" content="noindex, follow">. Many Content Management Systems (CMS) or SEO plugins offer settings for this, so you don't have to work manually in the code. It's important that name="robots" only appears once per page. You can alternatively set search engine-specific tags (e.g., <meta name="googlebot" content="...">), but generally, the general "robots" tag is sufficient for all common search engines.

What is the difference between Meta Robots and robots.txt?

The Meta Robots Tag explicitly controls indexing and link following on the respective HTML page. The robots.txt, on the other hand, regulates access to individual subpages or directories of your website for search engine bots in advance. Both methods work together in different ways to protect content from unwanted indexing or to use the crawler's resources more efficiently.

How can I check if a page is set to Noindex or Nofollow?

There are various methods: direct inspection of the HTML page via the page source code, SEO tools, or browser plugins. It's particularly easy with the Google Search Console. Here you will find all relevant information on whether a page has been correctly interpreted by the search engine bots, for example, whether a noindex tag has been recognised and whether it has any impact on your ranking.

How long does it take for a Noindex page to disappear from Google?

The removal of a Noindex page from the search results depends on the crawl frequency of the web crawlers and does not happen immediately. It usually takes between a few days and several weeks. For immediate removal, Google offers supplementary methods in the Search Console (e.g., the URL removal tool). Here, the URL is entered directly, which speeds up the removal.

Should I keep an unwanted page out of the search results with Noindex or via robots.txt?

When in doubt: Noindex. Because Noindex ensures that the page (once crawled) stays out of the index. A pure block in the robots.txt prevents crawling, but not necessarily indexing – Google could still index the URL if external links point to it, albeit without content (which then looks bad in the results). The best approach, as described above: first Noindex, later optionally Disallow, if really necessary. For pages with sensitive data or absolutely no relevance that you never want crawled (e.g., admin logins), you can directly set robots.txt Disallow and additionally better secure them with password/IP protection. Noindex is not necessary here, as ideally Google should know nothing about the page at all.

Are there other Meta Robots instructions besides Index/Noindex and Follow/Nofollow?

Yes, the Meta Robots Tag can contain other parameters to control the search result behaviour. Some examples:

noarchive: The search engine should not offer a cached copy (cache version) of the page.
nosnippet: No snippet (text excerpt) should be shown under the search result. (The page can still be indexed but will then appear without a description text.)
noimageindex: Images on the page should not be indexed in Google Image Search.
notranslate: Google should not offer a translation function for this page.
max-snippet:-1, max-image-preview, max-video-preview: With such directives, you can control the length of the snippet or the display of preview images.

These additional instructions are more optional and concern the display in the search or special cases. For basic indexing control, index/noindex and follow/nofollow are mainly crucial. Also worth mentioning is the X-Robots-Tag HTTP header variant: this allows you to send similar instructions in the HTTP header of a response (practical, for example, for PDF files or other non-HTML documents where you can't place a meta tag in the code). For normal websites, however, the Meta Robots Tag in the HTML is sufficient.

Does a Noindex page affect my SEO performance?

A page that is set to Noindex cannot achieve any ranking itself because it is not in the index. However, this does not negatively affect your other, indexed pages – so it's not the case that Google "punishes" you just because you have many Noindex pages. John Müller from Google has confirmed that Noindex pages are not harmful per se. However, they do consume a little crawl budget because Google still has to check them from time to time. If very many unimportant pages are crawled, this could in extreme cases mean that more important pages are visited less frequently. Therefore: completely remove unnecessary pages or reduce crawling (e.g., for typical unimportant pages) is perfectly normal and part of healthy SEO hygiene.

Can Google still index or display a NoIndex page?

Basically: no, if Google can read the Noindex tag, the page will not be displayed in the search results. However, there are special cases: if you disallow a page via robots.txt, Google won't see the Noindex and could still include the URL in the index directory (without content). Or if external pages link to your Noindex page and you later completely remove this page, Google may still display an entry for a while (usually as an empty result or an error). But a page that is actively marked with <meta name="robots" content="noindex"> and is accessible to Google remains invisible in the search. Just make sure the tag is written correctly (typos like noindes or incorrect quotation marks can make it ineffective).

Should I use nofollow for outgoing links?

For internal links, generally no – internally you want Google to crawl your entire site. For outgoing (external) links, it depends: if you are referring to a page you trust and that is helpful for the user, then leave it as follow (that's a "real" recommendation from you). However, if you have a link that is, for example, paid (advertising, affiliate) or leads to a source you want to distance yourself from, then set rel="nofollow" on this link. This way, no PageRank flows, and you adhere to Google's guidelines for paid/untrustworthy links. Note: there are also rel="ugc" (User Generated Content) and rel="sponsored" as special forms that Google has introduced – but they are treated similarly to Nofollow (hint for Google). You rarely need the Meta Robots nofollow as a page setting, as described above. For external links, focus on the individual link level and decide on a case-by-case basis.

Do you have more questions about indexing or other SEO topics? You'll find more helpful articles in our SEO Knowledge Base. And if your specific concern isn't answered here, don't hesitate to contact us directly – we'll be happy to help you so that your website plays right at the top of the search index!

All Articles

SEO