The Importance of a Robots.txt File for Your SEO
Published July 03, 2010
Example of a Robots.txt File:
Your Robots.txt file is what tells the search engines which pages to access and index on your website on which pages not to. For example, if you specify in your Robots.txt file that you don’t want the search engines to be able to access your thank you page, that page won’t be able to show up in the search results and web users won’t be able to find it. Keeping the search engines from accessing certain pages on your site is essential for both the privacy of your site and for your SEO. This article will explain why this is and provide you with the knowledge of how to set up a good Robots.txt file.
How Robots.txt Work
Search engines send out tiny programs called “spiders” or “robots” to search your site and bring information back to the search engines so that the pages of your site can be indexed in the search results and found by web users. Your Robots.txt file instructs these programs not to search pages on your site which you designate using a “disallow” command. For example, the following Robots.txt command:
User-agent: *
Disallow: /thankyou
…would block all search engine robots from visiting the following page on your website:
http://www.yoursite.com/thankyou
Notice that before the disallow command, you have the command:
User-agent: *
The “User-agent:” part specifies which robot you want to block and could also read as follows:
User-agent: Googlebot
This command would only block the Google robots, while other robots would still have access to the page:
http://www.yoursite.com/thankyou
However, by using the “*” character, you’re specifying that the commands below it refer to all robots. Your robots.txt file would be located in the main directory of your site. For example:
http://www.yoursite.com/robots.txt
http://www.yoursite.com/robots.txt
Why Some Pages Need to Be Blocked
There are three reasons why you might want to block a page using the Robots.txt file. First, if you have a page on your site which is a duplicate of another page, you don’t want the robots to index it because that would result in duplicate content which can hurt your SEO. The second reason is if you have a page on your site which you don’t want users to be able to access unless they take a specific action. For example, if you have a thank you page where users get access to specific information because of the fact that they gave you their email address, you probably don’t want people being able to find that page by doing a Google search. The other time that you’ll want to block pages or files is when you want to protect private files in your site such as your cgi-bin and keep your bandwidth from being used up because of the robots indexing your image files:
User-agent: *
Disallow: /images/
Disallow: /cgi-bin/
Disallow: /cgi-bin/
In all of these cases, you’ll need to include a command in your Robots.txt file that tells the search engine spiders not to access that page, not to index it in search results and not to send visitors to it. Let’s look at how you can create a Robots.txt file that will make this possible.
Creating Your Robots.txt File
By setting up a free Google Webmaster tools account, you can easily create a Robots.txt file by selecting “crawler access” option under the “site configuration” option on the menu bar. Once you’re there, you can select “generate robots.txt” and set up a simple Robots.txt file as in the example below:

As you can see, you select the “block” option under “action” and then specific the robots that you want to block under “User-agent.” Then, you simply type in the directories that you want to block under “directories and files.” As you do this, be sure that you leave the “http://www.yoursite.com” part of your URL off. For example, if you want to block the following pages:
http://www.yoursite.com/thankfyou
http://www.yoursite.com/freestuff
http://www.yoursite.com/private
You would type the following into the “directories and files” field in the Google Webmaster tools:
/thankfyou
/freestuff
/private
After adding these for all robots and clicking “add rule,” you would end up with a Robots.txt that looked like this:
User-agent: *
Disallow: /private
Disallow: /thankfyou
Disallow: /freestuff
Allow: /
Notice here that you have a default “Allow” command which is useful if want to make an exception and allow on robot to access a page which you have blocked using a command like.
User-agent: *
Disallow: /images/
By placing the command:
Allow: /Googlebot
…below the disallow command, you’d be allowing ONLY the Googlebot to access the images directory of your site. Once you’ve specified which pages and files you want to block, click the “download” option to download your Robots.txt file.
Installing Your Robots.txt File
Once you have your Robots.txt file, you can upload it to the main (www) directory in the CNC area of your website. You can do this using an FTP program like Filezilla. The other option is to hire a web programmer to create and to install your robots.txt file by letting him know which pages you want to have blocked. If you chooses this option, a good web programmer can complete the job in less than one hour.
Conclusion
It’s important to update your Robots.txt file if you add pages, files or directories to your site that you don’t wish to be indexed by the search engines or accessed by web users. This will ensure the security of your website and the best possible results with your search engine optimization.
Example of a Robots.txt File:

To gain access to the most up-to-date SEO strategies, fill out the form below to sign up for our newsletter. This newsletter will help you know what Google's up to as well as how you can get your website ranking higher without raising any red flags.
{"http://www.seositecheckup.com/articles/83":{"data":[]}}
Read Related Articles About SEO Programming Topics
- How to Protect Your Website From Hackers and Cyberthugs
In a perfect world, you'd be able to mind your own business and cheerfully run a successful online business without being harassed by "cyberthugs." However, until that perfect world comes, you'll need to slap great big KEEP OUT sign on your website. This means creating security settings which will disallow access to sensitive data and controls. - How External CSS Style Sheets and External Javascript Files Can Boost Your Rankings
If you’re searching for some simple ways to increase your ranking with the search engines and provide visitors with a better user experience, external CSS style sheets and external JavaScript files might just be the answer you’re looking for. As search engine companies such as Yahoo, MSN and Google continue to consider page loading speed an important factor in determining the ranking of a website, the need to reduce load times becomes more and more important. This can be accomplished through file compression strategies such as GZIP and deflation and by optimizing graphics file sizes. - How Using GZIP Compression Helps Your Site's SEO
You might have heard some talk about how compressing your web pages helps with your site's SEO, but how does it help and how can you use it on your site? This article will provide you with some details on what file compression is and why it�s essential to the process of optimizing your site. - How Page Caching Optimizes Your Site Performance
Page caching is another method which can help you to improve the load time of your web pages and thus optimize your site for the search engines. Page load time can significantly impact your user experience and your site?s ability into convert visitors into buyers or into leads. In fact, experiments at Google have revealed that just a half second?s difference in load times can cause up to a 20% reduction in web traffic. - Static Links vs. Dynamic Links: Which Are Best for SEO?
Learn about the pros and cons behind static links and dynamic links, as well as which one is best for your SEO efforts. - Correct Your URL Canonicalization
Learn about URL Canonicalization, what it is, why it is important, and how to address this for your website. - Some Guidelines for Determining Web Page and File Size
Web page size is an important factor in determining how well optimized your site is. Search engines like MSN, Yahoo and Google are considering load time to be more and more important when it comes to both search engine optimization and PPC quality scores, and file size certainly effects your website’s load time. - Why You Need A Sitemap Protocol As Part of Your SEO
Learn about how powerful a Sitemap Protocol is in enhancing your website's search engine optimization. - How to Focus Your Content and Site Structure for Maximum SERP Results
What does the future of SEO hold? Will back links, fresh content or site structure be most important for ranking high in the search engine's? Find out what Google, MSN and Yahoo! have in mind... - SEO for Wordpress: Optimizing Your Blog With Wordpress Plug ins
A simple step by step game plan for boosting the SEO of your Wordpress blog, optimizing traffic flow and increasing visitor interactions. - The Importance of Avoiding the Use of Nested Tables
Learn about why you should avoid utilizing nested tables in your web pages from both SEO and browsing perspectives. - Understanding Javascript Redirects
Learn how javascript redirects work, when they are used, and better alternatives for redirecting users. - Finding a Good Ecommerce System for Your Site
Learn exactly what you need to know to find the perfect e-commerce system for selling products and services from your website or blog...
All Categories:
- Basic SEO (19 articles)
- Online Marketing Tips (18 articles)
- On Page SEO (22 articles)
- Off Page SEO (13 articles)
- Social Media Marketing (12 articles)
- Seo Product Reviews (4 articles)
- SEO News (6 articles)
- SEO Content Creation (14 articles)
- How to Avoid SEO Pitfalls (9 articles)
- SEO Outsourcing (6 articles)
- How to Get More Traffic (10 articles)
- Advanced SEO Strategies (2 articles)
