SEO: The Importance of a Tidy House

As the common saying goes “it is important to get your own house in order before worrying about anyone else’s” and this scenario is extremely important in the world of SEO.

What this means in more basic terms is that your website health is the first thing to make sure is intact before putting extensive efforts in promoting your website through link building, content marketing and any other marketing channels.

So, what does this include? Well, the first process which any agency or SEO specialist will go through is a full and comprehensive technical audit to identify any problems which might be holding the website back or be frowned upon by Google’s own Big Brother style eye.

 

What to look for…

Many articles have been written on what to look out for in an audit, and they typically include issues such as Duplicate Content, Over-Optimisation or Title Tags and Meta Data, missing or incorrect Canonical tags and issues with PageSpeed. Although these are all valid points, they are merely a fraction of the things that need looking at and are some of the most obvious.

With such a potentially wide scale of issues which could be present on a website, it is important to audit the website in stages to ensure nothing is missed or overlooked. The stages in which I would typically work upon are as follows: Basic, Advanced & CRO/Usability.

At any of these three stages you might find that the issue needs escalating to a more advanced stage, an example of this has been explained below:

Basic audit issues to look for without the aid of SEO tools:

  • Check for non-www. to www. redirects
  • Check for base file site duplication (typically /index.html or /index.php)
  • Check for unwanted indexed pages (Site:www.yourdomain.com) in Google
  • Check for ‘Restricted Resources’ in Robots.txt file
  • Check for Sitemap (.xml) & Sitemap (HTML version)
  • Check for Duplicate Content (copy and paste first paragraph of text into the Google search bar)

If any of the above produce potential issues, it is important then to understand what the issue might mean for the website to then be able to correctly rectify the problem. My explanation of these issues are:

 

  • Check for non-www. to www. redirect

This basically causes the website to be available in its entirety multiple times and therefore potentially causes duplicate content and the risk of incorrect or broken hyperlinks both internally and externally in future.

 

  • Check for base file site duplication (typically /index.html or /index.php)

Typically a problem on Magento-based websites, but can appear on any CMS-based website, this issue and can cause site-wide duplication of all content and an issue with canonical tags.

Tip: To diagnose this issue and test if your website has this problem simply add ‘index’ and the correct file extension (.php, .html, .aspx, .asp, .phtml etc) after your domain name (example; http://www.example.com/index.php) and hit ‘Enter’. If this page loads correctly without redirecting to the base domain (in our example: http://www.example.com) then you need to fix this issue. The normal solution to fixing this problem is by adding a rule to the HTACCESS file of the website, this is typically a job for the developer.

 

  • Check for unwanted indexed pages (Site:www.yourdomain.com) in Google

An error which usually highlights thin or no content pages which of course could land you in hot water with Google’s Panda algorithm, this issue is when pages such as ‘Tags’, ‘Author’ or Archive-style pages are indexed by the search engine (Tag, Author and Archives are found typically in WordPress but these examples have similar substitutes depending on the CMS of your website). WordPress typically has settings which allow these to be configured correctly or indeed turned off to prevent this issue (other CMS systems may vary). For a lot of websites, in particular ecommerce websites, dynamic pages generated by search queries or filters can also cause this issue.

 

  • Check for ‘Restricted Resources’ in Robots.txt file

Google recently announced that the search engine now reads all Javascript and Jquery files which make a website function as well as the HTML, CSS, PHP and Images. This announcement was made as a back-up to the recent (April 2015) Mobile-Friendly update and explains a little of how Google determines if the rendering of the website throughout each device size operates correctly in order to give the website a quality score or ranking. Therefore by ensuring that your website is not unnecessarily blocking any File Directories ‘Resources’ which make the website function, then you are providing a better and more open view of the website to the search engine.

Tip: As well as manually checking the Robots file, Google also offer three important tools through the Search Console platform called Blocked Resources, Fetch as Google and Robots.txt Tester which can all aid you and provide a better insight in to whether your website is operating properly.

 

  • Check for Sitemap (.xml) & Sitemap (HTML version)

An XML sitemap is an important part of the website and needs to be accurate to allow the search engine to understand the structure of the website and to ensure that all of the website’s pages that need to be indexed are being indexed. This needs to be checked to ensure only pages which you want to be indexed are included and the formatting of this file is correct, and then it can be submitted to Google through the Search Console platform. A HTML version of your sitemap is a version for the user. This file should be included in the script on your 404 page to aid the user journey through the website and is supposed to be a navigational aid.

 

  • Check for Duplicate Content (copy and paste first paragraph of text into the Google search bar)

Duplicate content is a big bugbear of search engines because throughout the billions of pages across the internet in which they try to understand and give relevancy to, a large proportion of these pages are duplicated, and therefore the processing power it takes to read these pages is wasting time and resources. Due to this Google created Panda – an update to the search engine algorithm which penalises websites for duplicate content issues. Removing duplicate content and ensuring it is unique and of the highest quality will give your website a real boost in the relevancy and ranking of your keywords which will in turn aid sales or lead generation, and help the business to grow.

As I have illustrated from the few audit points above, understanding how Google works is key to the performance of your website and therefore growth of the business through digital marketing. The sentiment therefore of “Keeping a tidy house” is imperative.

For more tips and methodology advice as to how to ensure your website is tidy, follow us on Twitter: @bancmedia or check our website for regular updates and insights.