Whilst I would love to think I knew the full 200+ signals being used in Google’s ranking algorithm, I don’t think I would quite get to 200 on my own. I am asking for help from the SEO community in this post to share their knowledge on how Google rank websites.
This post is designed to list all of the possible data sources / signals Google could be using as part of their ranking algorithm. My feeling is that if they have access to data, then they will be using it in someway or another. Weather this is to help boost websites or to flag websites as spam and demote them. We aren’t going to be discussing the actual weight Google is giving to each of the different signals (that can be for another post ), just to see if we can get all 200 (or more) signals theycould be using.
I have broken down the main areas to help categorise the different signals. Anyone who can add to this list will get a link back to their blog (as long as you don’t have a spammy website!) since this is my way of saying thanks for contributing.
Anyone wishing to contribute can add a comment to this blog post and I will update the blog post. When commenting please state;
- What piece of information Google could be using
- How they would judge / rank this information (I.e. how does this show quality or not?)
- Where they would get this data from
- Any relevant official sources of information to support your theory
A few of the easy ones have already been put in I want this post to be generated by the SEO community to see how many we can get between us all.
So lets get started then!
On Site Signals
- Meta title
- H1 & heading tags
- Keywords within the text
- Amount of useful text when subtracting templated information and adverts
- Phone number, which signifies that the website is a genuine business
- Amount of unique images which cannot be found anywhere else
- Amount of unique high quality content
- Is there regular fresh content appearing on the website?
- Internal linking
- Site speed
- Image ALT attribute
- Amount of adverts on the page & how much useful content is left once the adverts are removed
- Video markup
- Microformats and rich snippets such as hreview or hrecipe
- IP C class and location of server
- Content above the fold VS content below the fold
- Top level domain
- How long the domain has been owned by the current owner, not just how long the domain has been registered to ‘someone’
- Keyword over usage or keyword stuffing (negative factor)
- Cloaking (negative factor)
- Hidden text (negative factor)
- Duplicated content from external websites
- 301 redirect flags such as redirect chains (301 –> 301 –> 301 –> Final destination), redirect loops or redirects ending on a 404 page.
- Keyword rich URLs opposed to page numbers or product ID’s
- Number of crawl errors found when crawling the site
- Does the HTML conform to W3C standards
Off Site Signals
- Anchor text for inbound links
- Number of linking root domains
- PageRank from inbound links
- Followed / Nofollowed attributes on inbound links
- Number of mentions (ie, ‘website.com’ without an actual link)
- Total number of external links in addition to linking root domains
- Surrounding text on external links
- IP C class of linking websites
- Existence of a Google places profile
- Link profile of competitor websites
- Age of the back link
- Link growth rate and how fast links to certain pages have gained links
- Evidence of paid links
- Selling ‘followed’ links to other websites (negative factor)
- History of comment spamming on forums, blogs or other link spam (negative factor)
Brand Signals
- Does the website have a brand logo which appears throughout the website?
- Does the brand logo appear anywhere else on the web?
- Does the website send out branded emails to its customers? This data could be gathered by analysing gMail accounts.
- How many gMail accounts are receiving, opening and engaging with emails sent from this brand website?
- Does the website have a LinkedIn company page with real people working at the company?
- Search volume for branded keywords and brand name
- Presence of a physical address on the website
- Age of the domain
- Number of branded external links
Social Signals
- Does the website have a Twitter account?
- Does the website have a Google+ brand page?
- How many Plus 1’s does the website/web page have?
- How many Facebook fans does the brand page have?
- How many people are Tweeting about the website?
- How many Twitter followers the website has
- Surrounding text of links in tweets
- Number and rate of growth for page views and replies on social media pages
- Presence of user generated content (UGC) on the website
- Growth rate of social media mentions, such as a slow growing amount of mentions or a large outbreak
Other Signals
- Is the website advertising on AdWords?
- Is the website advertising on Google Display Network?
- Is the website listed on any official business websites such as Business Link or Yell.com?
- Does the website have a physical address?
- Is there an email address where customers / website visitors can contact the owner?
- Click through rate (CTR) from the search results
- Bounce rate from the search results
- Review rating of the website / business on third party websites
- Google seller reviews for eCommerce websites
- Quantity of pages in Google’s index
- History of past penalties for this domain (negative factor)
- History of past penalties from this owner of the domain (negative factor)
- History of past penalties for other domains from the same owner (negative factor)
- Amount of time spent on the page / site
- Number of page views per visit
Please add your comments so we can get this list full. I would be surprised if we couldn’t fill this list with all the knowledge out in the SEO industry.
Contributors
Michael Cropper – SEO and Internet Geek (1, 2, 3, 4, 5, 6, 7, 8, 41, 42, 43, 44, 45, 81, 82, 83, 84, 85, 121, 122, 123, 124, 125, 161, 162, 163, 164, 165)
Felix Lueneberger (9, 10, 11, 12, 13, 14, 15, 16, 17, 46, 47, 48, 49, 50, 86, 87, 88, 89, 126, 127, 128, 129, 130, 166, 167, 168, 169, 170)
John – Forest Software (18, 51, 52, 53)
Sean – http://01100111011001010110010101101011.co.uk (19, 54, 55, 20, 21, 22, 171, 172, 173, 23)
Jeremy Quinn (24)
Brahmadas from SEOZooms (25, 26, 174, 175)
Interesting approach ;-).
I’d like to add:
“on site”:
-internal linking (!)
-site speed
– TLD
-image alt tags
– amount of ads on the page
-video markup
-microformats such as hreview or hrecipe
-IP C class, location of server
-content above Vs. below the fold
“off site”:
– total number of external links (in addition to LRDs already mentioned)
– surrounding text of external links
– IP C class of linking sites
– Google places profile exists
– Link profile of competitor sites
“brand signals”:
-search volume of brand name
– physical address on page
– age of domain / site
– number of branded external links
“social signals”:
– how many twitter followers does the site have
– Surrounding text of links in tweets
– what is the rate of pageviews / replies on social media
– is there UGC on the site?
– how fast is the growth of social mentions (slwoly growing or outbreak?)
“other signals”:
– CTR from SERPs
– bounce rate from SERPs
– Reviews rating on third party sites
– Google seller reviews (for e-commerce sites)
– number of pages in Google index
Wow, great additions to the list Felix! We will have the 200+ Google ranking signals in no time at this rate.
I would add :
Onpage :
– Age of domain (although I see that that is in Brand signals) more specifically, how long it has been owned by “you” not just registered
External links :
– How old the link is
– how quickly these links have been obtained
– any evidence that links have been paid for
Thanks for the extra ranking signals John
Is this a link bait attempt?
Of course it is, what else would this be. It’s also good to see what people within the SEO community think could be being used as a ranking signal.
But the real question is…are you the real Matt Cutts? (It would be awesome if you are and knowing that you are reading my blog But it could also be a fake)
Update: The above comment is not from Matt Cutts, the real Matt Cutts has clarified.
What about Crawl rate, crawl errors, validation, page overal visibility, user side reputations, bounce rate, time on site and pages, number of page views . there are many things I think…. so it may not be 200+ may be more than 300 or 400. for the time being and Google may identify and add many and many new factors in the future so that the SEO game will in more tough plat form….. he he
Title is not a Meta
Technically correct. But it is a common understanding that when one says “Meta Title” that they are talking about the <title> tag and not the old skool tag which is never used: <meta name=”title” content”something”/>
Think that is jumped ! one or serious forget to attend skool before attending college
Any way nice post Mick. Keep it up. Think more time will take to fill the blank space of signals. Or the Google God is not getting right signals now? Regards
I’ll give you an A for effort here but you might want to check out the following project before you spend too much time trying to do the same work on your own. It seems someone has already started a community effort to determine exactly you are after here.
http://www.theopenalgorithm.com/
Thanks for sharing Robert, the website provides a nice scientific approach whereas this list is a nice quick reference point based purely on peoples opinions and experience within the industry.