AuthorRank, AuthoRank – A probability distribution modeled on PageRank used to determine the quality of an article on the basis of the reputation of its associated author. AuthorRank was introduced into search algorithm analysis by Google in 2011 and increased the perceived value of personas.
Authorship – A method developed and implemented by Google whereby writers use a Google+ account to identify content that they have legitimately written and published. Such content is distinguished by the appearance of an authorized image of the content producer in search results next to appropriate listings. As of December 2012 the program is still in the pilot phase and is not necessarily available to everyone who implements Authorship markup.
Blog – A Web site or portion of a Web site devoted to “Web logging” or “Web journaling”. Blogs are typically used to create content and place links for search results management.
Blog Farm – A group of blogs operated by a single person or group that are populated by software, usually RSS-feed scraping scripts. Used for link building, blog farms are created by special software that installs popular blogging software on multiple domains and hosting accounts. Sometimes confused with links farms.
Blog Network – Any group of blogs (either hosted on large Websites or hosted on their own dedicated sites) that are used together for a common purpose. Some blog networks are enterprise assets (such as the AOL blog network including TechCrunch, Huffington Post, etc.). Some blog networks may be used to manipulate search engine results by Web spammers. A blog network differs from a blog farm in that writers upload their content directly to the network or to a central distribution platform.
Body Links – Hypertext links placed within the main content of a Web page.
Branded Search – According to Barry Adams, branded search is “traffic from organic search on keywords that contain the organisation’s brand name”. Branded Search may occur for individuals and Websites as well as organizations. Branded Searchis a form of Conceptual Search.
Cloaking – The practice of serving different content to search engines than real human visitors are likely to see through normal Web browsing. The intent behind the cloaking may be deceptive or language-related or related to accessibility. Some forms of simple cloaking are accepted by the search engines.
Conceptual Search – A form of search focusing on attributes or qualifiers that are relevant to a specific topic or concept. e.g., given the concept of “aerodynamics”, a conceptual query may look for “definition of aerodynamics” or “value of aerodynamics”. Search engines typically must infer the searcher’s intent from an incomplete query expression.
Content Marketing – The practice of publishing content with no immediately obvious return on investment for the sake of stimulating new consumer interest in a business or brand. Since late 2012 the term “content marketing” has been misappropriated by bloggers in the SEO industry as a euphemism for generating large volumes of pointless Web articles.
Conversion – A conversion is any desired action that is taken as a result of visiting a Web page. Conversions are used in many Web marketing metrics. Conversions fall into four categories: Informational Conversions, Search Conversions, Transformational Conversions, and Transactional Conversions.
Conversion Rate Optimization (CRO) – The practice of analyzing conversion data to determine the best practices for increasing or otherwise improving conversions on a Website.
Content-rich doorway – A doorway page dressed up with graphics, navigation, and linked to from a site map so that it looks like a normal part of a Web site. The copy is written to rank for a single keyword expression.
Crawl Page – A document consisting of links to other pages, provided for the sole purpose of giving crawlers (robots) links to follow. Spammers used to submit these puppies to the search engines en masse. Maybe they still do.
Doorway – A document with a small amount of text (usually coherent but sometimes gibberish) intended to rank well specifically for one targeted expression. In the old days, people created as many doorways as they had targeted keywords and search engines to work with.
Infographic – A large, oversized image in which many purported facts have been embedded in a stylized format. The intent of the infographic is to encapsulate useful, interesting information in an easy-to-read manner. Infographics cannot be indexed by search engines but they have given rise to Infographic Spam.
Informational Conversion – An informational conversion occurs when a visitor finds the precise information he is seeking on a Web page.
- The practice of evaluating the source of a query and its relevance to the content that received traffic from the query.
- The practice of estimating how much revenue a specific query drives to a given (group of) website(s).
- The practice of identifying queries that drive traffic to websites.
- The practice of estimating how much revenue should be expected from optimizing for specific keywords.
Landing Page – A content-rich doorway most often used to receive PPC traffic. Copy is written for the visitor, not the search engine, making a sales pitch (usually — I’ve seen a few that meandered pointlessly with fake testimonials). Some organic SEO makes use of landing pages for experimentation such as A/B testing.
Link Acquisition – The practice of acquiring links for a Website document through active and passive link building. Has begun to supersede link building in common SEO parlance.
Link Building – The process of acquiring links for a Web document through creation, request, reciprocation, or lease/purchase, or distribution of copy through automated services. This expression has accrued some negative connotation and is gradually being replaced by other expressions such as Link Acquisition.
Link Farm – Any group of Web sites where every member site in the group links to every other member site in the group for the express purpose of mutually improving all member sites’ performance in search results.
Meta data – Secondary data stored in or collected about a Web document for the purpose of assisting in its classification, use, or interpretation.
Navigational links – The (usually uniformly used) internal links a Web site uses to provide visitors with clear pathways between pages.
Negative SEO – The practice of pointing links at a Website with the intention of harming its reputation, the reputation of its owner, or of tricking a search engine into penalizing the Website. Link bombing and Google bowling are two forms of Negative SEO that have been widely documented.
PageRank or Page Rank – Named for Larry Page, primary author of the algorithm created to compute the value
- A recursive link-based probability distribution used to assess the importance of Web documents by counting and valuing the links between documents according to the number and value of the links pointing to the documents. PageRank is based on citation analysis, a controversial but long-standing practice of assessing the quality and value of scientific papers based on the number of other scientific papers referring to them.
- A derivative value computed from the first PageRank and defined across an integer scale ranging from 0 to 10; often identified as “Toolbar PageRank” or denoted as TBPR.
- The act of searching for Websites or information in a search engine by entering 1 or more keywords or search terms.
- The directives and/or search terms used in a query. Classification: There are several types of queries, including Informational Queries, Navigational Queries, and Transactional Queries.
Search Conversion – A search conversion occurs when a user clicks through a link in search results. Some metrics require that the user remain on the destination for a minimum length of time in order for the click-through to count as a search conversion.
- The practice of promoting one or more Websites through the services offered by search engines.
- The use of paid search listings, often referred to as PPC (pay-per-click) advertising.
- The process of promoting products and/or services through Web search.
Search Engine Optimization – The practice of analyzing search engine protocols, actions, resources, and guidelines for the purpose of improving Website compliance and performance in search results.
Search Result – The search listings provided by a search engine in response to a query.
SEO – Search engine optimization.
SM – Search marketing.
Shallow Content – According to Ted Ulle “Words (any number) that don’t actually communicate much” are shallow content.
Site Map – Also spelled “sitemap”. An on-site directory of important (or all) pages. Sitemaps have been divided into XML Sitemaps and TXT Sitemaps which are used by search engines, and HTML Sitemaps which are used by visitors for quick navigation to deep content. Some specialized sitemaps may only list certain types of content.
Spam – In general use, any form of high-repetition marketing practice that annoys people, uses resources in unethical fashion, or otherwise violates the guidelines and/or laws of various Web services or governments. Cf. Web spam, Spam bomb, SpamAd Page, Spamversibot, Crawl Spam, et. al.
Thin Content – According to Ted Ulle, “(content) reproducing a feed that is commonly used around the web, and adding no extra value” is thin content.
Transactional Conversion – A transactional conversion occurs when a visitor exchanges something of value (such as money) for a product, service, or other form of valued commodity. Purchases, fee payments, bill payments, and paid membership registrations are all examples of transactional conversions.
Advanced SEO Glossary
Crawl – The process by which search engines retrieve content from a Website, including the criteria used to determine rates and priorities of crawl. From the perspective of an SEO, crawl can be influenced or managed through internal and external resources.
Crawl-to-Cache-Time – The amount of time that elapses from when a search engine fetches a page from a Web site until the page’s contents appear in the search engine’s cache report for the page. Abbreviated as CCT.
Hummingbird, Hummingbird Algorithm. – The Hummingbird update was the first major update to Google’s search algorithm since the 2010 “Caffeine Update”, but even that was limited primarily to improving the indexing of information rather than the sorting of information. Google search chief Amit Singhal stated that Hummingbird is the first major update of its type since 2001. Conversational search leverages natural language, semantic search, and more to improve the way search queries are parsed. Unlike previous search algorithms which would focus on each individual word in the search query, Hummingbird considers each word but also how each word makes up the entirety of the query — the whole sentence or conversation or meaning — is taken into account, rather than particular words. The goal is that pages matching the meaning do better, rather than pages matching just a few words.
Index – The database(s) against which queries are resolved. All of the major search engines maintain multiple indexes. Each is a separate, distinct database, either physically (kept in separate files) or virtually (logically segmented portions of a master database). The expression database is probably inappropriate for describing what the search engines maintain. When you see me refer to Main Index, think of that as the “static Web page index”. Other indexes may include Image Indexes, News Indexes, and Blog Indexes. I have some ideas on how these various indexes are built, but I don’t expect to share them on this blog.
Index – The process of adding information about Web content to a search engine’s database about the Web. The indexing process may entail considerable effort depending upon the complexity and applicability of the document.
Localization – The practice whereby a search engine delivers search results that are deemed relevant to the user’s location or a specific locale. Localization may be activated algorithmically but some search engines give users the option of resetting their search location after it has been automatically determined (probably by IP address).
- The quality of a search result that has been modified either by the user or (more often) the search engine to show only results that are relevant or specific to a certain location or locale.
- As in no. 1, except with respect to a topic or Website as opposed to a geographic location. See also Localization.
Panda, Panda algorithm – A document classifier developed by a Google engineer (Navneet? Panda) in 2010 and used beginning in March 2011 to downgrade documents matching low-quality criteria such that they no longer appear prominently in Google’s search results, and may no longer to able to pass value through their links. The algorithm’s mechanism has not been published but disclosures from Google indicate it is a learning algorithm that scans large data sets to identify “signals” (statistical trends in the data) which may be used to approximate human judgment distinguishing between “high quality sites” and “low quality sites” as demonstrated through a large sample set of Websites evaluated by human quality raters. The algorithm is run against Google’s index “offline” and the results released into the general search results every 4-6 weeks.
Penguin, Penguin algorithm – A document classifier developed by Google in 2011 and 2012 to identify Websites that violate Google’s quality guidelines either through keyword-stuffing or use of manipulative links from other Websites. So-named to avoid associating the algorithm with any specific Google engineer. The Penguin algorithm has been compared to the Panda algorithm and some people believe that Penguin may have been adapted from Panda primarily by using a different type of learning set.
Penguinated – The state or status of a Website having been deindexed or downgraded by the Penguin algorithm in Google’s search index.
Quality Content – A nonsense expression with no real value or purpose other than to act as a catchall for Web documents people think are better than “that other content”. Search engine marketers typically speak of quality content when describing their own work without providing any basis for comparison or measurement of the quality of the content being referred to. Any discussion about “quality content” is essentially meaningless babble that provides no insight into search marketing practices or the results to be expected from following those practices.
Quality Links – A nonsense expression with no real value or purpose other than to act as a catchall for the types of links people think are better than “those other links”. Googlers use “quality links” as a subtle way of telling people to stop getting cheap spammy links. Many SEO forum moderators and admins use “quality links” in a somewhat broader but similar fashion, if only because they don’t know exactly what criteria make links good for any particular search engine but they recognize that people who are asking about linkage have a problem. Nearly everyone else seems to use the expression to refer to their (usually non-performing) backlinks. I wrote about high quality links at SEOmoz (in a post designed to rank for “high quality links” on the basis of content but the lesson passed over everyone’s head, except for Aaron Pratt who saw what I was doing right away).
SERP – Acronym for Search Engine Results Page. Everyone seems to know this acronym by now. I have always hated it even though I now reluctantly use it. SRP (search results page) would be better, since it’s all inclusive. You can have a DRP (Directory Results Page) which some people might argue should be called a DSRP (Directory Search Results Page). I still get click throughs from Yahoo! and DMOZ directory page listings (or a DLP, Directory Listings Page).
Sitelinks – Google invented this term, which is better than my classic “little clustered links under the main listing”. Sitelinks are those “little clustered links under the main listing” that deep link into the site by category or topic. Many people wonder how these Sitelinks appear. Googlers always say, “That’s algorithmically determined and we have no control over them” — meaning, “We wrote special commands into our software to create those things and we’re not going to tell you what criteria are used to decide which sites get them.” My best guess is that sites that have more than 1,000 pages of content, clear content categorization in their non-breadcrumb internal links, and lots of deep links from other domains are good candidates for Sitelinks. Other criteria are probably taken into consideration. Sitelinks are only shown for the top listing in a popular query result.
- Any link that a search engine such as Bing or Google determines was created for the purpose of influencing search results rather than as part of the “natural Web” experience
- Any link created as a result of a marketing campaign, the purpose of which is to create or inspire the creation of links that would not have been placed by Website owners of their own volition without the intervention of the destination owner or marketing representative.
Update – From the SEO side, an update is any noticeable change to the way a search engine behaves. From the search engines’ side, an update is any intended change in a search engine’s makeup or data. Matt Cutts offers an incomplete explanation of a Google update in his December 2006 Explaining Algorithm Updates and Data Refreshes post. He wrote a similar post in September 2005 with What’s An Update?. I don’t expect Matt to confirm every algorithmic change. That would pretty much defeat the purpose of many of them. Yahoo! and Windows Live occasionally issue “weather reports”. Matt has informally issued some on Google’s behalf.