Official Google Webmaster Central Blog: search results

How we fought Search spam on Google - Webspam Report 2019

Tuesday, June 09, 2020

Every search matters. That is why whenever you come to Google Search to find relevant and useful information, it is our ongoing commitment to make sure users receive the highest quality results possible.

Unfortunately, on the web there are some disruptive behaviors and content that we call "webspam" that can degrade the experience for people coming to find helpful information. We have a number of teams who work to prevent webspam from appearing in your search results, and it’s a constant challenge to stay ahead of the spammers. At the same time, we continue to engage with webmasters to ensure they’re following best practices and can find success on Search, making great content available on the open web.

Looking back at last year, here’s a snapshot of how we fought spam on Search in 2019, and how we supported the webmaster community.

Fighting Spam at Scale

With hundreds of billions of webpages in our index serving billions of queries every day, perhaps it’s not too surprising that there continue to be bad actors who try to manipulate search ranking. In fact, we observed that more than 25 Billion pages we discover each day are spammy. That’s a lot of spam and it goes to show the scale, persistence, and the lengths that spammers are willing to go. We’re very serious about making sure that your chance of encountering spammy pages in Search is as small as possible. Our efforts have helped ensure that more than 99% of visits from our results lead to spam-free experiences.

Updates from last year

In 2018, we reported that we had reduced user-generated spam by 80%, and we’re happy to confirm that this type of abuse did not grow in 2019. Link spam continued to be a popular form of spam, but our team was successful in containing its impact in 2019. More than 90% of link spam was caught by our systems, and techniques such as paid links or link exchange have been made less effective.

Hacked spam, while still a commonly observed challenge, has been more stable compared to previous years. We continued to work on solutions to better detect and notify affected webmasters and platforms and help them recover from hacked websites.

Spam Trends

One of our top priorities in 2019 was improving our spam fighting capabilities through machine learning systems. Our machine learning solutions, combined with our proven and time-tested manual enforcement capability, have been instrumental in identifying and preventing spammy results from being served to users.

In the last few years, we’ve observed an increase in spammy sites with auto-generated and scraped content with behaviors that annoy or harm searchers, such as fake buttons, overwhelming ads, suspicious redirects and malware. These websites are often deceptive and offer no real value to people. In 2019, we were able to reduce the impact on Search users from this type of spam by more than 60% compared to 2018.

As we improve our capability and efficiency in catching spam, we’re continuously investing in reducing broader types of harm, like scams and fraud. These sites trick people into thinking they’re visiting an official or authoritative site and in many cases, people can end up disclosing sensitive personal information, losing money, or infecting their devices with malware. We have been paying close attention to queries that are prone to scam and fraud and we’ve worked to stay ahead of spam tactics in those spaces to protect users.

Working with webmasters and developers for a better web

Much of the work we do to fight against spam is using automated systems to detect spammy behavior, but those systems aren’t perfect and can’t catch everything. As someone who uses Search, you can also help us fight spam and other issues by reporting spam on search, phishing or malware. We received nearly 230,000 reports of search spam in 2019, and we were able to take action on 82% of those reports we processed. We appreciate all the reports you sent to us and your help in keeping search results clean!

So what do we do when we get those reports or identify that something isn’t quite right? An important part of what we do is notifying webmasters when we detect something wrong with their website. In 2019, we generated more than 90 million messages to website owners to let them know about issues, problems that may affect their site’s appearance on Search results and potential improvements that can be implemented. Of all messages, about 4.3 million were related to manual actions, resulting from violations of our Webmaster Guidelines.

And we’re always looking for ways to better help site owners. There were many initiatives in 2019 aimed at improving communications, such as the new Search Console messages, Site Kit for WordPress sites or the Auto-DNS verification in the new Search Console. We hope that these initiatives have equipped webmasters with more convenient ways to get their sites verified and will continue to be helpful. We also hope this provides quicker access to news and that webmasters will be able to fix webspam issues or hack issues more effectively and efficiently.

While we deeply focused on cleaning up spam, we also didn’t forget to keep up with the evolution of the web and rethought how we wanted to treat “nofollow” links. Originally introduced as a means to help fight comment spam and annotate sponsored links, the “nofollow” attribute has come a long way. But we’re not stopping there. We believe it’s time for it to evolve even more, just as how our spam fighting capability has evolved. We introduced two new link attributes, rel="sponsored" and rel="ugc", that provide webmasters with additional ways to identify to Google Search the nature of particular links. Along with rel="nofollow", we began treating these as hints for us to incorporate for ranking purposes. We are very excited to see that these new rel attributes were well received and adopted by webmasters around the world!

Engaging with the community

As always, we’re grateful for all the opportunities we had last year to connect with webmasters around the world, helping them improve their presence in Search and hearing feedback. We delivered more than 150 online office hours, online events and offline events in many cities across the globe to a wide range of audience including SEOs, developers, online marketers and business owners. Among those events, we have been delighted by the momentum behind our Webmaster Conferences in 35 locations across 15 countries and 12 languages around the world, including the first Product Summit version in Mountain View. While we’re not currently able to host in-person events, we look forward to more of these events and virtual touchpoints in the future.

Webmasters continued to find solutions and tips on our Webmasters Help Community with more than 30,000 threads in 2019 in more than a dozen languages. On YouTube, we launched #AskGoogleWebmasters as well as series such as SEO mythbusting to ensure that your questions get answered and your uncertainties get clarified.

We know that our journey to better web with you is ongoing and we would love to continue this with you in the year to come! Therefore, do keep in touch on Twitter, YouTube, blog, Help Community or see you in person at one of our conferences near you!

Posted by Cherry Prommawin, Search Relations, and Duy Nguyen, Search Quality Analyst

Video Series for New Webmasters: Search for Beginners!

Tuesday, October 29, 2019

We are excited to introduce our newest video series: “Search For Beginners”! The series was created primarily to help new webmasters. It is also for anyone with an interest in Search or anyone who is still learning about the Web and how to manage their online presence.

We love to see the webmaster community grow! Every day, there are countless new webmasters who are taking the first steps in learning how Search works, and how to make their websites perform well and discoverable on Search. We understand that it sometimes can be challenging or even overwhelming to start with our existing content without some prior knowledge or basic understandings of the Web. We find our basic videos in our YouTube channels to be the ones with the most views. At the same time, advanced webmasters also see the need for content that can be sent to clients or stakeholders to help explain important concepts in managing an online presence.

We want to help all webmasters succeed, regardless of whether you have been managing websites for many years or you’ve just started out yesterday. We want to do more to help the new webmasters and this video series will hopefully help us achieve that.

Introduction to the series:

Episode 1:

The “Search For Beginners” video series covers basic online presence topics ranging from ‘Do you need a website?’, ‘What are the goals for your website?’ to more organic search-related topics such as ‘How does Google Search work?’, ‘How to change description line’, or ‘How to change wrong address information on Google’. Actually, we get asked these questions frequently in forums, social channels and at events around the world! The videos are fully animated. The videos are in English with subtitles available in Spanish, Portuguese, Korean, Chinese, Indonesian, Italian, Japanese, and English. We are working on more, so please stay tuned!

And if you consider yourself a more experienced user, please feel free to use these videos to support your pitches or explaining things to your clients. If you want to share any ideas or learnings, please leave them in the comment section in each video so that others can benefit from your knowledge and experience.

Follow us on Twitter and subscribe on YouTube for the upcoming videos! We will be adding new videos in this series to this playlist about every two weeks!

Posted by Cherry Prommawin, Search Quality Analyst

Google Search News: coming soon to a screen near you

Friday, September 27, 2019

The world of search is constantly evolving. New tools, opportunities, and features are regularly arriving, sometimes existing things change, and sometimes we say goodbye to some things to make way for the new. To help you stay on top of things, we've started a new YouTube series called Google Search News.

With Google Search News, we want to give you a regular & short summary of what's been happening around Google Search, specifically for SEOs, publishers, developers, and webmasters. The first episode is out now, so check it out.

(The first episode, now on your screen)

In this first episode, we cover:

Recent updates to Search Console
Changes in the webmaster office-hours setup
Advances for rel=nofollow & related attributes
Changes in review rich results
New meta-tags & attributes for your pages' search previews
Some recent Webmaster Conferences

We plan to make these updates regularly, and adjust the format over time as needed. Let us know what you think in the video comments!

Posted by John Mueller, Webmaster Trends Analyst, Google Switzerland

More options to help websites preview their content on Google Search

Tuesday, September 24, 2019

Google uses content previews, including text snippets and other media, to help people decide whether a result is relevant to their query. The type of preview shown depends on many factors, including the type of content a person is looking for and the kind of device they're viewing it on.

For instance, if you look for recipe results on Google, you may see thumbnail images and user ratings--things that may be more helpful than text snippets when it comes to deciding what you want to eat. Alternately, or perhaps you're looking for a concert nearby, and are able to check out the events directly in the search results. These are made possible by publishers marking up their pages with structured data.

Google automatically generates previews in a way intended to help a user understand why the results shown are relevant to their search and why the user would want to visit the linked pages. However, we recognize that site owners may wish to independently adjust the extent of their preview content in search results. To make it easier for individual websites to define how much or which text should be available for snippeting and the extent to which other media should be included in their previews, we're now introducing several new settings for webmasters.

Letting Google know about your snippet and content preview preferences

Previously, it was only possible to allow a textual snippet or to not allow one. We're now introducing a set of methods that allow more fine-grained configuration of the preview content shown for your pages. This is done through two types of new settings: a set of robots meta tags and an HTML attribute.

Using robots meta tags

The robots meta tag is added to an HTML page's <head>, or specified via the x-robots-tag HTTP header. The robots meta tags addressing the preview content for a page are:

"nosnippet"
This is an existing option to specify that you don't want any textual snippet shown for this page.
"max-snippet:[number]"
New! Specify a maximum text-length, in characters, of a snippet for your page.
"max-video-preview:[number]"
New! Specify a maximum duration in seconds of an animated video preview.
"max-image-preview:[setting]"
New! Specify a maximum size of image preview to be shown for images on this page, using either "none", "standard", or "large".

They can be combined, for example:

<meta name="robots" content="max-snippet:50, max-image-preview:large">

Preview settings from these meta tags will become effective in mid-to-late October 2019 and may take about a week for the global rollout to complete.

Using the new data-nosnippet HTML attribute

A new way to help limit which part of a page is eligible to be shown as a snippet is the "data-nosnippet" HTML attribute on span, div, and section elements. With this, you can prevent that part of an HTML page from being shown within the textual snippet on the page.

For example:

<p><span data-nosnippet>Harry Houdini</span> is undoubtedly the most famous magician ever to live.</p>

The data-nosnippet HTML attribute will be start affecting presentation on Google products later this year. Learn more in our developer documentation for the robots meta tag, x-robots-tag, and data-nosnippet.

A note about rich results and featured snippets

Content in structured data is eligible for display as rich results in search. These kinds of results do not conform to limits declared in the above meta robots settings, but rather, can be addressed with much greater specificity by limiting or modifying the content provided in the structured data itself. For example, if a recipe is included in structured data, the contents of that structured data may be presented in a recipe carousel in the search results. Similarly, if an event is marked up with structured data, it may be presented as such in the search results. To limit that presentation, a publisher can limit the amount and type of content in the structured data.

Some special features on Search depend on the availability of preview content, so limiting your previews may prevent your content from appearing in these areas. Featured snippets, for example, requires a certain minimum number of characters to be displayed. This can vary by language, which is why there is no exact max-snippets length we can provide to ensure appearing in this feature. Those who do not wish to have content appear as featured snippets can experiment with lower max-snippet lengths. Those who want a guaranteed way to opt-out of featured snippets should use nosnippet.

The AMP Format

The AMP format comes with certain benefits, including eligibility for more prominent presentation of thumbnail images in search results and in the Google Discover feed. These characteristics have been shown to drive more traffic to publishers’ articles. However, publishers who do not want Google to use larger thumbnail images when their AMP pages are presented in search and Discover can use the above meta robots settings to specify max-image-preview of “standard” or “none.”

These new options are available to content owners worldwide and will operate the same for results we display globally. We hope they make it easier for you to optimize the value you get from Search and achieve your business goals. For more information, check out our developer documentation on meta tags. Should you have any questions, feel free to reach out to us, or drop by our webmaster help forums.

Posted by John Mueller, Webmaster Trends Analyst, Google Switzerland

When indexing goes wrong: how Google Search recovered from indexing issues & lessons learned since.

Monday, August 12, 2019

Most of the time, our search engine runs properly. Our teams work hard to prevent technical issues that could affect our users who are searching the web, or webmasters whose sites we index and serve to users. Similarly, the underlying systems that we use to power the search engine also run as intended most of the time. When small disruptions happen, they are largely not visible to anyone except our teams who ensure that our products are up and running. However, like all complex systems, sometimes larger outages can occur, which may lead to disruptions for both users and website creators.

In the last few months, such a situation occurred with our indexing systems, which had a ripple effect on some other parts of our infrastructure. While we worked as quickly as possible to remedy the situation, we apologize for the disruption, as our goal is to continuously provide high-quality products to our users and to the web ecosystem.

Since then, we took a closer, careful look into the situation. In the process, we learned a few lessons that we'd like to share with you today. In this blog post, we will go into more details about what happened, clarify how we plan to communicate better if such things happen in the future, and remind website owners of the channels they can use to communicate with us.

So, what happened a few months ago?

In April, we had several issues related to our index. The Search index is the database that holds the hundreds of billions of web pages that we crawled on the web and that we think could answer some of our users’ queries. When a user enters a query in the Google search engine, our ranking algorithms sort through those pages in our Search index to find the most relevant, useful results in a fraction of a second. Here is more information on what happened.

1. The indexing issue

To start it off, we temporarily lost part of the Search index.
Wait... What? What do you mean “lost part of the index?” Is that even possible?

Basically, when serving search results to users, to accelerate the speed of the service, the query of the user only “travels” as far as the closest of our data centers supporting the Google Search product, from which the Search Engine Results Page (SERP) is generated. So when there are modifications to the composition of the index (some pages added and removed, documents are merged, or other types of data modification), those modifications need to be reflected in all of those data centers. The consequence is that users all over the world are consistently served pages from the most recent version of the index.

Google owns and operates data centers (like the one pictured above) around the world, to keep our products running 24 hours a day, 7 days a week - source

Keeping the index unified across all those data centers is a non trivial task. For large user-facing services, we may deploy updates by starting in one data center and expand until all relevant data centers are updated. For sensitive pieces of infrastructure, we may extend a rollout over several days, interleaving them across instances in different geographic regions. source

So, as we pushed some planned changes to the Search index, on April 5th parts of the deployment system broke, on a Friday no-less! More specifically: as we were updating the index over some of our data centers, a small number of documents ended up being dropped from the index accidentally. Hence: “we lost part of the index.”

Luckily, our on-call engineers caught the issue pretty quickly, at the same time as we started picking up chatter on social media (thanks to everyone who notified us over that weekend!). As a result, we were able to start reverting the Search index to its previous stable state in all data centers only a few hours after the issue was uncovered (we keep back-ups of our indexes just in case such events happen).

We communicated on Sunday, April 7th that we were aware of the issue, and that things were starting to get back to normal. As data centers were progressively reverting back to a stable index, we continued updating on Twitter (on April 8th, on April 9th), until we were confident that all data centers were fully back to a complete version of the index on April 11th.

2. The Search Console issue

Search Console is the set of tools and reports any webmaster can use to access data about their website’s performance in Search. For example, it shows how many impressions and clicks a website gets in the organic search results every day, or information on what pages of a website are included and excluded from the Search index.

As a consequence of the Search index having the issues we described above, Search Console started to also show inconsistencies. This is because some of the data that surfaces in Search Console originates from the Search index itself:

the Index Coverage report depends on the Search index being consistent across data centers.
when we store a page in the Search index, we can annotate the entry with key signals about the page, like the fact that the page contains rich results markup for example. Therefore, an issue with the Search index can have an impact on the Rich Results reports in Search Console.

Basically, many Search Console individual reports read data from a dedicated database. That database is partially built by using information that comes from the Search index. As we had to revert back to a previous version of the Search index, we also had to pause the updating of the Search Console database. This resulted in plateau-ing data for some reports (and flakiness in others, like the URL inspection tool).

Index coverage report for indexed pages, which shows an example of the data freshness issues in Search Console in April 2019, with a longer time between 2 updates than what is usually observed.

Because the whole Search index issue took several days to roll back (see explanation above), we were delayed focusing on fixing the Search Console database until a few days later, only after the indexing issues were fixed. We communicated on April 15th - tweet - that the Search Console was having troubles and that we were working on fixing it, and we completed our fixes on April 28th (day on which the reports started gathering fresh data again, see graph above). We communicated on Twitter on April 30th that the issue was resolved- tweet.

3. Other issues unrelated to the main indexing bug

Google Search relies on a number of systems that work together. While some of those systems can be tightly linked to one another, in some cases different parts of the system experience unrelated problems around the same time.

In the present case for example, around the same time as the main indexing bug explained above, we also had brief problems gathering fresh Google News content. Additionally, while rendering pages, certain URLs started to redirect Googlebot to other unrelated pages. These issues were entirely unrelated to the indexing bug, and were quickly resolved (tweet 1 & tweet 2).

Our communication and how we intend on doing better

In addition to communicating on social media (as highlighted above) during those few weeks, we also gave webmasters more details in 2 other channels: Search Console, as well as the Search Console Help Center.

In the Search Console Help Center

We updated our “Data anomalies in Search Console” help page after the issue was fully identified. This page is used to communicate information about data disruptions to our Search Console service when the impact affects a large number of website owners.

In Search Console

Because we know that not all our users read social media or the external Help Center page, we also added annotations on Search Console reports, to notify users that the data might not be accurate (see image below). We added this information after the resolution of the bugs. Clicking on “see here for more details” sends users to the “Data Anomalies” page in the Help Center.

Index coverage report for indexed pages, which shows an example of the data annotations that we can include to notify users of specific issues.

Communications going forward

When things break at Google, we have a strong “postmortem” culture: creating a document to debrief on the breakage, and try to avoid it happening next time. The whole process is described in more detail at the Google Site Reliability Engineering website.

In the wake of the April indexing issues, we included in the postmortem how to better communicate with webmasters in case of large system failures. Our key decisions were:

Explore ways to more quickly share information within Search Console itself about widespread bugs, and have that information serve as the main point of reference for webmasters to check, in case they are suspecting outages.
More promptly post to the Search Console data anomalies page, when relevant (if the disturbance is going to be seen over the long term in Search Console data).
Continue tweeting as quickly as we can about such issues to quickly reassure webmasters we’re aware and that the issue is on our end.

Those commitments should make potential future similar situations more transparent for webmasters as a whole.

Putting our resolutions into action: the “new URLs not indexed” case study

On May 22nd, we tested our new communications strategy, as we experienced another issue. Here’s what happened: while processing certain URLs, our duplicate management system ran out of memory after a planned infrastructure upgrade, which caused all incoming URLs to stop processing.

Here is a timeline of how we thought about communications, following the 3 points highlighted just above:

We noticed the issue (around 5.30am California time, May 22nd)
We tweeted about the ongoing issue (around 6.40am California time, May 22nd)
We tweeted about the resolution (around 10pm California time, May 22nd)
We evaluated updating the “Data Anomalies” page in the Help Center, but decided against it since we did not expect any long-term impact for the majority of webmasters' Search Console data in the long run.
The confusion that this issue created for many confirmed our earlier conclusions that we need a way to signal more clearly in the Search Console itself that there might be a disruption to one of our systems which could impact webmasters. Such a solution might take longer to implement. We will communicate on this topic in the future, as we have more news.

Last week, we also had another indexing issue. As with May 22, we tweeted to let people know there was an issue, that we were working to fix it and when the issue was resolved.

How to debug and communicate with us

We hope that this post will bring more clarity to how our systems are complex and can sometimes break, and will also help you understand how we communicate about these matters. But while this post focuses on a widespread breakage of our systems, it’s important to keep in mind that most website indexing issues are caused by an individual website’s configuration, which can create difficulties for Google Search to index that website properly. For those cases, all webmasters can debug issues using Search Console and our Help center. After doing so, if you still think that an issue is not coming from your site or don’t know how to resolve it, come talk to us and our community, we always want to take feedback from our users. Here is how to signal an issue to us:

Check our Webmaster Community, sometimes other webmasters have highlighted an issue that also impacts your site.
In person! We love contact, come and talk to us at events. Calendar.
Within our products! The Search Console feedback tool is very useful to our teams.
Twitter and YouTube!

Posted by Vincent Courson, Google Search Outreach

Helping publishers and users get more out of visual searches on Google Images with AMP

Thursday, July 25, 2019

Google Images has made a series of changes to help people explore, learn and do more through visual search. An important element of visual search is the ability for users to scan many ideas before coming to a decision, whether it’s purchasing a product, learning more about a stylish room, or finding instructions for a DIY project. Often this involves loading many web pages, which can slow down a search considerably and prevent users from completing a task.

As previewed at Google I/O, we’re launching a new AMP-powered feature in Google Images on the mobile web, Swipe to Visit, which makes it faster and easier for users to browse and visit web pages. After a Google Images user selects an image to view on a mobile device, they will get a preview of the website header, which can be easily swiped up to load the web page instantly.

Swipe to Visit uses AMP's prerender capability to show a preview of the page displayed at the bottom of the screen. When a user swipes up on the preview, the web page is displayed instantly and the publisher receives a pageview. The speed and ease of this experience makes it more likely for users to visit a publisher's site, while still allowing users to continue their browsing session.

Publishers who support AMP don’t need to take any additional action for their sites to appear in Swipe to Visit on Google Images. Publishers who don’t support AMP can learn more about getting started with AMP here. In the coming weeks, publishers can also view their traffic data from AMP in Google Images in a Search Console’s performance report for Google Images in a new search area named “AMP on Image result”.

We look forward to continuing to support the Google Images ecosystem with features that help users and publishers alike.

Posted by Assaf Broitman, Google Images PM

Instant-loading AMP pages from your own domain

Tuesday, April 16, 2019

Today we are rolling out support in Google Search’s AMP web results (also known as “blue links”) to link to signed exchanges, an emerging new feature of the web enabled by the IETF web packaging specification. Signed exchanges enable displaying the publisher’s domain when content is instantly loaded via Google Search. This is available in browsers that support the necessary web platform feature—as of the time of writing, Google Chrome—and availability will expand to include other browsers as they gain support (e.g. the upcoming version of Microsoft Edge).

Background on AMP’s instant loading

One of AMP's biggest user benefits has been the unique ability to instantly load AMP web pages that users click on in Google Search. Near-instant loading works by requesting content ahead of time, balancing the likelihood of a user clicking on a result with device and network constraints–and doing it in a privacy-sensitive way.

We believe that privacy-preserving instant loading web content is a transformative user experience, but in order to accomplish this, we had to make trade-offs; namely, the URLs displayed in browser address bars begin with google.com/amp, as a consequence of being shown in the Google AMP Viewer, rather than display the domain of the publisher. We heard both user and publisher feedback over this, and last year we identified a web platform innovation that provides a solution that shows the content’s original URL while still retaining AMP's instant loading.

Introducing signed exchanges

A signed exchange is a file format, defined in the web packaging specification, that allows the browser to trust a document as if it belongs to your origin. This allows you to use first-party cookies and storage to customize content and simplify analytics integration. Your page appears under your URL instead of the google.com/amp URL.

Google Search links to signed exchanges when the publisher, browser, and the Search experience context all support it. As a publisher, you will need to publish both the signed exchange version of the content in addition to the non-signed exchange version. Learn more about how Google Search supports signed exchange.

Getting started with signed exchanges

Many publishers have already begun to publish signed exchanges since the developer preview opened up last fall. To implement signed exchanges in your own serving infrastructure, follow the guide “Serve AMP using Signed Exchanges” available at amp.dev.

If you use a CDN provider, ask them if they can provide AMP signed exchanges. Cloudflare has recently announced that it is offering signed exchanges to all of its customers free of charge.

Check out our resources like the webmaster community or get in touch with members of the AMP Project with any questions. You can also provide feedback on the signed exchange specification.

Posted by Devin Mullins and Greg Rogers

Search Console reporting for your site's Discover performance data

Wednesday, April 10, 2019

Discover is a popular way for users to stay up-to-date on all their favorite topics, even when they’re not searching. To provide publishers and sites visibility into their Discover traffic, we're adding a new report in Google Search Console to share relevant statistics and help answer questions such as:

How often is my site shown in users' Discover? How large is my traffic?
Which pieces of content perform well in Discover?
How does my content perform differently in Discover compared to traditional search results?

A quick reminder: What is Discover?

Discover is a feature within Google Search that helps users stay up-to-date on all their favorite topics, without needing a query. Users get to their Discover experience in the Google app, on the Google.com mobile homepage, and by swiping right from the homescreen on Pixel phones. It has grown significantly since launching in 2017 and now helps more than 800M monthly active users get inspired and explore new information by surfacing articles, videos, and other content on topics they care most about. Users have the ability to follow topics directly or let Google know if they’d like to see more or less of a specific topic. In addition, Discover isn’t limited to what’s new. It surfaces the best of the web regardless of publication date, from recipes and human interest stories, to fashion videos and more. Here is our guide on how you can optimize your site for Discover.

Discover in Search Console

The new Discover report is shown to websites that have accumulated meaningful visibility in Discover, with the data shown back to March 2019. We hope this report is helpful in thinking about how you might optimize your content strategy to help users discover engaging information-- both new and evergreen.

For questions or comments on the report, feel free to drop by our webmaster help forums, or contact us through our other channels.

Posted by Michael Huzman, Ariel Kroszynski

How to discover & suggest Google-selected canonical URLs for your pages

Tuesday, March 26, 2019

Sometimes a web page can be reached by using more than one URL. In such cases, Google tries to determine the best URL to display in search and to use in other ways. We call this the “canonical URL.” There are ways site owners can help us better determine what should be the canonical URLs for their content.

If you suspect we’ve not selected the best canonical URL for your content, you can check by entering your page’s address into the URL Inspection tool within Search Console. It will show you the Google-selected canonical. If you believe there’s a better canonical that should be used, follow the steps on our duplicate URLs help page on how to suggest a preferred choice for consideration.

Please be aware that if you search using the site: or inurl: commands, you will be shown the domain you specified in those, even if these aren’t the Google-selected canonical. This happens because we’re fulfilling the exact request entered. Behind-the-scenes, we still use the Google-selected canonical, including for when people see pages without using the site: or inurl: commands.

We’ve also changed URL Inspection tool so that it will display any Google-selected canonical for a URL, not just those for properties you manage in Search Console. With this change, we’re also retiring the info: command. This was an alternative way of discovering canonicals. It was relatively underused, and URL Inspection tool provides a more comprehensive solution to help publishers with URLs.

Posted by John Mueller, Google Switzerland

Ways to succeed in Google News

Thursday, January 17, 2019

With the New Year now underway, we'd like to offer some best practices and advice we hope will lead publishers to more success within Google News in 2019.

General advice

There is a lot of helpful information to consider within the Google News Publisher Help Center. Be sure to have read the material in this area, in particular the content and technical guidelines.

Headlines and dates

Present clear headlines: Google News looks at a variety of signals to determine the headline of an article, including within your HTML title tag and for the most prominent text on the page. Review our headline tips.
Provide accurate times and dates: Google News tries to determine the time and date to display for an article in a variety of ways. You can help ensure we get it right by using the following methods:

Show one clear date and time: As per our date guidelines, show a clear, visible date and time between the headline and the article text. Prevent other dates from appearing on the page whenever possible, such as for related stories.
Use structured data: Use the datePublished and dateModified schema and use the correct time zone designator for AMP or non-AMP pages.

Avoid artificially freshening stories: If an article has been substantially changed, it can make sense to give it a fresh date and time. However, don't artificially freshen a story without adding significant information or some other compelling reason for the freshening. Also, do not create a very slightly updated story from one previously published, then delete the old story and redirect to the new one. That's against our article URLs guidelines.

Duplicate content

Google News seeks to reward independent, original journalistic content by giving credit to the originating publisher, as both users and publishers would prefer. This means we try not to allow duplicate content—which includes scraped, rewritten, or republished material—to perform better than the original content. In line with this, these are guidelines publishers should follow:

Block scraped content: Scraping commonly refers to taking material from another site, often on an automated basis. Sites that scrape content must block scraped content from Google News.
Block rewritten content: Rewriting refers to taking material from another site, then rewriting that material so that it is not identical. Sites that rewrite content in a way that provides no substantial or clear added value must block that rewritten content from Google News. This includes, but is not limited to, rewrites that make only very slight changes or those that make many word replacements but still keep the original article's overall meaning.
Block or consider canonical for republished content: Republishing refers to when a publisher has permission from another publisher or author to republish an original work, such as material from wire services or in partnership with other publications.
Publishers that allow others to republish content can help ensure that their original versions perform better in Google News by asking those republishing to block or make use of canonical.
Google News also encourages those that republish material to consider proactively blocking such content or making use of the canonical, so that we can better identify the original content and credit it appropriately.
Avoid duplicate content: If you operate a network of news sites that share content, the advice above about republishing is applicable to your network. Select what you consider to be the original article and consider blocking duplicates or making use of the canonical to point to the original.

Transparency

Be transparent: Visitors to your site want to trust and understand who publishes it and information about those who have written articles. That's why our content guidelines stress that content should have posts with clear bylines, information about authors, and contact information for the publication.
Don't be deceptive: Our content policies do not allow sites or accounts that impersonate any person or organization, or that misrepresent or conceal their ownership or primary purpose. We do not allow sites or accounts that engage in coordinated activity to mislead users. This includes, but isn't limited to, sites or accounts that misrepresent or conceal their country of origin or that direct content at users in another country under false premises.

More tips

Avoid taking part in link schemes: Don't participate in link schemes, which can include large-scale article marketing programs or selling links that pass PageRank. Review our page on link schemes for more information.
Use structured data for rich presentation: Both those using AMP and non-AMP pages can make use of structured data to optimize your content for rich results or carousel-like presentations.
Protect your users and their data: Consider securing every page of your website with HTTPS to protect the integrity and confidentiality of the data users exchange on your site. You can find more useful tips in our best practices on how to implement HTTPS.

Here's to a great 2019!

We hope these tips help publishers succeed in Google News over the coming year. For those who have more questions about Google News, we are unable to do one-to-one support. However, we do monitor our Google News Publisher Forum—which has been newly-revamped—and try to provide guidance on questions that might help a number of publishers all at once. The forum is also a great resource where publishers share tips and advice with each other.
Posted by Danny Sullivan, Public Liaison for Search

Introducing the Indexing API and structured data for livestreams

Wednesday, December 05, 2018

Over the past few years, it's become easier than ever to stream live videos online, from celebrity updates to special events. But it's not always easy for people to determine which videos are live and know when to tune in.
Today, we're introducing new tools to help more people discover your livestreams in Search and Assistant. With livestream structured data and the Indexing API, you can let Google know when your video is live, so it will be eligible to appear with a red "live" badge:

Add livestream structured data to your page

If your website streams live videos, use the livestream developer documentation to flag your video as a live broadcast and mark the start and end times. In addition, VideoObject structured data is required to tell Google that there's a video on your page.

Update Google quickly with the Indexing API

The Indexing API now supports pages with livestream structured data. We encourage you to call the Indexing API to request that your site is crawled in time for the livestream. We recommend calling the Indexing API when your livestream begins and ends, and if the structured data changes.
For more information, visit our developer documentation. If you have any questions, ask us in the Webmaster Help Forum. We look forward to seeing your live videos on Google!
Posted by Danielle Marshak, Product Manager

Rich Results expands for Question & Answer pages

Monday, December 03, 2018

People come to Google seeking information about all kinds of questions.
Frequently, the information they're looking for is on sites where users ask and answer each other's questions. Popular social news sites, expert forums, and help and support message boards are all examples of this pattern.

A screenshot of an example search result for a page titled “How do I remove a cable that is stuck in a USB port” with a list of the top answers from the page.

In order to help users better identify which search results may give the best information about their question, we have developed a new rich result type for question and answer sites. Search results for eligible Q&A pages display a preview of the top answers. This new presentation helps site owners reach the right users for their content and helps users get the relevant information about their questions faster.

A screenshot of an example search result for a page titled “Why do touchscreens sometimes register a touch when ...” with a preview of the top answers from the page.

To be eligible for this feature, add Q&A structured data to your pages with Q&A content. Be sure to use the Structured Data Testing Tool to see if your page is eligible and to preview the appearance in search results. You can also check out Search Console to see aggregate stats and markup error examples. The Performance report also tells you which queries show your Q&A Rich Result in Search results, and how these change over time.
If you have any questions, ask us in the Webmaster Help Forum or reach out on Twitter!
Posted by Kayla Hanson, Software Engineer

Introducing the Indexing API for job posting URLs

Tuesday, June 26, 2018

Last June we launched a job search experience that has since connected tens of millions of job seekers around the world with relevant job opportunities from third party providers across the web. Timely indexing of new job content is critical because many jobs are filled relatively quickly. Removal of expired postings is important because nothing's worse than finding a great job only to discover it's no longer accepting applications.

Today we're releasing the Indexing API to address this problem. This API allows any site owner to directly notify Google when job posting pages are added or removed. This allows Google to schedule job postings for a fresh crawl, which can lead to higher quality user traffic and job applicant satisfaction. Currently, the Indexing API can only be used for job posting pages that include job posting structured data.

For websites with many short-lived pages like job postings, the Indexing API keeps job postings fresh in Search results because it allows updates to be pushed individually. This API can be integrated into your job posting flow, allowing high quality job postings to be searchable quickly after publication. In addition, you can check the last time Google received each kind of notification for a given URL.

Follow the Quickstart guide to see how the Indexing API works. If you have any questions, ask us in the Webmaster Help Forum. We look forward to hearing from you!

Posted by Zach Clifford, Software Engineer

We updated our job posting guidelines

Friday, April 27, 2018

Last year, we launched job search on Google to connect more people with jobs. When you provide Job Posting structured data, it helps drive more relevant traffic to your page by connecting job seekers with your content. To ensure that job seekers are getting the best possible experience, it's important to follow our Job Posting guidelines.

We've recently made some changes to our Job Posting guidelines to help improve the job seeker experience.

Remove expired jobs
Place structured data on the job's detail page
Make sure all job details are present in the job description

Remove expired jobs

When job seekers put in effort to find a job and apply, it can be very discouraging to discover that the job that they wanted is no longer available. Sometimes, job seekers only discover that the job posting is expired after deciding to apply for the job. Removing expired jobs from your site may drive more traffic because job seekers are more confident when jobs that they visit on your site are still open for application. For more information on how to remove a job posting, see Remove a job posting.

Place structured data on the job's detail page

Job seekers find it confusing when they land on a list of jobs instead of the specific job's detail page. To fix this, put structured data on the most detailed leaf page possible. Don't add structured data to pages intended to present a list of jobs (for example, search result pages) and only add it to the most specific page describing a single job with its relevant details.

Make sure all job details are present in the job description

We've also noticed that some sites include information in the JobPosting structured data that is not present anywhere in the job posting. Job seekers are confused when the job details they see in Google Search don't match the job description page. Make sure that the information in the JobPosting structured data always matches what's on the job posting page. Here are some examples:

If you add salary information to the structured data, then also add it to the job posting. Both salary figures should match.
The location in the structured data should match the location in the job posting.

Providing structured data content that is consistent with the content of the job posting pages not only helps job seekers find the exact job that they were looking for, but may also drive more relevant traffic to your job postings and therefore increase the chances of finding the right candidates for your jobs.

If your site violates the Job Posting guidelines (including the guidelines in this blog post), we may take manual action against your site and it may not be eligible for display in the jobs experience on Google Search. You can submit a reconsideration request to let us know that you have fixed the problem(s) identified in the manual action notification. If your request is approved, the manual action will be removed from your site or page.

For more information, visit our Job Posting developer documentation and our JobPosting FAQ.

Posted by Anouar Bendahou, Trust & Safety Search Team

Using page speed in mobile search ranking

Wednesday, January 17, 2018

Update July 9, 2018: The Speed Update is now rolling out for all users.

People want to be able to find answers to their questions as fast as possible — studies show that people really care about the speed of a page. Although speed has been used in ranking for some time, that signal was focused on desktop searches. Today we’re announcing that starting in July 2018, page speed will be a ranking factor for mobile searches.

The “Speed Update,” as we’re calling it, will only affect pages that deliver the slowest experience to users and will only affect a small percentage of queries. It applies the same standard to all pages, regardless of the technology used to build the page. The intent of the search query is still a very strong signal, so a slow page may still rank highly if it has great, relevant content.

We encourage developers to think broadly about how performance affects a user’s experience of their page and to consider a variety of user experience metrics. Although there is no tool that directly indicates whether a page is affected by this new ranking factor, here are some resources that can be used to evaluate a page’s performance.

Chrome User Experience Report, a public dataset of key user experience metrics for popular destinations on the web, as experienced by Chrome users under real-world conditions
Lighthouse, an automated tool and a part of Chrome Developer Tools for auditing the quality (performance, accessibility, and more) of web pages
PageSpeed Insights, a tool that indicates how well a page performs on the Chrome UX Report and suggests performance optimizations

As always, if you have any questions or feedback, please visit our webmaster forums.

Posted by Zhiheng Wang and Doantam Phan

Rendering AJAX-crawling pages

Monday, December 04, 2017

The AJAX crawling scheme was introduced as a way of making JavaScript-based webpages accessible to Googlebot, and we've previously announced our plans to turn it down. Over time, Google engineers have significantly improved rendering of JavaScript for Googlebot. Given these advances, in the second quarter of 2018, we'll be switching to rendering these pages on Google's side, rather than on requiring that sites do this themselves. In short, we'll no longer be using the AJAX crawling scheme.

As a reminder, the AJAX crawling scheme accepts pages with either a "#!" in the URL or a "fragment meta tag" on them, and then crawls them with an "?_escaped_fragment_=" in the URL. That escaped version needs to be a fully-rendered and/or equivalent version of the page, created by the website itself.

With this change, Googlebot will render the #! URL directly, making it unnecessary for the website owner to provide a rendered version of the page. We'll continue to support these URLs in our search results.

We expect that most AJAX-crawling websites won't see significant changes with this update. Webmasters can double-check their pages as detailed below, and we'll be sending notifications to any sites with potential issues.

If your site is currently using either #! URLs or the fragment meta tag, we recommend:

Verify ownership of the website in Google Search Console to gain access to the tools there, and to allow Google to notify you of any issues that might be found.
Test with Search Console's Fetch & Render. Compare the results of the #! URL and the escaped URL to see any differences. Do this for any significantly different part of the website. Check our developer documentation for more information on supported APIs, and see our debugging guide when needed.
Use Chrome's Inspect Element to confirm that links use "a" HTML elements and include a rel=nofollow where appropriate (for example, in user-generated content)
Use Chrome's Inspect Element to check the page's title and description meta tag, any robots meta tag, and other meta data. Also check that any structured data is available on the rendered page.
Content in Flash, Silverlight, or other plugin-based technologies needs to be converted to either JavaScript or "normal" HTML, if their content should be indexed in search.

We hope that this change makes it a bit easier for your website, and reduces the need to render pages on your end. Should you have any questions or comments, feel free to drop by our webmaster help forums, or to join our JavaScript sites working group.

Posted by John Mueller, Google Switzerland

A reminder about “event” markup

Monday, November 27, 2017

Lately we’ve been receiving feedback from users seeing non-events like coupons or vouchers showing up in search results where “events” snippets appear. This is really confusing for users and also against our guidelines, where we have added additional clarification.

So, what’s the problem?

We’ve seen a number of publishers in the coupons/vouchers space use the “event” markup to describe their offers. And as much as using a discount voucher can be a very special thing, that doesn’t make coupons or vouchers events or “saleEvents”. Using Event markup to describe something that is not an event creates a bad user experience, by triggering a rich result for something that will happen at a particular time, despite no actual event being present.

Here are some examples to illustrate the issue:

Since this creates a misleading user experience, we may take manual action on such cases. In case your website is affected by such a manual action, you will find a notification in your Search Console account. If a manual action is taken, it can result in structured data markup for the whole site not being used for search results.

While we’re specifically highlighting coupons and vouchers in this blogpost, this applies to all other non-event items being annotated with “event” markup as well -- or, really, for applying a type of markup to something other than the type of thing it is meant to describe.

For more information, please visit our developer documentation or stop by our Webmaster Forum in case you have additional questions!

Posted by Sven Naumann, Trust & Safety Search Team

Engaging users through high quality AMP pages

Thursday, November 16, 2017

To improve our users' experience with AMP results, we are making changes to how we enforce our policy on content parity with AMP. Starting Feb 1, 2018, the policy requires that the AMP page content be comparable to the (original) canonical page content. AMP is not a ranking signal and there is no change in terms of the ranking policy with respect to AMP.

The open source accelerated mobile pages project (AMP) launched in 2015 and has seen tremendous growth with over 25M domains having implemented the AMP format. This rapid progress comes with a sense of responsibility of ensuring that our users continue to have a great content consumption experience that ultimately leads to more engagement with publisher content.

In some cases, webmasters publish two versions of their content: a canonical page that is not based on AMP and an AMP page. In the ideal scenario, both these pages have equivalent content leading the user to get the same content but with a faster and smoother experience via AMP. However, in some cases the content on the AMP page does not match the content on its original (canonical) page.

In a small number of cases, AMP pages are used as teaser pages which create a particularly bad user experience since they only contain minimal content. In these instances, users have to click twice to get to the real content. Below is an example of how this may look like: a brief text of the main article and then asking the user to click to visit another page to complete reading the article.

AMP was introduced to dramatically improve the performance of the web and deliver a fast, consistent content consumption experience. In keeping with this goal, we'll be enforcing the requirement of close parity between AMP and canonical page, for pages that wish to be shown in Google Search as AMPs.

Where we find that an AMP page doesn't contain the same critical content as its non-AMP equivalent, we will direct our users to the non-AMP page. This does not affect Search ranking. However, these pages will not be considered for Search features that require AMP, such as the Top Stories carousel with AMP. Additionally, we will notify the webmaster via Search console as a manual action message and give the publisher the opportunity to fix the issue before its AMP page can be served again. The AMP open source website has several helpful guides to help produce fast, beautiful and high-performing AMP pages.

We hope this change encourages webmasters to maintain content parity between the canonical and AMP equivalent. This will lead to better experience on your site and ultimately happier users.

Posted by Ashish Mehta, Product Manager

Make your site's complete jobs information accessible to job seekers

Wednesday, November 15, 2017

In June, we announced a new experience that put the convenience of Search into the hands of job seekers. Today, we are taking the next step in improving the job search experience on Google by adding a feature that shows estimated salary information from the web alongside job postings, as well as adding new UI features for users.

Salary information has been one of the most requested additions from job seekers. This helps people evaluate whether a job is a good fit, and is an opportunity for sites with estimated salary information to:

Increase brand awareness: Estimated salary information shows a representative logo from the estimated salary provider.

Get more referral traffic: Users can click through directly to salary estimate pages when salary information surfaces in job search results.

If your site provides salary estimates, you can take advantage of these changes in the following ways:

Specify actual salary information

Actual salary refers to the base salary information that is provided by the employer. If your site publishes job listings, you can add JobPosting structured data and populate the baseSalary property to be eligible for inclusion in job search results.

This salary information will be made available in both the list and the detail views.

Provide estimated salary information

In cases where employers don’t provide actual salary, job seekers may see estimated salaries sourced from multiple partners for the same or similar occupation. If your site provides salary estimate information, you can add Occupation structured data to be eligible for inclusion in job search results.

Include exact location information

We've heard from users that having accurate, street-level location information helps them to focus on opportunities that work best for them. Sites that publish job listings can do this can do this by using the jobLocation property in JobPosting structured data.

Validate your structured data

To double-check the structured data on your pages, we'll be updating the Structured Data Testing Tool and the Search Console reports in the near future. In the meantime, you can monitor the performance of your job postings in Search Analytics. Stay tuned!

Since launching this summer, we’ve seen over 60% growth in number of companies with jobs showing on Google and connected tens of millions of people to new job opportunities. We are excited to help users find jobs with salaries that meet their needs, and to route them to your site for more information. We invite sites that provide salary estimates to mark up their salary pages using the Occupation structured data. Should you have any questions regarding the use of structured data on your site, feel free to drop by our webmaster help forums.

Posted by Nick Zakrasek, Product Manager

Enabling more high quality content for users

Sunday, October 01, 2017

In Google’s mission to organize the world's information, we want to guide Google users to the highest quality content, the principle exemplified in our quality rater guidelines. Professional publishers provide the lion’s share of quality content that benefits users and we want to encourage their success.

The ecosystem is sustained via two main sources of revenue: ads and subscriptions, with the latter requiring a delicate balance to be effective in Search. Typically subscription content is hidden behind paywalls, so that users who don’t have a subscription don’t have access. Our evaluations have shown that users who are not familiar with the high quality content behind a paywall often turn to other sites offering free content. It is difficult to justify a subscription if one doesn't already know how valuable the content is, and in fact, our experiments have shown that a portion of users shy away from subscription sites. Therefore, it is essential that sites provide some amount of free sampling of their content so that users can learn how valuable their content is.

The First Click Free (FCF) policy for both Google web search and News was designed to address this issue. It offers promotion and discovery opportunities for publishers with subscription content, while giving Google users an opportunity to discover that content. Over the past year, we have worked with publishers to investigate the effects of FCF on user satisfaction and on the sustainability of the publishing ecosystem. We found that while FCF is a reasonable sampling model, publishers are in a better position to determine what specific sampling strategy works best for them. Therefore, we are removing FCF as a requirement for Search, and we encourage publishers to experiment with different free sampling schemes, as long as they stay within the updated webmaster guidelines. We call this Flexible Sampling.

One of the original motivations for FCF is to address the issues surrounding cloaking, where the content served to Googlebot is different from the content served to users. Spammers often seek to game search engines by showing interesting content to the search engine, say healthy food recipes, but then showing users an offer for diet pills. This “bait and switch” scheme creates a bad user experience since users do not get the content they expected. Sites with paywalls are strongly encouraged to apply the new structured data to their pages, because without it, the paywall may be interpreted as a form of cloaking, and the pages would then be removed from search results.

Based on our investigations, we have created detailed best practices for implementing flexible sampling. There are two types of sampling we advise: metering, which provides users with a quota of free articles to consume, after which paywalls will start appearing; and lead-in, which offers a portion of an article’s content without it being shown in full.

For metering, we think that monthly (rather than daily) metering provides more flexibility and a safer environment for testing. The user impact of changing from one integer value to the next is less significant at, say, 10 monthly samples than at 3 daily samples. All publishers and their audiences are different, so there is no single value for optimal free sampling across publishers. However, we recommend that publishers start by providing 10 free clicks per month to Google search users in order to preserve a good user experience for new potential subscribers. Publishers should then experiment to optimize the tradeoff between discovery and conversion that works best for their businesses.

Lead-in is generally implemented as truncated content, such as the first few sentences or 50-100 words of the article. Lead-in allows users a taste of how valuable the content may be. Compared to a page with completely blocked content, lead-in clearly provides more utility and added value to users.

We are excited by this change as it allows the growth of the premium content ecosystem, which ultimately benefits users. We look forward to the prospect of serving users more high quality content!

Posted by Cody Kwok, Principal Engineer

Webmaster Central Blog