(cross-posted on mobile-l) Okay, looks like the index of zero.wikipedia.org pages in Google has shrunk by some 20 million entries. Nonetheless, a number of really old pages (e.g., going back to 6-May-2013) are still in the Google index with article text. I'll set a reminder to check on the Google index again in 30 days, and hopefully then we can finally put the no-index rules in place at that time.
The good news is that many of the pages are now correctly suppressed in natural search as non-canonical pages. In other words, a user would need to go through omitted results or do a site:<domain> search to see them. -Adam On Tue, Jun 18, 2013 at 3:35 PM, Adam Baso <ab...@wikimedia.org> wrote: > Update: > > We've added an enhancement to Wikipedia Zero so that if a user who isn't > on a participating carrier network navigates to a Wikipedia Zero page on > <language>.zero.wikipedia.org, such as > http://en.zero.wikipedia.org/wiki/Muse_%28band%29 , the user will be > presented an option to visit the canonical URL of the article. If clicked, > the canonical URL should get the user to the mobile or desktop version of > the page, based on device type. > > We're hoping that by next week the Google index will be refreshed so as to > correctly mark the <language>.zero.wikipedia.org pages as duplicate pages > in the omitted section. Upon confirmation of as much, the current plan is > to introduce https://gerrit.wikimedia.org/r/#/c/69420/ to prevent > indexing of <language>.zero.wikipedia.org altogether. > > > On Tue, May 28, 2013 at 6:26 PM, Adam Baso <ab...@wikimedia.org> wrote: > >> All, >> >> My mistake. The pages in Google's index that I used for sampling - the >> ones that have "Sorry, ..." in their description in Google search results - >> are cached pages. I assumed incorrectly that those pages were based on >> recent indexing (e.g., in the past few days). >> >> I think we can actually stick to the original plan of Google re-indexing >> and the search results de-emphasizing the <language>.zero.wikipedia.orglinks >> within the next 30 days. >> >> I still find it strange that there are <language>.zero.wikipedia.orglinks >> that turned up higher in the search engine rankings than their >> better-established <language>.wikipedia.org counterparts. But I suppose >> with fewer competing page elements, especially on long-tail articles with >> fewer or no direct links to the desktop page, this is maybe not totally >> unexpected. >> >> -Adam >> >> >> >> >> On Tue, May 28, 2013 at 1:49 PM, Adam Baso <ab...@wikimedia.org> wrote: >> >>> Hello All, >>> >>> We had shelved my patch, patch 64629<https://gerrit.wikimedia.org/r/64629>, >>> in hopes that an earlier patch, patch >>> 61809<https://gerrit.wikimedia.org/r/61809>(bug >>> 35233 <https://bugzilla.wikimedia.org/show_bug.cgi?id=35233>), would >>> resolve the issue naturally as Google re-indexed. But it appears Google has >>> re-indexed and yet the .zero.wikipedia.org URLs are still present in >>> Google's index, instead of the <language>.wikipedia.org URLs. >>> >>> I have thus resubmitted patch 64629<https://gerrit.wikimedia.org/r/64629> >>> for >>> re-review. We will need to further discuss whether it is appropriate to >>> have Google completely remove .zero.wikipedia.org links from their >>> cache, or if perhaps we need to open a support thread with Google about >>> canonical URLs. >>> >>> >>> >>> >>> On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <kwad...@wikimedia.org>wrote: >>> >>>> Adam Baso (copied on this email) is working on it and a fix is ready. >>>> He'll do some testing to make sure it's resolved. >>>> >>>> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <tf...@wikimedia.org>wrote: >>>> >>>>> Looping Dan Foy in who's managing the Zero backlog. >>>>> >>>>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <z...@mzmcbride.com> wrote: >>>>> > K. Peachey wrote: >>>>> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org >>>>> >? >>>>> > >>>>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856 >>>>> > >>>>> > >>>>> > MZMcBride >>>>> > >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > Wikimedia-l mailing list >>>>> > Wikimedia-l@lists.wikimedia.org >>>>> > Unsubscribe: >>>>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-l >>>>> >>>>> _______________________________________________ >>>>> Wikimedia-l mailing list >>>>> Wikimedia-l@lists.wikimedia.org >>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l >>>>> >>>> >>>> >>>> >>>> -- >>>> Kul Wadhwa >>>> Head of Mobile >>>> Wikimedia Foundation >>>> >>> >>> >> > _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>