[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

Aaron Schulz jschulz_4...@msn.com changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||FIXED

--- Comment #16 from Aaron Schulz jschulz_4...@msn.com 2011-02-10 03:04:32 
UTC ---
Sync check (per comment #15) added in r81874.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

wikipe...@deeprocks.de changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 CC||wikipe...@deeprocks.de
 Resolution|WONTFIX |

--- Comment #4 from wikipe...@deeprocks.de 2011-02-06 16:51:26 UTC ---
I hereby support the bug Matthias has described above and therefore the
deletion of the meta robots tag for logged in users. I do not care at all about
50 bytes more or less and that is not the point this is all about, but the meta
robots tag brings forward disadvantages in the shape of dangers. These dangers
are shown in the current debate about Google.de not listing German Wikipedia
articles under some circumstances.

Basically, there are two possible scenarios. I want to describe them both in
the following. When I say Google Bot I mean any search engine crawler as well
- I just take the Google crawler due to current topicality.

1st: Google Bot crawls pages as an anonymous user (not sending header cookies).
This scenario is the standard one we assume right now. We do not know any
search enigne bots which crawl while being logged in. Therefore: Any robots
information is totally senseless to be sent to logged in users as they are
generally no crawlers and so do not read robots messages. 1st scenario means:
The meta information is obsolete.

2nd: Google Bot crawls pages as logged in user. In this case, the usage of
robots information is sensitive also for logged in users. Then, however, this
could be the reason (or one reason among others) regarding the Google -
Wikipedia problem existing right now. If the 2nd scenario could apply, the
robots information should be removed temporarily just to make sure it is NOT
responsible for the problems. 2nd scenario means: It is likely that the robots
information and the Google problem are related to each other. To fix the
problem as fast as possible, the robots information should be disabled (for a
while, at least).

As you can see, both possible cases make me urge to delete the robots
information - at least for a couple of weeks. As soon as Google lists up all
the Wikipedia articles again and both MediaWiki Techs and Google Techs found
the reason causing the problem, they should deliberate if this meta tags are
reasonable and should be added back.

However, according to statements from Wikimedia, neither Google Bots nor any
other search engine crawlers log in. If this is true, there is no need for
those meta tags as they are NEVER read by crawlers and are therefore no more
than source code waste.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

LordAndrew reachouttothetr...@hotmail.com changed:

   What|Removed |Added

 CC||reachouttothetruth@hotmail.
   ||com

--- Comment #5 from LordAndrew reachouttothetr...@hotmail.com 2011-02-06 
17:37:02 UTC ---
The Googlebot indexing problem (bug 27155) is a problem on Google's end. I
don't see why anything has to be done here. Removing the robot indexing policy
would result in a bunch of useless pages being indexed, but if the search
engine isn't going to display it in the search results then all we've done is
waste resources.

Search engine indexing bots ''shouldn't'' be indexing while logged in. But if
someone does write a search engine bot that logs in for some reason, it should
follow the same indexing policies as all other search engine bots. If the
robots policy is removed for logged in users, then such a bot would be getting
different indexing instructions that those that don't. Why would we grant an
exception to the robot indexing policies simply because the bot logs in?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #6 from wikipe...@deeprocks.de 2011-02-06 17:41:37 UTC ---
(In reply to comment #5)
 The Googlebot indexing problem (bug 27155) is a problem on Google's end. 

This is not proven yet - not at all.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #7 from PDD david.dohe...@gmail.com 2011-02-06 17:51:32 UTC ---
(In reply to comment #5)
 But if
 someone does write a search engine bot that logs in for some reason, it should
 follow the same indexing policies as all other search engine bots.

Exactly, and the site-wide robots indexing policy *is not* and *should not be*
set via META tags. The META tags were introduced as a new feature of
FlaggedRevs to prevent unflagged (!) revisions of pages from being indexed. So,
if anything, only unflagged revisions should have the META tag with
NOINDEX,NOFOLLOW, but for logged-in users *all* pages (flagged und unflagged)
have this META tag. This is a bug. Bugs should be fixed.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

Derk-Jan Hartman hart...@videolan.org changed:

   What|Removed |Added

 CC||hart...@videolan.org,
   ||innocentkil...@gmail.com,
   ||jschulz_4...@msn.com
  Component|General/Unknown |FlaggedRevs
Version|unspecified |any
 AssignedTo|wikibugs-l@lists.wikimedia. |ro...@wikimedia.org
   |org |
Product|Wikimedia   |MediaWiki extensions

--- Comment #8 from Derk-Jan Hartman hart...@videolan.org 2011-02-06 18:17:18 
UTC ---
PDD is right, this is a bug in flaggerevs. changing component.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #9 from Platonides platoni...@gmail.com 2011-02-06 18:28:52 UTC 
---
So, the real reason about this bug is that Google is miscrawling wikipedia and
someone though that such line was at fault. It is not.
If that was the reason, all wikipedias would be affected, not only dewiki. No
wikipedia article would be listed. And if it were logged in, cached pages would
contain the Google user name at the top.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #10 from PDD david.dohe...@gmail.com 2011-02-06 18:35:01 UTC ---
(In reply to comment #9)
 If that was the reason, all wikipedias would be affected, not only dewiki.

Erm, are you commenting here without actually having looked into the matter?
The META tag bug affects dewiki, huwiki and plwiki *only*, so of course it
can't have any effect on all wikipedias, no matter what that effect might
be...

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #11 from DerHexer wikipedia_emails-nachfr...@yahoo.de 2011-02-06 
18:37:57 UTC ---
These are two separated bugs: One about that Google issue and one about useless
,noindex,nofollow‘ in flagged (and not only unflagged) revisions. The latter is
called [[bugzilla:27173]] (that one here), the former [[bugzilla:27155]].

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #12 from Chad H. innocentkil...@gmail.com 2011-02-06 19:06:27 UTC 
---
FWIW, the indexing problem is an issue on Google's end, not ours (bug 27155
tracks that)

We've actually been serving noindex,nofollow to logged in users in FlaggedRevs
for quite some time now (the code in this regard hasn't changed in about a
year). I think Googlebot's problems just raised people's awareness and it made
a decent original assumption as to the cause.

Whether or not we should serve noindex,nofollow to logged in users is
debatable, and I guess this bug serves that purpose.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #13 from wikipe...@deeprocks.de 2011-02-06 19:13:53 UTC ---
(In reply to comment #12)
 Whether or not we should serve noindex,nofollow to logged in users is
 debatable, and I guess this bug serves that purpose.

Chad got the point. Mentioning Google was just an example and nothing else.
Independent of the fact if crawlers do their job logged in or not, the meta
tags are senseless and have to be removed - or can anyone explain to me why a
crawler must not index an article version which has been checked (flagged)?!

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #14 from Derk-Jan Hartman hart...@videolan.org 2011-02-06 
19:30:26 UTC ---
This is caused because in FlaggedArticleView.php, setRobotPolicy has the
following check:

pre
if ( !$this-pageOverride()  $this-article-isStableShownByDefault() ) {
// set noindex
}
/pre

in this check, $this-pageOverride() returns false for stable versions for
logged in users, yet true for stable versions for non-logged in users.

pageOverride() returns false for logged in users, due to the following check:

pre
$config = $this-article-getVisibilitySettings();
# Does the stable version override the current one?
if ( $config['override'] ) {
if ( $this-showDraftByDefault() ) {
return ( $wgRequest-getIntOrNull( 'stable' )
=== 1 );
}
# Viewer sees stable by default
return !( $wgRequest-getIntOrNull( 'stable' ) === 0 );
/pre

ergo, pageOverride() does not account for usergroup settings in viewing stable
pages, it only takes into account usersettings, page settings and url
overrides.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #15 from Aaron Schulz jschulz_4...@msn.com 2011-02-06 20:06:39 
UTC ---
(In reply to comment #14)
 ergo, pageOverride() does not account for usergroup settings in viewing stable
 pages, it only takes into account usersettings, page settings and url
 overrides.

Yes, it does check that. That's what showDraftByDefault() does.

The real cause is that logged-in users see the current version by default, even
if it is synced with the stable version. Try logging in and adding ?stable=1 to
the page URL (noindex goes away). The two versions are almost the same, except
the stable has filetimestamp=X added to thumbnail links. In rare cases, the
current version might use newer versions of Commons files too (feature of bug
15748).

One way to index these would be to have setRobotPolicy() check for this
scenario (viewing the draft when the stable synced with it).

I've been doing refactoring yesterday to make the code easier to read. I'll
deal with this after finishing that.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

Mark A. Hershberger m...@everybody.org changed:

   What|Removed |Added

 CC||m...@everybody.org

--- Comment #1 from Mark A. Hershberger m...@everybody.org 2011-02-05 
22:04:50 UTC ---
I don't understand: what advantage would users see if this were removed?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

--- Comment #2 from Mathias Schindler mathias.schind...@gmail.com 2011-02-05 
22:06:03 UTC ---
A 50 byte size reduction per page (before compression).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 27173] Remove noindex meta tag from HTML of logged in users

2011-02-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27173

Platonides platoni...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||platoni...@gmail.com
 Resolution||WONTFIX

--- Comment #3 from Platonides platoni...@gmail.com 2011-02-05 22:12:30 UTC 
---
Those pages are indeed not suitable for robot indexing. The 50 byte size
reduction is not significant.
We could remove more bytes by removing the script variables, comments or
performing html5 minimization techniques. But you would need to reason why
those few bytes make a difference.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l