On 20 February 2010 23:00, Ævar Arnfjörð Bjarmason <ava...@gmail.com> wrote:
> On Thu, Feb 4, 2010 at 14:37, Aryeh Gregor
> <simetrical+wikil...@gmail.com> wrote:
>> On Wed, Feb 3, 2010 at 5:11 PM, Trevor Parscal <tpars...@wikimedia.org> 
>> wrote:
>>> Are the stats setup to differentiate between real ie6 users and bing
>>> autosurfing?
>>
>> I'd be pretty surprised if Bing is generating enough traffic to
>> noticeably affect the percentage, even if it does get counted as IE6.
>
> Bing can hit you pretty hard:
> http://blogs.perl.org/users/cpan_testers/2010/01/msnbot-must-die.html
>

Well.. is not a crawler, ... it seems a
cracracracracracracracracrawler, for the way repeat the same request N
times.  It act not like a single crawler, but like a multiple list of
crawler with not intercomunication all from the same range of ip's.  A
single optimization would be for all these crawlers to share the
robots.txt file (is not this obvious?). Since that request is not
shared, you see all instances making separate requests.  Theres also
not sincronization, so all the crawlers can hit you site at the same
time, say... 15 asking robot.txt at once .. or spread 2 hours, is just
luck.

It seems a .. simplistic and brute approach to internet indexing..  :-/
It seems Microsoft is dropping money on the problem, but not brains.



-- 
--
ℱin del ℳensaje.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to