Re: Please stop massive crawling against http://openean.kaufkauf.net/id/

2010-06-09 Thread Jürgen Umbrich
Hi all, > The volunteer who is hosting http://openean.kaufkauf.net/id/, a huge set of > GoodRelations product model data, is experiencing a problematic amount of > traffic from unidentified crawlers located in Ireland (DERI?), the > Netherlands (VUA?), and the USA. > Another crawler used fr

Re: Please stop massive crawling against http://openean.kaufkauf.net/id/

2010-06-08 Thread Robert Fuller
Kingsley Idehen wrote: The LOD Cloud Cache at DERI is a live Virtuoso instance with 15 Billion+ Triples loaded. It covers as much of the LOD Cloud as we've be able to get our hands on plus 6.4 Billion Triples from the Data.Gov effort. I'll drop a more detailed note about this instance (via bl

Re: Please stop massive crawling against http://openean.kaufkauf.net/id/

2010-06-08 Thread Kingsley Idehen
Robert Fuller wrote: Kingsley Idehen wrote: The LOD Cloud Cache at DERI is a live Virtuoso instance with 15 Billion+ Triples loaded. It covers as much of the LOD Cloud as we've be able to get our hands on plus 6.4 Billion Triples from the Data.Gov effort. I'll drop a more detailed note abou

Re: Please stop massive crawling against http://openean.kaufkauf.net/id/

2010-06-08 Thread Kingsley Idehen
Robert Fuller wrote: Hi, Sindice clearly identifies itself in the user agent http header. Currently we use these user agents: 1. "Mozilla/5.0 (compatible; sindice-fetcher/0.1.0 +http://sindice.com/developers/bot)" 2. "SindiceFetcher/Ping Manager (http://sindice.com/developers/bot"; 3. "si

Re: Please stop massive crawling against http://openean.kaufkauf.net/id/

2010-06-08 Thread Robert Fuller
Hi, Sindice clearly identifies itself in the user agent http header. Currently we use these user agents: 1. "Mozilla/5.0 (compatible; sindice-fetcher/0.1.0 +http://sindice.com/developers/bot)" 2. "SindiceFetcher/Ping Manager (http://sindice.com/developers/bot"; 3. "sindice.net ontology fet

Re: Please stop massive crawling against http://openean.kaufkauf.net/id/

2010-06-08 Thread Christophe Guéret
Dear Martin, I guess the VUA crawler was our. The deficient process has been stopped now and won't be restarted before being checked for bugs. Sorry about all the problems caused. Best regards, Christophe On 06/08/2010 10:03 AM, Martin Hepp (UniBW) wrote: Dear all: The volunteer who is h

Re: Please stop massive crawling against http://openean.kaufkauf.net/id/

2010-06-08 Thread Dan Brickley
On Tue, Jun 8, 2010 at 10:03 AM, Martin Hepp (UniBW) wrote: > Dear all: > > The volunteer who is hosting http://openean.kaufkauf.net/id/, a huge set of > GoodRelations product model data, is experiencing a problematic amount of > traffic from unidentified crawlers located in Ireland (DERI?), the >

Re: Please stop massive crawling against http://openean.kaufkauf.net/id/

2010-06-08 Thread Story Henry
One could put the data behind foaf+ssl, and so identify agents :-) Henry On 8 Jun 2010, at 10:03, Martin Hepp (UniBW) wrote: > Dear all: > > The volunteer who is hosting http://openean.kaufkauf.net/id/, a huge set of > GoodRelations product model data, is experiencing a problematic amount of

Please stop massive crawling against http://openean.kaufkauf.net/id/

2010-06-08 Thread Martin Hepp (UniBW)
Dear all: The volunteer who is hosting http://openean.kaufkauf.net/id/, a huge set of GoodRelations product model data, is experiencing a problematic amount of traffic from unidentified crawlers located in Ireland (DERI?), the Netherlands (VUA?), and the USA. The crawling has been so intense