Hello,
the second line in the log had the referrer:
http://toolserver.org/~kolossos/openlayers/embed.html
This script is included inside geohack, so it's no wonder if a lot of
requests coming with this referrer from different IP's. But if a lot of
requests are coming only from one IP than perhaps a kid want to create
the next big search engine after Google. So it's correct to throttle
them. Perhaps we should also check our robot.txt.

Gpsie.com is a special topic, as they have many users they can have a
positive effect for OSM. They still use cached hillshading generated by
us, but there will be no updates necessary in the next time.
The tiles from hikebike-styles can change each minute, and it depends on
the user which delay is acceptable. For a mapper the hikbike map is the
only chance to see own modifications on hiking routes, so a delay of 10
minutes is the maximum. For a normal user a delay of a week or a month
should be no problem.

Greetings Tim alias Kolossos


Am 04.10.2013 23:39, schrieb Marlen Caemmerer:
> Hello,
> 
> Kai, thank you for your prompt and informative response.
> 
> On Thu, 3 Oct 2013, Kai Krueger wrote:
> 
>>
>> do you have any more details of which tile layers are getting hit? Is it
>> low or high zoom tiles? What referes / user-agents do they come from? Is
>> it the tiles that get served through mod_tile, the hillshading tiles or
>> the tiles for the wiki mini atlas?
> 
> I can send you some example log lines of the throttled IPs:
> 
> 2013/10/04 08:11:30 [error] 28822#0: *53658093 limiting requests,
> excess: 55.240 by zone "hikebike", client: 213.73.96.44, server:
> toolserver.org, request: "GET /tiles/hikebike/15/17169/11177.png
> HTTP/1.1", host: "toolserver.org", referrer:
> "http://www.gpsies.com/map.do?fileId=gbcojbhrdfqlglbc";
> 
> 2013/10/04 08:11:02 [error] 28822#0: *53650597 limiting requests,
> excess: 55.120 by zone "hikebike", client: 85.0.37.63, server:
> toolserver.org, request: "GET /tiles/
> osm/3/4/0.png HTTP/1.1", host: "c.www.toolserver.org", referrer:
> "http://toolserver.org/~kolossos/openlayers/embed.html?layer=mapnik&bbox=39.68865498340449,43.5524339
> 
> 5214844,39.778011016595514,43.614232047851566&marker=43.583333,39.733333
> 
> If you want to get more logs I'd send them to you in private.
> 
> It seems to relate especially hikebike/cmarq tools.
> 
>>
>> Too high load from individual clients has been an issue on many other
>> tileservers as well. Mostly it comes from various mobile apps, that
>> offer their users to download large areas (e.g. Germany) for offline
>> use. These areas then cover potentially millions of tiles, that the
>> clients then try and download as fast as the connection allows.
> 
> Sounds plausible.
> 
>>
>> For that reason, the tileservers on osm.org have a significant list of
>> user-agents that they block completely and in addition they also have an
>> automatic rate limiting per IP. There is also a specific tile usage
>> policy ( https://wiki.openstreetmap.org/wiki/Tile_usage_policy ) that
>> gouverns how you are allowed to technically access the tile servers
>> (once you have it downloaded, the use is freely gouverned by the
>> CC-BY-SA licence)
>>
>> Other tileservers like the opencyclemap, equally have restrictions and
>> mod_tile (the apache module used to deliver tiles) has a number of
>> features available to limit traffic. mod_tile also has a complex system
>> to try and ensure maximum cachability of tiles while still ensuring
>> up-to-dateness. This system can furthermore be tuned either towards
>> fresshness or cacheability as needed.
>>
>> My impression was so far this has never been an issue with the
>> toolserver and I wasn't aware of any explicit policies of how the
>> toolserver tiles are allowed to be accessed, so I never activated any of
>> the limiting features. But if it is becoming an issue we can see how
>> best to compat the issue.
> 
> gpsies.com stated they use the cache-control header which is sometimes
> not set reasonably probably as far as I tried to see - i had a look at
> these hikebike URL delivered from cmarq.
> The will look  at the problem closer on their side so I expect some more
> details in the next days.
> 
> I could set cache control headers in the nginx which acts as load
> balancer for TS for tiles where it makes sense. Do you have any advices
> on this? Which tiles dont change for what time about?
> 
> 
>>
>> At least on the munin graphs for ptolemy, I don't see much increased
>> load. But if it is the hillshading tiles, or the WMA tiles, those don't
>> get served through ptolemy as far as I am aware.
> 
> Seems these tiles are delivered via ortelius/wolfsbane. Unfortunatelly
> the high load times lead to munin not graphing anymore so I dont have
> any accutal data but the error logs/loads via console.
> 
>>>
>>> I dont want to have this option configured forever - I rather hope we
>>> can do something about caching or give the pictures they need to the
>>> projects themselves (I doubt we have to deliver hill shading pictures
>>> for everyone - this is Toolserver)
>>>
> 
> 
> 
> Cheers
>     nosy
> 
> _______________________________________________
> Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/toolserver-l
> Posting guidelines for this list:
> https://wiki.toolserver.org/view/Mailing_list_etiquette


_______________________________________________
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Reply via email to