Hello, the second line in the log had the referrer: http://toolserver.org/~kolossos/openlayers/embed.html This script is included inside geohack, so it's no wonder if a lot of requests coming with this referrer from different IP's. But if a lot of requests are coming only from one IP than perhaps a kid want to create the next big search engine after Google. So it's correct to throttle them. Perhaps we should also check our robot.txt.
Gpsie.com is a special topic, as they have many users they can have a positive effect for OSM. They still use cached hillshading generated by us, but there will be no updates necessary in the next time. The tiles from hikebike-styles can change each minute, and it depends on the user which delay is acceptable. For a mapper the hikbike map is the only chance to see own modifications on hiking routes, so a delay of 10 minutes is the maximum. For a normal user a delay of a week or a month should be no problem. Greetings Tim alias Kolossos Am 04.10.2013 23:39, schrieb Marlen Caemmerer: > Hello, > > Kai, thank you for your prompt and informative response. > > On Thu, 3 Oct 2013, Kai Krueger wrote: > >> >> do you have any more details of which tile layers are getting hit? Is it >> low or high zoom tiles? What referes / user-agents do they come from? Is >> it the tiles that get served through mod_tile, the hillshading tiles or >> the tiles for the wiki mini atlas? > > I can send you some example log lines of the throttled IPs: > > 2013/10/04 08:11:30 [error] 28822#0: *53658093 limiting requests, > excess: 55.240 by zone "hikebike", client: 213.73.96.44, server: > toolserver.org, request: "GET /tiles/hikebike/15/17169/11177.png > HTTP/1.1", host: "toolserver.org", referrer: > "http://www.gpsies.com/map.do?fileId=gbcojbhrdfqlglbc" > > 2013/10/04 08:11:02 [error] 28822#0: *53650597 limiting requests, > excess: 55.120 by zone "hikebike", client: 85.0.37.63, server: > toolserver.org, request: "GET /tiles/ > osm/3/4/0.png HTTP/1.1", host: "c.www.toolserver.org", referrer: > "http://toolserver.org/~kolossos/openlayers/embed.html?layer=mapnik&bbox=39.68865498340449,43.5524339 > > 5214844,39.778011016595514,43.614232047851566&marker=43.583333,39.733333 > > If you want to get more logs I'd send them to you in private. > > It seems to relate especially hikebike/cmarq tools. > >> >> Too high load from individual clients has been an issue on many other >> tileservers as well. Mostly it comes from various mobile apps, that >> offer their users to download large areas (e.g. Germany) for offline >> use. These areas then cover potentially millions of tiles, that the >> clients then try and download as fast as the connection allows. > > Sounds plausible. > >> >> For that reason, the tileservers on osm.org have a significant list of >> user-agents that they block completely and in addition they also have an >> automatic rate limiting per IP. There is also a specific tile usage >> policy ( https://wiki.openstreetmap.org/wiki/Tile_usage_policy ) that >> gouverns how you are allowed to technically access the tile servers >> (once you have it downloaded, the use is freely gouverned by the >> CC-BY-SA licence) >> >> Other tileservers like the opencyclemap, equally have restrictions and >> mod_tile (the apache module used to deliver tiles) has a number of >> features available to limit traffic. mod_tile also has a complex system >> to try and ensure maximum cachability of tiles while still ensuring >> up-to-dateness. This system can furthermore be tuned either towards >> fresshness or cacheability as needed. >> >> My impression was so far this has never been an issue with the >> toolserver and I wasn't aware of any explicit policies of how the >> toolserver tiles are allowed to be accessed, so I never activated any of >> the limiting features. But if it is becoming an issue we can see how >> best to compat the issue. > > gpsies.com stated they use the cache-control header which is sometimes > not set reasonably probably as far as I tried to see - i had a look at > these hikebike URL delivered from cmarq. > The will look at the problem closer on their side so I expect some more > details in the next days. > > I could set cache control headers in the nginx which acts as load > balancer for TS for tiles where it makes sense. Do you have any advices > on this? Which tiles dont change for what time about? > > >> >> At least on the munin graphs for ptolemy, I don't see much increased >> load. But if it is the hillshading tiles, or the WMA tiles, those don't >> get served through ptolemy as far as I am aware. > > Seems these tiles are delivered via ortelius/wolfsbane. Unfortunatelly > the high load times lead to munin not graphing anymore so I dont have > any accutal data but the error logs/loads via console. > >>> >>> I dont want to have this option configured forever - I rather hope we >>> can do something about caching or give the pictures they need to the >>> projects themselves (I doubt we have to deliver hill shading pictures >>> for everyone - this is Toolserver) >>> > > > > Cheers > nosy > > _______________________________________________ > Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) > https://lists.wikimedia.org/mailman/listinfo/toolserver-l > Posting guidelines for this list: > https://wiki.toolserver.org/view/Mailing_list_etiquette _______________________________________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette