Hi, On Mon, Jul 8, 2013 at 6:25 PM, eberhard speer jr. <[email protected]> wrote: > ...- - I'm sure, particularly Adobe and The Weather Channel, as well as > others, must see bazillions of user-agents in their web-logs every > day. Would it be possible to ask your respective web-ops people to > make available a weekly or monthly list of just the user-agent strings ?...
I can try for Adobe, but see my next comments, they might also apply in this case. > ...- - Bertrand : the ASF itself must also collect massive amounts of > user-agent strings in there logs. Is there someone, maybe in > Infrastructure, we can contact with our request ?... I asked a while ago, it was just an informal conversation but two obstacles were mentioned: a) In some countries (Germany IIRC), User-Agent is considered private information, so publishing it without the owner's consent would be problematic. b) For some machine interactions, (svn clients IIRC) the User-Agent contains data that does disclose more information than desired if you were to publish it openly. So, blindly grabbing all user-agent values from apache.org websites is probably not possible unless we can come up with a process that allows us to filter for "public" user-agents and ignore others. I imagine this project's PMC members could be trusted with the full apache.org logs, if we have a reliable way to filter them. -Bertrand
