Way back when, in the harvester guidelines, we suggested using both the User-Agent (along the lines proposed but without URI suggestion) and the From (for email contact) headers:

http://www.openarchives.org/OAI/2.0/guidelines-harvester.htm#AgentInfo

So, a long winded +1 to a URI being a useful thing to put in the User-Agent string.

Cheers,
Simeon


On 11/17/14, 3:50 AM, Stuart A. Yeates wrote:
I've been looking at the logs for our OAI server and I'd like to appeal to
those harvesting over OAI to put URLs into the user agent string. Putting
the name of your project into the user agent string seems like a great way
to build profile. It also avoids the situation where the easiest way to
contact you is via the contacts associated with your DNS block.

For reference, these are some of the user agent strings I'm seeing
(standard browser strings removed):

"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
"Net::OAI::Harvester"
"Googlebot/2.1 (+http://www.google.com/bot.html)"
"OAIHarvester/2.0"
"Jakarta Commons-HttpClient/3.1"
"Mozilla/5.0 (compatible; Baiduspider/2.0; +
http://www.baidu.com/search/spider.html)"
"Celestial/3.02"
"WorldCat Digital Collection Gateway from OCLC.org"
"Apache-HttpClient/4.0.1 (java 1.5)"
"lwp-trivial/1.41"
"OAIGet-1.12"
"DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +
http://www.google.com/bot.html)"
"Mozilla/5.0 (compatible; Sosospider/2.0; +
http://help.soso.com/webspider.htm)"
"yacybot (freeworld/global; amd64 Linux 3.2.0-36-generic; java 1.6.0_27;
"OAIHarvesterObj 31 University of Illinois Library"
"PKPHarvester/2.x"
"OAI Harvester/1.0; FS Consulting, Inc."
"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
"Typhoeus - https://github.com/typhoeus/typhoeus";
....

cheers
stuart

Reply via email to