Way back when, in the harvester guidelines, we suggested using both the
User-Agent (along the lines proposed but without URI suggestion) and the
From (for email contact) headers:
http://www.openarchives.org/OAI/2.0/guidelines-harvester.htm#AgentInfo
So, a long winded +1 to a URI being a useful thing to put in the
User-Agent string.
Cheers,
Simeon
On 11/17/14, 3:50 AM, Stuart A. Yeates wrote:
I've been looking at the logs for our OAI server and I'd like to appeal to
those harvesting over OAI to put URLs into the user agent string. Putting
the name of your project into the user agent string seems like a great way
to build profile. It also avoids the situation where the easiest way to
contact you is via the contacts associated with your DNS block.
For reference, these are some of the user agent strings I'm seeing
(standard browser strings removed):
"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
"Net::OAI::Harvester"
"Googlebot/2.1 (+http://www.google.com/bot.html)"
"OAIHarvester/2.0"
"Jakarta Commons-HttpClient/3.1"
"Mozilla/5.0 (compatible; Baiduspider/2.0; +
http://www.baidu.com/search/spider.html)"
"Celestial/3.02"
"WorldCat Digital Collection Gateway from OCLC.org"
"Apache-HttpClient/4.0.1 (java 1.5)"
"lwp-trivial/1.41"
"OAIGet-1.12"
"DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +
http://www.google.com/bot.html)"
"Mozilla/5.0 (compatible; Sosospider/2.0; +
http://help.soso.com/webspider.htm)"
"yacybot (freeworld/global; amd64 Linux 3.2.0-36-generic; java 1.6.0_27;
"OAIHarvesterObj 31 University of Illinois Library"
"PKPHarvester/2.x"
"OAI Harvester/1.0; FS Consulting, Inc."
"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
"Typhoeus - https://github.com/typhoeus/typhoeus"
....
cheers
stuart