User-Agent string violates RFC
------------------------------
Key: HTTPCLIENT-655
URL: https://issues.apache.org/jira/browse/HTTPCLIENT-655
Project: HttpComponents HttpClient
Issue Type: Bug
Components: HttpClient
Affects Versions: 3.1 RC1
Reporter: Ortwin Glück
Priority: Minor
Our User-Agent says "Jakarta Commons-HttpClient/3.1-rc1". But space is a
reserved character to separate individual *products* and comments according to
RFC 2616, section 14.43. Jakarta is not a product. At the same time we may want
to drop the Jakarta name altogether.
We should change this to something more standard like:
"Apache-HttpClient/3.1-rc1 ("+ System.getProperty("os.name") +";"+
System.getProperty("os.arch") +") "+
"Java/"+ System.getProperty("java.vm.version") +" ("+
System.getProperty("java.vm.vendor") +")"
which renders:
"Apache-HttpClient/3.1-rc1 (Windows XP 5.1;x86) Java/1.5.0_08 (Sun Microsystems
Inc.)"
Sun's internal Http client uses something like "Java/1.5.0_08".
I am completely ignoring the fact that real-world user agents use almost
arbitrary strings.
Some fine examples of misbehaviour from my private logs:
"Jakmpqes dihurxf wfyiupsc" -- apparently somebody has to hide something...
"Missigua Locator 1.9"
"Poodle predictor 1.0"
"shelob v1.0"
"ISC Systems iRc Search 2.1"
"ping.blogug.ch aggregator 1.0"
"http://www.uni-koblenz.de/~flocke/robot-info.txt" -- ...sigh
I am very tempted to write a User-Agent string validator that prevents misuse
of this field in HttpClient.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]