Hi Oleg,
On Dec 3, 2009, at 2:40am, Oleg Kalnichevski wrote:
On Wed, 2009-12-02 at 19:15 -0800, Ken Krugler wrote:
Below is an email from August 7th, which I'm reviving due to this
becoming a bigger issue over in Bixo-land.
I've continued to run into this issue with my crawls, but so far I'm
not doing anything with cookies, so it hasn't been a priority to
track
down.
However another Bixo user also runs into it, and he noticed that by
switching back to HttpClient 4.0-beta3, the warnings went away.
I believe he just opened HTTPCLIENT-896 as a clone of HTTPCLIENT-773,
which seemed to be this exact same bug (fixed by Oleg around 17/May/
08).
I'm wondering if the bug crept back into the code sometime between
then and the final release.
Thanks,
-- Ken
Hi Ken
The cookie in question violates the format of 'expires' attribute
expected by the Netscape policy. One can configure the policy to be
more
lenient about the format of 'expires' attribute by using a special
HTTP
parameter. For details see HTTPCLIENT-896.
It is not really a regression. I think the Netscape cookie policy was
made stricter at some point of time post 4.0-beta1
Hope this clarifies the situation.
Thanks for the clarification, and the example code you added in a
comment to HTTPCLIENT-896.
Given the number of invalid cookies w/this issue that I see during a
crawl, would it make sense for the "best match" policy to select a
more lenient Netscape format?
Or maybe add a "best match-lenient" policy that does this?
I haven't had to do much in the way of cookie processing in the past,
so I'll confess up front that I'm ignorant about the potential issues
that could arise from using a more lenient policy.
Thanks again,
-- Ken
--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c w e b m i n i n g
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org
For additional commands, e-mail: httpclient-users-h...@hc.apache.org