I started to look at using HttpParser for the Cookie header but there are some 
differences in the way it works compared to the existing parser in Cookies that 
I wanted to check direction before getting too far in.

The area I’m concerned about is the need to copy the bytes in order to parse 
the header. The Cookies parser relies heavily on MessageBytes and avoids 
copying to a String as far as possible. HttpParser, however, operates on a 
StringReader which requires converting to a String before parsing.

After digging into the usage of Cookies I think there are only two places that 
read them:
1) Request#getCookies(), which needs to copy to Strings anyway in order to 
create the Cookie instances it returns
2) CoyoteAdapter#parseSessionCookiesId(), which parses the header and compares 
names as MessageBytes, only allocating a String for the value if the session 
cookie is found

It’s this second one that has me concerned about switching to HttpParser as 
this gets called for every request. If we switch then there is going to be 
allocation and copying of the header that we currently don’t do. 

Having said that, the current parse relies heavily on the assumption that the 
header is US-ASCII encoded and that it is only dealing with 7-bit characters 
(it freely casts bytes to chars). The cookie change proposal has us supporting 
UTF-8 as specified by HTML5 which means a more robust decoder will be needed 
and the copy may not be avoidable.

My plan here is to KISS and implement a parser similar to the others in 
HttpParser assuming the header has already been decoded so it can just deal 
with the chars. Then if we notice any performance degradation we can focus on 
improving HttpParser which will have the benefit of working for the other 
header parsers as well. I’ll implement this alongside the existing code 
(actually, in the parser package) to make it easier to do an A-B comparison.

There would likely be some follow-on changes from such a change. Cookies and 
ServerCookie are recyclable objects associated with the request. By moving away 
from MessageBytes these could be replaced by basic String values and may not be 
needed e.g. Request already caches the array of Cookie values returned from 
getCookies() and that could be now populated directly from the parse. These 
classes may end up going away.

Thanks
Jeremy

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to