I started to look at using HttpParser for the Cookie header but there are some differences in the way it works compared to the existing parser in Cookies that I wanted to check direction before getting too far in.
The area I’m concerned about is the need to copy the bytes in order to parse the header. The Cookies parser relies heavily on MessageBytes and avoids copying to a String as far as possible. HttpParser, however, operates on a StringReader which requires converting to a String before parsing. After digging into the usage of Cookies I think there are only two places that read them: 1) Request#getCookies(), which needs to copy to Strings anyway in order to create the Cookie instances it returns 2) CoyoteAdapter#parseSessionCookiesId(), which parses the header and compares names as MessageBytes, only allocating a String for the value if the session cookie is found It’s this second one that has me concerned about switching to HttpParser as this gets called for every request. If we switch then there is going to be allocation and copying of the header that we currently don’t do. Having said that, the current parse relies heavily on the assumption that the header is US-ASCII encoded and that it is only dealing with 7-bit characters (it freely casts bytes to chars). The cookie change proposal has us supporting UTF-8 as specified by HTML5 which means a more robust decoder will be needed and the copy may not be avoidable. My plan here is to KISS and implement a parser similar to the others in HttpParser assuming the header has already been decoded so it can just deal with the chars. Then if we notice any performance degradation we can focus on improving HttpParser which will have the benefit of working for the other header parsers as well. I’ll implement this alongside the existing code (actually, in the parser package) to make it easier to do an A-B comparison. There would likely be some follow-on changes from such a change. Cookies and ServerCookie are recyclable objects associated with the request. By moving away from MessageBytes these could be replaced by basic String values and may not be needed e.g. Request already caches the array of Cookie values returned from getCookies() and that could be now populated directly from the parse. These classes may end up going away. Thanks Jeremy
signature.asc
Description: Message signed with OpenPGP using GPGMail