Based on Jason's question... On Tue, May 17, 2016 at 1:31 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> On Tue, May 17, 2016 at 1:00 PM, Julian Reschke <julian.resc...@gmx.de> > wrote: > >> On 2016-05-17 19:01, Graham Leggett wrote: >> >>> On 17 May 2016, at 6:43 PM, William A Rowe Jr <wr...@rowe-clan.net> >>> wrote: >>> >>> Wondering what other contributors are thinking on this topic. >>>> >>>> We have a number of changes in the ABNF grammar between >>>> RFC2616 and RFC7230..7235. Do we want trunk 2.6/3.0 to be >>>> an entirely RFC723x generation server, and drop all support for >>>> RFC2616? >>>> >>>> Do we want to backport these changes to 2.4.x? If so, what >>>> mechanism do we want to toggle the behavior of the server >>>> between 2616 and 7230..7235? >>>> >>>> We can presume a small performance hit in any conditional >>>> operation, especially when those decisions apply to tight parsing >>>> loop. Toggling between two different parser implementations would >>>> probably be a bit more efficient than conditionals within a parser >>>> itself. >>>> >>> >>> Can you give some examples to get a sense of the extent of this? >>> >> +1 to the question; I'd like to see examples as well... >> >> I believe we only changed the ABNF when we came to the conclusion that >> the old one was incorrect, or did not reflect what implementations do in >> practice. >> > > One of the more significant is the change to token, > https://tools.ietf.org/html/rfc2616#section-2.2 > > token = 1*<any CHAR except CTLs or separators> > > separators = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" | "}" | SP | HT (Note that HT is a CTL, right, so it appears to be doubly excluded, no?) CHAR is US-ASCII 0-127. > vs https://tools.ietf.org/html/rfc7230#section-3.2.6 > > token = 1*tchar > > tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" > / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~" > / DIGIT / ALPHA > ; any VCHAR, except delimiters > > "Delimiters are chosen from the set of US-ASCII visual characters not allowed in a token (DQUOTE and "(),/:;<=>?@[\]{}")." The characters missing above from tchar are '"', '(', ')', ',', '/', ':', ';', '<', '=', '>', '?', '@', '[', '\', ']', '{', '}' which corresponds to this delimiter list, and to the RFC2616 list. VCHAR is clearly US-ASCII 20-7E, possibly includes tab. (Tabs are visible spacing.) So my concerns may have been unfounded, but reviewing the new spec against implementation still seems prudent. As I come across specifics we can discuss those, sorry for my confusion.