Ander Juaristi <[email protected]> writes: > On 06/03/17 16:47, Dale R. Worley wrote: >> Orange Tsai <[email protected]> writes: >>> # This will work >>> $ wget 'http://127.0.0.1%0d%0aCookie%3a hi%0a/' >> >> Not even considering the effect on headers, it's surprising that wget >> doesn't produce an immediate error, since >> "127.0.0.1%0d%0aCookie%3a hi%0a" is syntactically invalid as a host >> part. Why doesn't wget's URL parser detect that? > > Simply because it first splits the URL into several parts according to > the delimiters, and then decodes the percent-encoding. > > Additionally for the host part it also checks whether it's an IP address > and the IDNA stuff, but yeah you raise a good point. Other than that the > host part is treated similarly to the other parts.
Ah, I looked into RFC 3986, and the generic syntax *does* allow the host part to contain %-escapes. But in any case, "127.0.0.1<CR><LF>Cookie:<SPACE>hi<LF>" is not parsable as an IPv4 address. (Always beware of parsing functions that stop when they see the first invalid character!) (Also, shouldn't the above example have ended "hi%0d%0a/"?) Dale
