Oops. Dont you hate it when you send an email while you are in the middle of typing it?
I guess I better continue on.. I'll repeat the last little bit I was saying... Which is where we get to the first major bug. evhttp_decode_uri assumes that you are decoding an entire query string which includes a '?', but we can see here that it is being used to decode only a portion of the query string, just a value part. if (c == '?') { > in_query = 1; > } else if (c == '+' && in_query) { > c = ' '; > In this code, if there is a '+' in the value, it will not be decoded to a space like it is supposed to be. Then there is another bug on the next line: > } else if (c == '%' && isxdigit((unsigned char)uri[i+1]) && > isxdigit((unsigned char)uri[i+2])) { > char tmp[] = { uri[i+1], uri[i+2], '\0' }; > Where we are not checking to make sure that 'i' is not already at the end of the supplied string. If using valgrind or other bounds checking tools, this would generate an exception if someone was to send a query string that happened to end with a '%'. This is rather minor, as it would be a little hard to exploit this. So now we get to the third bug... and the one that actually made me stop and try to figure out what was going on. I didn't find the previous mentioned bugs until I started really examining the code, but this one stopped my development in its tracks. After evhttp_parse_query has seperated the key and value (and decoded the value), it then tries to add them to a headers structure... which was designed to handle the http headers, and not really the supplied query parameters. A peice of code in evhttp_add_header actually does this: if (strchr(value, '\r') != NULL || strchr(value, '\n') != NULL || strchr(key, '\r') != NULL || strchr(key, '\n') != NULL) { /* drop illegal headers */ event_debug(("%s: dropping illegal header\n", __func__)); return (-1); } Which actually causes a multi-line parameter (that was encoded by the browser, but decoded before going into this function) to be dropped, and ignored. This means that this code would not be able to process any multi-line <textarea> elements in forms that are sent by the browser, or encoded multi-line entries sent via ajax, which is what I was actually trying to do. I know that work is being done on a 2.0 branch, so maybe this is different there, dont know... but just wanted to let people know that there are some decoding issues in the current version. In summary, we shouldn't be decoding entire query strings before parsing out the key/value pairs. We should be decoding both key and value, and we should not be excluding values that have CR or LF characters in them. Apologies for the rather long email(s). On Wed, Mar 11, 2009 at 10:57 PM, Clint Webb <webb.cl...@gmail.com> wrote: > Hello all, > > memcached uses libevent, and I've been working recently on a web-based > debugging console as part of it. Since memcached already used libevent for > socket notifications, I figured that evhttp would make it easy to add a > web-based interface to it. Which it did. > > In doing this though, I found a few serious bugs in evhttp_parse_query() > that make me wonder if anyone is actually using libevent/http for anything > serious. > > Of course, I may be completely miss-understanding how these functions are > supposed to be used, but a lack of documentation found anywhere has caused > me to look at the code to figure out how to use it. > > The first bit that piqued my interest was in the comment before the > evhttp_parse_query code. > > /* >> * Helper function to parse out arguments in a query. >> * The arguments are separated by key and value. >> * URI should already be decoded. >> */ >> > > This seems to be markedly wrong. You should not decode the entire URI > query string in one go. You need to parse out the key/value pairs and then > decode the key and value bits. > > For example.... if you have a query with parameters like this: > ?username=frank&password=secret > > The existing code works alright, but what if the persons password was > se&cret? > > Anyway, assuming the developer did or didn't already decode the query > string, we get to the evhttp_parse_query function, which will first find the > '?', and then basically split the string using '&' as a delimeter. Each > key/value pair is split on the '=' so that you get seperate key and value. > > So if we supply a query of '?username=frank&password=secret', you get the > the following pairs: > >> username=frank >> password=secret >> > > But what if frank used 'se&cret' as a password? The browser would actually > send '?username=frank&password=se#38cret', but since we are supposed to > decode the entire query string before we parse out the params, then we end > up actually processing '?username=frank&password=se&cret'. Since we > seperate the pairs by splitting on string on '&', we end up with: > username=frank > password=se > cret > > Which is entirely not what we wanted. But anyway, that is not really a bug > in the event code, but an incorrect comment. The only real example of code > on how to use this function followed the instructions though, and did decode > the entire query string before parsing it. > > Inside the function, we see that before the key and value is added to the > 'headers' structure, we see that the value is then decoded, which is correct > (although I would say the key should be decoded then too). > >> value = evhttp_decode_uri(value); >> > > Which is where we get to the first major bug. > evhttp_decode_uri assumes that you are decoding an entire query string > which includes a '?', but we can see here that it is being used to decode > only a portion of the query string, just a value part. > > if (c == '?') { > in_query = 1; > } else if (c == '+' && in_query) { > c = ' '; > > > > -- > "Be excellent to each other" > -- "Be excellent to each other"
_______________________________________________ Libevent-users mailing list Libevent-users@monkey.org http://monkeymail.org/mailman/listinfo/libevent-users