> On Nov. 20, 2014, 9:11 p.m., Ben Mahler wrote: > > 3rdparty/libprocess/include/process/http.hpp, lines 415-416 > > <https://reviews.apache.org/r/28253/diff/1/?file=770208#file770208line415> > > > > Here's what we'll have if we added join: > > > > parse: takes a __decoded__ query string, construct a key,value map of > > the parameters > > > > join: takes a key,value map of parameters, creates an __encoded__ query > > string. > > > > That's where I find this to be non-intuitive: > > > > ``` > > join(parse(query)) == encode(query) > > ``` > > > > It seems like only request encoding should deal with constructing an > > encoded query string, is that not the case? Would like to avoid exposing > > this if possible, and if not possible, would like to make this more > > intuitive. > > Cody Maloney wrote: > So that behavior of parse is going to produce incorrect results I > believe. Suppose one of my parameters in my query string is a small blob of > JSON (Contains '=' and '&' potentially). If we encode/decode on the outside > of query string parsing/joining, we can't represent this (The & will look > like the start of a new parameter). > > ``` > std::map<> query = {{"a%20","b&c=d!@#foobar"}} > > EXPECT_EQ(query, parse(join(query))); > ``` > > Based on my reading of rfc3986 2.1 (snippet inline): > ``` > A percent-encoding mechanism is used to represent a data octet in a > component when that octet's corresponding character is outside the > allowed set or is being used as a delimiter of, or within, the > component. > ``` > > This should work, and is the reason why there is percent encoding. From > what I can tell grepping throuhg the source code nothing currently does the > decoding before we parse the query string, so making the decode live inside > of the parse function should work. Currently callers just end up with percent > encoded chunks in their query string, which causes badness (And could explain > an issue which a customer was having with files/browse.json and things not > showing up) > > Ben Mahler wrote: > Hm.. so I did the grep and as expected the callers parse decoded strings. > > The original use of parse is in Decoder: > > ``` > static int on_message_complete(http_parser* p) > { > DataDecoder* decoder = (DataDecoder*) p->data; > > // Parse the query key/values. > Try<std::string> decoded = http::decode(decoder->query); > if (decoded.isError()) { > return 1; > } > decoder->request->query = http::query::parse(decoded.get()); > ... > ``` > > Note that here we parse a decoded string, as expected. The other use > appears to be within the http code in the master that parses request > __bodies__: > > ``` > Future<Response> Master::Http::shutdown(const Request& request) > { > if (request.method != "POST") { > return BadRequest("Expecting POST"); > } > > // Parse the query string in the request body (since this is a POST) > // in order to determine the framework ID to shutdown. > hashmap<string, string> values = > process::http::query::parse(request.body); > ``` > > Bodies are not expected to be percent encoded, IIUC. > > Anything I missed...?
So we do, not sure how I missed that... So I can fix up the decoding to be inside the query string. Simple extra patch, then add some round trip tests. Sound good? The body one is interesting in that it looks like we're doing it incorrectly: http://stackoverflow.com/questions/4007969/application-x-www-form-urlencoded-or-multipart-form-data/4073451#4073451 We're assuming application/x-www-form-urlencoded at the moment which we should decode. None of the requests happen to be from non-libprocess things or contain illegal characters so it happens to work. *random side thoughts*: We aren't a very good HTTP/webserver / parsing into request/response objects, processing Content-Type, doing simple clean routing. The more I dig into this code the more I think it would be nice if there was just an nginx, apache, proxygen / etc. which are good at doing the HTTP parts and we just take their parsing + methods, getting us out of the game of writing an HTTP server, then just have the bits to bridge / convert their request objects to the appropriate libprocess dispatches. - Cody ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28253/#review62422 ----------------------------------------------------------- On Nov. 20, 2014, 12:50 a.m., Cody Maloney wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/28253/ > ----------------------------------------------------------- > > (Updated Nov. 20, 2014, 12:50 a.m.) > > > Review request for mesos. > > > Bugs: MESOS-2132 > https://issues.apache.org/jira/browse/MESOS-2132 > > > Repository: mesos-git > > > Description > ------- > > This is needed to turn Request objects into http request which we send across > the wire. The existing HTTP request sending code expects the query string to > be already constructed. > > Ideally people can either just forward the query string they get to send it > back across the wire or they can manipulate it using the native C++ datatype > (A hashmap), and then have it automatically re-encoded. > > The parsing logic to go from a string to a map is already exists in this > header, this just adds the other direction. > > > Diffs > ----- > > 3rdparty/libprocess/include/process/http.hpp > 9cf05acbb724ab9af8010d1788621d37a0e48e86 > 3rdparty/libprocess/src/http.cpp b00f33339366f5c06b6f20e38c5ae0c23b8a9358 > 3rdparty/libprocess/src/tests/http_tests.cpp > a90e65f77904da0a45e1cc0cc9889ae69354a1a5 > > Diff: https://reviews.apache.org/r/28253/diff/ > > > Testing > ------- > > make distcheck ubuntu 14.04 > > > Thanks, > > Cody Maloney > >