Re: dealing with empty field names in query
On Feb 13, 2009, at 5:11 PM, Joe Schaefer wrote: We had to stop using libapreq2 for cookies, because we found out that wordpress (being a shoddy piece of software) was generating invalid cookies at times. when apreq encountered it, it segfaulted. What version of apreq was this? And did you report it to the apreq- dev@ mailing list? 2.07 reported to apreq-dev in 2006 : http://marc.info/?l=apreq- dev&m=113996436206606&w=2 it's an edge case to cause it -- you have to somehow write a bad cookie, which most libraries fix for automatically. wordpress did that often back then though.
Re: dealing with empty field names in query
- Original Message > From: Jonathan Vanasco > To: modperl > Sent: Friday, February 13, 2009 3:30:20 PM > Subject: Re: dealing with empty field names in query > > > On Feb 6, 2009, at 4:58 PM, Phil Carmody wrote: > > > In those name/value pairs, according to HTML 4 at least, the names must > > begin > with a letter [A-Za-z]. The empty string does not do so. Garbage in, garbage > out. > > > > Part of me agrees with that philosophy. > > Another part of me is more practical. > > We had to stop using libapreq2 for cookies, because we found out that > wordpress > (being a shoddy piece of software) was generating invalid cookies at times. > when apreq encountered it, it segfaulted. What version of apreq was this? And did you report it to the apreq-dev@ mailing list? > so while the engineering part of me is okay with garbage in / garbage out, > the > management side of me says sometimes you have to expect bad data and try to > make > the best of it - otherwise you lose customers and revenue.
Re: dealing with empty field names in query
On Fri, Feb 13, 2009 at 3:03 PM, Jonathan Vanasco wrote: > A simple typo could render your application broken. Or a hostile competitor.
Re: dealing with empty field names in query
On Feb 13, 2009, at 3:38 PM, André Warnier wrote: The management part of me says that if you sell shoddy merchandise to people, they are going to come back and hit you with it. Presumably, if you get such kind of posted data from a form, it is because you sent a shoddy form to the browser, which can submit such shoddy data. Or because you have some shoddy javascript in the form, which sends shoddy data to your server. So we're still at the garbage level, but the other way around : garbage out, gargabe in. ;-) That's assuming that you're responsible. Today many people use misc javascript libraries; and there are js DMZ servers that serve off cached versions so people don't have to reload. A simple typo could render your application broken.
Re: dealing with empty field names in query
Jonathan Vanasco wrote: On Feb 6, 2009, at 4:58 PM, Phil Carmody wrote: In those name/value pairs, according to HTML 4 at least, the names must begin with a letter [A-Za-z]. The empty string does not do so. Garbage in, garbage out. Part of me agrees with that philosophy. Another part of me is more practical. We had to stop using libapreq2 for cookies, because we found out that wordpress (being a shoddy piece of software) was generating invalid cookies at times. when apreq encountered it, it segfaulted. so while the engineering part of me is okay with garbage in / garbage out, the management side of me says sometimes you have to expect bad data and try to make the best of it - otherwise you lose customers and revenue. The management part of me says that if you sell shoddy merchandise to people, they are going to come back and hit you with it. Presumably, if you get such kind of posted data from a form, it is because you sent a shoddy form to the browser, which can submit such shoddy data. Or because you have some shoddy javascript in the form, which sends shoddy data to your server. So we're still at the garbage level, but the other way around : garbage out, gargabe in. ;-)
Re: dealing with empty field names in query
On Feb 6, 2009, at 4:58 PM, Phil Carmody wrote: In those name/value pairs, according to HTML 4 at least, the names must begin with a letter [A-Za-z]. The empty string does not do so. Garbage in, garbage out. Part of me agrees with that philosophy. Another part of me is more practical. We had to stop using libapreq2 for cookies, because we found out that wordpress (being a shoddy piece of software) was generating invalid cookies at times. when apreq encountered it, it segfaulted. so while the engineering part of me is okay with garbage in / garbage out, the management side of me says sometimes you have to expect bad data and try to make the best of it - otherwise you lose customers and revenue.
Re: dealing with empty field names in query
André Warnier wrote: "application/x-www-form-urlencoded" is the default, and it means that you are passing the form data appended at the end of the URL, preceded by a "?" sign, as one long string of the form "name1=value1&name2=value2..." etc.. usually known as "the query string". That is easy to do, but has the inconvenient that the server does not really know in which character set these things are. This can play havoc with internationally-minded applications. It can also have the result that the request may be truncated after a certain maximum length, by some intervening actor. Sorry, I have to throw a little exception there the char set for URL-encoded get/post data is defined in http://www.w3.org/TR/html401/interact/forms.html#h-17.3 as the value of the form element's accept-charset attribute, which defaults to "UNKNOWN", and "User agents may interpret this value as the character encoding that was used to transmit the document containing this FORM element.". We rely on this in a system used on hundreds of sites in various countries for several years, and we have found no exceptions. Other that that, I hope our less experienced readers take note of your excellent advice. cheers John
Re: dealing with empty field names in query
Clinton Gormley wrote: Are you using a different version? Or is it the fact that you're POSTing it? Sorry for the lecture, but I see this so often that it seems it deserves repeating : To send the content of a to a webserver, you can use either a POST or a GET method. You should use a GET, if the result of sending this to the server, is not going to modify anything on the server, and if re-sending the same request several times would always give the same result. In technical jargon, that is called "idempotent". You should use POST if it is not the case, in other words if what you are sending is going to modify something, and multiple identical requests would be "not idempotent". Neither of the above says how you are passing the data to the server however. This is something else entirely. Separately from the above, and usable with either one, is the question of how you are passing the data of your request to the server. This you can also do in two different ways : - encoded as "application/x-www-form-urlencoded" - or encoded as "multipart/form-data" "application/x-www-form-urlencoded" is the default, and it means that you are passing the form data appended at the end of the URL, preceded by a "?" sign, as one long string of the form "name1=value1&name2=value2..." etc.. usually known as "the query string". That is easy to do, but has the inconvenient that the server does not really know in which character set these things are. This can play havoc with internationally-minded applications. It can also have the result that the request may be truncated after a certain maximum length, by some intervening actor. "multipart/form-data" is more complicated and harder to do, and is described here : http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4 but it has the advantage that each of the "name=value" pairs can be as long as you want, and that the type of data and encoding of each is clear. In neither of the above though, is it allowed in the specs to send a "name=value" pair where there is no name. And if name there is, the specs do define what is allowed it in, and "" is not among these. Now which combination of the above some clever javascript function may decide to use when sending the form content to the server, is another matter. But as Phil rightly said, garbage in, garbage out. Whether the server software can deal or not with some forms of invalid data, is rather outside of the question. It is certainly not obliged to. And the request data of which it is originally the question here is certainly, without a doubt, invalid. In my opinion thus, the OP should first take whatever measure is appropriate to ensure that his application sends only valid data, and then come back if there is still a problem.
Re: dealing with empty field names in query
> With the following request body: > > i1=drnk4&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aquan > tity=1&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aid=de9a792da0f5127d72d7c6a5f > 6b2d4c5&i2=clth12&basket%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aquantity=1&basket > %3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aid=7acf9602cd6ab0ee86f77efeaaffefff&i3=&= > &=&i4=&=&=&i5=&=&=&i6=&=&=&action=insert&x=46&y=17 When I pass the above as a query string to my site, Apache2::Request (from libapreq2-2.08) parses it as follows: -- $APR_Request_Param_Table1 = bless( { "=" => '', "=" => '', "=" => '', "=" => '', "=" => '', "=" => '', "=" => '', "=" => '', action => 'insert', "basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:id" => '7acf9602cd6ab0ee86f77efeaaffefff', "basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:quantity" => 1, "basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:id" => 'de9a792da0f5127d72d7c6a5f6b2d4c5', "basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:quantity" => 1, i1 => 'drnk4', i2 => 'clth12', i3 => '', i4 => '', i5 => '', i6 => '', x=> 46, y=> 17 }, 'APR::Request::Param::Table' ); -- Are you using a different version? Or is it the fact that you're POSTing it? Clint
Re: dealing with empty field names in query
--- On Sat, 2/7/09, John ORourke wrote: > Phil, can you point me to the part of the spec which > specifies that a field name must begin with an ASCII letter? http://www.w3.org/TR/html401/types.html#type-cdata Phil
Re: dealing with empty field names in query
André Warnier wrote: In those name/value pairs, according to HTML 4 at least, the names must begin with a letter [A-Za-z]. The empty string does not do so. Garbage in, garbage out. +1 + : Above the OP is talking about a request "body". Are we sure that this is really a request body, and not a query-string ? What does the tag really look like ? (enctype) Just thinking that if this is a query-string, is it not just being cut off after a certain size ? It would not be possible to submit this data as multipart/form-data, for a similar reason to what Phil says. It's regular form data using the post method, so no length issue and normal URI parameter rules apply. I suspect Phil is correct but I can't find any mention of this in the HTML 4 spec, which is what prompted my question. Phil, can you point me to the part of the spec which specifies that a field name must begin with an ASCII letter? cheers John
Re: dealing with empty field names in query
Phil Carmody wrote: --- On Fri, 2/6/09, John ORourke wrote: We're using more and more javascript to do clever things with forms, Lots of people have said that. Probably a majority were wrong. and I think we broke the Apache2::Request parser, but wanted to check before reporting it as a bug. (and tell me if this should go to the apreq list) With the following request body: i1=drnk4&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aquan tity=1&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aid=de9a792da0f5127d72d7c6a5f 6b2d4c5&i2=clth12&basket%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aquantity=1&basket %3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aid=7acf9602cd6ab0ee86f77efeaaffefff&i3=&= &=&i4=&=&=&i5=&=&=&i6=&=&=&action=insert&x=46&y=17 When I create a new Apache2::Request object and loop through the parameters, I get this: (output from Data::Dumper of a hash of the params) 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:id' => '7acf9602cd6ab0ee86f77efeaaff efff', 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:quantity' => '1', 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:id' => 'de9a792da0f5127d72d7c6a5f6b2 d4c5', 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:quantity' => '1', 'i1' => 'drnk4', 'i2' => 'clth12', 'i3' => '' So it stops parsing when it gets an '=' straight after an ampersand. I looked up the spec and it doesn't seem to explicitly say, so I don't think we should just stop parsing. Spec: http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1 In those name/value pairs, according to HTML 4 at least, the names must begin with a letter [A-Za-z]. The empty string does not do so. Garbage in, garbage out. +1 + : Above the OP is talking about a request "body". Are we sure that this is really a request body, and not a query-string ? What does the tag really look like ? (enctype) Just thinking that if this is a query-string, is it not just being cut off after a certain size ? It would not be possible to submit this data as multipart/form-data, for a similar reason to what Phil says.
Re: dealing with empty field names in query
--- On Fri, 2/6/09, John ORourke wrote: > We're using more and more javascript to do clever > things with forms, Lots of people have said that. Probably a majority were wrong. > and I think we broke the Apache2::Request > parser, but wanted to check before reporting it as a bug. > (and tell me if this should go to the apreq list) > > With the following request body: > > i1=drnk4&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aquan > tity=1&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aid=de9a792da0f5127d72d7c6a5f > 6b2d4c5&i2=clth12&basket%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aquantity=1&basket > %3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aid=7acf9602cd6ab0ee86f77efeaaffefff&i3=&= > &=&i4=&=&=&i5=&=&=&i6=&=&=&action=insert&x=46&y=17 > > When I create a new Apache2::Request object and loop > through the parameters, I get this: (output from > Data::Dumper of a hash of the params) > > 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:id' > => '7acf9602cd6ab0ee86f77efeaaff > efff', > > 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:quantity' > => '1', > 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:id' > => 'de9a792da0f5127d72d7c6a5f6b2 > d4c5', > > 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:quantity' > => '1', > 'i1' => 'drnk4', > 'i2' => 'clth12', > 'i3' => '' > > So it stops parsing when it gets an '=' straight > after an ampersand. > > I looked up the spec and it doesn't seem to explicitly > say, so I don't think we should just stop parsing. > > Spec: > > http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1 In those name/value pairs, according to HTML 4 at least, the names must begin with a letter [A-Za-z]. The empty string does not do so. Garbage in, garbage out. Phil Invalid.
dealing with empty field names in query
Hi mod_perl list, We're using more and more javascript to do clever things with forms, and I think we broke the Apache2::Request parser, but wanted to check before reporting it as a bug. (and tell me if this should go to the apreq list) With the following request body: i1=drnk4&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aquan tity=1&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aid=de9a792da0f5127d72d7c6a5f 6b2d4c5&i2=clth12&basket%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aquantity=1&basket %3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aid=7acf9602cd6ab0ee86f77efeaaffefff&i3=&= &=&i4=&=&=&i5=&=&=&i6=&=&=&action=insert&x=46&y=17 When I create a new Apache2::Request object and loop through the parameters, I get this: (output from Data::Dumper of a hash of the params) 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:id' => '7acf9602cd6ab0ee86f77efeaaff efff', 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:quantity' => '1', 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:id' => 'de9a792da0f5127d72d7c6a5f6b2 d4c5', 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:quantity' => '1', 'i1' => 'drnk4', 'i2' => 'clth12', 'i3' => '' So it stops parsing when it gets an '=' straight after an ampersand. I looked up the spec and it doesn't seem to explicitly say, so I don't think we should just stop parsing. Spec: http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1 thanks John --