Re: dealing with empty field names in query

2009-02-23 Thread Jonathan Vanasco


On Feb 13, 2009, at 5:11 PM, Joe Schaefer wrote:
We had to stop using libapreq2 for cookies, because we found out  
that wordpress
(being a shoddy piece of software) was generating invalid cookies  
at times.

when apreq encountered it, it segfaulted.


What version of apreq was this?  And did you report it to the apreq- 
dev@ mailing list?


2.07

reported to apreq-dev in 2006 : http://marc.info/?l=apreq- 
dev&m=113996436206606&w=2


it's an edge case to cause it -- you have to somehow write a bad  
cookie, which most libraries fix for automatically.  wordpress did  
that often back then though.


Re: dealing with empty field names in query

2009-02-13 Thread Joe Schaefer
- Original Message 

> From: Jonathan Vanasco 
> To: modperl 
> Sent: Friday, February 13, 2009 3:30:20 PM
> Subject: Re: dealing with empty field names in query
> 
> 
> On Feb 6, 2009, at 4:58 PM, Phil Carmody wrote:
> 
> > In those name/value pairs, according to HTML 4 at least, the names must 
> > begin 
> with a letter [A-Za-z]. The empty string does not do so. Garbage in, garbage 
> out.
> 
> 
> 
> Part of me agrees with that philosophy.
> 
> Another part of me is more practical.
> 
> We had to stop using libapreq2 for cookies, because we found out that 
> wordpress 
> (being a shoddy piece of software) was generating invalid cookies at times.  
> when apreq encountered it, it segfaulted.

What version of apreq was this?  And did you report it to the apreq-dev@ 
mailing list?

> so while the engineering part of me is okay with garbage in / garbage out, 
> the 
> management side of me says sometimes you have to expect bad data and try to 
> make 
> the best of it - otherwise you lose customers and revenue.



  


Re: dealing with empty field names in query

2009-02-13 Thread David Nicol
On Fri, Feb 13, 2009 at 3:03 PM, Jonathan Vanasco  wrote:
> A simple typo could render your application broken.

Or a hostile competitor.


Re: dealing with empty field names in query

2009-02-13 Thread Jonathan Vanasco

On Feb 13, 2009, at 3:38 PM, André Warnier wrote:


The management part of me says that if you sell shoddy merchandise to
people, they are going to come back and hit you with it.
Presumably, if you get such kind of posted data from a form, it is
because you sent a shoddy form to the browser, which can submit such
shoddy data.  Or because you have some shoddy javascript in the form,
which sends shoddy data to your server.
So we're still at the garbage level, but the other way around :  
garbage

out, gargabe in.
;-)


That's assuming that you're responsible.

Today many people use misc javascript libraries; and there are js DMZ  
servers that serve off cached versions so people don't have to  
reload.  A simple typo could render your application broken.

Re: dealing with empty field names in query

2009-02-13 Thread André Warnier

Jonathan Vanasco wrote:


On Feb 6, 2009, at 4:58 PM, Phil Carmody wrote:

In those name/value pairs, according to HTML 4 at least, the names 
must begin with a letter [A-Za-z]. The empty string does not do so. 
Garbage in, garbage out.




Part of me agrees with that philosophy.

Another part of me is more practical.

We had to stop using libapreq2 for cookies, because we found out that 
wordpress (being a shoddy piece of software) was generating invalid 
cookies at times.  when apreq encountered it, it segfaulted.


so while the engineering part of me is okay with garbage in / garbage 
out, the management side of me says sometimes you have to expect bad 
data and try to make the best of it - otherwise you lose customers and 
revenue.



The management part of me says that if you sell shoddy merchandise to
people, they are going to come back and hit you with it.
Presumably, if you get such kind of posted data from a form, it is
because you sent a shoddy form to the browser, which can submit such
shoddy data.  Or because you have some shoddy javascript in the form,
which sends shoddy data to your server.
So we're still at the garbage level, but the other way around : garbage
out, gargabe in.
;-)




Re: dealing with empty field names in query

2009-02-13 Thread Jonathan Vanasco


On Feb 6, 2009, at 4:58 PM, Phil Carmody wrote:

In those name/value pairs, according to HTML 4 at least, the names  
must begin with a letter [A-Za-z]. The empty string does not do so.  
Garbage in, garbage out.




Part of me agrees with that philosophy.

Another part of me is more practical.

We had to stop using libapreq2 for cookies, because we found out that  
wordpress (being a shoddy piece of software) was generating invalid  
cookies at times.  when apreq encountered it, it segfaulted.


so while the engineering part of me is okay with garbage in / garbage  
out, the management side of me says sometimes you have to expect bad  
data and try to make the best of it - otherwise you lose customers and  
revenue.


Re: dealing with empty field names in query

2009-02-07 Thread John ORourke

André Warnier wrote:
"application/x-www-form-urlencoded" is the default, and it means that 
you are passing the form data appended at the end of the URL, preceded 
by a "?" sign, as one long string of the form 
"name1=value1&name2=value2..." etc..

usually known as "the query string".
That is easy to do, but has the inconvenient that the server does not 
really know in which character set these things are.  This can play 
havoc with internationally-minded applications.
It can also have the result that the request may be truncated after a 
certain maximum length, by some intervening actor.


Sorry, I have to throw a little exception there the char set for 
URL-encoded get/post data is defined in 
http://www.w3.org/TR/html401/interact/forms.html#h-17.3 as the value of 
the form element's accept-charset attribute, which defaults to 
"UNKNOWN", and "User agents may interpret this value as the character 
encoding that was used to transmit the document containing this FORM 
element.".  We rely on this in a system used on hundreds of sites in 
various countries for several years, and we have found no exceptions.


Other that that, I hope our less experienced readers take note of your 
excellent advice.


cheers
John




Re: dealing with empty field names in query

2009-02-07 Thread André Warnier

Clinton Gormley wrote:


Are you using a different version?  Or is it the fact that you're
POSTing it?

Sorry for the lecture, but I see this so often that it seems it deserves 
repeating :


To send the content of a  to a webserver, you can use either a 
POST or a GET method.
You should use a GET, if the result of sending this to the server, is 
not going to modify anything on the server, and if re-sending the same 
request several times would always give the same result.

In technical jargon, that is called "idempotent".

You should use POST if it is not the case, in other words if what you 
are sending is going to modify something, and multiple identical 
requests would be "not idempotent".


Neither of the above says how you are passing the data to the server 
however. This is something else entirely.


Separately from the above, and usable with either one, is the question 
of how you are passing the data of your request to the server.

This you can also do in two different ways :
- encoded as "application/x-www-form-urlencoded"
- or encoded as "multipart/form-data"

"application/x-www-form-urlencoded" is the default, and it means that 
you are passing the form data appended at the end of the URL, preceded 
by a "?" sign, as one long string of the form 
"name1=value1&name2=value2..." etc..

usually known as "the query string".
That is easy to do, but has the inconvenient that the server does not 
really know in which character set these things are.  This can play 
havoc with internationally-minded applications.
It can also have the result that the request may be truncated after a 
certain maximum length, by some intervening actor.


"multipart/form-data" is more complicated and harder to do, and is 
described here :

http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4
but it has the advantage that each of the "name=value" pairs can be as 
long as you want, and that the type of data and encoding of each is clear.


In neither of the above though, is it allowed in the specs to send a 
"name=value" pair where there is no name. And if name there is, the 
specs do define what is allowed it in, and "" is not among these.


Now which combination of the above some clever javascript function may 
decide to use when sending the form content to the server, is another 
matter.

But as Phil rightly said, garbage in, garbage out.
Whether the server software can deal or not with some forms of invalid 
data, is rather outside of the question. It is certainly not obliged to.


And the request data of which it is originally the question here is 
certainly, without a doubt, invalid.


In my opinion thus, the OP should first take whatever measure is 
appropriate to ensure that his application sends only valid data, and 
then come back if there is still a problem.


Re: dealing with empty field names in query

2009-02-07 Thread Clinton Gormley

> With the following request body:
> 
> i1=drnk4&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aquan
> tity=1&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aid=de9a792da0f5127d72d7c6a5f
> 6b2d4c5&i2=clth12&basket%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aquantity=1&basket
> %3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aid=7acf9602cd6ab0ee86f77efeaaffefff&i3=&=
> &=&i4=&=&=&i5=&=&=&i6=&=&=&action=insert&x=46&y=17

When I pass the above as a query string to my site, Apache2::Request
(from libapreq2-2.08) parses it as follows:

--
$APR_Request_Param_Table1 = bless( {
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  action   => 'insert',
  "basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:id"
   => '7acf9602cd6ab0ee86f77efeaaffefff',
  "basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:quantity"
   => 1,
  "basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:id"
   => 'de9a792da0f5127d72d7c6a5f6b2d4c5',
  "basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:quantity"
   => 1,
  i1   => 'drnk4',
  i2   => 'clth12',
  i3   => '',
  i4   => '',
  i5   => '',
  i6   => '',
  x=> 46,
  y=> 17
}, 'APR::Request::Param::Table' );
--


Are you using a different version?  Or is it the fact that you're
POSTing it?

Clint



Re: dealing with empty field names in query

2009-02-07 Thread Phil Carmody

--- On Sat, 2/7/09, John ORourke  wrote:
> Phil, can you point me to the part of the spec which
> specifies that a field name must begin with an ASCII letter?

http://www.w3.org/TR/html401/types.html#type-cdata

Phil


  


Re: dealing with empty field names in query

2009-02-06 Thread John ORourke

André Warnier wrote:


In those name/value pairs, according to HTML 4 at least, the names 
must begin with a letter [A-Za-z]. The empty string does not do so. 
Garbage in, garbage out.



+1
+ :
Above the OP is talking about a request "body".  Are we sure that this 
is really a request body, and not a query-string ?

What does the  tag really look like ? (enctype)
Just thinking that if this is a query-string, is it not just being cut 
off after a certain size ?
It would not be possible to submit this data as multipart/form-data, 
for a similar reason to what Phil says.




It's regular form data using the post method, so no length issue and 
normal URI parameter rules apply.  I suspect Phil is correct but I can't 
find any mention of this in the HTML 4 spec, which is what prompted my 
question.  Phil, can you point me to the part of the spec which 
specifies that a field name must begin with an ASCII letter?


cheers
John




Re: dealing with empty field names in query

2009-02-06 Thread André Warnier

Phil Carmody wrote:

--- On Fri, 2/6/09, John ORourke  wrote:

We're using more and more javascript to do clever
things with forms,


Lots of people have said that. Probably a majority were wrong.


 and I think we broke the Apache2::Request
parser, but wanted to check before reporting it as a bug. 
(and tell me if this should go to the apreq list)


With the following request body:

i1=drnk4&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aquan
tity=1&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aid=de9a792da0f5127d72d7c6a5f
6b2d4c5&i2=clth12&basket%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aquantity=1&basket
%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aid=7acf9602cd6ab0ee86f77efeaaffefff&i3=&=
&=&i4=&=&=&i5=&=&=&i6=&=&=&action=insert&x=46&y=17

When I create a new Apache2::Request object and loop
through the parameters, I get this: (output from
Data::Dumper of a hash of the params)

 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:id'
=> '7acf9602cd6ab0ee86f77efeaaff
efff',

'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:quantity'
=> '1',
 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:id'
=> 'de9a792da0f5127d72d7c6a5f6b2
d4c5',

'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:quantity'
=> '1',
 'i1' => 'drnk4',
 'i2' => 'clth12',
 'i3' => ''

So it stops parsing when it gets an '=' straight
after an ampersand.

I looked up the spec and it doesn't seem to explicitly
say, so I don't think we should just stop parsing.

Spec:

http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1


In those name/value pairs, according to HTML 4 at least, the names must begin 
with a letter [A-Za-z]. The empty string does not do so. Garbage in, garbage 
out.


+1
+ :
Above the OP is talking about a request "body".  Are we sure that this 
is really a request body, and not a query-string ?

What does the  tag really look like ? (enctype)
Just thinking that if this is a query-string, is it not just being cut 
off after a certain size ?
It would not be possible to submit this data as multipart/form-data, for 
a similar reason to what Phil says.





Re: dealing with empty field names in query

2009-02-06 Thread Phil Carmody

--- On Fri, 2/6/09, John ORourke  wrote:
> We're using more and more javascript to do clever
> things with forms,

Lots of people have said that. Probably a majority were wrong.

>  and I think we broke the Apache2::Request
> parser, but wanted to check before reporting it as a bug. 
> (and tell me if this should go to the apreq list)
> 
> With the following request body:
> 
> i1=drnk4&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aquan
> tity=1&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aid=de9a792da0f5127d72d7c6a5f
> 6b2d4c5&i2=clth12&basket%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aquantity=1&basket
> %3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aid=7acf9602cd6ab0ee86f77efeaaffefff&i3=&=
> &=&i4=&=&=&i5=&=&=&i6=&=&=&action=insert&x=46&y=17
> 
> When I create a new Apache2::Request object and loop
> through the parameters, I get this: (output from
> Data::Dumper of a hash of the params)
> 
>  'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:id'
> => '7acf9602cd6ab0ee86f77efeaaff
> efff',
> 
> 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:quantity'
> => '1',
>  'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:id'
> => 'de9a792da0f5127d72d7c6a5f6b2
> d4c5',
> 
> 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:quantity'
> => '1',
>  'i1' => 'drnk4',
>  'i2' => 'clth12',
>  'i3' => ''
> 
> So it stops parsing when it gets an '=' straight
> after an ampersand.
> 
> I looked up the spec and it doesn't seem to explicitly
> say, so I don't think we should just stop parsing.
> 
> Spec:
> 
> http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1

In those name/value pairs, according to HTML 4 at least, the names must begin 
with a letter [A-Za-z]. The empty string does not do so. Garbage in, garbage 
out.

Phil
Invalid.


  


dealing with empty field names in query

2009-02-06 Thread John ORourke

Hi mod_perl list,

We're using more and more javascript to do clever things with forms, and 
I think we broke the Apache2::Request parser, but wanted to check before 
reporting it as a bug.  (and tell me if this should go to the apreq list)


With the following request body:

i1=drnk4&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aquan
tity=1&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aid=de9a792da0f5127d72d7c6a5f
6b2d4c5&i2=clth12&basket%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aquantity=1&basket
%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aid=7acf9602cd6ab0ee86f77efeaaffefff&i3=&=
&=&i4=&=&=&i5=&=&=&i6=&=&=&action=insert&x=46&y=17

When I create a new Apache2::Request object and loop through the 
parameters, I get this: (output from Data::Dumper of a hash of the params)


 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:id' => 
'7acf9602cd6ab0ee86f77efeaaff

efff',
 'basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:quantity' => '1',
 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:id' => 
'de9a792da0f5127d72d7c6a5f6b2

d4c5',
 'basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:quantity' => '1',
 'i1' => 'drnk4',
 'i2' => 'clth12',
 'i3' => ''

So it stops parsing when it gets an '=' straight after an ampersand.

I looked up the spec and it doesn't seem to explicitly say, so I don't 
think we should just stop parsing.


Spec:

http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1


thanks
John
--