Re: dealing with empty field names in query

2009-02-07 Thread John ORourke

André Warnier wrote:
"application/x-www-form-urlencoded" is the default, and it means that 
you are passing the form data appended at the end of the URL, preceded 
by a "?" sign, as one long string of the form 
"name1=value1&name2=value2..." etc..

usually known as "the query string".
That is easy to do, but has the inconvenient that the server does not 
really know in which character set these things are.  This can play 
havoc with internationally-minded applications.
It can also have the result that the request may be truncated after a 
certain maximum length, by some intervening actor.


Sorry, I have to throw a little exception there the char set for 
URL-encoded get/post data is defined in 
http://www.w3.org/TR/html401/interact/forms.html#h-17.3 as the value of 
the form element's accept-charset attribute, which defaults to 
"UNKNOWN", and "User agents may interpret this value as the character 
encoding that was used to transmit the document containing this FORM 
element.".  We rely on this in a system used on hundreds of sites in 
various countries for several years, and we have found no exceptions.


Other that that, I hope our less experienced readers take note of your 
excellent advice.


cheers
John




Re: dealing with empty field names in query

2009-02-07 Thread André Warnier

Clinton Gormley wrote:


Are you using a different version?  Or is it the fact that you're
POSTing it?

Sorry for the lecture, but I see this so often that it seems it deserves 
repeating :


To send the content of a  to a webserver, you can use either a 
POST or a GET method.
You should use a GET, if the result of sending this to the server, is 
not going to modify anything on the server, and if re-sending the same 
request several times would always give the same result.

In technical jargon, that is called "idempotent".

You should use POST if it is not the case, in other words if what you 
are sending is going to modify something, and multiple identical 
requests would be "not idempotent".


Neither of the above says how you are passing the data to the server 
however. This is something else entirely.


Separately from the above, and usable with either one, is the question 
of how you are passing the data of your request to the server.

This you can also do in two different ways :
- encoded as "application/x-www-form-urlencoded"
- or encoded as "multipart/form-data"

"application/x-www-form-urlencoded" is the default, and it means that 
you are passing the form data appended at the end of the URL, preceded 
by a "?" sign, as one long string of the form 
"name1=value1&name2=value2..." etc..

usually known as "the query string".
That is easy to do, but has the inconvenient that the server does not 
really know in which character set these things are.  This can play 
havoc with internationally-minded applications.
It can also have the result that the request may be truncated after a 
certain maximum length, by some intervening actor.


"multipart/form-data" is more complicated and harder to do, and is 
described here :

http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4
but it has the advantage that each of the "name=value" pairs can be as 
long as you want, and that the type of data and encoding of each is clear.


In neither of the above though, is it allowed in the specs to send a 
"name=value" pair where there is no name. And if name there is, the 
specs do define what is allowed it in, and "" is not among these.


Now which combination of the above some clever javascript function may 
decide to use when sending the form content to the server, is another 
matter.

But as Phil rightly said, garbage in, garbage out.
Whether the server software can deal or not with some forms of invalid 
data, is rather outside of the question. It is certainly not obliged to.


And the request data of which it is originally the question here is 
certainly, without a doubt, invalid.


In my opinion thus, the OP should first take whatever measure is 
appropriate to ensure that his application sends only valid data, and 
then come back if there is still a problem.


Re: dealing with empty field names in query

2009-02-07 Thread Clinton Gormley

> With the following request body:
> 
> i1=drnk4&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aquan
> tity=1&basket%3A_new_de9a792da0f5127d72d7c6a5f6b2d4c5%3Aid=de9a792da0f5127d72d7c6a5f
> 6b2d4c5&i2=clth12&basket%3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aquantity=1&basket
> %3A_new_7acf9602cd6ab0ee86f77efeaaffefff%3Aid=7acf9602cd6ab0ee86f77efeaaffefff&i3=&=
> &=&i4=&=&=&i5=&=&=&i6=&=&=&action=insert&x=46&y=17

When I pass the above as a query string to my site, Apache2::Request
(from libapreq2-2.08) parses it as follows:

--
$APR_Request_Param_Table1 = bless( {
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  "="  => '',
  action   => 'insert',
  "basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:id"
   => '7acf9602cd6ab0ee86f77efeaaffefff',
  "basket:_new_7acf9602cd6ab0ee86f77efeaaffefff:quantity"
   => 1,
  "basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:id"
   => 'de9a792da0f5127d72d7c6a5f6b2d4c5',
  "basket:_new_de9a792da0f5127d72d7c6a5f6b2d4c5:quantity"
   => 1,
  i1   => 'drnk4',
  i2   => 'clth12',
  i3   => '',
  i4   => '',
  i5   => '',
  i6   => '',
  x=> 46,
  y=> 17
}, 'APR::Request::Param::Table' );
--


Are you using a different version?  Or is it the fact that you're
POSTing it?

Clint



Re: dealing with empty field names in query

2009-02-07 Thread Phil Carmody

--- On Sat, 2/7/09, John ORourke  wrote:
> Phil, can you point me to the part of the spec which
> specifies that a field name must begin with an ASCII letter?

http://www.w3.org/TR/html401/types.html#type-cdata

Phil