Re: form parameters

André Warnier Fri, 20 Mar 2009 01:20:10 -0700

Joseph Millet wrote:

Maybe I'm missing something but from the little knowledge I have, I'd
think an HTML form is posted encoded in the form enclosing HTML
document charset specified in the sent Server headers. So that you
settle a page encoded in iso-8859-2, you wouldn't expect a form
present in that page to post unicode data, would you ?

Maybe we need to restate the issue a bit differently.
Imagine a website on which there is a starting page with 3 links :
- formA.html
- formB.html
- formC.html
Each of these is a html page containing a tag '<form method="POST">'.

Now 3 users, each at his workstation, obtain this starting page from theserver.Then userA clicks on the link to formA.html and obtains thecorresponding page.

Similarly, userB clicks on the second link etc..

The users fill in their respective forms, and submit their respectiveforms to the server (in any order).

The process on the server which handles the first submission (whether itis a servlet in Tomcat, or a cgi-bin under httpd etc.. doesn't matter),has no idea where this submit data comes from, right ? (It could evencome from a page obtained from another server).So the process in question has to evaluate this data, based only on whatit gets in this specific POST.

What we are discussing here is how, based only on the data coming infrom the browser POST, the server process determines the correctcharacter encoding of what it receives.And the answer so far is, it basically cannot be sure, because thebrowser does not send enough information with the POST, to allow theserver process to determine this unambiguously.

Of course, if the server process is sure that the form originally camefrom itself, and that all the forms composing this application aredefined such that the browser *should* always encode the data in aspecific way, then the process could reasonably assume a charset andencoding. But if one of the users uses a non-compliant browser thatdoes not give a jot about what html is telling it to do, then ..

A separate but connected question is that it seems that current browsersdo not follow entirely the HTML specifications, and even formultipart/form-data submissions, do not send the charset/encodingheaders that would enable the server to know for sure, athough they should.


To go back to your note above :

It is true that the browser, in the absence of other information, SHOULDconsider that the data it is going to submit should be in the encodingof the page containing the <form>.This /can/ be changed by using the "accept-charset" attribute of the<form> tag.However, even if that is true and if the browser follows thespecifications in that respect and does encode the data properly, itdoes not change what I mention above about the fact that the server isstill really in the dark about what it gets.




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: form parameters

Reply via email to