On Tue, Jan 14, 2003 at 10:26:40AM -0000, Ivor Williams wrote:
> I've been reviewing what I have learned from the LWP stuff I have been doing 
> for Grubstreet. This involved form filling and POST method.
> 
> I was struck with the feeling that boiling down the form data is something that 
> has probably been done many times over - but a search didn't find anything 
> obvious.

Yes, I'm sure it has been. That doesn't mean that anyone ever released it to
CPAN, though. :)

Between us, london.pm must have written $LARGE_NUMBER of templating systems,
but only a couple actually made it out. Or how many reasonably generic 
CGI-as-dispatch systems were written back in the heady days of dotComNonsense?
 
> >From the existing code that I have written, is a sub formdata, which takes the 
> HTML page and form name as parameters. The form name is optional; if no form 
> name is specified, the routine picks up the first form on the page.
>
> The sub returns a list of key/value pairs. Thinking about it, I realised that 
> if the calling code turns it into a hash, this could lose any duplicate keys.

Except for checkboxen (and possibly even then), duplicate keys on forms are a 
very bad idea. One way I've seen them go horribly wrong is with a couple of 
URL-compression / cookie-caching proxies as used on The Portal Which Shall
Not Be Named. The idea of these doohickeys is to improve access for phones,
and other brain-dead clients, by shortening the (RFC-breaking) huge GET URLs
that portals always seem to end up with, and by keeping cookies on the server
side.   

By this stage, I'm sure people have guessed how good the hashing and storage
of these little sweethearts is. As soon as you start pointing them at forms
with duplicate keys, they start dropping data on the floor. 

The other thing I always try and keep in mind is nested forms. They are a very
good way to see how sturdy your application or web-snarfer really is. They're
invalid, nasty, rather more common than people seem to realise and have a 
horrible habit of making otherwise well-behaved web-tech spazz out.

I've even seen a (Java) example where if you parse a document containing nested
forms as XML, all appears to be well, and then if you cast it back to an HTML-type 
object, the JVM crashes. 

Luverly. 
 
> At this point, the light of recognition came on in my mind. This was a very 
> familiar concept, that of a CGI object.
> I could make formdata return a CGI object or something inheriting from CGI, 
> giving access to all the input fields via $form->param. Besides being capable 
> of being submitted via a normal POST of encoding type 
> application/x-www-form-urlencoded, I would also like the code to be able to 
> handle file uploads and encoding type multipart/form-data.

Please don't make it return a CGI object. CGI.pm is a nasty, bloated piece of
crap which should have been retired years ago. It uses what is now very
non-standard technology, it leaks memory under mod_perl and in many ways is
an example of how not to build a module.

Ben 

Reply via email to