Re: Form.pm (next installment of code review)

Ovid Fri, 05 Jul 2002 07:54:35 -0700

Hi David,

While I raise quite a few issues here, I have to say that I've rarely seen
anyone present a CGI.pm alternative that supports uploads.  I'm quite
impressed, so keep that in mind while I pick it apart :)


I'll ignore anything I may have covered in my discussion about the previous 
subroutine.  Further, there are quite a few thing that I would have done 
differently, but I'll try to focus specifically on those issues that I 
consider bugs (and I'll skip some issues that I *think* are problems but 
I'd prefer to research further before shooting my mouth off :).  I also won't 
bother on commenting on issues that I've seen you mention (DOS attacks,
for instance).

    sub parse_mform_input($) {
        my $input = shift;
        my (%form_vars, @name_file, @list, %var_count);
        if( $ENV{'CONTENT_TYPE'} =~ /multipart\/form-data; boundary=(.+)$/ ) {

            # snip

        } else { # if it is not mime
            return(parse_form_input($input)); # call the non-mime parser
        }
    }

Either you are parsing multipart/formdata (MFD) or you are not.  Your sub name
implies that this is for MFD, but then goes on to handle non-MFD.  Don't throw
curve balls at programmers :)

    if( $ENV{'CONTENT_TYPE'} =~ /multipart\/form-data; boundary=(.+)$/ ) {
        $boundary = '--'.$1;  # Using MIME to split out the form elements.

It turns out that IE 3.01 on Macintosh leaves off the initial '--' on the 
boundary.  Mac users with this version of IE won't be able to upload files.  
This may not seem like much of an issue, but it's important to realize that 
there can be numerous OS/server/browser issues and for a CGI form handling 
routine to be truly robust, it should attempt to address as many of these 
issues as it can.  If you want your code to be accepted for this, it has to 
be robust.  You'll get mysterious sporadic failures that can be very hard to
debug if you don't make sure to handle all of the special cases.

Further, if someone using your code notices odd, intermittant failures that
they don't get with CGI.pm, they'll probably *have* to go with CGI.pm.  Fast 
code that doesn't work almost always loses to slow code that does work.

    # split out each form variable into @list
    @list = split(/$boundary/, $input);

I see you're just going ahead and processing the data without any sanity
checks.  What happens if someone uploads a large file and hits "Esc" about
halfway through?  Does every Web server check this?  If your's doesn't,
$ENV{CONTENT_LENGTH} is not going to match the length of the data.  You need to
check this to ensure that you're not processing "partial" data and thereby
risking some data corruption.

    foreach $listitem (@list){

Again, you appear to not be using strict.

    # if the data has a filename in it then
    if ($listitem =~ / name=\"(.*)\"; filename=\"(.*?)\"[\r\n]{2}/){

I see some a bad regex here.  Items:

1.  I understand that some user agents do not wrap the values in quote marks.
Also, some versions of Netscape don't escape quotes in filenames.

2.  The dot star in your regex is bad.  Read "Death to Dot Star!" for more
information.  http://www.perlmonks.org/index.pl?node_id=24640.  This can
cause slow regexes, or worse, buggy ones.  

3.  You need to define the CRLF sequence.   What is the CRLF sequence that the
system that *this* version of your program is using?  Not every system is going
to use the same one.  Read through CGI.pm and see how it's handled there.  You
can search on $CRLF.

    # if the file name is not empty now
    if ($form_file{'file_name'} ne "") {
        # if the file is larger than 0 bytes
        if (length($form_file{'file_content'}) > 0){
            # creating a garanteed unique file name

Rather than going through all of you code to guarantee a unique name (which
will be thrown away anyway), why not just do some variation of this:

  use Digest::MD5;
  my $temp_file;
  do {
      $temp_file = Digest::MD5::md5_hex( $ENV{REMOTE_ADDR}, rand() );
  } until ( ! -e $upload_dir.$temp_file );

Much cleaner and easier to understand, though I confess that I was commenting
on style and not a bug, per se.

Also, I think you are overcommenting.  Your last snippet testing that the
filename isn't empty and the content length is greater than zero are good
examples.

Much of the rest of the code has issues similar to what I've outlined above.  I
would also strongly recommend breaking your subroutines up into much smaller
chunks.  They're easier to understand that way.

Cheers,
Curtis "Ovid" Poe

=====
"Ovid" on http://www.perlmonks.org/
Someone asked me how to count to 10 in Perl:
push@A,$_ for reverse q.e...q.n.;for(@A){$_=unpack(q|c|,$_);@a=split//;
shift@a;shift@a if $a[$[]eq$[;$_=join q||,@a};print $_,$/for reverse @A

__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free
http://sbc.yahoo.com

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Form.pm (next installment of code review)

Reply via email to