Re: Subroutine returning a hash (?)

Curtis Poe Tue, 03 Jul 2001 07:06:36 -0700
--- Martijn van Exel <[EMAIL PROTECTED]> wrote:
> I was in a modular mood and thought it might just be a good idea to 
> do the parsing of the querystring in a subroutine. The subroutine 
> goes into a separate file that I then 'require' in all my future 
> CGI-scripts (is this a good way?).

In theory, yes, this is a good way.  In practice, from the code snippet below, I see 
that there
are a variety of issues that you may wish to be familiar with.
 
  sub parseit
  {
      foreach $name ($q->(param))
      {
          $input{$name} = $q->param($name);
      }
      return \%input;
  }

1.  You do not appear to be using 'strict'.

You will hear this many, many times as you work with Perl.  For smaller programs, it 
may not seem
like an issue, but as your programs get larger, finding out that it silently fails to 
work because
you misspelled "&getfieldnames" as "&getfieldsnames" is going to be increasingly 
frustrating. 
Personally, I type rather fast when I'm programming.  That means that I am prone to 
many typos. 
'strict' will catch most of those.  To make this snippet 'strict' safe, change it to:

  sub parseit
  {
      my %input;
      foreach my $name ($q->param)
      {
          $input{$name} = $q->param($name);
      }
      return \%input;
  }

2.  I'm surprised that no one has mentioned this problem:  the names in a query string 
can have
multiple values.  For example, have you ever joined a Web site or signed up for a 
mailing list and
seen the multiple checkboxes that allow you to select which "vendor" you want to 
receive spam
from?  Often, in the HTML, those checkboxes have the same name, but different values.  
Consider
the following tags:

  <input type="checkbox" name="vendor" value="Acme">
  <input type="checkbox" name="vendor" value="Generic">

If *both* of the above checkboxes are checked, the query string sent with them will be 
something
like:

  vendor=Acme&vendor=Generic

Your code, however, will only return the first instance of each.  That's because 
CGI::param, if
called in scalar context (i.e., the left side of the '=' has a scalar value to which 
you are
assigning the result) will only return the first value of the appropriate param.  If 
it's called
in array context, it will return all values.  Personally, I like to use an anonymous 
array in the
hash to handle this.  The following code should fix that:

  sub parseit
  {
      my %input;
      foreach my $name ($q->param)
      {
          my @values = $q->param( $name );
          if ( scalar @values == 1 )
          {
              # only one value, accessed the old way
              $input{$name} = $q->param($name);
          }
          else
          {
              # multiple values
              $input{$name} = [$q->param($name)];
          }
      }
      return \%input;
  }
  
Then, you can access the value of "vendor" as follows (in the parseit() routine):

  $input{ 'vendor' }->[0];  # This is 'Acme'
  $input{ 'vendor' }->[1];  # This is 'Generic'

Other values are accessed with the original methods:

  my $color = $input{'color'};

3.  Your routine uses a global '$q' as the reference to the CGI object.  Unless that 
$q was
declared in your package as a global variable or a lexically defined variable in the 
same block as
your &parseit routine, the routine will not recognize it.  A cleaner way is to pass 
this variable
to the routine.  Here's what I would do to finish cleaning it up (with formatting that 
I am more
comfortable with):


  sub parseit {
      my $q = shift;
      my %input;
      foreach my $name ($q->param) {
          my @values = $q->param( $name );
          if ( scalar @values == 1 ) {
              # only one value, accessed the old way
              $input{$name} = $q->param($name);
          } else {
              # multiple values
              $input{$name} = [$q->param($name)];
          }
      }
      return \%input;
  }

However, if this routine is exported into the calling code's namespace, then $q may 
very well be
visible to this code and the above change (my $q = shift) wouldn't be necessary.  
Here's the code
that would get placed at the top of your module to allow that:

  use strict;
  use Exporter;

  use vars qw/ @ISA @EXPORT /;
  @ISA = qw/Exporter/;
  @EXPORT = qw( parseit );

By placing those lines at the top, you can have the calling code see parseit directly.

This, however, is often considered bad practice.  Skipping over the detail and 
switching to the
more pressing problem:  this may lead you to think that you can just add the following 
line at the
top of your programs:

  my $q = CGI->new;

Then, you shouldn't have to call the &parseit routine with the CGI object as the 
argument.  But
what happens if you ever need to change the name of the CGI object?  You either will 
have to write
up some duplicate code to &parseit, with a different variable name, or you will have 
to change
&parseit.  The first idea is bad because you want to *reuse* code as much as possible. 
Duplicating it means that more than one piece of code must be kept in synch.  The 
second solution
has the problem that if you change the name of the $q variable, all other code that 
uses this
library is broken.  By passing in the CGI object as an argument and returning a 
reference to a
hash, no code outside of he routine is forced to adopt particular variable names.  
This makes your
code much more flexible.

Again, for small programs, much of this advice isn't overly important.  However, for 
large ones,
this advice is invaluable.

Cheers,
Curtis

=====
Senior Programmer
Onsite! Technology (http://www.onsitetech.com/)
"Ovid" on http://www.perlmonks.org/

__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail
http://personal.mail.yahoo.com/
Re: Subroutine returning a hash (?)

Reply via email to