Re: Perl best practices

2007-09-17 Thread Ben Scott
On 9/13/07, Paul Lussier <[EMAIL PROTECTED]> wrote:
> For all those just tuning in, Ben and I are in violent and vocal
> agreement with each other, and at this point are merely quibbling over
> semantics :)

  Hey, now, I came here for an argument, and I demand to get one!  And
it better be more than just simple contradiction, too.  I expect a
connected series of statements intended to establish a proposition!
;-)

  (And they say Perl and Python have nothing in common.  ;-)  )

> To avoid LTS and backslashitis in a regexp, I tend to do something like:
>m|/foo/bar|/bar/baz|g;

  The pipe (|) is another often-used regexp syntax character, though.
If I was going to use a single alternate character for m//, I'd prolly
use a bang (!).  But I like balanced pairs -- {} or or [] or <> or ()
-- because they nicely identify the start and end of the thing, rather
than simply delimiting it.  And if it's a substitution, they let you
separate the pattern from the replacement with whitespace.  I
frequently use that to line things up for visual structure (e.g., that
condense_type() function I posted).

> Agreed, though, we ... do things like this slightly differently,
> completely avoiding the $_ dilemma:

  That actually wouldn't work for the code I posted, which is
performing multiple transformations on the same string repeatedly.
The fact that you missed that concerns me.  *That's* a sign of
inadequate clarity.  Looking at the code, I don't see a way to make
that clearer in the code itself, so I guess it needs a comment to that
effect.  Phooey.  (In my thinking, ideal code needs *no* comments,
because it's so obvious what's going on.  Obviously, that's a
theoretical ideal, and not something that happens in practice, but I
think it makes a good goal.)

  Unrelated to the above, I think there is an observation to be made
here about "sophisticated" solutions also adding complexity.  KISS, as
the saying goes.  By introducing variables and loops and control
structures, you're making things more complex, to no gain that I can
see.  A simple list of s/// operations does the same thing, and does
not require any additional cognition on the part of the reader.  (It
might even be more efficient at runtime, although that would depend on
how smart Perl's optimizer is.)

> ... by we, I mean the company which currently puts food on my table ...

  I didn't know Stop and Shop used Perl...  ;-)

> Once readability has been achieved, the next
> priority ought to be future maintenance and extensibility, IMO.

  Yup.

> Alas, assertNumArgs() is a part of a huge
> home-grown library of routines we have.

  Everybody has these.  They're extremely useful.

>>> The compelling argument is this: It should be blatantly obvious to
>>> whomever is going to be maintaining your code in 6 months what you
>>> were thinking
>>
>>   I do not think I could agree with you more here.  The thing you seem
>> to be ignoring in my argument is that "clarity" is subjective and
>> often depends on context.  :)
>
> I'm not ignoring it.  I'm saying that where you have stopped because
> you think it is sufficiently clear can in fact be made cleaner and
> clearer for the sakes of both clarity and future maintenance.

  No, I'm saying that clarity is subjective and depends on context,
and what you deem "clearer" I deem "less clear".

> There are only a finite number of options for any given command [such as 
> tar].  The
> same is not true for writing perl code.

  True, but I think my point still stands.  Verbosity does not equal
clarity, and indeed, verbosity sometimes interferes with readability.

  (My choice of tar was, perhaps, a bad example, because I do agree
with your argument WRT to your AMANDA example.)

-- Ben
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Re: Perl best practices

2007-09-14 Thread Paul Lussier
Lloyd Kvam <[EMAIL PROTECTED]> writes:

> On Thu, 2007-09-13 at 23:58 -0400, Paul Lussier wrote:
>> For all those just tuning in, Ben and I are in violent and vocal
>> agreement with each other, and at this point are merely quibbling over
>> semantics :) 
>
> As an old Python guy who knows just enough Perl to get it wrong, this
> has been educational (and even fun).

I'm glad you enjoyed it!  That's why I wrote it.  I feel that perl has
a bad reputation for being "ugly".  I believe it's a myth.  One can
write ugly code in any language.  Python went so far as to put white
space restrictions place to "enforce" clean code.  While that may work
to some extent, I've seen ugly python code.

I, by no means, claim my way is the best or only way.  I don't even
claim what I posted will work completely correctly.  I was merely
trying to demonstrate that there are better ways to write any code
such that it is both elegant and pleasing to look at as well as
maintain and extend.

I would love to see other's interpretations of Ben's code, in any language.

How would you accomplish the same in awk, python, ruby, java, C, etc.?

-- 
Seeya,
Paul
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Re: Perl best practices

2007-09-14 Thread Paul Lussier
[EMAIL PROTECTED] (Kevin D. Clark) writes:

> Bill Ricker writes:
>
>> I highly recommend Damian Conway's book of same title, "Perl Best
>> Practices", which recommends a much tamer, consistent readable style
>> within a workgroup than he uses in his own code (depending on context)
>> -- he suggests one style but encourages each group to decide for
>> themselves and take his list of 255 practices as a template for their
>> own local standard.
>
> I will second this.  Great book by a great author.

Ahm, fellas, we've got a fence-post error here :)

This response to Ben is what started all it all:

  Paul Lussier <[EMAIL PROTECTED]> writes:

  > "Ben Scott" <[EMAIL PROTECTED]> writes:
  >
  >>   You don't need to put parenthesis around arguments to split, and you
  >> don't need to explicitly specify the default pattern match target
  >> ($_).
  >
  > Unfortunately, you both "don't *need* to" and "*can* do" anything in
  > perl.  Often at the same time!  This is what leads to very difficult
  > to read, and more difficult to maintain perl code.
  [...]
  > I highly recommend Damian Conway's book "Perl Best Practices", which
  > outlines these and many other useful pieces of advice.

So, Kevin is really thirding ;)
-- 
Seeya,
Paul
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Re: Perl best practices (was: question ... Split operator in Perl)

2007-09-14 Thread Kevin D. Clark

Bill Ricker writes:

> I highly recommend Damian Conway's book of same title, "Perl Best
> Practices", which recommends a much tamer, consistent readable style
> within a workgroup than he uses in his own code (depending on context)
> -- he suggests one style but encourages each group to decide for
> themselves and take his list of 255 practices as a template for their
> own local standard.

I will second this.  Great book by a great author.

Regards,

--kevin
-- 
GnuPG ID: B280F24E  Error messages
alumni.unh.edu!kdc  strewn across my terminal.
A vein starts to throb.
   -- Coy.pm
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Re: Perl best practices (was: question ... Split operator in Perl)

2007-09-14 Thread Bill Ricker
I highly recommend Damian Conway's book of same title, "Perl Best
Practices", which recommends a much tamer, consistent readable style
within a workgroup than he uses in his own code (depending on context)
-- he suggests one style but encourages each group to decide for
themselves and take his list of 255 practices as a template for their
own local standard.

http://www.oreilly.com/catalog/perlbp/

-- 
Bill
[EMAIL PROTECTED] [EMAIL PROTECTED]
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Re: Perl best practices

2007-09-14 Thread Lloyd Kvam
On Thu, 2007-09-13 at 23:58 -0400, Paul Lussier wrote:
> For all those just tuning in, Ben and I are in violent and vocal
> agreement with each other, and at this point are merely quibbling over
> semantics :) 

As an old Python guy who knows just enough Perl to get it wrong, this
has been educational (and even fun).

Thanks.

-- 
Lloyd Kvam
Venix Corp

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Re: Perl best practices

2007-09-13 Thread Paul Lussier

For all those just tuning in, Ben and I are in violent and vocal
agreement with each other, and at this point are merely quibbling over
semantics :)

"Ben Scott" <[EMAIL PROTECTED]> writes:

>   Er, yes.  "blah" in this case was meta-syntactic, and I was still
> thinking of the first example in this discussion, which had LTS
> (Leaning Toothpick Syndrome).  I will use // if the regexp doesn't
> suffer from LTS.  I use m{} or s{}{} when the regexp otherwise
> contains slashes.

Something about the use {} and () in regexps really bothers me.  I
think it's because in general, perl overloads too many things to begin
with.  To use {} for regexp delimiting is confusing and completely
non-intuitive to me. They are meant to denote either a hash element or
a code block.  Trying to make my mind use them for regexps hurts :)

To avoid LTS and backslashitis in a regexp, I tend to do something like:

   m|/foo/bar|/bar/baz|g;

The | is close enough to / that it's instantly clear to me.

> "A foolish consistency is the hobgoblin of little minds."  -- Ralph
> Waldo Emerson

Yeah, what the old, dead guy said :)

> As a completely non-contrived example, here is an illustration of
> when I think implicit use of $_ is very appropriate.
[...]
> sub condense_type($) {
> # condense a MIME content type to something shorter, for people
> $_ = $_[0];
> s{^text/plain$} {text};
> s{^text/html$}  {html};
> s{^text/css$}   {css};
> s{^text/javascript$}{jscript};
> s{^text/xml$}   {xml};
> s{^text/}   {};
> s{^image/.*}{image};
> s{^video/.*}{video};
> s{^audio/.*}{audio};
> s{^multipart/byteranges}{bytes};
> s{^application/}{};
> s{^octet-stream$}   {binary};
> s{^x-javascript$}   {jscript};
> s{^x-shockwave-flash$}  {flash};
> s{\*/\*}{stars};# some content gets marked */*
> return $_;
> }
>
> I could have used a regular named variable (say, $type) and
> repeated "$type =~" over and over again for 14 lines.  I believe
> that would actually harm the readability of the code.

Agreed, though, we (by we, I mean the company which currently puts
food on my table :) do things like this slightly differently,
completely avoiding the $_ dilemma:

  $match = shift;
  %mimeTypes = 
('^text/plain$'  => "text"   ,
 '^text/html$'   => "html"   ,
 '^text/css$'=> "css"   ,
 '^text/javascript$' => "jscript",
 '^text/xml$'=> "xml"   ,
 '^text/'=> ""  ,
 '^image/.*' => "image"  ,
 '^video/.*' => "video"  ,
 '^audio/.*' => "audio"  ,
 '^multipart/byteranges' => "bytes"  ,
 '^application/' => ""  ,
 '^octet-stream$'=> "binary" ,
 '^x-javascript$'=> "jscript",
 '^x-shockwave-flash$'   => "flash"  ,
 '*/*'   => "stars"  ,# some content gets marked */*
);

  foreach my $mtype (keys %mimeTypes) {
if ($mtype =~ /$match/)
  return $mimeType{$mtype};
}
  }

Also, the foreach could be written as:

  map { ( $match =~ /$_/) && return $mimeTypes{$_}} keys %mimeTypes

Though I find this completely readable, it suffers from the problem
that it's not easily extensible.  If you decide you need to do more
processing within the loop, the foreach is much easier to extend.  You
just plonk another line in there and operate on the already existing
variables.  With the map() style loop, this becomes more difficult.

So, though I love map(), I would have to argue this is not the best
place to use it.  Once readability has been achieved, the next
priority ought to be future maintenance and extensibility, IMO.

>   As a counter-example from the same script, here's something using
> explicit names and grouping which isn't strictly needed, because I
> find it clearer:
>
> sub condense_size($) {
> # consense a byte-count into K/M/G
> my $size = $_[0];
> if($size > $gigabyte) { $size = ($size / $gigabyte) . "G"; }
> elsif ($size > $megabyte) { $size = ($size / $megabyte) . "M"; }
> elsif ($size > $kilobyte) { $size = ($size / $kilobyte) . "K"; }
> return $size;
> }

I tend to like this style too, though I'd use a slightly different
syntax.  It's otherwise exactly the same.

  my $size = shift;
  ($size > $gigabyte) && { return (($size/$gigabyte) . "G")};
  ($size > $megabyte) && { return (($size/$megabyte) . "M")};
  ($size > $kilobyte) && { return (($size/$kilobyte) . "K")};

Or, perhaps, if you wanted to be a little more cleverer:

  my  %units = ($gigabyte => sub { int($_[0]/$gigabyte) . 'G'},
$megabyte => sub { int($_[0]/$megabyte) . 'M'},
$kilobyte => sub { int($_[0]/$kilobyte) . 'K'},
 );

  foreach my $base (sort {$b <=> $a } keys %units) {
if ($size > $base) {
  print ($units{$base}->($size),"\n");
  last;
}
  }

This last approach is both too clever by 1, but also, slightl

Re: Perl best practices

2007-09-13 Thread Ben Scott
On 13 Sep 2007 12:10:58 -0400, Kevin D. Clark <[EMAIL PROTECTED]> wrote:
> If I write anything else, it would just be a combination of me
> nit-picking for no purpose and hot air.

  Welcome to the Internet!   ;-)

-- Ben
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Re: Perl best practices

2007-09-13 Thread Ben Scott
On 9/13/07, Paul Lussier <[EMAIL PROTECTED]> wrote:
> When writing code which will be used, looked at, modified, and
> maintained by a group of people, it is best to agree upon and strictly
> adhere to a common set of coding standards.

  Yes.  But when the designated group of people is "all of humanity",
the problem of agreeing on said standards becomes difficult.  :)

> I find both of these bothersome :)  *I* prefer:
>
> @foo = split (/blah/);

  Er, yes.  "blah" in this case was meta-syntactic, and I was still
thinking of the first example in this discussion, which had LTS
(Leaning Toothpick Syndrome).  I will use // if the regexp doesn't
suffer from LTS.  I use m{} or s{}{} when the regexp otherwise
contains slashes.

> @foo = split (/blah/);
> @foo = split (/blah/, $actualVariableName);
> The former implies $_, which need not be explicitly stated in this case.

  Right.  That's exactly what I was saying.  ;-)

> map { grep { ... $_ } $_ } @foo;
>
> Which $_ is which, and where is each getting it's data from?

  Like I said, I use things like parenthesis, braces, named variables,
etc., liberally when I find the meaning/intent is not obvious in
context.  But it's a case-by-case call, not an absolute, inviolable
rule.

  "A foolish consistency is the hobgoblin of little minds."  -- Ralph
Waldo Emerson

  As a completely non-contrived example, here is an illustration of
when I think implicit use of $_ is very appropriate.  It's from a
Squid log analysis tool I wrote, where I wanted to condense the MIME
content type into something smaller and more appropriate for a log
report.  Here's the code (as usual, view in a monospace font to get
this to line up properly):

sub condense_type($) {
# condense a MIME content type to something shorter, for people
$_ = $_[0];
s{^text/plain$} {text};
s{^text/html$}  {html};
s{^text/css$}   {css};
s{^text/javascript$}{jscript};
s{^text/xml$}   {xml};
s{^text/}   {};
s{^image/.*}{image};
s{^video/.*}{video};
s{^audio/.*}{audio};
s{^multipart/byteranges}{bytes};
s{^application/}{};
s{^octet-stream$}   {binary};
s{^x-javascript$}   {jscript};
s{^x-shockwave-flash$}  {flash};
s{\*/\*}{stars};# some content gets marked */*
return $_;
}

  I could have used a regular named variable (say, $type) and repeated
"$type =~" over and over again for 14 lines.  I believe that would
actually harm the readability of the code.  I find it clearer with use
of implicit $_, because it puts the focus on the fact that I'm doing a
bunch of transformations on the same thing, over and over again.

  As a counter-example from the same script, here's something using
explicit names and grouping which isn't strictly needed, because I
find it clearer:

sub condense_size($) {
# consense a byte-count into K/M/G
my $size = $_[0];
if($size > $gigabyte) { $size = ($size / $gigabyte) . "G"; }
elsif ($size > $megabyte) { $size = ($size / $megabyte) . "M"; }
elsif ($size > $kilobyte) { $size = ($size / $kilobyte) . "K"; }
return $size;
}

>> Explicitly specifying $_ over and over again just clutters up the
>> code with pointless syntax.  It's one more thing my brain has to
>> recognize and process.
>
> Right, which is why you shouldn't depend upon $_ in these contexts and
> explicitly state a variable name ...

  A named variable would be *two* more things.  ;-)

>   s/(^\s*|\s*$)//g; # trim leading/trailing whitespace

  Er, yah, that would be even better.  Not sure why I didn't just use
s/// with /g when I wrote that the first time around.

  (The actual script in my ~/bin/ has several lines of comments
explaining certain design decisions, but that's one one of them.)

>   my $file = shift;

  You're using an implicit argument to shift there.  ;-)

> The compelling argument is this: It should be blatantly obvious to
> whomever is going to be maintaining your code in 6 months what you
> were thinking

  I do not think I could agree with you more here.  The thing you seem
to be ignoring in my argument is that "clarity" is subjective and
often depends on context.  :)

>> I write Perl programs with the assumption that the reader
understands Perl ...
>
> ... even those who *claim* to know the language, often times are
> just fooling themselves.

  I'm not going to penalize the competent because there are others who
are incompetent.

>> Many say similar things about Unix.  Or Emacs.  :-) I'm don't argue
>> that one approach is right and the other wrong, but I do think that
>> both approaches have their merits.
>
> Which approaches are you talking about?  Approaches to learning, or to
> writing?

  Yes.  :)

  Let me restate: A pattern which is powerful and easy-to-use is
sometimes unavoidably non-obvious.

  Or perhaps an example of a similar principle in a different context:
When invoking tar from a shell script, which of the following do you
prefer?

tar --create --gzip --verbose -

Re: Perl best practices

2007-09-13 Thread Kevin D. Clark

Paul Lussier writes:

>  (/me waiting for Kevin to pipe in here in 4...3...2...1... ;)

Ben and Paul are competent Perl programmers.  They write good code.
Code should be written to be clear.  While it is nice if code written
in a given language is understandable by people who don't know the
language, this property isn't guaranteed.  Cryptic one-liners can be
hard to follow, but they can also be beautiful and useful.

If I write anything else, it would just be a combination of me
nit-picking for no purpose and hot air.

Kind regards,

--kevin
-- 
GnuPG ID: B280F24E It is best to forget the great sky
alumni.unh.edu!kdc And to retire from every wind
 -- Mumon
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Re: Perl best practices

2007-09-13 Thread Paul Lussier
"Ben Scott" <[EMAIL PROTECTED]> writes:

>   Personally, in the proper context, I find this:

When writing code which will be used, looked at, modified, and
maintained by no one else, doing whatever makes you happy and more
efficient makes sense, and is more efficient and expedient.

When writing code which will be used, looked at, modified, and
maintained by a group of people, it is best to agree upon and strictly
adhere to a common set of coding standards.  This makes the entire
group more eficient.  The personal likes and/or dislikes of anyone
person may or may not bother anyone else in the group.


>   @foo = split m{blah};
>
> to be easier to read and comprehend at a glance than this:
>
>   @foo = split (m{blah}, $_);


I find both of these bothersome :)  *I* prefer:

@foo = split (/blah/);
or: @foo = split (/blah/, $actualVariableName);

The former implies $_, which need not be explicitly stated in this case.
The latter clearly denotes where you're getting your data from.

$_ has a *lot* of magical properties which can really screw things up,
especially in cases like:

   map { grep { ... $_ } $_ } @foo;

Which $_ is which, and where is each getting it's data from?  This is
where the use of named variables I find to be better than just
depending upon built-ins like $_.

> Explicitly specifying $_ over and over again just clutters up the
> code with pointless syntax.  It's one more thing my brain has to
> recognize and process.

Right, which is why you shouldn't depend upon $_ in these contexts and
explicitly state a variable name (which should also be my'ed into the
proper scope :)

>   I don't arbitrarily assign to $_ and use it at random, the way some
> people do.  And I do make use of parenthesis, braces, and such, even
> when they are not needed, when I find it makes the code clearer.  But
> I also leave them out when I find it makes the code clearer.

No arguments with that.  In general, IMO, clarity is of the utmost
importance.  There are many "best practices" which can help aid
clarity though.  I find one such practice is to always use func(args)
because it makes it blatantly obvious you're calling a function.
(perhaps the one exception is with print, but even then, I find myself
very often using it there too.  To me:

print (join("\s", "Some text", func(args), "more text",),
   "\n"
  );

is far more readable than
print join " ", "Some text", func(args), "more text", "\n";

In the former, if I need to add "stuff" to the join, it's blatantly
obvious where it goes.  In the latter, it is not.

>   For a slightly less contrived example, take a script which trims
> leading and trailing whitespace from each line in an input file.  I
> already have one implementation, and I just wrote up another one.
[...]
>   Assuming the reader is familiar with the language, which do you
> think will be easier/quicker to comprehend?

Both of these hurt my eyes! :)

This one is short and sweet:

> #!/usr/bin/perl -wp
> s/^[\x20\t]*//; # trim leading space
> s/[\x20\t]*$//; # trim trailing space

but I'd rewrite it as:

  #!/usr/bin/perl -p

  s/(^\s*|\s*$)//g; # trim leading/trailing whitespace


For a script which optionally took stdin, I'd write it as:

  #!/usr/bin/perl -w

  use English;

  my $file = shift;
  my $FH; 
 
  if (!$file) {
*FH = *STDIN;
  } else {
open(FH, "$file") || die("Could not open $file: $ERRNO\n";
  }

  while (my $line = ) {
$line =~ s/(^\s*|\s*$)//g; # trim leading/trailing whitespace
print("$line\n");
  }
  close(FH);


> It may be true that someone who *isn't* familiar with Perl would
> find it easier to puzzle out the meaning of the longer version.

I'm fairly comfortable with perl.  I could puzzle out the meaning
fairly easily.  And I'll even concede that as far as most perl, it's
pretty good.  But, as is true I'm sure even with my own code, there's
always room for improvement :)

 (/me waiting for Kevin to pipe in here in 4...3...2...1... ;)

> But I don't find that a particularly compelling argument.

The compelling argument is this: It should be blatantly obvious to
whomever is going to be maintaining your code in 6 months what you
were thinking :) The easier you make it up front to read your code and
discern your mindset, the less time it take the maintainer in 6
months.  Many times, that future maintainer is *you* :)

> I write Perl programs with the assumption that the reader
> understands Perl, the same way I am assuming readers of this message
> understand English.  :)

Ahh, yes.  But as the superintendent of the Lawrence, MA, School
system has recently shown, even those who *claim* to know the
language, often times are just fooling themselves.  Just "axe" him :)

> This may mean Perl, as practiced, is harder to learn than a language
> which is more rigid and always verbose.

I think perl is incredibly easy to learn if you learn from a good
source.  The documentation is one such source.  Other people's code is
mo

Re: Perl best practices (was: question ... Split operator in Perl)

2007-09-13 Thread Ben Scott
On 9/13/07, John Abreau <[EMAIL PROTECTED]> wrote:
>> s/^[\x20\t]*//; # trim leading space
>> s/[\x20\t]*$//; # trim trailing space
>
> Any particular reason to use [\x20\t] instead of \s ?

  \s would also eat newlines and similar.  At a minimum, it would have
to explicitly print with "\n" and use the -n switch instead of the -p
switch.  Which would be fine.  But if the file contains non-native
line endings, it can result in those getting mangled, or so I've
found.  I've got a lot of such files hanging around on my system.
Just eating space and tab worked better for me.

  OTOH, \s should eat other kinds of in-line whitespace that might be
encountered, including anything Unicode dishes up.  So that might be
better for some situations.

  YMMV.  Or, since this is Perl we're talking about: TIMTOWTDI.  ;-)

-- Ben
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Re: Perl best practices (was: question ... Split operator in Perl)

2007-09-12 Thread John Abreau

On Wed, September 12, 2007 9:37 pm, Ben Scott said:

> s/^[\x20\t]*//; # trim leading space
> s/[\x20\t]*$//; # trim trailing space
>

Any particular reason to use [\x20\t] instead of \s ?


-- 
John Abreau / Executive Director, Boston Linux & Unix
IM: [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL 
PROTECTED]
Email [EMAIL PROTECTED] / WWW http://www.abreau.net / PGP-Key-ID 0xD5C7B5D9
PGP-Key-Fingerprint 72 FB 39 4F 3C 3B D6 5B E0 C8 5A 6E F1 2C BE 99


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


Perl best practices (was: question ... Split operator in Perl)

2007-09-12 Thread Ben Scott
On 9/12/07, Paul Lussier <[EMAIL PROTECTED]> wrote:
>>   You don't need to put parenthesis around arguments to split, and you
>> don't need to explicitly specify the default pattern match target
>> ($_).
>
> Unfortunately, you both "don't *need* to" and "*can* do" anything in
> perl.  Often at the same time!  This is what leads to very difficult
> to read, and more difficult to maintain perl code.

  Well, this is something of a religious issue.  :)  But here's my opinion:

  Personally, in the proper context, I find this:

@foo = split m{blah};

to be easier to read and comprehend at a glance than this:

@foo = split (m{blah}, $_);

  The "proper context" being a loop or other pattern where I'm running
a series of operations on a series of inputs (i.e., lines).  In those
contexts, where the whole point is to perform a series of operations
on each input, assembly-line fashion, using the the implicit argument
lets me focus on what the code is actually *doing*.  Explicitly
specifying $_ over and over again just clutters up the code with
pointless syntax.  It's one more thing my brain has to recognize and
process.

  I don't arbitrarily assign to $_ and use it at random, the way some
people do.  And I do make use of parenthesis, braces, and such, even
when they are not needed, when I find it makes the code clearer.  But
I also leave them out when I find it makes the code clearer.

  For a slightly less contrived example, take a script which trims
leading and trailing whitespace from each line in an input file.  I
already have one implementation, and I just wrote up another one.
Here's one of the possible implementations:

#!/usr/bin/perl
$filename = $ARGV[0];
if ($filename eq "") {
open(INPUTFILE, "<&STDIN");
}
else {
open(INPUTFILE, "< $filename") or die("could not open input file!");
}
while(not(eof(INPUTFILE))) {
$line = ;
$line =~ s/^[\x20\t]*//; # trim leading space
$line =~ s/[\x20\t]*$//; # trim trailing space
print($line);
}

  And here is another possibility:

#!/usr/bin/perl -p
s/^[\x20\t]*//; # trim leading space
s/[\x20\t]*$//; # trim trailing space

  Assuming the reader is familiar with the language, which do you
think will be easier/quicker to comprehend?

  It may be true that someone who *isn't* familiar with Perl would
find it easier to puzzle out the meaning of the longer version.  But I
don't find that a particularly compelling argument.  I write Perl
programs with the assumption that the reader understands Perl, the
same way I am assuming readers of this message understand English.  :)

  This may mean Perl, as practiced, is harder to learn than a language
which is more rigid and always verbose.  Many say similar things about
Unix.  Or Emacs.  :-)  I'm don't argue that one approach is right and
the other wrong, but I do think that both approaches have their
merits.

  YMMV.

-- Ben
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/