Re: Type distinguishing

2002-05-23 Thread Piers Cawley

Ovid <[EMAIL PROTECTED]> writes:

> --- Barry Jones <[EMAIL PROTECTED]> wrote:
>> If I have a hash full of values, and some of those values point to
>> arrays of more values...in a loop, how could I distinguish which ones
>> pointed to an array and which were just string values?
>
> Barry,
>
> Use the 'ref' function for this:
>
> perldoc -f ref
>
> One caveat, though: most uses of 'ref' seem wrong.  Generally
> speaking, whenever I am tempted to use ref on vanilla code, I
> discover that I could simplify what I meant to do.  In your case,
> have you considered making every value an array reference?  If you
> only have a scalar, you have a one-element array ref.  Generally,
> when I rework the code to eliminate the need for 'ref', I discover
> that my code is shorter and easier to maintain.
>
> I'm not saying that you've done anything wrong, of course.  I'm just
> suggesting that you might want to give the code a second look.

Depending on what the hash is being used for and whether the structure
is arbitrary, this may be a good time pull out the old OO toolkit,
specifically the 'Composite' and 'Visitor' patterns.

The idea here is that you have a treelike data structure made up of
containers and 'terminals'. You could then setup some classes like so:


package Container;

sub accept {
my $self = shift;
my($visitor) = @_;

$visitor->visit_container($self);
foreach my $element ($self->contents) {
$element->accept($visitor);
}
$visitor->leave_container($self);
}

sub contents {
my $self = shift;
wantarray ? @{$self->{contents}} : $self->{contents};
}

sub name {
my $self = shift;
$self->{name};
}


package Terminal;

sub accept {
my $self = shift;
my($visitor) = @_;

$visitor->visit_terminal($self);
}

sub name {
$self->{name};
}

So, suppose we use this data structure to represent a filesystem and
we want to list all the files in it, so, we write ourselves a visitor:

package FileLister;

sub visit_container {
my $self = shift;
my($directory) = @_;

$self->push_path($directory->name);
}

sub visit_terminal {
my $self = shift;
my($file) = @_;

print join '/', $self->path, $file->name;
}

sub leave_container {
my $self = shift;
my($directory) = @_;

($directory->name eq $self->pop_path) or
die "Something very strange happened";
}

NB: This is very skeletal there's no denyinig. Implementing
constructors, other accessor methods and all the other paraphenalia of
a fully functional class is left as an exercise for the interested
reader. 

This is a very powerful approach; you'll come across it in all manner
of places. There's all sorts of wrinkles you can add, for instance,
you might extend the interface to allow the various vist_* methods to
prune the tree walk by cunning use of return values or exceptions.

There's no denying that it carries a bunch of overhead with it though,
and it's not suited to everything; but it's a useful tool to know
about.


-- 
Piers

   "It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
 -- Jane Austen?


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: regexp

2001-11-15 Thread Piers Cawley

birgit kellner <[EMAIL PROTECTED]> writes:

> Many thanks to "Wagner-David" for the code. I've slightly changed it,
> as below, but have still further questions.

[...]

> ... checking on an initial digit is not specific enough, because in
> between a heading "$counter" and the next subheading "$counter.1",
> there might be a line beginning with "$counter" = a list entry. I
> really need to do a loop moving up a counter, *and* I need the
> search to be "position-aware".
> 
> I guess my question boils down to: How to code "when there's a match
> of "1\.\s" in the beginning of a line, check whether there's a
> "1\.1\.\s" after the end of that line - if so, assign everything in
> between the end of that line and "1\.1\.\s." as a hash value, and
> move up the counter to 2; if not, move up the counter to 2 and
> search for a line beginning with "2\.\s" *after* "1\.1\.\s". "

Let's do some exploration. I'm going to ignore the requirement for
grabbing the text in a section for now, let's deal with the hard bit.
First we'll need a '@counters' array, it can hold the current values
of our section numbers. Since 0 isn't a valid section number we'll
initialize it like so:

my @counter = (0);

This is our state variable.

Now, let's say we have a 'candidate' string that might be a valid
section number. We found it by matching:

my($candidate) = /((?:(?:^|\.)\d+)+)/

In general we have several possible legal strings that could match, let's
say the last section number we saw was 1.1, then our possibles are:
1.2, 1.1.1, or 2, so lets build our possible patterns

my $section_incremented_pattern = 
   '^' . join("\\.",
  @counter[0..($#counter - 1)], $counter[-1] + 1) . '$';

my $new_section_pattern = 
   '^' . join("\\.", @counter, 1) . '$';

These two are easy, all we have to do with these is:

if ($candidate =~ $section_incremented_pattern) {
$counter[-1]++; # Increment the last counter in @counter;
}
elsif ($candidate =~ $new_section_pattern) {
push @counter, 1; # Add a new section counter.
}
else {
...
}

So, the trick here is working out what to replace the ... with. Which
is left as an exercise to the interested reader. 

Yeah, right.

Okay, this time we match candidate against the current counter.
First we'll split $candidate.

my @possible_counter = split /\./, $candidate;

Okay, now, if $candidate is legal then @possible_counter will be
shorter than @counter:

return unless @possible_counter < @counter;

And every element of @possible_counter will be the same as @counter,
except for its last element, which will be one more than the
equivalent element of @counter.

$possible_counter[-1]--;

for my $i (0..$#@possible_counter) {
$possible_counter[$i] == $counter[$i] or return;
}

If we reach this point we have a legal string, so we reset
@possible_counter and replace @counter with it.

$possible_counter[-1]++;
@counter = @possible_counter;

return "Success!";

Now all you have to do is put all that together. And my train's
arriving at the station, so I'll leave that up to you. Hopefully the
code snippets above should point you in the right direction.

-- 
Piers

   "It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
 -- Jane Austen?

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: regex help

2001-11-15 Thread Piers Cawley

"Brett W. McCoy" <[EMAIL PROTECTED]> writes:

> On Tue, 13 Nov 2001, A. Rivera wrote:
> 
> 
>> I need help find the most effecient way to do this..
>>
>> I have a variable...
>> $data="this is a test";
>>
>> What is the quickest way to get $data to equal just the first two words of
>> the original variable
> 
> This splits the string up, extracts the first two words, and joins them
> again and re-assigns to $data:
> 
> $data = join(" ", (split(/\s/, $data))[0..1]);

You may find that
  $data = join('', (split /(\s+)/, $data)[0..2]);

Is a little more tolerant of multiple white space than Brett's answer.
Note the trick we use with split to capture the 'actual' whitespace
used to seperate the words by putting the split pattern in brackets.

You could also make the change by doing:

  $data =~ s/((?:(?:^|\s+)\S+){2}).*/$1/;

Which has the advantage that you can easily change the number of words
you match simply by changing the value in the braces. And if you want
to catch at most 2 words, you'd have {0,2} in there...

I'm not sure which is the fastest; I've not benchmarked it, but it's
generally more important to worry about which is the *clearest*.
Programmer time is far more valuable than processor time.

So if you are sure that your data will never contain more than one
space between words, go with Brett's solution. If it might have more
than one space between words and you don't mind replacing them with a
single space, go with Brett's solution but replace \s with \s+ in the
split pattern.

If you want to be flexible about data, then go with my solution, but
wrap it in a function like so:

sub truncate_to_n_words {
my($string, @bounds) = @_;
croak "Too many bounds" unless scalar @bounds <= 2;
croak "Not enough bounds" unless scalar @bounds;
local $" = ','; # makes "@bounds" seperate terms with a comma
$string =~ s{((?: # replace
  (?:^|\s+)   # line start or any number of spaces
  \S+ # followed by some none-white chars
  )   # Match this group
  {@bounds}   # between $bounds[0] and $bounds[1] times
 )# And remember it.
 .*   # Catch the remaining chars.
}{$1}x;   # and throw them away.

$_[0] = $string;  # Modify the original string in place.
}

sub truncate_to_2_words {
truncate_to_n_words($_[0], 2);
}

The idea being that, yes, the regular expression is ugly, but that
ugliness is hidden away behind a well named function. The code where
you need the behaviour will then look like:

truncate_to_2_words($string);

Which is substantially clearer than any of the one line solutions.

Of course, it's slower to run and took longer to write, but every time
you revisit code that makes use of it you'll not have to work out
what's going on.
   
-- 
Piers

   "It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
 -- Jane Austen?

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: problem with 'use strict'

2001-10-29 Thread Piers Cawley

"Brett W. McCoy" <[EMAIL PROTECTED]> writes:

> On Sun, 28 Oct 2001, Piers Cawley wrote:
> 
>> > strict is very picky... but it's a good thing to use because it
>> > enforces good, clean programming practices. In Perl6, strict will be
>> > on by default, so it has been written.
>>
>> It has? Where? And by whom?
> 
> I believe it was in one of the various epistles from Larry Wall
> regarding Perl 6 development, from www.perl.com

Okay, so I went back and reread the relevant section of Apocalypse 1
http://dev.perl.org/perl6/apocalypse/1 (Where Larry discusses 'RFC
16') and what do I see:

   ... whether Perl 6 main programs should default to strict or not (I
   think not)

So it seems that Larry is minded to keep the current situation where
strict doesn't get turned on unless you want it. And I don't remember
him saying anything different in any of the later Apocalypses, which
I've been following rather closely.

-- 
Piers

   "It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
 -- Jane Austen?

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: problem with 'use strict'

2001-10-29 Thread Piers Cawley

"Brett W. McCoy" <[EMAIL PROTECTED]> writes:

> On Fri, 26 Oct 2001, David Gilden wrote:
> 
>> > Sorry, I meant that to say "And it runs without 'use strict'?
>>
>> Yes the code works fine, untill I try to use strict
> 
> strict is very picky... but it's a good thing to use because it
> enforces good, clean programming practices. In Perl6, strict will be
> on by default, so it has been written.

It has? Where? And by whom?

-- 
Piers

   "It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
 -- Jane Austen?

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: how many items in a hash?

2001-08-15 Thread Piers Cawley

Bob Showalter <[EMAIL PROTECTED]> writes:
> Or, from perldoc perldata:
> 
>   If you evaluate a hash in scalar context, it returns false if the hash
>   is empty.  If there are any key/value pairs, it returns true; more
>   precisely, the value returned is a string consisting of the number of
>   used buckets and the number of allocated buckets, separated by a
>   slash.
> 
> print "Empty!\n" unless %myhash;

Note that the number of used buckets isn't (necessarily) the same as
the number of items in the hash.

-- 
Piers Cawley
www.iterative-software.com


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: HTTP::Date???

2001-06-01 Thread Piers Cawley

Please learn to quote properly...

"Dan Egli" <[EMAIL PROTECTED]> writes:

> It only seems to happen on certain modules. I'm suspecting it's more a
> damaged perl install. this is a RH 7.1 box and I downloaded/installed perl
> 5.6.1 on it to replace the 5.6.0 that it came with. I'm re-running the make
> install from the perl build dir (Thankfully I saved it) and when it's done
> will try again.
> 
> It's odd tho. I type:
> 
> perl -MCPAN -e 'install GD' and it works fine. Only with thinks like
> HTTP::Date and Data::Dumper does it work. I'm not sure whats up.

Try doing 

perl -MCPAN -e 'install "HTTP::Date"'

I think it's a bareword thing.

-- 
Piers Cawley
www.iterative-software.com




Re: MAIL::POP3

2001-06-01 Thread Piers Cawley

Paul Johnson <[EMAIL PROTECTED]> writes:

> On Thu, May 31, 2001 at 08:58:40PM -0400, KeN ClarK wrote:
> > I haven't done much of anything yet. But of course figured out how
> > fetchmail can do this and send it to my user locally. So that is working.
> > BUT, if a perl script that is cron'd will use less resources, I'd prefer
> > that.
> 
> I would be extremely surprised if fetchmail wasn't far cheaper to run
> than a Perl solution.

Mail::Audit (a procmail replacement by Simon Cozens) comes with a script 
which you can run to turn it into a fetchmail replacement as well. The
idea is that you can just leave it running in daemon mode and it'll
deliver mail etc without having to fork off extra processes.

Assuming you've got your filtering scripts sorted out properly, this
will probably be the most lightweight solution going.

And it's Perl. And it's available from CPAN. Result.

-- 
Piers Cawley
www.iterative-software.com




Re: SPLIT QUESTION

2001-05-31 Thread Piers Cawley

Jeff Pinyan <[EMAIL PROTECTED]> writes:

> On May 31, Pedro A Reche Gallardo said:
> 
> >How can I split  a  string of caracters -any but blank spaces-   into
> >the individual caracters?
> 
> So you want to split "what's up, doc?" into
> 
>   @chars = qw( w h a t ' s u p , d o c ? );
> 
> That is, every character except spaces?
> 
> First, remove spaces from the string:
> 
>   $string =~ tr/\n\r\f\t //d;  # translate whitespace to nothing
> 
> Then, split the string into characters:
> 
>   @chars = split //, $string;

Or, you could do it in one step:

@chars = split /\s*/, $string;

Which splits on any number (including zero) of whitespace characters,
throwing away the characters that match.

-- 
Piers Cawley
www.iterative-software.com




Re: test for real number

2001-05-31 Thread Piers Cawley

Paul <[EMAIL PROTECTED]> writes:
> BTW, this is another use of the same sort of trick as 
> 
>   { local $/ = undef;
> $file = ; # slurp the whole file into $file
>   }
> 
> which is more efficient than
> 
>while() { $file .= $_ }
> 
> or 
> 
>$file = join '', ;
> 
> Still, I have a coworker who swears by the latter as more work, yes,
> but MUCH more readable in his opinion.

You know, I think I agree with your cow orker. The intent of the last
version is obvious. It'd be interesting to see profiling data from a
typical app that slurped in a whole file to see if the method of
slurping made a significant difference in the running time.

-- 
Piers Cawley
www.iterative-software.com