Re: [BPQ] help!! any idea whats wrong with this??

2001-04-27 Thread Timothy Kimball


: Ah, a Heisenbug.  There's a problem with your benchmarking:

Yep, you're right. map is slightly slower when it actually has
something to do.  I stand corrected... again.

So the moral of the story is: If you want your code
to run really fast, make it do nothing. ;)

That's what I love about Perl: been programming with
it for 8 years, & still learning.

-- tdk



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-27 Thread Michael Lamertz

Timothy Kimball ([EMAIL PROTECTED]) wrote:
> 
> : Because someone (and with apologies to all, I don't recall off the top
> : of my head who)correctly pointed out to me earlier in this thread that
> : using map() here was inefficient. map() builds and returns an array, so
> : there's no point in using it in this void context. Aside from that,
> : both do the same thing. The postfix for is cleaner. =o)
> 
> I agree that the postfix is cleaner, but when I benchmark these, map
> looks faster- though several months ago, map was slower (IIRC). Maybe
> something changed in 5.6.0 to make map faster in a null context...?
> 
> Here's the script & output (Perl 5.6.0 on an Ultra 10):

Whoa!  It was me who noted that the map should be avoided in void
context, but trying your benchmark, I had the same results.  However,
since I did some benchmarks on my own *before* stating that map should
not be used, I had some completely different results.

I was suspicious, since your results are by far too fast - an U10 is not
that much of a big machine, isn't it?  ;-)

...hack, hack, hack...

Ah, a Heisenbug.  There's a problem with your benchmarking:

-- snip --
my @lines = qw(
...
timethese(500_000,{
"1. map" => 'map { s/a// } @lines',
-- snip --

Inside 'timethese', @lines is unknown and thus empty.  Looks as if the
Benchmark module ignores '$@' after the eval, but I haven't checked for
that.  But fact is, you're running 500_000 loops on en empty list, and
map doesn't need to create any new list at all - well, at least it looks
as if map is *pretty* fast on empty lists.

There are 2 modifications required in your benchmark:

a. make the @list array global, so it's visible inside 'timethese'

b. make sure the s/ doesn't truncate the string, so there's still
   some work to do after 10 test loops. (I did that by replacing s/
   with a switching tr...)

(
c. as a not really necessary addon I decided to create more random
   test data (yes, I *am* using nested maps there >:-> ).
)

Here's my version of your benchmark:

-- snip --
kanku-dai:~$ cat check.pl
#!/usr/bin/perl -w

use strict;
use Benchmark;
use vars qw{@lines};

my @chars=('A'..'Z', 'a'..'z', 0 .. 9, ' ');
@lines=map { join('', @chars[map { rand @chars } (0 .. 63)]) } (1 .. 10);

timethese(500_000,{
"1. map" => 'map { tr/abAB/baBA/ } @lines',
"2. foreach" => 'foreach ( @lines ) { tr/abAB/baBA/ }',
"3. for" => 'tr/abAB/baBA/ for @lines',
});
-- snip --

Here are the new results:

-- new --
kanku-dai:~$ perl check.pl
Benchmark: timing 50 iterations of 1. map, 2. foreach, 3. for...
1. map:  7 wallclock secs ( 6.95 usr +  0.01 sys =  6.96 CPU) @ 71839.08/s 
(n=50)
2. foreach:  9 wallclock secs ( 8.38 usr +  0.01 sys =  8.39 CPU) @ 59594.76/s 
(n=50)
3. for:  9 wallclock secs ( 8.13 usr +  0.01 sys =  8.14 CPU) @ 61425.06/s 
(n=50)
kanku-dai:~$ 
-- new --

So, it looks as if you're right, map *IS* a bit faster on small data
sets, but not in the dimensions that your benchmark suggested.
Increasing the amount of data makes that difference go away however.
Here's the data for 500 benchmark loops over 10_000 lines of data:

-- mod_new --
kanku-dai:~$ perl check.pl
Benchmark: timing 500 iterations of 1. map, 2. foreach, 3. for...
1. map: 10 wallclock secs ( 9.71 usr +  0.04 sys =  9.75 CPU) @ 51.28/s (n=500)
2. foreach:  9 wallclock secs ( 9.58 usr +  0.01 sys =  9.59 CPU) @ 52.14/s (n=500)
3. for:  9 wallclock secs ( 9.44 usr +  0.02 sys =  9.46 CPU) @ 52.85/s (n=500)
kanku-dai:~$ 
-- mod_new --

Conclusion:  The perlfaq6 information seems outdated, so the only
argument against map is the question of style, readability and personal
taste - naturally, I stick with my style ;-)

Mike

-- 
 If we fail, we will lose the war.

Michael Lamertz  | [EMAIL PROTECTED] / [EMAIL PROTECTED]
Nordstr. 49  | http://www.lamertz.net
50733 Cologne| Work: +49 221 3091-121
Germany  | Priv: +49 221 445420 / +49 171 6900 310



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-27 Thread Timothy Kimball


: Because someone (and with apologies to all, I don't recall off the top
: of my head who)correctly pointed out to me earlier in this thread that
: using map() here was inefficient. map() builds and returns an array, so
: there's no point in using it in this void context. Aside from that,
: both do the same thing. The postfix for is cleaner. =o)

I agree that the postfix is cleaner, but when I benchmark these, map
looks faster- though several months ago, map was slower (IIRC). Maybe
something changed in 5.6.0 to make map faster in a null context...?

Here's the script & output (Perl 5.6.0 on an Ultra 10):

archdev 10:58% more ./z
#!/archive/data1/bin/perl

use strict;
use Benchmark;

my @lines = qw(
10th 1st 2nd 3rd 4th 5th 6th 7th 8th 9th
a AAA AAAS Aarhus Aaron AAU ABA Ababa aback abacus
);

timethese(500_000,{
"1. map" => 'map { s/a// } @lines',
"2. foreach" => 'foreach ( @lines ) { s/a// }',
"3. for" => 's/a// for @lines',
});

archdev 10:58am 147% ./z
Benchmark: timing 50 iterations of 1. map, 2. foreach, 3. for...
1. map:  0 wallclock secs ( 0.93 usr +  0.00 sys =  0.93 CPU) @ 537634.41/s 
(n=50)
2. foreach:  2 wallclock secs ( 2.29 usr +  0.00 sys =  2.29 CPU) @ 218340.61/s 
(n=50)
3. for:  1 wallclock secs ( 2.32 usr +  0.00 sys =  2.32 CPU) @ 215517.24/s 
(n=50)

-- tdk



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-27 Thread Paul


--- Bill Lawry <[EMAIL PROTECTED]> wrote:
> Neato & thanks.
> 
> I don't understand why one solution uses map & count and the other
> just uses count. Is map implied in the second solution?

Because someone (and with apologies to all, I don't recall off the top
of my head who)correctly pointed out to me earlier in this thread that
using map() here was inefficient. map() builds and returns an array, so
there's no point in using it in this void context. Aside from that,
both do the same thing. The postfix for is cleaner. =o)

> > >   map { $count{$_}++ } $data =~ /(\w+)/sog;
> > $count{$_}++ for $data =~ /([\w-]+)/sog;

btw, "foreach" might have been more readable here, but "foreach" is
pretty much an alias to "for", which is fewer characters to type
~wink~

> - Original Message -
> From: "Paul" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>; "Bill Lawry" <[EMAIL PROTECTED]>
> Cc: <[EMAIL PROTECTED]>
> Sent: Thursday, April 26, 2001 6:59 PM
> Subject: Re: [BPQ] help!! any idea whats wrong with this??
> 
> 
> >
> > --- Paul <[EMAIL PROTECTED]> wrote:
> > >
> > > --- Bill Lawry <[EMAIL PROTECTED]> wrote:
> > > > Pretty cool but when used on a file it breaks hyphenated words
> into
> > > > their components and counts them separately:
> > > >
> > > > 17 occurrences of 'Acct'
> > > > 3 occurrences of 'Authentic'
> > > > etc
> > > >
> > > > instead of:
> > > >
> > > > 3 occurrences of Acct-Authentic
> > > > 3 occurrences of Acct-Delay-Time
> > > > 1 occurrences of Acct-Input-Octets
> > > > 1 occurrences of Acct-Input-Packets
> > > > 1 occurrences of Acct-Output-Octets
> > > > 1 occurrences of Acct-Output-Packets
> > > > 3 occurrences of Acct-Session-Id
> > > > 1 occurrences of Acct-Session-Time
> > > > 3 occurrences of Acct-Status-Type
> > >
> > > Not what you want? Then let's edit the pattern. =o)
> > >
> > > Instead of
> > >   map { $count{$_}++ } $data =~ /(\w+)/sog;
> > >
> > > try
> > >$count{$_}++ } for $data =~ /([\w-]+)/sog;
> >
> > ps^Make that:
> >
> > $count{$_}++ for $data =~ /([\w-]+)/sog;
> >
> >
> > > > - Original Message -
> > > > From: "Michael Lamertz" <[EMAIL PROTECTED]>
> > > > To: <[EMAIL PROTECTED]>
> > > > Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
> > > > Sent: Tuesday, April 24, 2001 1:03 PM
> > > > Subject: Re: [BPQ] help!! any idea whats wrong with this??
> > > >
> > > >
> > > > > Paul ([EMAIL PROTECTED]) wrote:
> > > > > >
> > > > > >  #!/usr/local/bin/perl -w
> > > > > >
> > > > > >  use strict
> > > > > >  open (FILE,$0) or die $!; # this reads itself
> > > > > >  my($data,%count);
> > > > > >  { local $/ = undef;   # erases the record seperator
> for
> > > this
> > > > block
> > > > > >$data = ; # slurps in the whole file to
> $data
> > > > > >  }
> > > > > >  close(FILE);  # good habit
> > > > > >  map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the
> context!
> > > > > >  print map { "$count{$_} occurances of '$_'\n" } sort keys
> > > > %count;
> > > > > >
> > > > > > Perl is a wonderfully concise language.
> > > > > > The above is strictly given as an example of a few
> performance
> > > > tricks
> > > > > > that are worth researching. =o)
> > > > >
> > > > > I agree printing the map output, but I disagree using map to
> > > > calculate
> > > > > the sums.  map always generates a new array that immediately
> gets
> > > > dumped
> > > > > since it's not assigned.  A foreach would be nicer to system
> > > > resources
> > > > > and better to read.  To make it short, use it postfix:
> > > > >
> > > > > $count{$_}++ foreach ($data=~ /.../);
> > > > >
> > > > > Check 'perldoc perlfaq6' for reference.
> > > > >
> > > > > --
> > > > >  If we fail, we will lose the war.
> > > > >
> > > > > Michael Lamertz  | [EMAIL PROTECTED] /
> > > > [EMAIL PROTECTED]
> > > > > Nordstr. 49  | http://www.lamertz.net
> > > > > 50733 Cologne| Work: +49 221 3091-121
> > > > > Germany  | Priv: +49 221 445420 / +49 171
> 6900
> > > 310
> > > > >
> > > >
> > >
> > >
> > > __
> > > Do You Yahoo!?
> > > Yahoo! Auctions - buy the things you want at great prices
> > > http://auctions.yahoo.com/
> >
> >
> > __
> > Do You Yahoo!?
> > Yahoo! Auctions - buy the things you want at great prices
> > http://auctions.yahoo.com/
> >
> 


__
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-27 Thread Bill Lawry

Neato & thanks.

I don't understand why one solution uses map & count and the other just uses
count. Is map implied in the second solution?

- Original Message -
From: "Paul" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>; "Bill Lawry" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Thursday, April 26, 2001 6:59 PM
Subject: Re: [BPQ] help!! any idea whats wrong with this??


>
> --- Paul <[EMAIL PROTECTED]> wrote:
> >
> > --- Bill Lawry <[EMAIL PROTECTED]> wrote:
> > > Pretty cool but when used on a file it breaks hyphenated words into
> > > their components and counts them separately:
> > >
> > > 17 occurrences of 'Acct'
> > > 3 occurrences of 'Authentic'
> > > etc
> > >
> > > instead of:
> > >
> > > 3 occurrences of Acct-Authentic
> > > 3 occurrences of Acct-Delay-Time
> > > 1 occurrences of Acct-Input-Octets
> > > 1 occurrences of Acct-Input-Packets
> > > 1 occurrences of Acct-Output-Octets
> > > 1 occurrences of Acct-Output-Packets
> > > 3 occurrences of Acct-Session-Id
> > > 1 occurrences of Acct-Session-Time
> > > 3 occurrences of Acct-Status-Type
> >
> > Not what you want? Then let's edit the pattern. =o)
> >
> > Instead of
> >   map { $count{$_}++ } $data =~ /(\w+)/sog;
> >
> > try
> >$count{$_}++ } for $data =~ /([\w-]+)/sog;
>
> OOOOps........^Make that:
>
>     $count{$_}++ for $data =~ /([\w-]+)/sog;
>
>
> > > - Original Message -
> > > From: "Michael Lamertz" <[EMAIL PROTECTED]>
> > > To: <[EMAIL PROTECTED]>
> > > Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
> > > Sent: Tuesday, April 24, 2001 1:03 PM
> > > Subject: Re: [BPQ] help!! any idea whats wrong with this??
> > >
> > >
> > > > Paul ([EMAIL PROTECTED]) wrote:
> > > > >
> > > > >  #!/usr/local/bin/perl -w
> > > > >
> > > > >  use strict
> > > > >  open (FILE,$0) or die $!; # this reads itself
> > > > >  my($data,%count);
> > > > >  { local $/ = undef;   # erases the record seperator for
> > this
> > > block
> > > > >$data = ; # slurps in the whole file to $data
> > > > >  }
> > > > >  close(FILE);  # good habit
> > > > >  map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context!
> > > > >  print map { "$count{$_} occurances of '$_'\n" } sort keys
> > > %count;
> > > > >
> > > > > Perl is a wonderfully concise language.
> > > > > The above is strictly given as an example of a few performance
> > > tricks
> > > > > that are worth researching. =o)
> > > >
> > > > I agree printing the map output, but I disagree using map to
> > > calculate
> > > > the sums.  map always generates a new array that immediately gets
> > > dumped
> > > > since it's not assigned.  A foreach would be nicer to system
> > > resources
> > > > and better to read.  To make it short, use it postfix:
> > > >
> > > > $count{$_}++ foreach ($data=~ /.../);
> > > >
> > > > Check 'perldoc perlfaq6' for reference.
> > > >
> > > > --
> > > >  If we fail, we will lose the war.
> > > >
> > > > Michael Lamertz  | [EMAIL PROTECTED] /
> > > [EMAIL PROTECTED]
> > > > Nordstr. 49  | http://www.lamertz.net
> > > > 50733 Cologne| Work: +49 221 3091-121
> > > > Germany  | Priv: +49 221 445420 / +49 171 6900
> > 310
> > > >
> > >
> >
> >
> > __
> > Do You Yahoo!?
> > Yahoo! Auctions - buy the things you want at great prices
> > http://auctions.yahoo.com/
>
>
> __
> Do You Yahoo!?
> Yahoo! Auctions - buy the things you want at great prices
> http://auctions.yahoo.com/
>




Re: [BPQ] help!! any idea whats wrong with this??

2001-04-26 Thread Paul


--- Paul <[EMAIL PROTECTED]> wrote:
> 
> --- Bill Lawry <[EMAIL PROTECTED]> wrote:
> > Pretty cool but when used on a file it breaks hyphenated words into
> > their components and counts them separately:
> > 
> > 17 occurrences of 'Acct'
> > 3 occurrences of 'Authentic'
> > etc
> > 
> > instead of:
> > 
> > 3 occurrences of Acct-Authentic
> > 3 occurrences of Acct-Delay-Time
> > 1 occurrences of Acct-Input-Octets
> > 1 occurrences of Acct-Input-Packets
> > 1 occurrences of Acct-Output-Octets
> > 1 occurrences of Acct-Output-Packets
> > 3 occurrences of Acct-Session-Id
> > 1 occurrences of Acct-Session-Time
> > 3 occurrences of Acct-Status-Type
> 
> Not what you want? Then let's edit the pattern. =o)
> 
> Instead of
>   map { $count{$_}++ } $data =~ /(\w+)/sog;
> 
> try 
>$count{$_}++ } for $data =~ /([\w-]+)/sog;

ps^Make that:

$count{$_}++ for $data =~ /([\w-]+)/sog;


> > - Original Message -
> > From: "Michael Lamertz" <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
> > Sent: Tuesday, April 24, 2001 1:03 PM
> > Subject: Re: [BPQ] help!! any idea whats wrong with this??
> > 
> > 
> > > Paul ([EMAIL PROTECTED]) wrote:
> > > >
> > > >  #!/usr/local/bin/perl -w
> > > >
> > > >  use strict
> > > >  open (FILE,$0) or die $!; # this reads itself
> > > >  my($data,%count);
> > > >  { local $/ = undef;   # erases the record seperator for
> this
> > block
> > > >$data = ; # slurps in the whole file to $data
> > > >  }
> > > >  close(FILE);  # good habit
> > > >  map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context!
> > > >  print map { "$count{$_} occurances of '$_'\n" } sort keys
> > %count;
> > > >
> > > > Perl is a wonderfully concise language.
> > > > The above is strictly given as an example of a few performance
> > tricks
> > > > that are worth researching. =o)
> > >
> > > I agree printing the map output, but I disagree using map to
> > calculate
> > > the sums.  map always generates a new array that immediately gets
> > dumped
> > > since it's not assigned.  A foreach would be nicer to system
> > resources
> > > and better to read.  To make it short, use it postfix:
> > >
> > > $count{$_}++ foreach ($data=~ /.../);
> > >
> > > Check 'perldoc perlfaq6' for reference.
> > >
> > > --
> > >  If we fail, we will lose the war.
> > >
> > > Michael Lamertz  | [EMAIL PROTECTED] /
> > [EMAIL PROTECTED]
> > > Nordstr. 49  | http://www.lamertz.net
> > > 50733 Cologne| Work: +49 221 3091-121
> > > Germany  | Priv: +49 221 445420 / +49 171 6900
> 310
> > >
> > 
> 
> 
> __
> Do You Yahoo!?
> Yahoo! Auctions - buy the things you want at great prices
> http://auctions.yahoo.com/


__
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-26 Thread Paul


--- Bill Lawry <[EMAIL PROTECTED]> wrote:
> Pretty cool but when used on a file it breaks hyphenated words into
> their components and counts them separately:
> 
> 17 occurrences of 'Acct'
> 3 occurrences of 'Authentic'
> etc
> 
> instead of:
> 
> 3 occurrences of Acct-Authentic
> 3 occurrences of Acct-Delay-Time
> 1 occurrences of Acct-Input-Octets
> 1 occurrences of Acct-Input-Packets
> 1 occurrences of Acct-Output-Octets
> 1 occurrences of Acct-Output-Packets
> 3 occurrences of Acct-Session-Id
> 1 occurrences of Acct-Session-Time
> 3 occurrences of Acct-Status-Type

Not what you want? Then let's edit the pattern. =o)

Instead of
  map { $count{$_}++ } $data =~ /(\w+)/sog;

try 
   $count{$_}++ } for $data =~ /([\w-]+)/sog;


> - Original Message -
> From: "Michael Lamertz" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
> Sent: Tuesday, April 24, 2001 1:03 PM
> Subject: Re: [BPQ] help!! any idea whats wrong with this??
> 
> 
> > Paul ([EMAIL PROTECTED]) wrote:
> > >
> > >  #!/usr/local/bin/perl -w
> > >
> > >  use strict
> > >  open (FILE,$0) or die $!; # this reads itself
> > >  my($data,%count);
> > >  { local $/ = undef;   # erases the record seperator for this
> block
> > >$data = ; # slurps in the whole file to $data
> > >  }
> > >  close(FILE);  # good habit
> > >  map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context!
> > >  print map { "$count{$_} occurances of '$_'\n" } sort keys
> %count;
> > >
> > > Perl is a wonderfully concise language.
> > > The above is strictly given as an example of a few performance
> tricks
> > > that are worth researching. =o)
> >
> > I agree printing the map output, but I disagree using map to
> calculate
> > the sums.  map always generates a new array that immediately gets
> dumped
> > since it's not assigned.  A foreach would be nicer to system
> resources
> > and better to read.  To make it short, use it postfix:
> >
> > $count{$_}++ foreach ($data=~ /.../);
> >
> > Check 'perldoc perlfaq6' for reference.
> >
> > --
> >  If we fail, we will lose the war.
> >
> > Michael Lamertz  | [EMAIL PROTECTED] /
> [EMAIL PROTECTED]
> > Nordstr. 49  | http://www.lamertz.net
> > 50733 Cologne| Work: +49 221 3091-121
> > Germany  | Priv: +49 221 445420 / +49 171 6900 310
> >
> 


__
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-26 Thread Bill Lawry

Pretty cool but when used on a file it breaks hyphenated words into their
components and counts them separately:

17 occurrences of 'Acct'
3 occurrences of 'Authentic'
etc

instead of:

3 occurrences of Acct-Authentic
3 occurrences of Acct-Delay-Time
1 occurrences of Acct-Input-Octets
1 occurrences of Acct-Input-Packets
1 occurrences of Acct-Output-Octets
1 occurrences of Acct-Output-Packets
3 occurrences of Acct-Session-Id
1 occurrences of Acct-Session-Time
3 occurrences of Acct-Status-Type

- Original Message -
From: "Michael Lamertz" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Tuesday, April 24, 2001 1:03 PM
Subject: Re: [BPQ] help!! any idea whats wrong with this??


> Paul ([EMAIL PROTECTED]) wrote:
> >
> >  #!/usr/local/bin/perl -w
> >
> >  use strict
> >  open (FILE,$0) or die $!; # this reads itself
> >  my($data,%count);
> >  { local $/ = undef;   # erases the record seperator for this block
> >$data = ; # slurps in the whole file to $data
> >  }
> >  close(FILE);  # good habit
> >  map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context!
> >  print map { "$count{$_} occurances of '$_'\n" } sort keys %count;
> >
> > Perl is a wonderfully concise language.
> > The above is strictly given as an example of a few performance tricks
> > that are worth researching. =o)
>
> I agree printing the map output, but I disagree using map to calculate
> the sums.  map always generates a new array that immediately gets dumped
> since it's not assigned.  A foreach would be nicer to system resources
> and better to read.  To make it short, use it postfix:
>
> $count{$_}++ foreach ($data=~ /.../);
>
> Check 'perldoc perlfaq6' for reference.
>
> --
>  If we fail, we will lose the war.
>
> Michael Lamertz  | [EMAIL PROTECTED] / [EMAIL PROTECTED]
> Nordstr. 49  | http://www.lamertz.net
> 50733 Cologne| Work: +49 221 3091-121
> Germany  | Priv: +49 221 445420 / +49 171 6900 310
>




Re: [BPQ] help!! any idea whats wrong with this??

2001-04-24 Thread Paul


Agreed, and thanks for pointing that out!

--- Michael Lamertz <[EMAIL PROTECTED]> wrote:
> Paul ([EMAIL PROTECTED]) wrote:
> > 
> >  #!/usr/local/bin/perl -w
> > 
> >  use strict 
> >  open (FILE,$0) or die $!; # this reads itself
> >  my($data,%count); 
> >  { local $/ = undef;   # erases the record seperator for this
> block
> >$data = ; # slurps in the whole file to $data
> >  }   
> >  close(FILE);  # good habit
> >  map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context!
> >  print map { "$count{$_} occurances of '$_'\n" } sort keys %count;
> > 
> > Perl is a wonderfully concise language.
> > The above is strictly given as an example of a few performance
> tricks
> > that are worth researching. =o)
> 
> I agree printing the map output, but I disagree using map to
> calculate
> the sums.  map always generates a new array that immediately gets
> dumped
> since it's not assigned.  A foreach would be nicer to system
> resources
> and better to read.  To make it short, use it postfix:
> 
> $count{$_}++ foreach ($data=~ /.../);
> 
> Check 'perldoc perlfaq6' for reference.
> 
> -- 
>  If we fail, we will lose the war.
> 
> Michael Lamertz  | [EMAIL PROTECTED] /
> [EMAIL PROTECTED]
> Nordstr. 49  | http://www.lamertz.net
> 50733 Cologne| Work: +49 221 3091-121
> Germany  | Priv: +49 221 445420 / +49 171 6900 310


__
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-24 Thread Michael Lamertz

Paul ([EMAIL PROTECTED]) wrote:
> 
>  #!/usr/local/bin/perl -w
> 
>  use strict 
>  open (FILE,$0) or die $!; # this reads itself
>  my($data,%count); 
>  { local $/ = undef;   # erases the record seperator for this block
>$data = ; # slurps in the whole file to $data
>  }   
>  close(FILE);  # good habit
>  map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context!
>  print map { "$count{$_} occurances of '$_'\n" } sort keys %count;
> 
> Perl is a wonderfully concise language.
> The above is strictly given as an example of a few performance tricks
> that are worth researching. =o)

I agree printing the map output, but I disagree using map to calculate
the sums.  map always generates a new array that immediately gets dumped
since it's not assigned.  A foreach would be nicer to system resources
and better to read.  To make it short, use it postfix:

$count{$_}++ foreach ($data=~ /.../);

Check 'perldoc perlfaq6' for reference.

-- 
 If we fail, we will lose the war.

Michael Lamertz  | [EMAIL PROTECTED] / [EMAIL PROTECTED]
Nordstr. 49  | http://www.lamertz.net
50733 Cologne| Work: +49 221 3091-121
Germany  | Priv: +49 221 445420 / +49 171 6900 310



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-24 Thread Kevin Meltzer

Hi Chris,

You are getting only the last line of the file because of this:

> foreach $i (@lines) {
> @words = split(/\s+/, $i);
> }

You reassign the @words array each time, and end up with the last line only
when exiting the foreach loop. You may want to look at 'perldoc -f push' to see
how to add to an array.

Here is how I would likely accomplish  this task:

#!/usr/bin/perl -w 

use strict;
my %counts;

open(FILE,"file.txt") or die "Can't open file: $!";
my @lines = ;
close FILE;

for (@lines) {
$counts{$_}++ for (split /\s+/);
}

print qq{$counts{$_} occurances of the word $_\n} for keys %counts;


Cheers,
Kevin

On Tue, Apr 24, 2001 at 10:17:02AM -0500, Chris Brown ([EMAIL PROTECTED]) spew-ed 
forth:
> so...this is suposed to count the words in FILE and return how many occourances of 
>each word there were...its not working for me thoughits only returning the count 
>for the last word in the file...help
> 
> #!/usr/local/bin/perl
> 
> open (FILE,"../www/main.php3");
> @lines=;
> close(FILE);
> 
> foreach $i (@lines) {
> @words = split(/\s+/, $i);
> }
> 
> foreach $word (@words) {
> $wordcount{"$word"}=0;
> }
> 
> foreach $word2 (@words) {
> $wordcount{"$word2"}+=1;
> }
> 
> foreach $key (keys (%wordcount)) {
> print "$wordcount{$key} occourances of the whord $key\n";
> }
> 

-- 
Down that path lies madness.  On the other hand, the road to hell is
paved with melting snowballs. 
--Larry Wall in <[EMAIL PROTECTED]>



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-24 Thread Paul


--- Chris Brown <[EMAIL PROTECTED]> wrote:
> so...this is suposed to count the words in FILE and return how many
> occourances of each word there were...its not working for me
> thoughits only returning the count for the last word in the
> file...help
> 
> #!/usr/local/bin/perl
> 
> open (FILE,"../www/main.php3");
> @lines=;
> close(FILE);
> 
> foreach $i (@lines) {
> @words = split(/\s+/, $i);
> }
> 
> foreach $word (@words) {
> $wordcount{"$word"}=0;
> }
> 
> foreach $word2 (@words) {
> $wordcount{"$word2"}+=1;
> }
> 
> foreach $key (keys (%wordcount)) {
> print "$wordcount{$key} occourances of the whord $key\n";
> }

TMTOWTDI :o]

 #!/usr/local/bin/perl -w

 use strict 
 open (FILE,$0) or die $!; # this reads itself
 my($data,%count); 
 { local $/ = undef;   # erases the record seperator for this block
   $data = ; # slurps in the whole file to $data
 }   
 close(FILE);  # good habit
 map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context!
 print map { "$count{$_} occurances of '$_'\n" } sort keys %count;

Perl is a wonderfully concise language.
The above is strictly given as an example of a few performance tricks
that are worth researching. =o)

Paul
 

__
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-24 Thread Sean O'Leary

At 11:17 AM 4/24/2001, you wrote:
>so...this is suposed to count the words in FILE and return how many 
>occourances of each word there were...its not working for me thoughits 
>only returning the count for the last word in the file...help
>
>#!/usr/local/bin/perl
>
>open (FILE,"../www/main.php3");
>@lines=;
>close(FILE);
>
>foreach $i (@lines) {
>@words = split(/\s+/, $i);
>}

The loop above clobbers @words each time.  The loops below only work on the 
last line of the file.

>foreach $word (@words) {
>$wordcount{"$word"}=0;
>}

You're initializing every entry in the hash to 0, and you don't have to do 
that.  The rest of the code looks good. But we still have to deal with the 
problem of not getting all of the words into @words.

I think this would work, if used as the first loop of the program.

foreach (@lines) {
 push @word, split /\s+/;
}

With the loop above, as you split words out of @lines, they get _added_ to 
@words.  The assignment operator, =, would clobber the value already there, 
leaving us with just the last thing we assigned to it, the last line of the 
file.  Look in perlfunc for the push, pop, shift, and unshift 
functions.  If you don't know about stacks and queues, get a decent book on 
data structures and check them out.  They are amazingly powerful for how 
simple they are, and Perl is nice enough to have all the stuff built in so 
you can treat regular arrays like either (or both) of them.

As for hashes, when you use a hash key for the first time, the value is 
undef.  Undef, in a numeric context, looks like 0.  So all we have to do is 
add 1 for each time we see a word, and we are ok.  There's no need to 
initialize them all to zero.  So your program would become something that 
looks like this.

use strict;

open FILE, "../www/main.php3" or die "Can't open the file: $!";
# I added the "or die ..." above, because you want your
# program to halt if you have no data to work on.  You
# also want to check the return values of functions that
# do stuff outside of your program, to make sure that
# the succeed.
@lines=;
close(FILE);

foreach (@lines) {
 push @words, split /\s+/;
}

my %wordcount = ();

foreach my $word (@words) {
 # Oh, my $word!  : )
 $wordcount{$word}++;
 # $var++ is shorthand for $var += 1
 # but either is fine.
}

But, this can be cleaned up further!  You can compress it all down to one 
loop, in a few different ways.  Take another look at it Chris, play with 
it, and crunch it down.  Less code, less bugs, less stuff to worry about.

Good luck!

Thank you for your time,

Sean.




Re: [BPQ] help!! any idea whats wrong with this??

2001-04-24 Thread Collin Rogowski

The problem is that you override the (global) array @words for each line.
You go through @lines and the split overrides @words!
While we are at it... ;-)

You do not need that many loops. The programm will be much simpler like
that:

open(FILE, "yourFileName");

while ($line = ) { #no need to read the file into an array
  @words = split/\s+/, $line; #just process each line as it comes
  foreach $word (@words) {
$wordcount{$word} += 1; #no need to initialize the hash with 0,
#perl does that for you
  }
}
close(FILE);

foreach $word (keys %wordcount) {
  print $word, ": ", $wordcount{$word}, "\n";
}


hope this helps,

cr

P.S.:
If you have question about the code above, feel free to ask.


On Tue, 24 Apr 2001 10:17:02 -0500, Chris Brown said:

> so...this is suposed to count the words in FILE and return how many occourances of 
>each word there were...its not working for me thoughits only returning the count 
>for the last word in the file...help
>  
>  #!/usr/local/bin/perl
>  
>  open (FILE,"../www/main.php3");
>  @lines=;
>  close(FILE);
>  
>  foreach $i (@lines) {
>  @words = split(/\s+/, $i);
>  }
>  
>  foreach $word (@words) {
>  $wordcount{"$word"}=0;
>  }
>  
>  foreach $word2 (@words) {
>  $wordcount{"$word2"}+=1;
>  }
>  
>  foreach $key (keys (%wordcount)) {
>  print "$wordcount{$key} occourances of the whord $key\n";
>  }
>  
>  




Re: [BPQ] help!! any idea whats wrong with this??

2001-04-24 Thread Casey West

On Tue, Apr 24, 2001 at 10:17:02AM -0500, Chris Brown wrote:
: so...this is suposed to count the words in FILE and return how many occourances of 
:each word there were...its not working for me thoughits only returning the count 
:for the last word in the file...help

The following is a generality, not directly related to this question (
unless it is homework ;).

Please note that I will not allow this list to do anyone's homework.
Answering homework style questions with simple 'go read xxx' is fine,
though.  Pointers are very different from doing work for a student.

-- 
Casey West



Re: [BPQ] help!! any idea whats wrong with this??

2001-04-24 Thread Timothy Kimball


: so...this is suposed to count the words in FILE and return how many occourances of 
:each word there were...its not working for me thoughits only returning the count 
:for the last word in the file...help

Think: In the first loop, what happens to the first line when you move
to the second?

(Sorry for the terse answer, but this one sounds like a homework
problem. ;)

--
Tim Kimball · ACDSD / MAST¦ 
Space Telescope Science Institute ¦ We are here on Earth to do good to others.
3700 San Martin Drive ¦ What the others are here for, I don't know.
Baltimore MD 21218 USA¦   -- W.H. Auden