Re: [BPQ] help!! any idea whats wrong with this??
: Ah, a Heisenbug. There's a problem with your benchmarking: Yep, you're right. map is slightly slower when it actually has something to do. I stand corrected... again. So the moral of the story is: If you want your code to run really fast, make it do nothing. ;) That's what I love about Perl: been programming with it for 8 years, & still learning. -- tdk
Re: [BPQ] help!! any idea whats wrong with this??
Timothy Kimball ([EMAIL PROTECTED]) wrote: > > : Because someone (and with apologies to all, I don't recall off the top > : of my head who)correctly pointed out to me earlier in this thread that > : using map() here was inefficient. map() builds and returns an array, so > : there's no point in using it in this void context. Aside from that, > : both do the same thing. The postfix for is cleaner. =o) > > I agree that the postfix is cleaner, but when I benchmark these, map > looks faster- though several months ago, map was slower (IIRC). Maybe > something changed in 5.6.0 to make map faster in a null context...? > > Here's the script & output (Perl 5.6.0 on an Ultra 10): Whoa! It was me who noted that the map should be avoided in void context, but trying your benchmark, I had the same results. However, since I did some benchmarks on my own *before* stating that map should not be used, I had some completely different results. I was suspicious, since your results are by far too fast - an U10 is not that much of a big machine, isn't it? ;-) ...hack, hack, hack... Ah, a Heisenbug. There's a problem with your benchmarking: -- snip -- my @lines = qw( ... timethese(500_000,{ "1. map" => 'map { s/a// } @lines', -- snip -- Inside 'timethese', @lines is unknown and thus empty. Looks as if the Benchmark module ignores '$@' after the eval, but I haven't checked for that. But fact is, you're running 500_000 loops on en empty list, and map doesn't need to create any new list at all - well, at least it looks as if map is *pretty* fast on empty lists. There are 2 modifications required in your benchmark: a. make the @list array global, so it's visible inside 'timethese' b. make sure the s/ doesn't truncate the string, so there's still some work to do after 10 test loops. (I did that by replacing s/ with a switching tr...) ( c. as a not really necessary addon I decided to create more random test data (yes, I *am* using nested maps there >:-> ). ) Here's my version of your benchmark: -- snip -- kanku-dai:~$ cat check.pl #!/usr/bin/perl -w use strict; use Benchmark; use vars qw{@lines}; my @chars=('A'..'Z', 'a'..'z', 0 .. 9, ' '); @lines=map { join('', @chars[map { rand @chars } (0 .. 63)]) } (1 .. 10); timethese(500_000,{ "1. map" => 'map { tr/abAB/baBA/ } @lines', "2. foreach" => 'foreach ( @lines ) { tr/abAB/baBA/ }', "3. for" => 'tr/abAB/baBA/ for @lines', }); -- snip -- Here are the new results: -- new -- kanku-dai:~$ perl check.pl Benchmark: timing 50 iterations of 1. map, 2. foreach, 3. for... 1. map: 7 wallclock secs ( 6.95 usr + 0.01 sys = 6.96 CPU) @ 71839.08/s (n=50) 2. foreach: 9 wallclock secs ( 8.38 usr + 0.01 sys = 8.39 CPU) @ 59594.76/s (n=50) 3. for: 9 wallclock secs ( 8.13 usr + 0.01 sys = 8.14 CPU) @ 61425.06/s (n=50) kanku-dai:~$ -- new -- So, it looks as if you're right, map *IS* a bit faster on small data sets, but not in the dimensions that your benchmark suggested. Increasing the amount of data makes that difference go away however. Here's the data for 500 benchmark loops over 10_000 lines of data: -- mod_new -- kanku-dai:~$ perl check.pl Benchmark: timing 500 iterations of 1. map, 2. foreach, 3. for... 1. map: 10 wallclock secs ( 9.71 usr + 0.04 sys = 9.75 CPU) @ 51.28/s (n=500) 2. foreach: 9 wallclock secs ( 9.58 usr + 0.01 sys = 9.59 CPU) @ 52.14/s (n=500) 3. for: 9 wallclock secs ( 9.44 usr + 0.02 sys = 9.46 CPU) @ 52.85/s (n=500) kanku-dai:~$ -- mod_new -- Conclusion: The perlfaq6 information seems outdated, so the only argument against map is the question of style, readability and personal taste - naturally, I stick with my style ;-) Mike -- If we fail, we will lose the war. Michael Lamertz | [EMAIL PROTECTED] / [EMAIL PROTECTED] Nordstr. 49 | http://www.lamertz.net 50733 Cologne| Work: +49 221 3091-121 Germany | Priv: +49 221 445420 / +49 171 6900 310
Re: [BPQ] help!! any idea whats wrong with this??
: Because someone (and with apologies to all, I don't recall off the top : of my head who)correctly pointed out to me earlier in this thread that : using map() here was inefficient. map() builds and returns an array, so : there's no point in using it in this void context. Aside from that, : both do the same thing. The postfix for is cleaner. =o) I agree that the postfix is cleaner, but when I benchmark these, map looks faster- though several months ago, map was slower (IIRC). Maybe something changed in 5.6.0 to make map faster in a null context...? Here's the script & output (Perl 5.6.0 on an Ultra 10): archdev 10:58% more ./z #!/archive/data1/bin/perl use strict; use Benchmark; my @lines = qw( 10th 1st 2nd 3rd 4th 5th 6th 7th 8th 9th a AAA AAAS Aarhus Aaron AAU ABA Ababa aback abacus ); timethese(500_000,{ "1. map" => 'map { s/a// } @lines', "2. foreach" => 'foreach ( @lines ) { s/a// }', "3. for" => 's/a// for @lines', }); archdev 10:58am 147% ./z Benchmark: timing 50 iterations of 1. map, 2. foreach, 3. for... 1. map: 0 wallclock secs ( 0.93 usr + 0.00 sys = 0.93 CPU) @ 537634.41/s (n=50) 2. foreach: 2 wallclock secs ( 2.29 usr + 0.00 sys = 2.29 CPU) @ 218340.61/s (n=50) 3. for: 1 wallclock secs ( 2.32 usr + 0.00 sys = 2.32 CPU) @ 215517.24/s (n=50) -- tdk
Re: [BPQ] help!! any idea whats wrong with this??
--- Bill Lawry <[EMAIL PROTECTED]> wrote: > Neato & thanks. > > I don't understand why one solution uses map & count and the other > just uses count. Is map implied in the second solution? Because someone (and with apologies to all, I don't recall off the top of my head who)correctly pointed out to me earlier in this thread that using map() here was inefficient. map() builds and returns an array, so there's no point in using it in this void context. Aside from that, both do the same thing. The postfix for is cleaner. =o) > > > map { $count{$_}++ } $data =~ /(\w+)/sog; > > $count{$_}++ for $data =~ /([\w-]+)/sog; btw, "foreach" might have been more readable here, but "foreach" is pretty much an alias to "for", which is fewer characters to type ~wink~ > - Original Message - > From: "Paul" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]>; "Bill Lawry" <[EMAIL PROTECTED]> > Cc: <[EMAIL PROTECTED]> > Sent: Thursday, April 26, 2001 6:59 PM > Subject: Re: [BPQ] help!! any idea whats wrong with this?? > > > > > > --- Paul <[EMAIL PROTECTED]> wrote: > > > > > > --- Bill Lawry <[EMAIL PROTECTED]> wrote: > > > > Pretty cool but when used on a file it breaks hyphenated words > into > > > > their components and counts them separately: > > > > > > > > 17 occurrences of 'Acct' > > > > 3 occurrences of 'Authentic' > > > > etc > > > > > > > > instead of: > > > > > > > > 3 occurrences of Acct-Authentic > > > > 3 occurrences of Acct-Delay-Time > > > > 1 occurrences of Acct-Input-Octets > > > > 1 occurrences of Acct-Input-Packets > > > > 1 occurrences of Acct-Output-Octets > > > > 1 occurrences of Acct-Output-Packets > > > > 3 occurrences of Acct-Session-Id > > > > 1 occurrences of Acct-Session-Time > > > > 3 occurrences of Acct-Status-Type > > > > > > Not what you want? Then let's edit the pattern. =o) > > > > > > Instead of > > > map { $count{$_}++ } $data =~ /(\w+)/sog; > > > > > > try > > >$count{$_}++ } for $data =~ /([\w-]+)/sog; > > > > ps^Make that: > > > > $count{$_}++ for $data =~ /([\w-]+)/sog; > > > > > > > > - Original Message - > > > > From: "Michael Lamertz" <[EMAIL PROTECTED]> > > > > To: <[EMAIL PROTECTED]> > > > > Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > > > > Sent: Tuesday, April 24, 2001 1:03 PM > > > > Subject: Re: [BPQ] help!! any idea whats wrong with this?? > > > > > > > > > > > > > Paul ([EMAIL PROTECTED]) wrote: > > > > > > > > > > > > #!/usr/local/bin/perl -w > > > > > > > > > > > > use strict > > > > > > open (FILE,$0) or die $!; # this reads itself > > > > > > my($data,%count); > > > > > > { local $/ = undef; # erases the record seperator > for > > > this > > > > block > > > > > >$data = ; # slurps in the whole file to > $data > > > > > > } > > > > > > close(FILE); # good habit > > > > > > map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the > context! > > > > > > print map { "$count{$_} occurances of '$_'\n" } sort keys > > > > %count; > > > > > > > > > > > > Perl is a wonderfully concise language. > > > > > > The above is strictly given as an example of a few > performance > > > > tricks > > > > > > that are worth researching. =o) > > > > > > > > > > I agree printing the map output, but I disagree using map to > > > > calculate > > > > > the sums. map always generates a new array that immediately > gets > > > > dumped > > > > > since it's not assigned. A foreach would be nicer to system > > > > resources > > > > > and better to read. To make it short, use it postfix: > > > > > > > > > > $count{$_}++ foreach ($data=~ /.../); > > > > > > > > > > Check 'perldoc perlfaq6' for reference. > > > > > > > > > > -- > > > > > If we fail, we will lose the war. > > > > > > > > > > Michael Lamertz | [EMAIL PROTECTED] / > > > > [EMAIL PROTECTED] > > > > > Nordstr. 49 | http://www.lamertz.net > > > > > 50733 Cologne| Work: +49 221 3091-121 > > > > > Germany | Priv: +49 221 445420 / +49 171 > 6900 > > > 310 > > > > > > > > > > > > > > > > > > __ > > > Do You Yahoo!? > > > Yahoo! Auctions - buy the things you want at great prices > > > http://auctions.yahoo.com/ > > > > > > __ > > Do You Yahoo!? > > Yahoo! Auctions - buy the things you want at great prices > > http://auctions.yahoo.com/ > > > __ Do You Yahoo!? Yahoo! Auctions - buy the things you want at great prices http://auctions.yahoo.com/
Re: [BPQ] help!! any idea whats wrong with this??
Neato & thanks. I don't understand why one solution uses map & count and the other just uses count. Is map implied in the second solution? - Original Message - From: "Paul" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>; "Bill Lawry" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Thursday, April 26, 2001 6:59 PM Subject: Re: [BPQ] help!! any idea whats wrong with this?? > > --- Paul <[EMAIL PROTECTED]> wrote: > > > > --- Bill Lawry <[EMAIL PROTECTED]> wrote: > > > Pretty cool but when used on a file it breaks hyphenated words into > > > their components and counts them separately: > > > > > > 17 occurrences of 'Acct' > > > 3 occurrences of 'Authentic' > > > etc > > > > > > instead of: > > > > > > 3 occurrences of Acct-Authentic > > > 3 occurrences of Acct-Delay-Time > > > 1 occurrences of Acct-Input-Octets > > > 1 occurrences of Acct-Input-Packets > > > 1 occurrences of Acct-Output-Octets > > > 1 occurrences of Acct-Output-Packets > > > 3 occurrences of Acct-Session-Id > > > 1 occurrences of Acct-Session-Time > > > 3 occurrences of Acct-Status-Type > > > > Not what you want? Then let's edit the pattern. =o) > > > > Instead of > > map { $count{$_}++ } $data =~ /(\w+)/sog; > > > > try > >$count{$_}++ } for $data =~ /([\w-]+)/sog; > > OOOOps........^Make that: > > $count{$_}++ for $data =~ /([\w-]+)/sog; > > > > > - Original Message - > > > From: "Michael Lamertz" <[EMAIL PROTECTED]> > > > To: <[EMAIL PROTECTED]> > > > Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > > > Sent: Tuesday, April 24, 2001 1:03 PM > > > Subject: Re: [BPQ] help!! any idea whats wrong with this?? > > > > > > > > > > Paul ([EMAIL PROTECTED]) wrote: > > > > > > > > > > #!/usr/local/bin/perl -w > > > > > > > > > > use strict > > > > > open (FILE,$0) or die $!; # this reads itself > > > > > my($data,%count); > > > > > { local $/ = undef; # erases the record seperator for > > this > > > block > > > > >$data = ; # slurps in the whole file to $data > > > > > } > > > > > close(FILE); # good habit > > > > > map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context! > > > > > print map { "$count{$_} occurances of '$_'\n" } sort keys > > > %count; > > > > > > > > > > Perl is a wonderfully concise language. > > > > > The above is strictly given as an example of a few performance > > > tricks > > > > > that are worth researching. =o) > > > > > > > > I agree printing the map output, but I disagree using map to > > > calculate > > > > the sums. map always generates a new array that immediately gets > > > dumped > > > > since it's not assigned. A foreach would be nicer to system > > > resources > > > > and better to read. To make it short, use it postfix: > > > > > > > > $count{$_}++ foreach ($data=~ /.../); > > > > > > > > Check 'perldoc perlfaq6' for reference. > > > > > > > > -- > > > > If we fail, we will lose the war. > > > > > > > > Michael Lamertz | [EMAIL PROTECTED] / > > > [EMAIL PROTECTED] > > > > Nordstr. 49 | http://www.lamertz.net > > > > 50733 Cologne| Work: +49 221 3091-121 > > > > Germany | Priv: +49 221 445420 / +49 171 6900 > > 310 > > > > > > > > > > > > > __ > > Do You Yahoo!? > > Yahoo! Auctions - buy the things you want at great prices > > http://auctions.yahoo.com/ > > > __ > Do You Yahoo!? > Yahoo! Auctions - buy the things you want at great prices > http://auctions.yahoo.com/ >
Re: [BPQ] help!! any idea whats wrong with this??
--- Paul <[EMAIL PROTECTED]> wrote: > > --- Bill Lawry <[EMAIL PROTECTED]> wrote: > > Pretty cool but when used on a file it breaks hyphenated words into > > their components and counts them separately: > > > > 17 occurrences of 'Acct' > > 3 occurrences of 'Authentic' > > etc > > > > instead of: > > > > 3 occurrences of Acct-Authentic > > 3 occurrences of Acct-Delay-Time > > 1 occurrences of Acct-Input-Octets > > 1 occurrences of Acct-Input-Packets > > 1 occurrences of Acct-Output-Octets > > 1 occurrences of Acct-Output-Packets > > 3 occurrences of Acct-Session-Id > > 1 occurrences of Acct-Session-Time > > 3 occurrences of Acct-Status-Type > > Not what you want? Then let's edit the pattern. =o) > > Instead of > map { $count{$_}++ } $data =~ /(\w+)/sog; > > try >$count{$_}++ } for $data =~ /([\w-]+)/sog; ps^Make that: $count{$_}++ for $data =~ /([\w-]+)/sog; > > - Original Message - > > From: "Michael Lamertz" <[EMAIL PROTECTED]> > > To: <[EMAIL PROTECTED]> > > Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > > Sent: Tuesday, April 24, 2001 1:03 PM > > Subject: Re: [BPQ] help!! any idea whats wrong with this?? > > > > > > > Paul ([EMAIL PROTECTED]) wrote: > > > > > > > > #!/usr/local/bin/perl -w > > > > > > > > use strict > > > > open (FILE,$0) or die $!; # this reads itself > > > > my($data,%count); > > > > { local $/ = undef; # erases the record seperator for > this > > block > > > >$data = ; # slurps in the whole file to $data > > > > } > > > > close(FILE); # good habit > > > > map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context! > > > > print map { "$count{$_} occurances of '$_'\n" } sort keys > > %count; > > > > > > > > Perl is a wonderfully concise language. > > > > The above is strictly given as an example of a few performance > > tricks > > > > that are worth researching. =o) > > > > > > I agree printing the map output, but I disagree using map to > > calculate > > > the sums. map always generates a new array that immediately gets > > dumped > > > since it's not assigned. A foreach would be nicer to system > > resources > > > and better to read. To make it short, use it postfix: > > > > > > $count{$_}++ foreach ($data=~ /.../); > > > > > > Check 'perldoc perlfaq6' for reference. > > > > > > -- > > > If we fail, we will lose the war. > > > > > > Michael Lamertz | [EMAIL PROTECTED] / > > [EMAIL PROTECTED] > > > Nordstr. 49 | http://www.lamertz.net > > > 50733 Cologne| Work: +49 221 3091-121 > > > Germany | Priv: +49 221 445420 / +49 171 6900 > 310 > > > > > > > > __ > Do You Yahoo!? > Yahoo! Auctions - buy the things you want at great prices > http://auctions.yahoo.com/ __ Do You Yahoo!? Yahoo! Auctions - buy the things you want at great prices http://auctions.yahoo.com/
Re: [BPQ] help!! any idea whats wrong with this??
--- Bill Lawry <[EMAIL PROTECTED]> wrote: > Pretty cool but when used on a file it breaks hyphenated words into > their components and counts them separately: > > 17 occurrences of 'Acct' > 3 occurrences of 'Authentic' > etc > > instead of: > > 3 occurrences of Acct-Authentic > 3 occurrences of Acct-Delay-Time > 1 occurrences of Acct-Input-Octets > 1 occurrences of Acct-Input-Packets > 1 occurrences of Acct-Output-Octets > 1 occurrences of Acct-Output-Packets > 3 occurrences of Acct-Session-Id > 1 occurrences of Acct-Session-Time > 3 occurrences of Acct-Status-Type Not what you want? Then let's edit the pattern. =o) Instead of map { $count{$_}++ } $data =~ /(\w+)/sog; try $count{$_}++ } for $data =~ /([\w-]+)/sog; > - Original Message - > From: "Michael Lamertz" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > Sent: Tuesday, April 24, 2001 1:03 PM > Subject: Re: [BPQ] help!! any idea whats wrong with this?? > > > > Paul ([EMAIL PROTECTED]) wrote: > > > > > > #!/usr/local/bin/perl -w > > > > > > use strict > > > open (FILE,$0) or die $!; # this reads itself > > > my($data,%count); > > > { local $/ = undef; # erases the record seperator for this > block > > >$data = ; # slurps in the whole file to $data > > > } > > > close(FILE); # good habit > > > map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context! > > > print map { "$count{$_} occurances of '$_'\n" } sort keys > %count; > > > > > > Perl is a wonderfully concise language. > > > The above is strictly given as an example of a few performance > tricks > > > that are worth researching. =o) > > > > I agree printing the map output, but I disagree using map to > calculate > > the sums. map always generates a new array that immediately gets > dumped > > since it's not assigned. A foreach would be nicer to system > resources > > and better to read. To make it short, use it postfix: > > > > $count{$_}++ foreach ($data=~ /.../); > > > > Check 'perldoc perlfaq6' for reference. > > > > -- > > If we fail, we will lose the war. > > > > Michael Lamertz | [EMAIL PROTECTED] / > [EMAIL PROTECTED] > > Nordstr. 49 | http://www.lamertz.net > > 50733 Cologne| Work: +49 221 3091-121 > > Germany | Priv: +49 221 445420 / +49 171 6900 310 > > > __ Do You Yahoo!? Yahoo! Auctions - buy the things you want at great prices http://auctions.yahoo.com/
Re: [BPQ] help!! any idea whats wrong with this??
Pretty cool but when used on a file it breaks hyphenated words into their components and counts them separately: 17 occurrences of 'Acct' 3 occurrences of 'Authentic' etc instead of: 3 occurrences of Acct-Authentic 3 occurrences of Acct-Delay-Time 1 occurrences of Acct-Input-Octets 1 occurrences of Acct-Input-Packets 1 occurrences of Acct-Output-Octets 1 occurrences of Acct-Output-Packets 3 occurrences of Acct-Session-Id 1 occurrences of Acct-Session-Time 3 occurrences of Acct-Status-Type - Original Message - From: "Michael Lamertz" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: "Chris Brown" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Tuesday, April 24, 2001 1:03 PM Subject: Re: [BPQ] help!! any idea whats wrong with this?? > Paul ([EMAIL PROTECTED]) wrote: > > > > #!/usr/local/bin/perl -w > > > > use strict > > open (FILE,$0) or die $!; # this reads itself > > my($data,%count); > > { local $/ = undef; # erases the record seperator for this block > >$data = ; # slurps in the whole file to $data > > } > > close(FILE); # good habit > > map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context! > > print map { "$count{$_} occurances of '$_'\n" } sort keys %count; > > > > Perl is a wonderfully concise language. > > The above is strictly given as an example of a few performance tricks > > that are worth researching. =o) > > I agree printing the map output, but I disagree using map to calculate > the sums. map always generates a new array that immediately gets dumped > since it's not assigned. A foreach would be nicer to system resources > and better to read. To make it short, use it postfix: > > $count{$_}++ foreach ($data=~ /.../); > > Check 'perldoc perlfaq6' for reference. > > -- > If we fail, we will lose the war. > > Michael Lamertz | [EMAIL PROTECTED] / [EMAIL PROTECTED] > Nordstr. 49 | http://www.lamertz.net > 50733 Cologne| Work: +49 221 3091-121 > Germany | Priv: +49 221 445420 / +49 171 6900 310 >
Re: [BPQ] help!! any idea whats wrong with this??
Agreed, and thanks for pointing that out! --- Michael Lamertz <[EMAIL PROTECTED]> wrote: > Paul ([EMAIL PROTECTED]) wrote: > > > > #!/usr/local/bin/perl -w > > > > use strict > > open (FILE,$0) or die $!; # this reads itself > > my($data,%count); > > { local $/ = undef; # erases the record seperator for this > block > >$data = ; # slurps in the whole file to $data > > } > > close(FILE); # good habit > > map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context! > > print map { "$count{$_} occurances of '$_'\n" } sort keys %count; > > > > Perl is a wonderfully concise language. > > The above is strictly given as an example of a few performance > tricks > > that are worth researching. =o) > > I agree printing the map output, but I disagree using map to > calculate > the sums. map always generates a new array that immediately gets > dumped > since it's not assigned. A foreach would be nicer to system > resources > and better to read. To make it short, use it postfix: > > $count{$_}++ foreach ($data=~ /.../); > > Check 'perldoc perlfaq6' for reference. > > -- > If we fail, we will lose the war. > > Michael Lamertz | [EMAIL PROTECTED] / > [EMAIL PROTECTED] > Nordstr. 49 | http://www.lamertz.net > 50733 Cologne| Work: +49 221 3091-121 > Germany | Priv: +49 221 445420 / +49 171 6900 310 __ Do You Yahoo!? Yahoo! Auctions - buy the things you want at great prices http://auctions.yahoo.com/
Re: [BPQ] help!! any idea whats wrong with this??
Paul ([EMAIL PROTECTED]) wrote: > > #!/usr/local/bin/perl -w > > use strict > open (FILE,$0) or die $!; # this reads itself > my($data,%count); > { local $/ = undef; # erases the record seperator for this block >$data = ; # slurps in the whole file to $data > } > close(FILE); # good habit > map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context! > print map { "$count{$_} occurances of '$_'\n" } sort keys %count; > > Perl is a wonderfully concise language. > The above is strictly given as an example of a few performance tricks > that are worth researching. =o) I agree printing the map output, but I disagree using map to calculate the sums. map always generates a new array that immediately gets dumped since it's not assigned. A foreach would be nicer to system resources and better to read. To make it short, use it postfix: $count{$_}++ foreach ($data=~ /.../); Check 'perldoc perlfaq6' for reference. -- If we fail, we will lose the war. Michael Lamertz | [EMAIL PROTECTED] / [EMAIL PROTECTED] Nordstr. 49 | http://www.lamertz.net 50733 Cologne| Work: +49 221 3091-121 Germany | Priv: +49 221 445420 / +49 171 6900 310
Re: [BPQ] help!! any idea whats wrong with this??
Hi Chris, You are getting only the last line of the file because of this: > foreach $i (@lines) { > @words = split(/\s+/, $i); > } You reassign the @words array each time, and end up with the last line only when exiting the foreach loop. You may want to look at 'perldoc -f push' to see how to add to an array. Here is how I would likely accomplish this task: #!/usr/bin/perl -w use strict; my %counts; open(FILE,"file.txt") or die "Can't open file: $!"; my @lines = ; close FILE; for (@lines) { $counts{$_}++ for (split /\s+/); } print qq{$counts{$_} occurances of the word $_\n} for keys %counts; Cheers, Kevin On Tue, Apr 24, 2001 at 10:17:02AM -0500, Chris Brown ([EMAIL PROTECTED]) spew-ed forth: > so...this is suposed to count the words in FILE and return how many occourances of >each word there were...its not working for me thoughits only returning the count >for the last word in the file...help > > #!/usr/local/bin/perl > > open (FILE,"../www/main.php3"); > @lines=; > close(FILE); > > foreach $i (@lines) { > @words = split(/\s+/, $i); > } > > foreach $word (@words) { > $wordcount{"$word"}=0; > } > > foreach $word2 (@words) { > $wordcount{"$word2"}+=1; > } > > foreach $key (keys (%wordcount)) { > print "$wordcount{$key} occourances of the whord $key\n"; > } > -- Down that path lies madness. On the other hand, the road to hell is paved with melting snowballs. --Larry Wall in <[EMAIL PROTECTED]>
Re: [BPQ] help!! any idea whats wrong with this??
--- Chris Brown <[EMAIL PROTECTED]> wrote: > so...this is suposed to count the words in FILE and return how many > occourances of each word there were...its not working for me > thoughits only returning the count for the last word in the > file...help > > #!/usr/local/bin/perl > > open (FILE,"../www/main.php3"); > @lines=; > close(FILE); > > foreach $i (@lines) { > @words = split(/\s+/, $i); > } > > foreach $word (@words) { > $wordcount{"$word"}=0; > } > > foreach $word2 (@words) { > $wordcount{"$word2"}+=1; > } > > foreach $key (keys (%wordcount)) { > print "$wordcount{$key} occourances of the whord $key\n"; > } TMTOWTDI :o] #!/usr/local/bin/perl -w use strict open (FILE,$0) or die $!; # this reads itself my($data,%count); { local $/ = undef; # erases the record seperator for this block $data = ; # slurps in the whole file to $data } close(FILE); # good habit map { $count{$_}++ } $data =~ /(\w+)/sog; # watch the context! print map { "$count{$_} occurances of '$_'\n" } sort keys %count; Perl is a wonderfully concise language. The above is strictly given as an example of a few performance tricks that are worth researching. =o) Paul __ Do You Yahoo!? Yahoo! Auctions - buy the things you want at great prices http://auctions.yahoo.com/
Re: [BPQ] help!! any idea whats wrong with this??
At 11:17 AM 4/24/2001, you wrote: >so...this is suposed to count the words in FILE and return how many >occourances of each word there were...its not working for me thoughits >only returning the count for the last word in the file...help > >#!/usr/local/bin/perl > >open (FILE,"../www/main.php3"); >@lines=; >close(FILE); > >foreach $i (@lines) { >@words = split(/\s+/, $i); >} The loop above clobbers @words each time. The loops below only work on the last line of the file. >foreach $word (@words) { >$wordcount{"$word"}=0; >} You're initializing every entry in the hash to 0, and you don't have to do that. The rest of the code looks good. But we still have to deal with the problem of not getting all of the words into @words. I think this would work, if used as the first loop of the program. foreach (@lines) { push @word, split /\s+/; } With the loop above, as you split words out of @lines, they get _added_ to @words. The assignment operator, =, would clobber the value already there, leaving us with just the last thing we assigned to it, the last line of the file. Look in perlfunc for the push, pop, shift, and unshift functions. If you don't know about stacks and queues, get a decent book on data structures and check them out. They are amazingly powerful for how simple they are, and Perl is nice enough to have all the stuff built in so you can treat regular arrays like either (or both) of them. As for hashes, when you use a hash key for the first time, the value is undef. Undef, in a numeric context, looks like 0. So all we have to do is add 1 for each time we see a word, and we are ok. There's no need to initialize them all to zero. So your program would become something that looks like this. use strict; open FILE, "../www/main.php3" or die "Can't open the file: $!"; # I added the "or die ..." above, because you want your # program to halt if you have no data to work on. You # also want to check the return values of functions that # do stuff outside of your program, to make sure that # the succeed. @lines=; close(FILE); foreach (@lines) { push @words, split /\s+/; } my %wordcount = (); foreach my $word (@words) { # Oh, my $word! : ) $wordcount{$word}++; # $var++ is shorthand for $var += 1 # but either is fine. } But, this can be cleaned up further! You can compress it all down to one loop, in a few different ways. Take another look at it Chris, play with it, and crunch it down. Less code, less bugs, less stuff to worry about. Good luck! Thank you for your time, Sean.
Re: [BPQ] help!! any idea whats wrong with this??
The problem is that you override the (global) array @words for each line. You go through @lines and the split overrides @words! While we are at it... ;-) You do not need that many loops. The programm will be much simpler like that: open(FILE, "yourFileName"); while ($line = ) { #no need to read the file into an array @words = split/\s+/, $line; #just process each line as it comes foreach $word (@words) { $wordcount{$word} += 1; #no need to initialize the hash with 0, #perl does that for you } } close(FILE); foreach $word (keys %wordcount) { print $word, ": ", $wordcount{$word}, "\n"; } hope this helps, cr P.S.: If you have question about the code above, feel free to ask. On Tue, 24 Apr 2001 10:17:02 -0500, Chris Brown said: > so...this is suposed to count the words in FILE and return how many occourances of >each word there were...its not working for me thoughits only returning the count >for the last word in the file...help > > #!/usr/local/bin/perl > > open (FILE,"../www/main.php3"); > @lines=; > close(FILE); > > foreach $i (@lines) { > @words = split(/\s+/, $i); > } > > foreach $word (@words) { > $wordcount{"$word"}=0; > } > > foreach $word2 (@words) { > $wordcount{"$word2"}+=1; > } > > foreach $key (keys (%wordcount)) { > print "$wordcount{$key} occourances of the whord $key\n"; > } > >
Re: [BPQ] help!! any idea whats wrong with this??
On Tue, Apr 24, 2001 at 10:17:02AM -0500, Chris Brown wrote: : so...this is suposed to count the words in FILE and return how many occourances of :each word there were...its not working for me thoughits only returning the count :for the last word in the file...help The following is a generality, not directly related to this question ( unless it is homework ;). Please note that I will not allow this list to do anyone's homework. Answering homework style questions with simple 'go read xxx' is fine, though. Pointers are very different from doing work for a student. -- Casey West
Re: [BPQ] help!! any idea whats wrong with this??
: so...this is suposed to count the words in FILE and return how many occourances of :each word there were...its not working for me thoughits only returning the count :for the last word in the file...help Think: In the first loop, what happens to the first line when you move to the second? (Sorry for the terse answer, but this one sounds like a homework problem. ;) -- Tim Kimball · ACDSD / MAST¦ Space Telescope Science Institute ¦ We are here on Earth to do good to others. 3700 San Martin Drive ¦ What the others are here for, I don't know. Baltimore MD 21218 USA¦ -- W.H. Auden