Hey Jon,

The most glaringly obvious thing I could recommend is that at least in your
perl routine (and probably the other languages) most of your time is
context switching reading from the disk.
Now, my perl version is indeed faster, but one has to ask themselves, was
.015193256 seconds really worth the effort?  /shrug   -- If this is for a
financial industry perhaps, but then they'd just have written it in C.
Otherwise, probably not.
Also note, there's other ways to speed this up even further, but at that
point it isn't really worth the time.  We're talking a couple of
microseconds at best.  I've included my version for your reference.

Before closing, I happen to like micro benchmarks whether or not you think
'I know this benchmark is maybe meaningless' as your site says.
If anything, it can absolutely be useful.  I personally think sometimes
they are and others not so much.  Just depends on the context.

Your perl source (doit) .. my perl source (doit2):
# ./doit2.pl | md5
786be54356a5832dcd1148c18de71fc8
root@nas:~ # ./doit.pl | md5
786be54356a5832dcd1148c18de71fc8

# truss -c ./doit.pl
<!--snip-->
syscall                     seconds   calls  errors
read                    0.036828813    4140       0
<!--snip-->
                      ------------- ------- -------
                        0.037821821    5227     284



# truss -c ./doit2.pl
<!--snip-->
syscall                     seconds   calls  errors
read                    0.000245121      19       0
<!--snip-->
                      ------------- ------- -------
                        0.022628565     804      59


-------------------------------------
use strict;

$/ = undef;
my %stopwords = do {
        open my $fh, '<:mmap', 'stopwords.txt' or die $!;
        map { $_ => 1; } split /\n/, <$fh>;
};

my %count = do {
        my %res;
        open my $fh, '<:mmap', 'words.txt' or die $!;
        map { $res{$_}++ unless $stopwords{$_}; } split /\n/, <$fh>;
        %res;
};

my $i=0;
for (sort {$count{$b} <=> $count{$a}} keys %count) {
    if ($i < 20) {
        print "$_ -> $count{$_}\n"
    } else {
       last;
    }
    $i ++;
}

On Sat, Jan 15, 2022 at 12:37 AM Jon Smart <j...@smartown.nl> wrote:

> Hello,
>
> May I show the result of my benchmark for perl5, ruby, and scala?
> https://blog.cloudcache.net/benchmark-for-scala-ruby-and-perl/
>
> Welcome you to give any suggestion to me for improving this.
>
> Thanks.
>


-- 
__________________

:(){ :|:& };:

Reply via email to