-----BEGIN PGP SIGNED MESSAGE-----

Moin,

I noticed that this testfile is expectionally slow:

        # time ./perl -Ilib t/uni/class.t
        [snip]
        ok 4670

        real    0m33.821s
        user    0m30.893s
        sys     0m0.487s

But the entire testsuite only takes:

        All tests successful.
        u=2.72  s=0.72  cu=183.71  cs=25.36  scripts=954  tests=107891

        real    6m0.340s
        user    3m9.813s
        sys     0m27.548s

Or in other words, about 15% (30 out of 180) seconds are taken by that
test file alone.

Looking at what the test does, it has these bits near the end (among
others):

        # test the blocks (InFoobar)
        for (grep $utf8::Canonical{$_} =~ /^In/, keys %utf8::Canonical) {
          my $filename = File::Spec->catfile(
            $updir => lib => unicore => lib => gc_sc => 
"$utf8::Canonical{$_}.pl"
          );

          next unless -e $filename;
          my ($h1, $h2) = map hex, (split(/\t/, (do $filename), 3))[0,1];
          my $str = join "", map chr, $h1 .. (($h2 || $h1) + 1);

          my $blk = $_;

          is($str =~ /(\p{$blk}+)/ && $1, substr($str, 0, -1));
          is($str =~ /(\P{$blk}+)/ && $1, substr($str, -1));

          $blk =~ s/^In/Block:/;

          is($str =~ /(\p{$blk}+)/ && $1, substr($str, 0, -1));
          is($str =~ /(\P{$blk}+)/ && $1, substr($str, -1));
        }

Now, at first I thought some of these test are doubles, it tests 240
files, but there are only 145 unique file names. However, it does test
the same file with different blocknames. So I don't think we can cut out
some of the tests, unfortunately.

My next idea was that maybe we can load the file only once, and then test
all blocks.

However, the testsetup (reading in the file, compiling the string etc) is
not what is so slow, it is the  actually is() calls. When trying to
benchmark the speed, I did put in a time_it() routine, which should do
exactly the same as the is() tests. That alone speed up the runing time
on my system from roughly 30 seconds to:

        real    0m21.458s
        user    0m18.690s
        sys     0m0.452s

Hm. Two questions:

A: If the tests are not equivalent, what did I do wrong?
B: If they are equivalent, why is my routine much faster than the
construct in the original file? Bug/problem in Perl or Test.pm?
B.2: Now matter why it is faster, can we safely rewrite the tests that
way? That would cut out about 5% of the entire testsuite running time.

Best wishes,

Tels

- --
 Signed on Sun Jun 26 11:01:56 2005 with key 0x93B84C15.
 Visit my photo gallery at http://bloodgate.com/photos/
 PGP key on http://bloodgate.com/tels.asc or per email.

 "Sacrificing minions: Is there any problem it CAN'T solve?" -- Lord
 Xykon



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iQEVAwUBQr6CsHcLPEOTuEwVAQFG+Af8DPfc6TJAgW5StR81nBD9A+stKrJn7+ba
gDbMpKX590mS+O44yqITF82HM2p+GyVsRKood2ypnS6ulJ0EnGkT2dAWXe2Cwfwx
dhKEMXwukh0oSAzOhxxOaR/AwGfFzHJRq9VVsT11jffGBnDlmBe6PomjzvDpckLO
XihBSGSrwdtTbenpuLZ5YLNQZobP4WKpppAderGnFscY+XOCxH33znReM2mu+JjS
BkXeCaZPlhDV3X7zyWbOQHTEvYSGHe/7LDvw9/KdugwTiomWkuEpb5VZUMaG26Ms
9r4ikQLZFdo8DfzD0nKbzIgP99kRmk08EQITbuWqcjNA6CmH5A+Akw==
=rIoX
-----END PGP SIGNATURE-----
--- t/uni/class.t       2005-05-28 17:08:44.000000000 +0200
+++ t/uni/class3.t      2005-06-26 11:59:12.000000000 +0200
@@ -25,7 +25,7 @@
 END
 }
 
-
+use strict;
 my $str = join "", map chr($_), 0x20 .. 0x6F;
 
 # make sure it finds built-in class
@@ -127,7 +127,7 @@
 
   my %files;
 
-  my $dirname = File::Spec->catdir($updir => lib => unicore => lib => gc_sc);
+  my $dirname = File::Spec->catdir($updir => lib => unicore => lib => 'gc_sc');
   opendir D, $dirname or die $!;
   @files{readdir(D)} = ();
   closedir D;
@@ -153,6 +153,18 @@
   }
 }
 
+sub time_it
+  {
+  my ($str, $blk) = @_;
+
+  my $qr = qr/(\p{$blk}+)/; $str =~ /$qr/;
+  is ($1, substr($str, 0, -1));                 # all except last char
+
+  $qr = qr/(\P{$blk}+)/; $str =~ /$qr/;
+  is ($1, substr($str, -1));                    # only last char
+
+  }
+
 # test the blocks (InFoobar)
 for (grep $utf8::Canonical{$_} =~ /^In/, keys %utf8::Canonical) {
   my $filename = File::Spec->catfile(
@@ -160,16 +172,21 @@
   );
 
   next unless -e $filename;
+
+  print "# In $filename $_\n";
+
   my ($h1, $h2) = map hex, (split(/\t/, (do $filename), 3))[0,1];
   my $str = join "", map chr, $h1 .. (($h2 || $h1) + 1);
 
   my $blk = $_;
 
-  is($str =~ /(\p{$blk}+)/ && $1, substr($str, 0, -1));
-  is($str =~ /(\P{$blk}+)/ && $1, substr($str, -1));
+  time_it ($str, $blk);
+  #is($str =~ /(\p{$blk}+)/ && $1, substr($str, 0, -1));
+  #is($str =~ /(\P{$blk}+)/ && $1, substr($str, -1));
 
   $blk =~ s/^In/Block:/;
 
-  is($str =~ /(\p{$blk}+)/ && $1, substr($str, 0, -1));
-  is($str =~ /(\P{$blk}+)/ && $1, substr($str, -1));
+  time_it ($str, $blk);
+  #is($str =~ /(\p{$blk}+)/ && $1, substr($str, 0, -1));
+  #is($str =~ /(\P{$blk}+)/ && $1, substr($str, -1));
 }

Reply via email to