I've come to the conclusion that the current way threads are implemented in Perl 
basically are useless for anything serious, specifically for anything that looks like 
client <-> server.  Which I think many applications will use threads for.

I devised a little test program that basically sends request to a "server" thread from 
a variable number of "client" threads.  The number of requests is always the same.  
The job is basically incrementing a counter.  Summarized test output looks like this:

=====================================================================
workers = 1, 1.10 yields per request
   5 wallclock secs ( 3.77 usr + 0.94 sys = 4.71 CPU)

workers = 2, 2.69 yields per request
   5 wallclock secs ( 4.63 usr + 0.79 sys = 5.42 CPU)

workers = 10, 14.92 yields per request
  14 wallclock secs ( 12.81 usr + 1.43 sys = 14.24 CPU)

workers = 20, 34.11 yields per request
  35 wallclock secs ( 26.33 usr + 8.83 sys = 35.16 CPU)

workers = 50, 65.10 yields per request
123 wallclock secs ( 52.99 usr + 67.84 sys = 120.83 CPU)
=====================================================================

Please note that all of these runs all did the _same_ amount of work, just with a 
different number of worker threads.  And as the job itself is minimal, we're basically 
measuring overhead here.  The benchmark is not measuring thread startup or thread 
shutdown, just what happens inside the threads themselves.

Now you might say that more threads would mean more overhead.  And to an extent that 
is true.  But most of the overhead is caused by the CPU being evenly divided between 
all threads (which you will see if you run the test program at the end of this 
message), rather than the server thread getting one half of the CPU and the worker 
threads all get the other half.

This is caused by the fact that a cond_wait / cond_signal pair is not really a pair.  
The chance of the thread that is getting signalled acquiring the lock, is reduced by 
the number of client threads.  If you're running 1 worker thread, there is about a 
50/50 chance that the server thread gets the lock after a cond_signal.  If you're 
running 49 worker threads, you only have a 1/50 (2%) chance of getting the lock.  And 
all the other threads, seeing that it's not their turn yet, will just burn away CPU.  
You will see this in the number of yields in the above benchmark.  You should also 
note that even in a 1 server, 1 client situation, the number of yields is higher than 
0 (which it should be if the "other" thread would always get the lock).


I've tried to do a lot of things with Perl threads in the past 3 months.  But it's 
time to move on.  Maybe threads will mature in the next year, maybe not.  I am now 
convinced I won't be able to use them with Apache 2.0 and mod_perl, which was the 
reason for me to get into them in the first place.  I guess it's prefork MPM for me 
now.


Liz, now the "unthreaded one".


The benchmark program:
=============================================================================
use Benchmark;
use threads;
use threads::shared;

my $count : shared;
my $control : shared;
my $running : shared;
my @tries : shared;
my @benchmark : shared;

my $tocount = 100000;
foreach my $threads (1,2,10,20,50) {
   $count = 0;
   $control = 1;
   my $server = threads->new( sub { # start the server thread
     lock( $control );
     $control = 0;
     threads->yield until $running;
     my $t0 = Benchmark->new;
     while (1) {
       cond_wait( $control );
       $count++;
       last if $count == $tocount;
       $control = 0;
     }
     $running = 0;
     $benchmark[threads->tid] = timestr( timediff( Benchmark->new, $t0 ) );
   } );
   threads->yield while $control; # wait until server thread has started

   my @worker;
   push( @worker,threads->new( sub { # start a worker thread
     threads->yield until $running;
     my $t0 = Benchmark->new;
     while ($running) {
       my $tries;
       AGAIN: while ($running) {
         threads->yield if $tries++;
         {lock( $control );
          next AGAIN if $control;
          $control = 1;
          cond_signal( $control );
         }
         $tries[threads->tid] += $tries;
         $tries = 0;
       }
     }
     $benchmark[threads->tid] = timestr( timediff( Benchmark->new, $t0 ) );
   } ) ) foreach 1..$threads;

   $running = 1; # start all the threads now
   $server->join;
   $_->join foreach @worker;

   my $total;
   $total += $tries[$_] foreach 0..$#tries;
   my $ypr = sprintf( '%.2f',$total/$count );
   print "workers = $threads, $ypr yields per request\n";
   my ($worker,$secs,$usr,$sys,$CPU);
   foreach (0..$#benchmark) {
     next unless exists $benchmark[$_];
     $benchmark[$_] =~
      m#(\d+) wallclock secs \(\s+([\d\.]+) usr \+\s+([\d\.]+) sys =\s+([\d\.]+) CPU#;
     $secs = $1; $usr += $2; $sys += $3; $CPU += $4;
     print " $worker$benchmark[$_]\n";
     $worker = ' ';
   }
   printf( "%3d wallclock secs ( %.2f usr + %.2f sys = %.2f CPU)\n\n",
    $secs,$usr,$sys,$CPU );
   @benchmark = @tries = ();
}
=============================================================================

Reply via email to