This is a very interesting discussion and reveals a lot about the
difficulties of programming for multiple cores.

I was trying to understand what was going on and was 'messing about'
with the code and I noticed that almost any change I made slowed the
code down considerably - more than the 1 or 2 thread options in Java 6
does. Therefore I am not sure how realistic the code is. If the code
did more than simply increment a variable or two than the problem
might go away (because contention would be less and because the
uncontested operations would be a bigger percentage).

Going back to the original code. If it is OK to miss a few increments
(hence non-volitile statics) then the following should be OK:

          threads[ j ] =
            new Thread() {
              public void run() {
                final int total = totalSize / threadCount;
                for ( int k = 0; k < total; k++ ) {
                  if ( ( k & 0xFF ) == 0 ) {  // New
                    i += 256;                     // New
                    firedCount++;              // New
                  }                                   // New
//                  if ( ( i++ & 0xFF ) == 0 ) { firedCount++; } //
Original
                }
              }
            };

The above version doesn't show the problem on Java 6 and is quicker
than the original.

On Apr 2, 6:48 pm, Charles Oliver Nutter <[EMAIL PROTECTED]>
wrote:
> I ran into a very strange effect when some Sun folks tried to benchmark
> JRuby's multi-thread scalability. In short, adding more threads actually
> caused the benchmarks to take longer.
>
> The source of the problem (at least the source that, when fixed, allowed
> normal thread scaling), was an increment, mask, and test of a static int
> field. The code in question looked like this:
>
> private static int count = 0;
>
> public void pollEvents(ThreadContext context) {
>    if ((count++ & 0xFF) == 0) context.poll();
>
> }
>
> So the basic idea was that this would call poll() every 256 hits,
> incrementing a counter all the while. My first attempt to improve
> performance was to comment out the body of poll() in case it was causing
> a threading bottleneck (it does some locking and such), but that had no
> effect. Then, as a total shot in the dark, I commented out the entire
> line above. Thread scaling went to normal.
>
> So I'm rather confused here. Is a ++ operation on a static int doing
> some kind of atomic update that causes multiple threads to contend? I
> never would have expected this, so I wrote up a small Java benchmark:
>
> http://pastie.org/173993
>
> The benchmark does basically the same thing, with a single main counter
> and another "fired" counter to prevent hotspot from optimizing things
> completely away. I've been running this on a dual-core MacBook Pro with
> both Apple's Java 5 and the soylatte Java 6 release. The results are
> very confusing:
>
> First on Apple's Java 5
>
> ~/NetBeansProjects/jruby ➔ java -server Trouble 1
> time: 3924
> fired: 3906250
> time: 3945
> fired: 3906250
> time: 1841
> fired: 3906250
> time: 1882
> fired: 3906250
> time: 1896
> fired: 3906250
> ~/NetBeansProjects/jruby ➔ java -server Trouble 2
> time: 3243
> fired: 4090645
> time: 3245
> fired: 4100505
> time: 1173
> fired: 3906049
> time: 1233
> fired: 3906188
> time: 1173
> fired: 3906134
>
> Normal scaling here...1 thread on my system uses about 60-65% CPU, so
> the extra thread uses up the remaining 35-40% and the numbers show it.
> Then there's soylatte Java 6:
>
> ~/NetBeansProjects/jruby ➔ java -server Trouble 1
> time: 1772
> fired: 3906250
> time: 1973
> fired: 3906250
> time: 2748
> fired: 3906250
> time: 2114
> fired: 3906250
> time: 2294
> fired: 3906250
> ~/NetBeansProjects/jruby ➔ java -server Trouble 2
> time: 3402
> fired: 3848648
> time: 3805
> fired: 3885471
> time: 4145
> fired: 3866850
> time: 4140
> fired: 3839130
> time: 3658
> fired: 3880202
>
> Don't compare the times directly, since these are two pretty different
> codebases and they each have different general performance
> characteristics. Instead pay attention to the trend...the soylatte Java
> 6 run with two threads is significantly slower than the run with a
> single thread. This mirrors the results with JRuby when there was a
> single static counter being incremented.
>
> So what's up here?
>
> - Charlie
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to