Chris pointed to me that currently SunToolkit.lock() uses
  ReentrantLock which is supposed to have better
  characteristics than built-in Java synchronization
  under contention.

  So it would be interesting to see exactly what you were
  measuring, and how.

  Also, if you're doing any kind of Java2D performance
  testing I would encourage to use J2DBench as the
  benchmark (can be found in jdk/src/share/demo/J2DBench).
  You can plug in new tests if the existing ones don't
  match what you want to test.

  Thanks,
    Dmitri


Dmitri Trembovetski wrote:

  Hi Clemens.

Clemens Eisserer wrote:
Hello,

   Since most applications do render from one thread (either the
   Event Queue like Swing apps, or some kind of dedicated rendering
   thread like games), the lock is indeed very fast, given
   biased locking and such.

   I would suggest not trying to optimize things - especially tricky
   ones which involve locking - until you have
   identified with some kind of tool that there's a problem.

I did some benchmarking to find out the best design for my new
pipeline, and these are the results I got:

10mio solid 1x1 rect, VolatileImage, server-compiler, Core2Duo-2ghz,
Intel-945GM, Linux:

200ms no locking, no native call
650ms locking only
850ms native call, no locking
1350ms as currently implemented in X11Renderer

  Did you mean OGLRenderer? The X11Renderer doesn't use single
  thread rendering model and thus doesn't need render queue.

  Note that on X11 the render queue lock is doubled as the lock against
  all X11 access - for both awt and 2d. We must lock around it because
  we all use the same display, and X11 is not multi-threaded (at
  least in the way we use it).
  This means that the lock is likely to be promoted to a heavyweight lock,
  which is why it is expensive.

  So the problem with having separate render buffers per thread is that
  at some point you will have to synchronize on SunToolkit.awtLock()
  anyway.

I did rendering only from a single thread (however not the EDT), in
this simple pipeline-overhead test the locking itself is almost as
expensive as the "real" work (=native call), and far more expensive
than an "empty" JNI call.
However this was on a dual-core machine, on my single-core amd64
machine locking has much less influence. As far as I know biased
locking is only implemented for monitors.
Xorg ran on my 2nd core, and kept it with locking only 40% busy,
without locking about 80%.

However I have to admit there are probably much more important things
to do than playing with things like that ;)

  You probably can explore ways to improve the current design,
  which only allows a single rendering queue. For example,
  we had discussed the possibility of extending the STR design
  to allow a rendering thread per destination. But again,
  on unix it will bump against the need to sync around X11 access.

  You can also play with having a render buffer per thread as
  you suggest, but your rendering thread will have to sync for
  reading from each render buffer - presumably on the same lock
  as the thread used to put stuff into that buffer.
  All doable, but risky and hard to assess the benefits before
  you have a working implementation. Just commenting out
  locks gives wrong impression, since the resulting code
  becomes incorrect and thus the benchmark results can't be
  trusted.

  Anyway, I would suggest that you look at optimizing
  this later.

   If it appears null during a sync() call, no harm is done (the
   sync is just ignored - which is fine given that the render queue
   hasn't been created yet, so there's nothing to sync), so this is
   not a problem.
But what does happen if it has already been created, but the thread
calling sync() just does not see the updated "theInstance" value?
Could there be any problem when sync()-calls are left out?

  If the thread calling sync() sees theInstance as null, this means
  that it could not have anything to sync. If there's no queue,
  it could not have put anything into that queue prior to
  calling sync(). The sync() can be safely ignored.

  Thanks,
    Dmitri

Reply via email to