Travis Vitek wrote:
Since we don't have a string perf test that I could find, I wrote up a
quick and dirty one that just made many copies of the same string
repeatedly to exercise the atomic increment/decrement. The results show
a 3% performance penalty when using the newer atomic functions. This
test was run with an 8d configuration, so the atomic functions were
compiled into the stdcxx dll. The test hardware is a Lenovo T60p [Intel
Core 2 T7600 2.33GHz CPU, 2GB RAM].

8d is not thread-safe so the atomic function templates should
be implemented in terms of ordinary increments and decrements
(if they aren't it's a bug). They should only expand to the
atomic assembly (or the Win32 Interlocked) functions in 12X
and 15X build types.

Martin


  Old                new [patched]
  ------  1 threads  ------  1 threads
  ms            714  ms            737
  ms/op  0.00004256  ms/op  0.00004393
  ------  2 threads  ------  2 threads
  ms           3911  ms           4024
  ms/op  0.00023311  ms/op  0.00023985
  ------  4 threads  ------  4 threads
  ms           7660  ms           7865
  ms/op  0.00045657  ms/op  0.00046879
  ------  8 threads  ------  8 threads
  ms          15192  ms          15585
  ms/op  0.00090551  ms/op  0.00092894

I'm wondering if we used inline assembly for the __rw_atomic_* functions
if the cost would be reduced. We could also evaluate the intrinsic
pragma that is available on MSVC.

Travis

-----Original Message-----

I will do a quick run using the string performance test after lunch.
I'll report the results on that later. I've pasted the source for the
bulk of my test below. If someone wants the entire thing, let me know
and I'll provide everything.

Travis


Reply via email to