RE: STDCXX-1056 : numpunct fix

Travis Vitek Thu, 20 Sep 2012 13:45:31 -0700


> -----Original Message-----
> From: Stefan Teleman [mailto:stefan.tele...@gmail.com]
> Sent: Thursday, September 20, 2012 10:11 AM
> To: dev@stdcxx.apache.org
> Subject: Re: STDCXX-1056 : numpunct fix
> 
> On Thu, Sep 20, 2012 at 8:07 AM, Liviu Nicoara <nikko...@hates.ms>
> wrote:
> > But have you measured the amount of memory consumed by all STDCXX
> locale data loaded in one process? How much absolute time is spent in
> resizing the locale and facet buffers? What is the gain in space and
> time performance with such a change versus without? Just how fragmented
> the heap becomes and is there a performance impact because of it, etc.?
> IOW, before changing the status quo one must show an objective defect,
> produce a body of evidence, including a failing test case for the
> argument.
> 
> sizeof(std::locale) * number_of_locales.
> 
> I'll let you in on a little secret: once you call setlocale(3C) and
> localeconv(3C), the Standard C Library doesn't release its own locale
> handles until process termination. So you might think you save a lot
> of memory by destroying and constructing the same locales. You're
> really not. It's the Standard C Library locale data which takes up a
> lot of space.


You have a working knowledge of all Standard C Library implementations?

> 
> What I do know for a fact that this "optimization" did, was to cause
> the races conditions reported by 4 different thread analyzers. Race
> conditions are a show-stopper for me, and they are not negotiable.

The following is found near the top of the _C_manage method of __rw_facet.

    // acquire lock
    _RWSTD_MT_STATIC_GUARD (_RW::__rw_facet);

None of the shared data related to is read/written outside of the critical 
section protected by that lock, and given the declaration for that shared data 
it cannot be accessed by any code outside that function. Put bluntly, there is 
no way that there is a race condition relating to the caching code itself.

Your Performance Analyzer output indicates a race (7 race accesses) for 
_C_manage...

  http://s247136804.onlinehome.us/22.locale.numpunct.mt.1.er.ts/

Specifically, it is calling out the following block of code.

##  7        0             488.             *__rw_access::_C_get_pid (*pfacet) =
                           489.                 _RWSTD_STATIC_CAST 
(_RWSTD_SIZE_T, (type + 1) / 2);

The function _C_get_pid simply exposes a reference to a data member of the 
given facet. That function is thread safe. Provided that pfacet (the parameter 
passed to _C_manage) isn't being accessed by another thread, there is no way 
that this code is not safe. It is possible that calling code is not safe, but 
this code is clean.

Regardless, the proposed patch to _C_manage does nothing to change this block 
of code. I do not understand how you can claim that this change eliminated the 
race conditions you are so offended by. It is possible that other changes you 
have made eliminated the data races, but I do not see how this change has any 
effect.

> 
> > And I love minimalistic code, and hate waste at the same time,
> especially in a general purpose library. To each its own.
> 
> Here's a helpful quote:
> 
> "We should forget about small efficiencies, say about 97% of the time:
> premature optimization is the root of all evil". It's from Donald
> Knuth.

By that measure, your entire patch could be considered evil. I've seen no 
evidence that the subsequent two allocation/copy/deallocate/sort cycles 
required to get from 8 to 64 entries is measurably more expensive, and I've 
seen nothing to indicate that a normal application using the C++ Standard 
Library would be creating and destroying locale instances in large numbers, or 
that doing so has a measureable impact on performance.

> And I love correct code which doesn't cause thread analyzers to report
> more than 12000 race conditions for just one test case. I've said it
> before and I will say it again: race conditions are a showstopper and
> are not negotiable. Period.

When the code in question has 12 threads that invoke a function 1000 times, 
you've found 1 race condition. I do agree data races are bad and should be 
fixed. But making changes to 'optimize' the code instead of fixing it is 
actually much worse.

> 
> The patch is in scope for the issue at hand. The issue is that
> std::numpunct and std::moneypunct are not thread safe. This has been
> confirmed by 4 different thread analyzers, even after applying your
> _numpunct.h patch.

I looked at the output from the thread analyzer. It points out a data race in 
__rw::__rw_allocate(), indicating that a memset() is responsible for a data 
race...

  
http://s247136804.onlinehome.us/22.locale.numpunct.mt.1.er.ts/file.14.src.txt.html#line_43

Assuming that `operator new' is indeed thread safe (I didn't bother to look), 
I'm curious to hear how this is an actual data race. I'm also curious to hear 
how you managed to avoid having the same race appear in the output that you 
submitted with the proposed patch.

> You are more than welcome to submit your own patch which eliminates
> 100% of the race conditions.
> 
> --Stefan
> 
> --
> Stefan Teleman
> KDE e.V.
> stefan.tele...@gmail.com

RE: STDCXX-1056 : numpunct fix

Reply via email to