------- Additional Comments From rittle at latour dot waar dot labs dot mot dot com 2005-05-02 16:02 ------- Subject: Re: New: Lack of Posix compliant thread safety in std::basic_string
>I am sending this to the g++ bug list on the recommendation of >Gabriel Dos Reis. From what little I've read in the g++ >documentation, I'm not convinced that the authors of the g++ >library intend for it to be supported, although Posix would seem >to require it. Firstly, POSIX says nothing about C++ thus my confusion starts here. Secondly, it is clear that your bug report is hypothetical. The library maintainers do not typically deal in hypotheticals. >For the record, the statement in Posix is: "Applications shall >ensure that access to any memory location by more than one >thread of control (threads or processes) is restricted such that >no thread of control can read or modify a memory location while >another thread of control may be modifying it." The obvious >extension to C++ is that of replacing "memory location" with >"object"; at the very least, of course, one can only require >something of "memory locations" which the user sees, directly or >indirectly. OK. >The statement in the libstdc++-v3 FAQ (question >5.6) is: "All library objects are safe to use in a multithreaded >program as long as each thread carefully locks out access by any >other thread while it uses any object visible to another thread, >i.e., treat library objects like any other shared resource. In >general, this requirement includes both read and write access to >objects; unless otherwise documented as safe, do not assume that >two threads may access a shared standard library object at the >same time." OK, I seem to recall editing that statement at one point... >A considerably weaker guarantee than what one >normally expects under Posix. (Note that the clause "like any >other shared resource" is simply false for those of us used to >the Posix model. If I replace std::string with char[] in my >code below, the behavior is perfectly defined under Posix.) No, confusion. We are guaranteeing that if you lock the visible accesses to global or otherwise shared objects, that no POSIX threading rule will be violated within our own library code. We are saying that we guarantee that any internal shared objects (which may not be visible to the user, thus not lockable by the user) will be correctly handled per POSIX threading rules. >The following is an example of a program which may cause >problems: > > #include <pthread.h> > #include <iostream> > #include <ostream> > #include <string> > > std::string g ; Please note that g is a global object which you have shared between two tasks. Thus the words in the FAQ are operative unless another guarantee is documented as operative. > extern "C" void* > thread1( > void* lock ) > { > std::string t( g ) ; This is read access of shared, global g thus incorrect usage according to the general contract that we offered you. > pthread_mutex_lock( static_cast< pthread_mutex_t* >( lock ) ) ; > std::cout << t << '\n' ; > pthread_mutex_unlock( static_cast< pthread_mutex_t* >( lock ) ) ; > return NULL ; > } > > extern "C" void* > thread2( > void* lock ) > { > std::string t( g.begin(), g.end() ) ; This is read access of shared, global g thus this is incorrect usage according to the general contract that we offered you. There may or may not be additional documentation covering the std::string<> implementation which implies that the above usage should work. In fact, I think there is such documentation. However, you should refer to that documentation not the general FAQ about all other library objects. > pthread_mutex_lock( static_cast< pthread_mutex_t* >( lock ) ) ; > std::cout << t << '\n' ; > pthread_mutex_unlock( static_cast< pthread_mutex_t* >( lock ) ) ; I note that you used the correct idiom implied by the FAW to lock shared access to the gloabl std::cout. Why do you consider g to be different? > return NULL ; > } > > int > main() > { > g = "0123456789" ; > pthread_mutex_t lock ; > pthread_mutex_init( &lock, NULL ) ; > pthread_t t1 ; > if ( pthread_create( &t1, NULL, &thread1, &lock ) != 0 ) { > std::cerr << "Could not create thread1" << std::endl ; > } > pthread_t t2 ; > if ( pthread_create( &t2, NULL, &thread2, &lock ) != 0 ) { > std::cerr << "Could not create thread } > pthread_join( t1, NULL ) ; > pthread_join( t2, NULL ) ; > return 0 ; > } > >Consider the following scenario: > > -- Thread 2 executes first. It gets to the expression > g.begin() (which for purposes of argument, we will suppose > is called before g.end() -- the problem will occur in which > ever function is called first), and starts executing it. > > At this point, the value _M_refcount in the _Rep_base is 0, > since there is only one instance, g, which shares the > representation. The representation is not "leaked", so we > call _M_leak_hard. > > _M_leak_hard calls _M_rep()->_M_is_shared(), which returns > false. > > -- Thread 1 interupts. Thread 2 calls the copy constructor, > with g as a parameter, which ultimately calls _M_grab on the > _M_is_leaked() returns false, since the _M_refcount is still > 0 in the representation. Thread 2 thus calls _M_refcopy() > on the representation, which (atomically) increments > _M_refcount. Thread 1 leaves the copy constructor. > > -- Now back to thread 2. Since _M_is_shared() returned false, > thread 2 doesn't call _M_mutate -- is simply calls > _M_set_leaked() on the representation, which sets > _M_refcount to -1. > > We will suppose that this modification is visible to all > other threads, despite the fact that there are no memory > barriers around it (which means that this supposition will > be false on certain platforms). > > -- And life goes on. The second call to begin()/end() doesn't > change anything, because it finds that the representation is > already "leaked". > > -- Finally, suppose that thread 1 finishes while thread 1 is > still using its iterators. Thread 1 calls the destructor > for its string. It sees that _M_refcount < 0, concludes > that the representation is leaked, and deletes it. Despite > the fact that thread 2 still holds iterators refering to it, > and despite the fact that there is still a global variable > (usable after the pthread_joins in main) which depends on > it. > >The problem is, of course, that the sequence which tests whether >we have to leak, and then leaks, is not atomic. >-- > Summary: Lack of Posix compliant thread safety in > std::basic_string > Product: gcc > Version: 3.4.3 > Status: UNCONFIRMED > Severity: minor > Priority: P2 > Component: libstdc++ > AssignedTo: unassigned at gcc dot gnu dot org > ReportedBy: jkanze at cheuvreux dot com > CC: gcc-bugs at gcc dot gnu dot org > GCC build triplet: sparc-sun-solaris2.8 > GCC host triplet: sparc-sun-solaris2.8 >GCC target triplet: sparc-sun-solaris2.8 > > >http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334