On Mon, 2012-12-17 at 13:52 +0400, Dmitry Vyukov wrote:
> resend in plain text
> 
> On Mon, Dec 17, 2012 at 1:50 PM, Dmitry Vyukov <dvyu...@google.com> wrote:
> >
> > On Fri, Dec 14, 2012 at 5:43 PM, Torvald Riegel <trie...@redhat.com> wrote:
> > > On Thu, 2012-12-13 at 10:02 +0100, Jakub Jelinek wrote:
> > >> On Thu, Dec 13, 2012 at 10:38:13AM +0400, Dmitry Vyukov wrote:
> > >> > On Wed, Dec 12, 2012 at 11:50 PM, Jakub Jelinek <ja...@redhat.com> 
> > >> > wrote:
> > >> > > Various TM tests ICE when built with -fgnu-tm -fsanitizer=address.
> > >> > > The problem is that asan.c pass adds calls to builtins that weren't 
> > >> > > there
> > >> > > before and TM is upset about it.  The __asan_report* are all like
> > >> > > abort, in correctly written program they shouldn't have a user 
> > >> > > visible
> > >> > > effect, in bad program they will terminate the process, but in any 
> > >> > > case
> > >> > > it doesn't matter how many times they are retried as part of a 
> > >> > > transaction,
> > >> > > there is no state to roll back on transaction cancellation.
> > >> > > __asan_handle_no_return, while not being noreturn, just marks the 
> > >> > > stack as
> > >> > > unprotected, so again in correctly written application no effect, in 
> > >> > > bad app
> > >> > > might result in some issues being undetected, but still, it can be 
> > >> > > done many
> > >> > > times and isn't irreversible.
> > >> >
> > >> > I was only loosely following tm-languages discussions. Does gcc tm
> > >> > model guarantees strong consistency for all memory accesses? I mean
> > >> > there are tm implementations that allow transient inconsistencies,
> > >>
> > >> Will leave this to Torvald.
> > >
> > > This has two parts: (1) how TM fits into the C++11/C11 memory model, and
> > > (2) which guarantees the compiler and the TM runtime library agree on at
> > > the level of the TM ABI that we use in GCC.
> > >
> > > Regarding the first part, all transactions provide guarantees similar to
> > > a global lock.  Specifically, there are virtual transaction start/end
> > > events that take part in sequenced-before (similar to acquisition and
> > > release of the global lock), whose are then guaranteed to execute
> > > (without transactions interleaving with each other) in a global total
> > > order (let's call this order TO).  TO then contributes to
> > > happens-before.
> > >
> > > On the ABI level, the TM runtime is allowed to execute speculatively as
> > > long as (1) it does not expose any speculative execution to
> > > nontransactional code and (2) speculation doesn't violate the language's
> > > as-if rules (i.e., no visible side effects other than the abstract
> > > machine; no signals due to seg faults etc.).  This means that, for
> > > example, the TM will not access data at a wider granularity than what
> > > nontransactional code that conforms to the C++11 memory model does would
> > > do.
> > > Second, transactions can have a tentative position in TO, but they will
> > > only expose final TO choices to nontxnal code.  So, each transaction
> > > will execute code that would be valid when executed in isolation -- but
> > > it is possible that several transactions noncommitted transactions are
> > > in flight concurrently that may conflict with each other.  The TM
> > > ensures that such speculative execution is safe and not visible to a
> > > race-free program.  So, when a transaction commits at a certain position
> > > in TO, it will make sure that all other active transactions reach
> > > consensus on TO (up to this position) before it returns to the execution
> > > of nontxnal code.  Those active transactions will either see that they
> > > would be still valid at a new TO position (after the committing txn), or
> > > they abort and then signal that they agree to this TO.  That means that
> > > the nontxnal code will not be affected by any speculative execution,
> > > even if the prior txn privatized some data (this is called privatization
> > > safety in TM jargon).
> > > The sort-of counterpart to privatization safety is publication safety
> > > (ie, transactions can safely make some data shared).  While
> > > privatization safety is handled by the TM runtime, publication safety is
> > > ensured by the compiler by not allowing any dangerous load/load
> > > reordering, basicallly.  Publication safety is not yet ensured always,
> > > but Aldy is working on this.
> > >
> > >> > than are detected later and trx is restarted. Can't asan trigger false
> > >> > positives in this case?
> > >>
> > >> I can't imagine any.
> > >
> > > I also don't see any, because all loads/stores of all transactions, even
> > > those with a tentative TO position, would be a valid execution.  For
> > > malloc/free in transactions, AFAIR, ASAN hooks into malloc/free
> > > directly, so when libitm handles those including rollback of an
> > > allocation, there shouldn't be false positives.  However, there might be
> > > false negatives in case of free() because deallocations are deferred
> > > until transaction commit (they can't be rolled back).  Could we tell
> > > ASAN about free() directly using a call added by the TM instrumentation
> > > pass?  That would at least help catch the false negatives for all
> > > free()-like functions that the compiler knows about.
> >
> > I do not think that false positive is super important.  We have a similar 
> > false negative in case of racy use after free, when the free the just 
> > happen to happen after the use.
> >
> > However, it must be possible to support, because in case of asan we do can 
> > "rollback" free.
> > But won't it lead to false positives in case of:
> >
> > __transaction {
> >   free(p);
> >   p = 0;
> > }
> >
> > __transaction {
> >   r = 0;
> >   if (p)
> >     r = p->x;
> >   ...
> >
> > }
> > ?

(First, sorry for the late reply.)

You raise a good point.  For the thread executing the free(), the free()
should take immediate effect in the realm of asan checking.  For other
threads, it's more complicated because they can virtually execute at
older TO positions (see above).
The simplest solution would probably be again to execute transactions
serially.  Otherwise, we could use an STM with visible reads, which
doesn't switch TO position during execution of transactions (but there's
no STM method with visible reads implemented in libitm yet);  or, we
could invisible reads and replay all memory accesses whenever a
transaction switches it's TO position.

> >
> >
> > > All of this is different for TSAN, of course, because there a memory
> > > access does alter the state that TSAN keeps, so if we roll back the
> > > accesses in the TM we would also have to roll back the TSAN state.
> >
> >
> > I think the simplest way to solve it for now, it to use... well, single 
> > global lock.
> > I.e. replace __txn_start() with global pthread_mutex_t acquisition, 
> > __txn_commit() with lock release. And no transactional instrumentation 
> > inside of the trx. It should be correct transformation, right. However, I 
> > am not sure what to do with __txn_abort()...

This is already an execution mode supported by libitm.  It can be forced
by setting an env var: LIBITM_DEFAULT_METHOD=serial

Should we just tell users (in the docs) to set this env var when using
TSAN, or can TSAN set it, or should there be a way for TSAN to set the
mode using another mechanism?

> > >> > Also, what is the order of instrumentation in tm+asan setting? I mean
> > >> > that neither tm must instrument asan instrumentation, nor asan must
> > >> > instrument tm instrumentation. Is it the case? There also can be
> > >> > conflicts related to ordering of instrumentation in the following
> > >> > case:
> > >> > asan_check();
> > >> > speculative_load();
> > >> > tm_check();
> > >
> > > The ordering of instrumentation is an interesting question, but I think
> > > that putting the ASAN check before the TM load/store is the right thing
> > > to do.  It will catch any illegal access before it's performed.  It
> > > doesn't suffer from false positives because the TM will only allow valid
> > > (even though speculative) executions.  For this ordering to not be valid
> > > (ie, tm_check before asan_check), transactions would need to be able to
> > > prevent access to a certain variable by writing to them (which doesn't
> > > work).  Other cases in which the transaction would perform the access
> > > because of values of other variables that it loaded previously are
> > > covered by privatization safety.
> > >
> > >> I'm not aware of TM having speculative loads, libgtm certainly doesn't
> > >> install a SIGSEGV handler (and testing whether some memory is
> > >> readable/writable without actually dereferencing would be terribly slow).
> > >> If a memory load or store segfaults, whether in a transaction or outside 
> > >> of
> > >> it, it is a program bug and it is right if asan terminates the program.
> > >
> > > Right, memory accesses performed by the TM are only speculative to the
> > > extent described above.
> >
> > There was a project called Transactional Locking II, if you are interested:
> >
> > https://github.com/daveboutcher/tl2-x86-mp/blob/master/tl2.c
> >     /*
> >      * Fetch tentative value
> >      * Use either SPARC non-fault loads

The only architecture that has non-faulting loads is SPARC, AFAIK.

> or complicit signal handlers.

The signal handlers are probably too fragile in practice.  The TM
runtime library would have to prevent all other parts of the program
from overwriting the handler, and the faults are still visible in the OS
and thus perhaps visible to other components.  I've not seen a
rock-solid, practical implementation of masking such faults yet.

Privatization safety can avoid this issue, and it's necessary anyway
with time-based STMs such as TL2 (which don't provide privatization
safety).  This trade-off might be different for single-orec STM
algorithms such as NOrec which provide privatization safety except this
issue of pending loads.


Torvald

Reply via email to