Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Paolo Bonzini
On Tue, Sep 13, 2011 at 03:52, Geert Bosch wrote: > No, it is possible, and actually likely. Basically, the issue is write > buffers. > The coherency mechanisms come into play at a lower level in the > hierarchy (typically at the last-level cache), which is why we need fences > to start with to i

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Lawrence Crowl
On 9/11/11, Andrew MacLeod wrote: > On 09/09/2011 09:09 PM, Geert Bosch wrote: >> For the C++0x atomic types there are: >> >> void A::store(C desired, memory_order order = memory_order_seq_cst) >> volatile; >> void A::store(C desired, memory_order order = memory_order_seq_cst); >> >> where the fir

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Lawrence Crowl
On 9/9/11, Geert Bosch wrote: > To be honest, I can't quite see the use of completely unordered > atomic operations, where we not even prohibit compiler optimizations. > It would seem if we guarantee that a variable will not be accessed > concurrently from any other thread, we wouldn't need the op

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Geert Bosch
On Sep 12, 2011, at 19:19, Andrew MacLeod wrote: > Lets simplify it slightly. The compiler can optimize away x=1 and x=3 as > dead stores (even valid on atomics!), leaving us with 2 modification orders.. > 2,4 or 4,2 > and what you are getting at is you don't think we should ever see > r1==

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Andy Lutomirski
On 09/12/2011 05:30 PM, Ken Raeburn wrote: > On Sep 12, 2011, at 19:19, Andrew MacLeod wrote: >> lets say the order of the writes turns out to be 2,4... is it possible for >> both writes to be travelling around some bus and have thread 4 actually read >> the second one first, followed by the fi

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Ken Raeburn
On Sep 12, 2011, at 19:19, Andrew MacLeod wrote: > lets say the order of the writes turns out to be 2,4... is it possible for > both writes to be travelling around some bus and have thread 4 actually read > the second one first, followed by the first one? It would imply a lack of > memory co

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Andrew MacLeod
On 09/12/2011 02:40 PM, Geert Bosch wrote: thread 1 thread 2 thread 3 thread 4 x=1; r1=x x=3; r3=x; x=2; r2=x x=4; r4=x; Even with relaxed memory ordering, all modifications to x have to occur in some particular tot

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Paolo Bonzini
On Mon, Sep 12, 2011 at 20:40, Geert Bosch wrote: > Assuming that statement is true, that would imply that even for relaxed > ordering there has to be an optimization barrier. Clearly fences need to be > used for any atomic accesses, including those with relaxed memory order. > > Consider 4 threa

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Geert Bosch
On Sep 12, 2011, at 03:02, Paolo Bonzini wrote: > On 09/11/2011 09:00 PM, Geert Bosch wrote: >> So, if I understand correctly, then operations using relaxed memory >> order will still need fences, but indeed do not require any >> optimization barrier. For memory_order_seq_cst we'll need a full >>

Re: Comparison of GCC-4.6.1 and LLVM-2.9 on x86/x86-64 targets

2011-09-12 Thread Vladimir Makarov
On 09/09/2011 07:30 PM, Lawrence Crowl wrote: On 9/7/11, Vladimir Makarov wrote: Some people asked me to do comparison of GCC-4.6 and LLVM-2.9 (both released this spring) as I did GCC-LLVM comparison in previous year. You can find it on http://vmakarov.fedorapeople.org/spec under 2011 GCC-LLV

Re: GCC 4.7.0 Status Report (2011-09-09)

2011-09-12 Thread Jeff Law
On 09/09/2011 01:09 AM, Jakub Jelinek wrote: bitfield lowering? What is the status of lra, reload-2a, pph, cilkplus, gupc (I assume at least some of these are 4.8+ material)? The bits on reload-v2a provide range splitting and a second chance at assigning a hard reg for unallocated allocnos. Th

Re: GCC 4.7.0 Status Report (2011-09-09)

2011-09-12 Thread Aldy Hernandez
Jakub Jelinek writes: > In particular, is transactional-memory branch mergeable within > a month and half, at least some parts of cxx-mem-model branch, > bitfield lowering? What is the status of lra, reload-2a, pph, Torvald and I are looking into getting things merge read, but... The main prob

[Ann] MELT plugin 0.9 rc1 for GCC 4.6

2011-09-12 Thread Basile Starynkevitch
Hello All, It is my pleasure to announce the release candidate 1 of MELT plugin 0.9 for GCC 4.6 MELT is a high-level lisopy domain specific language to develop GCC extensions. A release candidate 1 of MELT plugin 0.9 for gcc 4.6 is available, as a gzipped source tar archive, from http://gc

Re: An internal compiler error when building gcc4.6 using "-flto -fuse-linker-plugin" on Win7 mingw64 target

2011-09-12 Thread Jonathan Wakely
On 12 September 2011 10:05, PcX wrote: > > > Hi, all > >    I report it to gcc bugzilla > :http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50351 Then there's no need to also email this list. Discussion of the bug should take place in Bugzilla, not here.

Re: An internal compiler error when building gcc4.6 using "-flto -fuse-linker-plugin" on Win7 mingw64 target

2011-09-12 Thread PcX
I try to use gcc version 4.7.0 20110911 (experimental) to build gcc trunk, and it also has the problem, but shows a different error: i686-w64-mingw32-gcc -pipe -g0 -O2 -fomit-frame-pointer -finline-functions -minline-a

An internal compiler error when building gcc4.6 using "-flto -fuse-linker-plugin" on Win7 mingw64 target

2011-09-12 Thread PcX
Hi, all I report it to gcc bugzilla :http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50351 When I use mingw64 gcc build gcc4.6.2 20110909 using "-flto -fuse-linker-plugin", an internal compiler error came out: ---

Votre site dans Google

2011-09-12 Thread Votre Site
Bonjour, on fait votre référencement web intensif pour que vous sortiez premier dans Google. Voir: http://www.PremierGoogle.com Nous construisons votre site ou re-construisons votre site actuel, le rendant encore plus attrayant. On offre des formations accélérées à vos employés, à vos enfants,

Incorrect optimized (-O2) linked list code with 4.3.2

2011-09-12 Thread pavan tc
Hi, I would like to know if there have been issues with optimized linked list code with GCC 4.3.2. [optiimization flag : -O2] The following is the inlined code that has the problem: static inline void list_add_tail (struct list_head *new, struct list_head *head) { new->next = head;

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Paolo Bonzini
On 09/12/2011 01:22 AM, Andrew MacLeod wrote: You're right that using lock_test_and_set as an exchange is very wrong because of the compiler barrier semantics, but I think this is entirely a red herring in this case. The same problem could happen with a fetch_and_add or even a lock_release opera

Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Paolo Bonzini
On 09/11/2011 09:00 PM, Geert Bosch wrote: So, if I understand correctly, then operations using relaxed memory order will still need fences, but indeed do not require any optimization barrier. For memory_order_seq_cst we'll need a full barrier, and for the others there is a partial barrier. If