Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie
> Ok, but the converse — if the general_operand is accessed by more > than one instruction, it is not safe — is correct, right? In general, I'd agree, but the ISO spec talks about "sequence points" and there are times when you *can* access a volatile multiple times as long as the state is correct

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread Paul_Koning
> On Jan 5, 2015, at 4:11 PM, DJ Delorie wrote: > > >> To try to generalize from that: it looks like the operating >> principle is that an insn that expands into multiple references to a >> given operand isn’t volatile-safe, but one where there is only a >> single reference is safe? > > No, if

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie
> To try to generalize from that: it looks like the operating > principle is that an insn that expands into multiple references to a > given operand isn’t volatile-safe, but one where there is only a > single reference is safe? No, if the expanded list of insns does "what the standard says, no mo

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread Paul_Koning
> On Jan 5, 2015, at 1:47 PM, DJ Delorie wrote: > > >> One question: do you have an example of a non-volatile-safe machine so >> I can get a feel for the problems one might encounter? At best I can >> imagine a machine that optimizes "add 0, [mem]" to avoid the >> read/write, but I'm not aware

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie
> I looked in the documentation and didn’t see this described. AFAIK it's not documented. Only recently was it agreed (and even then, reluctantly) that the ISO spec could be met by such opcodes.

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie
> One question: do you have an example of a non-volatile-safe machine so > I can get a feel for the problems one might encounter? At best I can > imagine a machine that optimizes "add 0, [mem]" to avoid the > read/write, but I'm not aware of such an ISA. For example, the MSP430 backend uses a ma

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread Paul_Koning
> On Jan 5, 2015, at 1:24 PM, DJ Delorie wrote: > > >> What is involved with the auditing? > > Each pattern that (directly or indirectly) uses general_operand, > memory_operand, or nonimmediate_operand needs to be checked to see if > it's volatile-safe. If so, you need to change the predicate

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread Matt Godbolt
On Mon, Jan 5, 2015 at 11:53 AM, DJ Delorie wrote: > > Matt Godbolt writes: >> GCC's code generation uses a "load; add; store" for volatiles, instead >> of a single "add 1, [metric]". > > GCC doesn't know if a target's load/add/store patterns are > volatile-safe, so it must avoid them. There are

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie
> What is involved with the auditing? Each pattern that (directly or indirectly) uses general_operand, memory_operand, or nonimmediate_operand needs to be checked to see if it's volatile-safe. If so, you need to change the predicate to something that explicitly accepts volatiles. There's been t

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread NightStrike
On Mon, Jan 5, 2015 at 12:53 PM, DJ Delorie wrote: > > Matt Godbolt writes: >> GCC's code generation uses a "load; add; store" for volatiles, instead >> of a single "add 1, [metric]". > > GCC doesn't know if a target's load/add/store patterns are > volatile-safe, so it must avoid them. There are

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie
Matt Godbolt writes: > GCC's code generation uses a "load; add; store" for volatiles, instead > of a single "add 1, [metric]". GCC doesn't know if a target's load/add/store patterns are volatile-safe, so it must avoid them. There are a few targets that have been audited for volatile-safe-ness s

Re: volatile access optimization (C++ / x86_64)

2014-12-30 Thread Paul_Koning
> On Dec 30, 2014, at 1:32 PM, Matt Godbolt wrote: > > On Tue, Dec 30, 2014 at 5:05 AM, Torvald Riegel wrote: >> I agree with Andrew. My understanding of volatile is that the generated >> code must do exactly what the abstract machine would do. > > That makes sense. I suppose I don't understa

Re: volatile access optimization (C++ / x86_64)

2014-12-30 Thread Matt Godbolt
On Tue, Dec 30, 2014 at 5:05 AM, Torvald Riegel wrote: > I agree with Andrew. My understanding of volatile is that the generated > code must do exactly what the abstract machine would do. That makes sense. I suppose I don't understand what the difference is in terms of an abstract machine of "lo

Re: volatile access optimization (C++ / x86_64)

2014-12-30 Thread Torvald Riegel
On Fri, 2014-12-26 at 22:26 +, Andrew Haley wrote: > On 26/12/14 20:32, Matt Godbolt wrote: > > > I'm investigating ways to have single-threaded writers write to memory > > areas which are then (very infrequently) read from another thread for > > monitoring purposes. Things like "number of uni

Re: volatile access optimization (C++ / x86_64)

2014-12-30 Thread Torvald Riegel
On Fri, 2014-12-26 at 23:19 +, Andrew Haley wrote: > On 26/12/14 22:49, Matt Godbolt wrote: > > At the moment I think the best I can do is to use an inline assembly > > version of the increment which prevents GCC from doing any > > optimisation upon it. That seems rather ugly though, and if any

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Matt Godbolt
> On Sat, Dec 27, 2014 at 11:57 AM, Andrew Haley wrote: > Is it faster? Have you measured it? Is it so much faster that it's critical > for your > application? Well, I couldn't really leave this be: I did a little bit of benchmarking using my company's proprietary benchmarking library, which I

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Paul_Koning
> On Dec 27, 2014, at 1:40 PM, Andrew Haley wrote: > > On 27/12/14 18:04, Matt Godbolt wrote: >> On Sat, Dec 27, 2014 at 11:57 AM, Andrew Haley wrote: > >>> if you don't need an atomic access, why do you care that it uses a >>> read-modify-write instruction instead of three instructions? Is i

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Andrew Haley
On 27/12/14 18:04, Matt Godbolt wrote: > On Sat, Dec 27, 2014 at 11:57 AM, Andrew Haley wrote: >> if you don't need an atomic access, why do you care that it uses a >> read-modify-write instruction instead of three instructions? Is it >> faster? Have you measured it? Is it so much faster that

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Matt Godbolt
On Sat, Dec 27, 2014 at 11:57 AM, Andrew Haley wrote: > On 27/12/14 00:02, Matt Godbolt wrote: >> On Fri, Dec 26, 2014 at 5:19 PM, Andrew Haley wrote: >>> On 26/12/14 22:49, Matt Godbolt wrote: On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: >>> Why? >> >> Performance. > > Okay, but th

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Oleg Endo
On Sat, 2014-12-27 at 09:51 -0800, H.J. Lu wrote: > On Sat, Dec 27, 2014 at 9:45 AM, Andrew Haley wrote: > > On 27/12/14 16:02, paul_kon...@dell.com wrote: > >> > >> In the case of volatile variables, the external interface in > >> question is the one at the point where that address is implemented

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Andrew Haley
On 27/12/14 00:02, Matt Godbolt wrote: > On Fri, Dec 26, 2014 at 5:19 PM, Andrew Haley wrote: >> On 26/12/14 22:49, Matt Godbolt wrote: >>> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: > >>> Thanks. I realise I was unclear in my original email. I'm really >>> looking for a way to say "do

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread H.J. Lu
On Sat, Dec 27, 2014 at 9:45 AM, Andrew Haley wrote: > On 27/12/14 16:02, paul_kon...@dell.com wrote: >> >> In the case of volatile variables, the external interface in >> question is the one at the point where that address is implemented — >> a memory cell, or memory mapped I/O device on a bus.

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Andrew Haley
On 27/12/14 16:02, paul_kon...@dell.com wrote: > >> On Dec 26, 2014, at 6:19 PM, Andrew Haley wrote: >> >> On 26/12/14 22:49, Matt Godbolt wrote: >>> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: On 26/12/14 20:32, Matt Godbolt wrote: > Is there a reason why (in principal) the vo

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Paul_Koning
> On Dec 26, 2014, at 6:19 PM, Andrew Haley wrote: > > On 26/12/14 22:49, Matt Godbolt wrote: >> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: >>> On 26/12/14 20:32, Matt Godbolt wrote: Is there a reason why (in principal) the volatile increment can't be made into a single add?

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
On Fri, Dec 26, 2014 at 5:20 PM, NightStrike wrote: > Have you tried release and acquire/consume instead? Yes; these emit the same instructions in this case. http://goo.gl/e94Ya7 Regards, Matt

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
On Fri, Dec 26, 2014 at 5:19 PM, Andrew Haley wrote: > On 26/12/14 22:49, Matt Godbolt wrote: >> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: >>> On 26/12/14 20:32, Matt Godbolt wrote: >> I realise my understanding could be wrong here! >> If not though, both clang and icc are taking a sh

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
On Fri, Dec 26, 2014 at 4:51 PM, Marc Glisse wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50677 Thanks Marc

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Andrew Haley
On 26/12/14 22:49, Matt Godbolt wrote: > On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: >> On 26/12/14 20:32, Matt Godbolt wrote: >>> Is there a reason why (in principal) the volatile increment can't be >>> made into a single add? Clang and ICC both emit the same code for the >>> volatile an

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Marc Glisse
On Fri, 26 Dec 2014, Matt Godbolt wrote: I'm investigating ways to have single-threaded writers write to memory areas which are then (very infrequently) read from another thread for monitoring purposes. Things like "number of units of work done". I initially modeled this with relaxed atomic ope

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: > On 26/12/14 20:32, Matt Godbolt wrote: >> Is there a reason why (in principal) the volatile increment can't be >> made into a single add? Clang and ICC both emit the same code for the >> volatile and non-volatile case. > > Yes. Volatiles use

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Andrew Haley
On 26/12/14 20:32, Matt Godbolt wrote: > I'm investigating ways to have single-threaded writers write to memory > areas which are then (very infrequently) read from another thread for > monitoring purposes. Things like "number of units of work done". > > I initially modeled this with relaxed atom

volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
Hi all, I'm investigating ways to have single-threaded writers write to memory areas which are then (very infrequently) read from another thread for monitoring purposes. Things like "number of units of work done". I initially modeled this with relaxed atomic operations. This generates a "lock xad