Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Olivier Guilyardi

Le 13/07/11 01:56, Paul Davis a écrit :

On Tue, Jul 12, 2011 at 7:31 PM, Arnold Krille  wrote:


You mean there should be a barrier to discussions about memory barriers?


No. He means that there needs to be a barrier inserted into the
discussion before its possible to move onto the next stage.


Yes, I think that might well be it :)

But it's getting late now. Good night

--
  Olivier


___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Olivier Guilyardi

Le 13/07/11 02:08, Fred Gleason a écrit :

On Jul 12, 2011, at 19:50 05, Olivier Guilyardi wrote:


Problem is I don't have a such device at the moment.


Is your testing code online somewhere?  I do have such a setup (iPad 2 
provisioned as a development device in XCode), and may take a crack at this.


That is nice! Yes I do.

On a standard shell you just need to:
$ svn co http://svn.samalyse.com/misc/rbtest
$ cd rbtest
$ make test

But you may need to adapt things a bit for the iPad.

--
  Olivier
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Fred Gleason
On Jul 12, 2011, at 19:50 05, Olivier Guilyardi wrote:

> Problem is I don't have a such device at the moment.

Is your testing code online somewhere?  I do have such a setup (iPad 2 
provisioned as a development device in XCode), and may take a crack at this.

Cheers!


|-|
| Frederick F. Gleason, Jr. |   Chief Developer   |
|   |   Paravel Systems   |
|-|
|  Research is what I'm doing when I don't know what I'm doing.   |
|  -- Werner von Braun|
|-|

___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Paul Davis
On Tue, Jul 12, 2011 at 7:31 PM, Arnold Krille  wrote:

> You mean there should be a barrier to discussions about memory barriers?

No. He means that there needs to be a barrier inserted into the
discussion before its possible to move onto the next stage.
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Olivier Guilyardi

Le 13/07/11 00:23, Dan Kegel a écrit :

On Tue, Jul 12, 2011 at 2:32 PM, Olivier Guilyardi  wrote:

no one can write a test case which fails when
memory barriers are missing in a ringbuffer implementation.


That's an interesting assertion.  It's kind of tempting to write some
buggy circular buffers and test that assumption on common hardware.


Not sure what you mean by buggy circular buffer, but we already did quite a lot 
of testing in the past.


That said, this article about the iPad2 is quite frightening. I've read that 
again and the guy seems to know what he's talking about. His little FIFO and his 
testing methodology both seem correct to me. That's the first potential proof I 
ever hear about:

http://wanderingcoder.net/2011/04/01/arm-memory-ordering/

This guy is quietly saying that a lot of code out there is about to break, for 
real. As I mentioned previously, multi-core ARM devices are in the wild now. In 
addition to the iPad2, possibly vulnerable Android devices are being sold 
massively right now.


Problem is I don't have a such device at the moment.

--
  Olivier
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Arnold Krille
On Tuesday 12 July 2011 22:20:48 Olivier Guilyardi wrote:
> On 07/12/2011 09:45 PM, Chris Cannam wrote:
> > Thinking it over and going back over some references and earlier
> > threads here (e.g. much earlier ones from Olivier et al) it does seem
> > that this should be enough.  This particular situation isn't so
> > complicated after all.  I think the more I read earlier during this
> > thread (and reading around) generally about memory ordering, the more
> > I was beginning to feel as if the entire subject was a source of only
> > trouble.
> 
> Quite interestingly, I have noticed that discussions about memory barriers
> are often somehow endless.

You mean there should be a barrier to discussions about memory barriers?

Sorry, couldn't resist...


signature.asc
Description: This is a digitally signed message part.
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Dan Kegel
On Tue, Jul 12, 2011 at 2:32 PM, Olivier Guilyardi  wrote:
> no one can write a test case which fails when
> memory barriers are missing in a ringbuffer implementation.

That's an interesting assertion.  It's kind of tempting to write some
buggy circular buffers and test that assumption on common hardware.
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Olivier Guilyardi
On 07/12/2011 11:37 PM, Chris Cannam wrote:
> On 12 July 2011 22:32, Olivier Guilyardi  wrote:
>> Thing is, of every single thing that has been said on this thread about 
>> memory
>> barriers and ringbuffers, no one can prove anything. On this thread, on 
>> others,
>> on LAD and elsewhere. For example, no one can write a test case which fails 
>> when
>> memory barriers are missing in a ringbuffer implementation.
> 
> There is one in the iPad 2 example Sean posted a link to earlier in the 
> thread:
> 
> http://wanderingcoder.net/2011/04/01/arm-memory-ordering/
> 
> I haven't tried it, lacking an iPad 2 or any other multicore ARM computer.

Ah right, I read that too quickly... Thing is, I'm always suspicious with
quickly crafted ringbuffers as the one on this blog post. It's never like a
mature implementation. I will try and run my little test suite on a such device.

--
  Olivier

___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Chris Cannam
On 12 July 2011 22:32, Olivier Guilyardi  wrote:
> Thing is, of every single thing that has been said on this thread about memory
> barriers and ringbuffers, no one can prove anything. On this thread, on 
> others,
> on LAD and elsewhere. For example, no one can write a test case which fails 
> when
> memory barriers are missing in a ringbuffer implementation.

There is one in the iPad 2 example Sean posted a link to earlier in the thread:

http://wanderingcoder.net/2011/04/01/arm-memory-ordering/

I haven't tried it, lacking an iPad 2 or any other multicore ARM computer.


Chris
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Olivier Guilyardi
On 07/12/2011 10:36 PM, Paul Davis wrote:
> On Tue, Jul 12, 2011 at 4:20 PM, Olivier Guilyardi  wrote:
> 
>> Quite interestingly, I have noticed that discussions about memory barriers 
>> are
>> often somehow endless. What happened in the past is that I saw countless
>> discussions about whether they are needed, whether they are not, and people
>> would argue a lot and passionately.
> 
> I think the problem is that memory barriers are almost never required
> when writing "normal" code, and so people (including myself) are not
> exposed to their implementation or their use very much. and indeed,
> there are precious few library implementations of memory barriers, nor
> are they widely documented in a way that suggests that their use is
> "normal".
> 
> by contrast, they get used all over the place in the kernel
> (relatively speaking), so if you tend to have a lot of exposure to
> kernel code, then calls to mb() or whatever they use these days will
> be quite familiar.
> 
> there's the additional problem that this discussion normally ends up
> confusing two separate topics that many people seem to think are the
> same (they are not):
> 
>1) do you need to actively ensure correct thread-level synchronization
>   between the reader and writer of a single-reader/single-writer FIFO?
> Put differently, do you need to use synchronization mechanisms
>   semantically equivalent to a mutex to ensure that any
> arbitrary execution
>   order of the 2 threads does not cause incorrect behaviour?
> 
>2) do you need memory barriers to ensure correct synchronization
>  for this kind of data structure in the face of possible hardware 
> level
>  instruction reordering?
> 
> My feeling is that the answer to (1) is "no" and the answer to (2) is "yes".

Thing is, of every single thing that has been said on this thread about memory
barriers and ringbuffers, no one can prove anything. On this thread, on others,
on LAD and elsewhere. For example, no one can write a test case which fails when
memory barriers are missing in a ringbuffer implementation.

That's a pretty rare and intriguing situation.

--
  Olivier

___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Olivier Guilyardi
On 07/12/2011 11:07 PM, Chris Cannam wrote:
> On 12 July 2011 21:20, Olivier Guilyardi  wrote:
>> Quite interestingly, I have noticed that discussions about memory barriers 
>> are
>> often somehow endless. [...] So I thought, maybe there's a hidden topic
>> behind that. A "memory barrier"...
> 
> Indeed -- perhaps these discussions need some sort of memory write
> barrier, ensuring that everything discussed before the barrier will be
> recalled from store subsequently, instead of being discussed anew.
> But this list would be awfully quiet if we had such a thing.

I didn't say the discussions were useless or uninteresting. I just meant that I
think there's a strong symbolic aspect in this topic, which makes it even more
interesting ;)

--
  Olivier
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Chris Cannam
On 12 July 2011 21:36, Paul Davis  wrote:
>   2) do you need memory barriers to ensure correct synchronization
>         for this kind of data structure in the face of possible hardware level
>         instruction reordering?

The transactional metaphor for this kind of thing seems useful -- the
idea that "we've written everything, now we commit for our readers"
feels like a helpful way to picture the points where barriers might be
necessary.

Since transactional integrity is not provided for us, the commit needs
to be either

 * protected with a memory barrier, if it doesn't matter that the data
is available before it has been announced but does matter if the data
is announced before being available
 * an atomic swap, if the new data must not be available before it has
been announced and also there is only a single point of reference to
the new data
 * mutex protected, if the references to any changes may be
significant or we are not confident of either of the previous cases

...?


Chris
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Chris Cannam
On 12 July 2011 21:20, Olivier Guilyardi  wrote:
> Quite interestingly, I have noticed that discussions about memory barriers are
> often somehow endless. [...] So I thought, maybe there's a hidden topic
> behind that. A "memory barrier"...

Indeed -- perhaps these discussions need some sort of memory write
barrier, ensuring that everything discussed before the barrier will be
recalled from store subsequently, instead of being discussed anew.
But this list would be awfully quiet if we had such a thing.


Chris
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Paul Davis
On Tue, Jul 12, 2011 at 4:20 PM, Olivier Guilyardi  wrote:

> Quite interestingly, I have noticed that discussions about memory barriers are
> often somehow endless. What happened in the past is that I saw countless
> discussions about whether they are needed, whether they are not, and people
> would argue a lot and passionately.

I think the problem is that memory barriers are almost never required
when writing "normal" code, and so people (including myself) are not
exposed to their implementation or their use very much. and indeed,
there are precious few library implementations of memory barriers, nor
are they widely documented in a way that suggests that their use is
"normal".

by contrast, they get used all over the place in the kernel
(relatively speaking), so if you tend to have a lot of exposure to
kernel code, then calls to mb() or whatever they use these days will
be quite familiar.

there's the additional problem that this discussion normally ends up
confusing two separate topics that many people seem to think are the
same (they are not):

   1) do you need to actively ensure correct thread-level synchronization
  between the reader and writer of a single-reader/single-writer FIFO?
  Put differently, do you need to use synchronization mechanisms
  semantically equivalent to a mutex to ensure that any
arbitrary execution
  order of the 2 threads does not cause incorrect behaviour?

   2) do you need memory barriers to ensure correct synchronization
 for this kind of data structure in the face of possible hardware level
 instruction reordering?

My feeling is that the answer to (1) is "no" and the answer to (2) is "yes".
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Olivier Guilyardi
On 07/12/2011 09:45 PM, Chris Cannam wrote:

> Thinking it over and going back over some references and earlier
> threads here (e.g. much earlier ones from Olivier et al) it does seem
> that this should be enough.  This particular situation isn't so
> complicated after all.  I think the more I read earlier during this
> thread (and reading around) generally about memory ordering, the more
> I was beginning to feel as if the entire subject was a source of only
> trouble.

Quite interestingly, I have noticed that discussions about memory barriers are
often somehow endless. What happened in the past is that I saw countless
discussions about whether they are needed, whether they are not, and people
would argue a lot and passionately. So I thought, maybe there's a hidden topic
behind that. A "memory barrier"... Well, that very much reminds me of this
memory loss which happens to all of us in the childhood. It turns early years
into fuzzy memories. That is a barrier, too.

Whether such psychological barriers are needed or not, that's an interesting
question :)

--
  Olivier






___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Chris Cannam
On 12 July 2011 13:44, Dan Muresan  wrote:
> I wonder if
>
> {
> pthread_mutex_t dummy = PTHREAD_MUTEX_INITIALIZER;
> pthread_mutex_lock(&dummy);
> pthread_mutex_unlock(&dummy);
> }
>
> doesn't provide a portable full memory barrier.

According to 
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11
the answer must surely be yes -- pthread_mutex_lock provides a memory
barrier and the only (explicit) exception is for recursive mutexes
already held by the caller.

> Oh, and you probably need a barrier in the consummer too

Yes, before updating the read index (I said read index earlier when I
meant write index).

Thinking it over and going back over some references and earlier
threads here (e.g. much earlier ones from Olivier et al) it does seem
that this should be enough.  This particular situation isn't so
complicated after all.  I think the more I read earlier during this
thread (and reading around) generally about memory ordering, the more
I was beginning to feel as if the entire subject was a source of only
trouble.


Chris
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Emanuel Rumpf
Will this solve any of the problems you are worrying about ?
Anyway, I'm mentioning it:

http://gcc.gnu.org/projects/gomp/

-- 
E.R.
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Dan Muresan
> updating the read index, yes?  how is that expressed as portably as
> possible?  Does __sync_synchronized reliably do that?  (The
> documentation is surprisingly short...)

I wonder if

{
pthread_mutex_t dummy = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_lock(&dummy);
pthread_mutex_unlock(&dummy);
}

doesn't provide a portable full memory barrier. The dummy is different
each time, so no contention -- but still inefficient since  this would
be a 2-step full barrier. Nevertheless, it could be a portable
fallback.

Oh, and you probably need a barrier in the consummer too, just
reasoning from symmetry.

-- Dan
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Olivier Guilyardi
On 07/12/2011 11:15 AM, Tim Blechmann wrote:
>>> using preempt_rt, the scheduling latency can be very low (like 10
>>> microseconds), if cpu frequency scaling is applied or smt/hyperthreading is
>>> enabled it can be as high as 250 microseconds (which is already quite
>>> significant, if one is using small signal vector sizes).
>> That's interesting. We're actually benchmarking scheduling latency at the
>> moment.
> 
> btw, i have discussed this briefly in my master thesis, section 4.1 [1]. the 
> effect of the scheduling latency can also be observed in the benchmarks given 
> in 
> section 5.

Wow, that's pretty exhaustive..

> and it would be great, if you can share your results, i'd be curious to see 
> them!

I'll setup something when we're done. The tests can actually be found in:
http://code.google.com/p/andraudio/source/browse/experiments/scheduling

Beware though this is about non-RT. We're trying to see if running multiple
inter-connected audio clients is feasible on Android without adding latency.

--
  Olivier

___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Tim Blechmann
>> using preempt_rt, the scheduling latency can be very low (like 10
>> microseconds), if cpu frequency scaling is applied or smt/hyperthreading is
>> enabled it can be as high as 250 microseconds (which is already quite
>> significant, if one is using small signal vector sizes).
> 
> That's interesting. We're actually benchmarking scheduling latency at the
> moment.

btw, i have discussed this briefly in my master thesis, section 4.1 [1]. the 
effect of the scheduling latency can also be observed in the benchmarks given 
in 
section 5.

and it would be great, if you can share your results, i'd be curious to see 
them!

cheers, tim

[1] http://tim.klingt.org/publications/tim_blechmann_supernova.pdf

___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Olivier Guilyardi
On 07/12/2011 10:27 AM, Tim Blechmann wrote:
>>> OTOH, if you have a number of threads at the same priority
>>> as Jack's and doing audio work (e.g. to use all the CPUs of
>>> an SMP machine) then using locks between them (but no other
>>> threads) should be OK - depending a bit on how they are used.
>> So, you can use locks as long as that's only meant to synchronize realtime
>> threads with each other? Should some master thread (could be the JACK process
>> thread) have a realtime priority slightly higher than the other (worker)
>> realtime threads? What are the caveats in general?
> 
> yes and no: it is perfectly fine to use locks to use multiple real-time 
> threads, 
> but the thread A fails to acquire a lock, it will be suspended and woken up 
> once 
> thread B releases the lock. the time between `B releases the lock' and `A 
> resumes its execution' is the scheduling latency, which is both os and 
> hardware 
> related.

I understand.

> using preempt_rt, the scheduling latency can be very low (like 10 
> microseconds), 
> if cpu frequency scaling is applied or smt/hyperthreading is enabled it can 
> be 
> as high as 250 microseconds (which is already quite significant, if one is 
> using 
> small signal vector sizes).

That's interesting. We're actually benchmarking scheduling latency at the 
moment.

> however one can avoid the scheduling latency by using spinlocks if one can 
> ensure that none of the synchronized threads can be preempted. personally i 
> achieve this by (a) using real-time scheduling, (b) using not more real-time 
> threads than physical cores and (c) pinning the rt threads to separate cores.

Ah yes, you ensure that threads run on separate cores. That really makes sense.
Thanks for the tip.

--
  Olivier
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Tim Blechmann
>> OTOH, if you have a number of threads at the same priority
>> as Jack's and doing audio work (e.g. to use all the CPUs of
>> an SMP machine) then using locks between them (but no other
>> threads) should be OK - depending a bit on how they are used.
> 
> So, you can use locks as long as that's only meant to synchronize realtime
> threads with each other? Should some master thread (could be the JACK process
> thread) have a realtime priority slightly higher than the other (worker)
> realtime threads? What are the caveats in general?

yes and no: it is perfectly fine to use locks to use multiple real-time 
threads, 
but the thread A fails to acquire a lock, it will be suspended and woken up 
once 
thread B releases the lock. the time between `B releases the lock' and `A 
resumes its execution' is the scheduling latency, which is both os and hardware 
related.
using preempt_rt, the scheduling latency can be very low (like 10 
microseconds), 
if cpu frequency scaling is applied or smt/hyperthreading is enabled it can be 
as high as 250 microseconds (which is already quite significant, if one is 
using 
small signal vector sizes).

however one can avoid the scheduling latency by using spinlocks if one can 
ensure that none of the synchronized threads can be preempted. personally i 
achieve this by (a) using real-time scheduling, (b) using not more real-time 
threads than physical cores and (c) pinning the rt threads to separate cores.

tim


___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev


Re: [LAD] a *simple* ring buffer, comments pls?

2011-07-12 Thread Chris Cannam
On 11 July 2011 21:58, Paul Davis  wrote:
> On Mon, Jul 11, 2011 at 4:50 PM, Chris Cannam
>  wrote:
>> Perhaps the pragmatic solution is to _lock_ the shared buffer?
>
> no, the pragmatic solution is to use memory barriers liberally applied.

Well OK, the vital library ringbuffer implementation that everyone
relies on is probably a bad place to start arguing for doing things
the wrong but easy way rather than the right way.

But I know my limitations (increasingly as I get older!) and reasoning
accurately about the behaviour of lock-free shared structures in a
system with relaxed memory ordering is probably among them.  My
existing code contains plenty of consequences of "oh, we don't need a
lock here because..." fuzz that won't work correctly in such an
environment.  I do wonder whether there isn't going to be increasingly
a case for doing it wrong in principle (through locking) but at least
getting the right answers.

Reading around for lock-free FIFO implementations I see several that
consist of "sequences of objects" (where the objects are allocated
outside the scope of the example) that rely on a single atomic
operation to update the read index -- that wouldn't be an adequate
barrier for the data itself in our buffer-of-floats, though, right?
We need a general write barrier after storing the data and before
updating the read index, yes?  how is that expressed as portably as
possible?  Does __sync_synchronized reliably do that?  (The
documentation is surprisingly short...)


Chris
___
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev