Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-08 Thread Felipe Balbi

Hi,

Felipe Ferreri Tonello  writes:
>> ps: can you point me to your devices shipping with f_midi ? Which
>> architecture are they using ? Which USB Peripheral Controller ? This
>> might be a good addition to my test farm depending on your answers above
>> :-p
>
> Seaboard GRAND[1]. Freescale's i.MX 6 running an ARM A9. The controller
> is Chip Idea.
>
> [1] https://www.roli.com/products/seaboard-grand

nice-looking product. But probably above my
"yet-another-device-for-some-hacking-and-testing" budget. :-p

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-08 Thread Felipe Balbi

Hi,

Felipe Ferreri Tonello  writes:
>> ps: can you point me to your devices shipping with f_midi ? Which
>> architecture are they using ? Which USB Peripheral Controller ? This
>> might be a good addition to my test farm depending on your answers above
>> :-p
>
> Seaboard GRAND[1]. Freescale's i.MX 6 running an ARM A9. The controller
> is Chip Idea.
>
> [1] https://www.roli.com/products/seaboard-grand

nice-looking product. But probably above my
"yet-another-device-for-some-hacking-and-testing" budget. :-p

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-08 Thread Felipe Ferreri Tonello
Hi Balbi,

On 08/03/16 14:01, Felipe Balbi wrote:
> 
> Hi,
> 
> Felipe Ferreri Tonello  writes:
 Since f_midi_transmit is called by both ALSA and USB frameworks, it
>>> can
 potentially cause a race condition between both calls. This is bad
>>> because the
 way f_midi_transmit is implemented can't handle concurrent calls.
>>> This is due
 to the fact that the usb request fifo looks for the next element and
>>> only if
 it has data to process it enqueues the request, otherwise re-uses it.
>>> If both
 (ALSA and USB) frameworks calls this function at the same time, the
 kfifo_seek() will return the same usb_request, which will cause a
>>> race
 condition.

 To solve this problem a syncronization mechanism is necessary. In
>>> this case it
 is used a spinlock since f_midi_transmit is also called by
>>> usb_request->complete
 callback in interrupt context.

 On benchmarks realized by me, spinlocks were more efficient then
>>> scheduling
 the f_midi_transmit tasklet in process context and using a mutex
 to synchronize. Also it performs better then previous
 implementation
>>> that
 allocated a usb_request for every new transmit made.
>>>
>>> behaves better in what way ? Also, previous implementation would not
>>> suffer from this concurrency problem, right ?
>>
>> The spin lock is faster than allocating usb requests all the time,
>> even if the udc uses da for it.
>
> did you measure ? Is the extra speed really necessary ? How did you
> benchmark this ?

 Yes I did measure and it was not that significant. This is not about
 speed. There was a bug in that approach that I already explained on
>>>
>>> you have very confusing statements. When I mentioned that previous code
>>> wouldn't have the need for the spinlock you replied that spinlock was
>>> faster.
>>>
>>> When I asked you about benchmarks you reply saying it's not about the
>>> speed.
>>>
>>> Make up your mind dude. What are you trying to achieve ?
>>>
 that patch, which was approved and applied BTW.
>>>
>>> patches can be reverted if we realise we're better off without
>>> them. Don't get cocky, please.
>>
>> Yes am I aware of that, but I honestly think that is the wrong way of
>> dealing with this.
>>
>> ?? I don't get why am I giving this impression.
> 
> re-read your emails. The gist goes like this:
> 
> . Send patch
> . Got comments
> . Well, whatever, you can just ignore if you don't agree

This is one of the problems with email. It can give the wrong impression
and feelings. :)

That was not what I meant at all. I mean that for real, not in a
childish manner. I'm sorry if I gave you that impression.

> 
 Any way, this spinlock should've been there since that patch but I
 couldn't really trigger this problem without a stress test.
>>>
>>> which tells me you sent me patches without properly testing. How much
>>> time did it take to trigger this ? How did you trigger this situation ?
>>
>> No, that is no true. The implementation I sent is working properly for
>> any real world usage.
>>
>> The stress test I made to break the current implementation is *not* a
>> real use-case. I made it in order to push as far as possible how fast
>> the driver can *reliably* handle while sending and reading data. Then I
>> noticed the bug.
>>
>> So, to answer your question. To trigger this bug is not a matter of
>> time. The following needs to happen:
>>  1. Device send MIDI message that is *bigger* than the usb request
>> length. (just this by itself is really unlikely to happen in real world
>> usage)
> 
> I wouldn't say it's unlikely. You just cannot trust the other side of
> the wire. We've seen e.g. Xbox 360's SCSI layer sending messages of the
> wrong size and we worked around them in g_mass_storage.
> 
> Broken implementations are a real thing ;-)

Fair enough. And that's why I am pushing this fix. :)

> 
>>  2. Host send a MIDI message back *exactly* at the same time as the
>> device is processing the second part of the usb request from the same
>> message.
> 
> also not that unlikely to happen ;-) You can't assume the host will only
> shift tokens on the wire at the time you're expecting it to.
> 
>> I couldn't trigger this in all the tests we've made. I just triggered
>> when I was sending huge messages back and forth (device <-> host) as
>> mentioned.
> 
> fair enough.
> 
>> In fact, we have thousands of devices out there without this patch (but
>> with my previous patch that introduced this bug).
> 
> that's thousands of devices waiting to have a problem, right ? :-)

:X

> 
>> I am not trying to say it wasn't a mistake. That patch unfortunately
>> introduces this bug, but it has real improvements over the previous
>> implementation. AFAIR the improvements are:
>>  * Fixes a bug that was 

Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-08 Thread Felipe Ferreri Tonello
Hi Balbi,

On 08/03/16 14:01, Felipe Balbi wrote:
> 
> Hi,
> 
> Felipe Ferreri Tonello  writes:
 Since f_midi_transmit is called by both ALSA and USB frameworks, it
>>> can
 potentially cause a race condition between both calls. This is bad
>>> because the
 way f_midi_transmit is implemented can't handle concurrent calls.
>>> This is due
 to the fact that the usb request fifo looks for the next element and
>>> only if
 it has data to process it enqueues the request, otherwise re-uses it.
>>> If both
 (ALSA and USB) frameworks calls this function at the same time, the
 kfifo_seek() will return the same usb_request, which will cause a
>>> race
 condition.

 To solve this problem a syncronization mechanism is necessary. In
>>> this case it
 is used a spinlock since f_midi_transmit is also called by
>>> usb_request->complete
 callback in interrupt context.

 On benchmarks realized by me, spinlocks were more efficient then
>>> scheduling
 the f_midi_transmit tasklet in process context and using a mutex
 to synchronize. Also it performs better then previous
 implementation
>>> that
 allocated a usb_request for every new transmit made.
>>>
>>> behaves better in what way ? Also, previous implementation would not
>>> suffer from this concurrency problem, right ?
>>
>> The spin lock is faster than allocating usb requests all the time,
>> even if the udc uses da for it.
>
> did you measure ? Is the extra speed really necessary ? How did you
> benchmark this ?

 Yes I did measure and it was not that significant. This is not about
 speed. There was a bug in that approach that I already explained on
>>>
>>> you have very confusing statements. When I mentioned that previous code
>>> wouldn't have the need for the spinlock you replied that spinlock was
>>> faster.
>>>
>>> When I asked you about benchmarks you reply saying it's not about the
>>> speed.
>>>
>>> Make up your mind dude. What are you trying to achieve ?
>>>
 that patch, which was approved and applied BTW.
>>>
>>> patches can be reverted if we realise we're better off without
>>> them. Don't get cocky, please.
>>
>> Yes am I aware of that, but I honestly think that is the wrong way of
>> dealing with this.
>>
>> ?? I don't get why am I giving this impression.
> 
> re-read your emails. The gist goes like this:
> 
> . Send patch
> . Got comments
> . Well, whatever, you can just ignore if you don't agree

This is one of the problems with email. It can give the wrong impression
and feelings. :)

That was not what I meant at all. I mean that for real, not in a
childish manner. I'm sorry if I gave you that impression.

> 
 Any way, this spinlock should've been there since that patch but I
 couldn't really trigger this problem without a stress test.
>>>
>>> which tells me you sent me patches without properly testing. How much
>>> time did it take to trigger this ? How did you trigger this situation ?
>>
>> No, that is no true. The implementation I sent is working properly for
>> any real world usage.
>>
>> The stress test I made to break the current implementation is *not* a
>> real use-case. I made it in order to push as far as possible how fast
>> the driver can *reliably* handle while sending and reading data. Then I
>> noticed the bug.
>>
>> So, to answer your question. To trigger this bug is not a matter of
>> time. The following needs to happen:
>>  1. Device send MIDI message that is *bigger* than the usb request
>> length. (just this by itself is really unlikely to happen in real world
>> usage)
> 
> I wouldn't say it's unlikely. You just cannot trust the other side of
> the wire. We've seen e.g. Xbox 360's SCSI layer sending messages of the
> wrong size and we worked around them in g_mass_storage.
> 
> Broken implementations are a real thing ;-)

Fair enough. And that's why I am pushing this fix. :)

> 
>>  2. Host send a MIDI message back *exactly* at the same time as the
>> device is processing the second part of the usb request from the same
>> message.
> 
> also not that unlikely to happen ;-) You can't assume the host will only
> shift tokens on the wire at the time you're expecting it to.
> 
>> I couldn't trigger this in all the tests we've made. I just triggered
>> when I was sending huge messages back and forth (device <-> host) as
>> mentioned.
> 
> fair enough.
> 
>> In fact, we have thousands of devices out there without this patch (but
>> with my previous patch that introduced this bug).
> 
> that's thousands of devices waiting to have a problem, right ? :-)

:X

> 
>> I am not trying to say it wasn't a mistake. That patch unfortunately
>> introduces this bug, but it has real improvements over the previous
>> implementation. AFAIR the improvements are:
>>  * Fixes a bug that was causing the DMA buffer to 

Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-08 Thread Felipe Balbi

Hi,

Felipe Ferreri Tonello  writes:
>>> Since f_midi_transmit is called by both ALSA and USB frameworks, it
>> can
>>> potentially cause a race condition between both calls. This is bad
>> because the
>>> way f_midi_transmit is implemented can't handle concurrent calls.
>> This is due
>>> to the fact that the usb request fifo looks for the next element and
>> only if
>>> it has data to process it enqueues the request, otherwise re-uses it.
>> If both
>>> (ALSA and USB) frameworks calls this function at the same time, the
>>> kfifo_seek() will return the same usb_request, which will cause a
>> race
>>> condition.
>>>
>>> To solve this problem a syncronization mechanism is necessary. In
>> this case it
>>> is used a spinlock since f_midi_transmit is also called by
>> usb_request->complete
>>> callback in interrupt context.
>>>
>>> On benchmarks realized by me, spinlocks were more efficient then
>> scheduling
>>> the f_midi_transmit tasklet in process context and using a mutex
>>> to synchronize. Also it performs better then previous
>>> implementation
>> that
>>> allocated a usb_request for every new transmit made.
>>
>> behaves better in what way ? Also, previous implementation would not
>> suffer from this concurrency problem, right ?
>
> The spin lock is faster than allocating usb requests all the time,
> even if the udc uses da for it.

 did you measure ? Is the extra speed really necessary ? How did you
 benchmark this ?
>>>
>>> Yes I did measure and it was not that significant. This is not about
>>> speed. There was a bug in that approach that I already explained on
>> 
>> you have very confusing statements. When I mentioned that previous code
>> wouldn't have the need for the spinlock you replied that spinlock was
>> faster.
>> 
>> When I asked you about benchmarks you reply saying it's not about the
>> speed.
>> 
>> Make up your mind dude. What are you trying to achieve ?
>> 
>>> that patch, which was approved and applied BTW.
>> 
>> patches can be reverted if we realise we're better off without
>> them. Don't get cocky, please.
>
> Yes am I aware of that, but I honestly think that is the wrong way of
> dealing with this.
>
> ?? I don't get why am I giving this impression.

re-read your emails. The gist goes like this:

. Send patch
. Got comments
. Well, whatever, you can just ignore if you don't agree

>>> Any way, this spinlock should've been there since that patch but I
>>> couldn't really trigger this problem without a stress test.
>> 
>> which tells me you sent me patches without properly testing. How much
>> time did it take to trigger this ? How did you trigger this situation ?
>
> No, that is no true. The implementation I sent is working properly for
> any real world usage.
>
> The stress test I made to break the current implementation is *not* a
> real use-case. I made it in order to push as far as possible how fast
> the driver can *reliably* handle while sending and reading data. Then I
> noticed the bug.
>
> So, to answer your question. To trigger this bug is not a matter of
> time. The following needs to happen:
>  1. Device send MIDI message that is *bigger* than the usb request
> length. (just this by itself is really unlikely to happen in real world
> usage)

I wouldn't say it's unlikely. You just cannot trust the other side of
the wire. We've seen e.g. Xbox 360's SCSI layer sending messages of the
wrong size and we worked around them in g_mass_storage.

Broken implementations are a real thing ;-)

>  2. Host send a MIDI message back *exactly* at the same time as the
> device is processing the second part of the usb request from the same
> message.

also not that unlikely to happen ;-) You can't assume the host will only
shift tokens on the wire at the time you're expecting it to.

> I couldn't trigger this in all the tests we've made. I just triggered
> when I was sending huge messages back and forth (device <-> host) as
> mentioned.

fair enough.

> In fact, we have thousands of devices out there without this patch (but
> with my previous patch that introduced this bug).

that's thousands of devices waiting to have a problem, right ? :-)

> I am not trying to say it wasn't a mistake. That patch unfortunately
> introduces this bug, but it has real improvements over the previous
> implementation. AFAIR the improvements are:
>  * Fixes a bug that was causing the DMA buffer to fill it up causing a
> kernel panic.

this is a good point. Had forgotten about that detail. Thanks

>  * Pre allocate IN usb requests so there is no allocation overhead while
> sending data (same behavior already existed for the OUT endpoint). This
> ensure that the DMA memory is not misused affecting the rest of the
> system.

also, arguably, a good idea. Recycling requests is a lot nicer and it's
what most gadget drivers do.

>  * It 

Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-08 Thread Felipe Balbi

Hi,

Felipe Ferreri Tonello  writes:
>>> Since f_midi_transmit is called by both ALSA and USB frameworks, it
>> can
>>> potentially cause a race condition between both calls. This is bad
>> because the
>>> way f_midi_transmit is implemented can't handle concurrent calls.
>> This is due
>>> to the fact that the usb request fifo looks for the next element and
>> only if
>>> it has data to process it enqueues the request, otherwise re-uses it.
>> If both
>>> (ALSA and USB) frameworks calls this function at the same time, the
>>> kfifo_seek() will return the same usb_request, which will cause a
>> race
>>> condition.
>>>
>>> To solve this problem a syncronization mechanism is necessary. In
>> this case it
>>> is used a spinlock since f_midi_transmit is also called by
>> usb_request->complete
>>> callback in interrupt context.
>>>
>>> On benchmarks realized by me, spinlocks were more efficient then
>> scheduling
>>> the f_midi_transmit tasklet in process context and using a mutex
>>> to synchronize. Also it performs better then previous
>>> implementation
>> that
>>> allocated a usb_request for every new transmit made.
>>
>> behaves better in what way ? Also, previous implementation would not
>> suffer from this concurrency problem, right ?
>
> The spin lock is faster than allocating usb requests all the time,
> even if the udc uses da for it.

 did you measure ? Is the extra speed really necessary ? How did you
 benchmark this ?
>>>
>>> Yes I did measure and it was not that significant. This is not about
>>> speed. There was a bug in that approach that I already explained on
>> 
>> you have very confusing statements. When I mentioned that previous code
>> wouldn't have the need for the spinlock you replied that spinlock was
>> faster.
>> 
>> When I asked you about benchmarks you reply saying it's not about the
>> speed.
>> 
>> Make up your mind dude. What are you trying to achieve ?
>> 
>>> that patch, which was approved and applied BTW.
>> 
>> patches can be reverted if we realise we're better off without
>> them. Don't get cocky, please.
>
> Yes am I aware of that, but I honestly think that is the wrong way of
> dealing with this.
>
> ?? I don't get why am I giving this impression.

re-read your emails. The gist goes like this:

. Send patch
. Got comments
. Well, whatever, you can just ignore if you don't agree

>>> Any way, this spinlock should've been there since that patch but I
>>> couldn't really trigger this problem without a stress test.
>> 
>> which tells me you sent me patches without properly testing. How much
>> time did it take to trigger this ? How did you trigger this situation ?
>
> No, that is no true. The implementation I sent is working properly for
> any real world usage.
>
> The stress test I made to break the current implementation is *not* a
> real use-case. I made it in order to push as far as possible how fast
> the driver can *reliably* handle while sending and reading data. Then I
> noticed the bug.
>
> So, to answer your question. To trigger this bug is not a matter of
> time. The following needs to happen:
>  1. Device send MIDI message that is *bigger* than the usb request
> length. (just this by itself is really unlikely to happen in real world
> usage)

I wouldn't say it's unlikely. You just cannot trust the other side of
the wire. We've seen e.g. Xbox 360's SCSI layer sending messages of the
wrong size and we worked around them in g_mass_storage.

Broken implementations are a real thing ;-)

>  2. Host send a MIDI message back *exactly* at the same time as the
> device is processing the second part of the usb request from the same
> message.

also not that unlikely to happen ;-) You can't assume the host will only
shift tokens on the wire at the time you're expecting it to.

> I couldn't trigger this in all the tests we've made. I just triggered
> when I was sending huge messages back and forth (device <-> host) as
> mentioned.

fair enough.

> In fact, we have thousands of devices out there without this patch (but
> with my previous patch that introduced this bug).

that's thousands of devices waiting to have a problem, right ? :-)

> I am not trying to say it wasn't a mistake. That patch unfortunately
> introduces this bug, but it has real improvements over the previous
> implementation. AFAIR the improvements are:
>  * Fixes a bug that was causing the DMA buffer to fill it up causing a
> kernel panic.

this is a good point. Had forgotten about that detail. Thanks

>  * Pre allocate IN usb requests so there is no allocation overhead while
> sending data (same behavior already existed for the OUT endpoint). This
> ensure that the DMA memory is not misused affecting the rest of the
> system.

also, arguably, a good idea. Recycling requests is a lot nicer and it's
what most gadget drivers do.

>  * It doesn't crash if the host 

Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-08 Thread Felipe Ferreri Tonello
Hi Balbi,

On 08/03/16 07:37, Felipe Balbi wrote:
> 
> Hi,
> 
> Felipe Ferreri Tonello  writes:
>> Since f_midi_transmit is called by both ALSA and USB frameworks, it
> can
>> potentially cause a race condition between both calls. This is bad
> because the
>> way f_midi_transmit is implemented can't handle concurrent calls.
> This is due
>> to the fact that the usb request fifo looks for the next element and
> only if
>> it has data to process it enqueues the request, otherwise re-uses it.
> If both
>> (ALSA and USB) frameworks calls this function at the same time, the
>> kfifo_seek() will return the same usb_request, which will cause a
> race
>> condition.
>>
>> To solve this problem a syncronization mechanism is necessary. In
> this case it
>> is used a spinlock since f_midi_transmit is also called by
> usb_request->complete
>> callback in interrupt context.
>>
>> On benchmarks realized by me, spinlocks were more efficient then
> scheduling
>> the f_midi_transmit tasklet in process context and using a mutex
>> to synchronize. Also it performs better then previous
>> implementation
> that
>> allocated a usb_request for every new transmit made.
>
> behaves better in what way ? Also, previous implementation would not
> suffer from this concurrency problem, right ?

 The spin lock is faster than allocating usb requests all the time,
 even if the udc uses da for it.
>>>
>>> did you measure ? Is the extra speed really necessary ? How did you
>>> benchmark this ?
>>
>> Yes I did measure and it was not that significant. This is not about
>> speed. There was a bug in that approach that I already explained on
> 
> you have very confusing statements. When I mentioned that previous code
> wouldn't have the need for the spinlock you replied that spinlock was
> faster.
> 
> When I asked you about benchmarks you reply saying it's not about the
> speed.
> 
> Make up your mind dude. What are you trying to achieve ?
> 
>> that patch, which was approved and applied BTW.
> 
> patches can be reverted if we realise we're better off without
> them. Don't get cocky, please.

Yes am I aware of that, but I honestly think that is the wrong way of
dealing with this.

?? I don't get why am I giving this impression.

> 
>> Any way, this spinlock should've been there since that patch but I
>> couldn't really trigger this problem without a stress test.
> 
> which tells me you sent me patches without properly testing. How much
> time did it take to trigger this ? How did you trigger this situation ?

No, that is no true. The implementation I sent is working properly for
any real world usage.

The stress test I made to break the current implementation is *not* a
real use-case. I made it in order to push as far as possible how fast
the driver can *reliably* handle while sending and reading data. Then I
noticed the bug.

So, to answer your question. To trigger this bug is not a matter of
time. The following needs to happen:
 1. Device send MIDI message that is *bigger* than the usb request
length. (just this by itself is really unlikely to happen in real world
usage)
 2. Host send a MIDI message back *exactly* at the same time as the
device is processing the second part of the usb request from the same
message.

I couldn't trigger this in all the tests we've made. I just triggered
when I was sending huge messages back and forth (device <-> host) as
mentioned.

In fact, we have thousands of devices out there without this patch (but
with my previous patch that introduced this bug).

I am not trying to say it wasn't a mistake. That patch unfortunately
introduces this bug, but it has real improvements over the previous
implementation. AFAIR the improvements are:
 * Fixes a bug that was causing the DMA buffer to fill it up causing a
kernel panic.
 * Pre allocate IN usb requests so there is no allocation overhead while
sending data (same behavior already existed for the OUT endpoint). This
ensure that the DMA memory is not misused affecting the rest of the system.
 * It doesn't crash if the host doesn't send an ACK after IN data
packets and we have reached the limit of available memory. Also, this is
useful because it causes the ALSA layer to timeout, which is the correct
userspace behavior.
 * Continuous to send data to the correct Jack (associated to each ALSA
substream) if that was interrupted somehow, for instance by the size
limit of a usb request.


> 
>> So, this patch fixes a bug in the current implementation.
> 
> fixes a regression introduced by you, true. I'm trying to figure out if
> we're better off without the original patch; to make a good decision I
> need to know if the extra "speed" we get from not allocating requests on
> demand are really that important.
> 
> So, how much faster did you get and is that extra "speed" really
> important ?

The speed is not relevant at 

Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-08 Thread Felipe Ferreri Tonello
Hi Balbi,

On 08/03/16 07:37, Felipe Balbi wrote:
> 
> Hi,
> 
> Felipe Ferreri Tonello  writes:
>> Since f_midi_transmit is called by both ALSA and USB frameworks, it
> can
>> potentially cause a race condition between both calls. This is bad
> because the
>> way f_midi_transmit is implemented can't handle concurrent calls.
> This is due
>> to the fact that the usb request fifo looks for the next element and
> only if
>> it has data to process it enqueues the request, otherwise re-uses it.
> If both
>> (ALSA and USB) frameworks calls this function at the same time, the
>> kfifo_seek() will return the same usb_request, which will cause a
> race
>> condition.
>>
>> To solve this problem a syncronization mechanism is necessary. In
> this case it
>> is used a spinlock since f_midi_transmit is also called by
> usb_request->complete
>> callback in interrupt context.
>>
>> On benchmarks realized by me, spinlocks were more efficient then
> scheduling
>> the f_midi_transmit tasklet in process context and using a mutex
>> to synchronize. Also it performs better then previous
>> implementation
> that
>> allocated a usb_request for every new transmit made.
>
> behaves better in what way ? Also, previous implementation would not
> suffer from this concurrency problem, right ?

 The spin lock is faster than allocating usb requests all the time,
 even if the udc uses da for it.
>>>
>>> did you measure ? Is the extra speed really necessary ? How did you
>>> benchmark this ?
>>
>> Yes I did measure and it was not that significant. This is not about
>> speed. There was a bug in that approach that I already explained on
> 
> you have very confusing statements. When I mentioned that previous code
> wouldn't have the need for the spinlock you replied that spinlock was
> faster.
> 
> When I asked you about benchmarks you reply saying it's not about the
> speed.
> 
> Make up your mind dude. What are you trying to achieve ?
> 
>> that patch, which was approved and applied BTW.
> 
> patches can be reverted if we realise we're better off without
> them. Don't get cocky, please.

Yes am I aware of that, but I honestly think that is the wrong way of
dealing with this.

?? I don't get why am I giving this impression.

> 
>> Any way, this spinlock should've been there since that patch but I
>> couldn't really trigger this problem without a stress test.
> 
> which tells me you sent me patches without properly testing. How much
> time did it take to trigger this ? How did you trigger this situation ?

No, that is no true. The implementation I sent is working properly for
any real world usage.

The stress test I made to break the current implementation is *not* a
real use-case. I made it in order to push as far as possible how fast
the driver can *reliably* handle while sending and reading data. Then I
noticed the bug.

So, to answer your question. To trigger this bug is not a matter of
time. The following needs to happen:
 1. Device send MIDI message that is *bigger* than the usb request
length. (just this by itself is really unlikely to happen in real world
usage)
 2. Host send a MIDI message back *exactly* at the same time as the
device is processing the second part of the usb request from the same
message.

I couldn't trigger this in all the tests we've made. I just triggered
when I was sending huge messages back and forth (device <-> host) as
mentioned.

In fact, we have thousands of devices out there without this patch (but
with my previous patch that introduced this bug).

I am not trying to say it wasn't a mistake. That patch unfortunately
introduces this bug, but it has real improvements over the previous
implementation. AFAIR the improvements are:
 * Fixes a bug that was causing the DMA buffer to fill it up causing a
kernel panic.
 * Pre allocate IN usb requests so there is no allocation overhead while
sending data (same behavior already existed for the OUT endpoint). This
ensure that the DMA memory is not misused affecting the rest of the system.
 * It doesn't crash if the host doesn't send an ACK after IN data
packets and we have reached the limit of available memory. Also, this is
useful because it causes the ALSA layer to timeout, which is the correct
userspace behavior.
 * Continuous to send data to the correct Jack (associated to each ALSA
substream) if that was interrupted somehow, for instance by the size
limit of a usb request.


> 
>> So, this patch fixes a bug in the current implementation.
> 
> fixes a regression introduced by you, true. I'm trying to figure out if
> we're better off without the original patch; to make a good decision I
> need to know if the extra "speed" we get from not allocating requests on
> demand are really that important.
> 
> So, how much faster did you get and is that extra "speed" really
> important ?

The speed is not relevant at all in this case. It was 

Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-07 Thread Felipe Balbi

Hi,

Felipe Ferreri Tonello  writes:
> Since f_midi_transmit is called by both ALSA and USB frameworks, it
 can
> potentially cause a race condition between both calls. This is bad
 because the
> way f_midi_transmit is implemented can't handle concurrent calls.
 This is due
> to the fact that the usb request fifo looks for the next element and
 only if
> it has data to process it enqueues the request, otherwise re-uses it.
 If both
> (ALSA and USB) frameworks calls this function at the same time, the
> kfifo_seek() will return the same usb_request, which will cause a
 race
> condition.
>
> To solve this problem a syncronization mechanism is necessary. In
 this case it
> is used a spinlock since f_midi_transmit is also called by
 usb_request->complete
> callback in interrupt context.
>
> On benchmarks realized by me, spinlocks were more efficient then
 scheduling
> the f_midi_transmit tasklet in process context and using a mutex
> to synchronize. Also it performs better then previous
> implementation
 that
> allocated a usb_request for every new transmit made.

 behaves better in what way ? Also, previous implementation would not
 suffer from this concurrency problem, right ?
>>>
>>> The spin lock is faster than allocating usb requests all the time,
>>> even if the udc uses da for it.
>> 
>> did you measure ? Is the extra speed really necessary ? How did you
>> benchmark this ?
>
> Yes I did measure and it was not that significant. This is not about
> speed. There was a bug in that approach that I already explained on

you have very confusing statements. When I mentioned that previous code
wouldn't have the need for the spinlock you replied that spinlock was
faster.

When I asked you about benchmarks you reply saying it's not about the
speed.

Make up your mind dude. What are you trying to achieve ?

> that patch, which was approved and applied BTW.

patches can be reverted if we realise we're better off without
them. Don't get cocky, please.

> Any way, this spinlock should've been there since that patch but I
> couldn't really trigger this problem without a stress test.

which tells me you sent me patches without properly testing. How much
time did it take to trigger this ? How did you trigger this situation ?

> So, this patch fixes a bug in the current implementation.

fixes a regression introduced by you, true. I'm trying to figure out if
we're better off without the original patch; to make a good decision I
need to know if the extra "speed" we get from not allocating requests on
demand are really that important.

So, how much faster did you get and is that extra "speed" really
important ?

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-07 Thread Felipe Balbi

Hi,

Felipe Ferreri Tonello  writes:
> Since f_midi_transmit is called by both ALSA and USB frameworks, it
 can
> potentially cause a race condition between both calls. This is bad
 because the
> way f_midi_transmit is implemented can't handle concurrent calls.
 This is due
> to the fact that the usb request fifo looks for the next element and
 only if
> it has data to process it enqueues the request, otherwise re-uses it.
 If both
> (ALSA and USB) frameworks calls this function at the same time, the
> kfifo_seek() will return the same usb_request, which will cause a
 race
> condition.
>
> To solve this problem a syncronization mechanism is necessary. In
 this case it
> is used a spinlock since f_midi_transmit is also called by
 usb_request->complete
> callback in interrupt context.
>
> On benchmarks realized by me, spinlocks were more efficient then
 scheduling
> the f_midi_transmit tasklet in process context and using a mutex
> to synchronize. Also it performs better then previous
> implementation
 that
> allocated a usb_request for every new transmit made.

 behaves better in what way ? Also, previous implementation would not
 suffer from this concurrency problem, right ?
>>>
>>> The spin lock is faster than allocating usb requests all the time,
>>> even if the udc uses da for it.
>> 
>> did you measure ? Is the extra speed really necessary ? How did you
>> benchmark this ?
>
> Yes I did measure and it was not that significant. This is not about
> speed. There was a bug in that approach that I already explained on

you have very confusing statements. When I mentioned that previous code
wouldn't have the need for the spinlock you replied that spinlock was
faster.

When I asked you about benchmarks you reply saying it's not about the
speed.

Make up your mind dude. What are you trying to achieve ?

> that patch, which was approved and applied BTW.

patches can be reverted if we realise we're better off without
them. Don't get cocky, please.

> Any way, this spinlock should've been there since that patch but I
> couldn't really trigger this problem without a stress test.

which tells me you sent me patches without properly testing. How much
time did it take to trigger this ? How did you trigger this situation ?

> So, this patch fixes a bug in the current implementation.

fixes a regression introduced by you, true. I'm trying to figure out if
we're better off without the original patch; to make a good decision I
need to know if the extra "speed" we get from not allocating requests on
demand are really that important.

So, how much faster did you get and is that extra "speed" really
important ?

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-07 Thread Felipe Ferreri Tonello
Hi Balbi,

On 07/03/16 07:32, Felipe Balbi wrote:
> 
> Hi,
> 
> (please break your lines at 80-characters, have a look at
> Documentation/email-clients.txt if needed)
> 
> Felipe Ferreri Tonello  writes:
>> [ text/plain ]
>> Hi Balbi, 
>>
>> On March 4, 2016 7:20:10 AM GMT+00:00, Felipe Balbi  wrote:
>>>
>>> Hi,
>>>
>>> "Felipe F. Tonello"  writes:
 [ text/plain ]
 Since f_midi_transmit is called by both ALSA and USB frameworks, it
>>> can
 potentially cause a race condition between both calls. This is bad
>>> because the
 way f_midi_transmit is implemented can't handle concurrent calls.
>>> This is due
 to the fact that the usb request fifo looks for the next element and
>>> only if
 it has data to process it enqueues the request, otherwise re-uses it.
>>> If both
 (ALSA and USB) frameworks calls this function at the same time, the
 kfifo_seek() will return the same usb_request, which will cause a
>>> race
 condition.

 To solve this problem a syncronization mechanism is necessary. In
>>> this case it
 is used a spinlock since f_midi_transmit is also called by
>>> usb_request->complete
 callback in interrupt context.

 On benchmarks realized by me, spinlocks were more efficient then
>>> scheduling
 the f_midi_transmit tasklet in process context and using a mutex to
 synchronize. Also it performs better then previous implementation
>>> that
 allocated a usb_request for every new transmit made.
>>>
>>> behaves better in what way ? Also, previous implementation would not
>>> suffer from this concurrency problem, right ?
>>
>> The spin lock is faster than allocating usb requests all the time,
>> even if the udc uses da for it.
> 
> did you measure ? Is the extra speed really necessary ? How did you
> benchmark this ?

Yes I did measure and it was not that significant. This is not about
speed. There was a bug in that approach that I already explained on that
patch, which was approved and applied BTW.

Any way, this spinlock should've been there since that patch but I
couldn't really trigger this problem without a stress test.

So, this patch fixes a bug in the current implementation.

Felipe


0x92698E6A.asc
Description: application/pgp-keys


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-07 Thread Felipe Ferreri Tonello
Hi Balbi,

On 07/03/16 07:32, Felipe Balbi wrote:
> 
> Hi,
> 
> (please break your lines at 80-characters, have a look at
> Documentation/email-clients.txt if needed)
> 
> Felipe Ferreri Tonello  writes:
>> [ text/plain ]
>> Hi Balbi, 
>>
>> On March 4, 2016 7:20:10 AM GMT+00:00, Felipe Balbi  wrote:
>>>
>>> Hi,
>>>
>>> "Felipe F. Tonello"  writes:
 [ text/plain ]
 Since f_midi_transmit is called by both ALSA and USB frameworks, it
>>> can
 potentially cause a race condition between both calls. This is bad
>>> because the
 way f_midi_transmit is implemented can't handle concurrent calls.
>>> This is due
 to the fact that the usb request fifo looks for the next element and
>>> only if
 it has data to process it enqueues the request, otherwise re-uses it.
>>> If both
 (ALSA and USB) frameworks calls this function at the same time, the
 kfifo_seek() will return the same usb_request, which will cause a
>>> race
 condition.

 To solve this problem a syncronization mechanism is necessary. In
>>> this case it
 is used a spinlock since f_midi_transmit is also called by
>>> usb_request->complete
 callback in interrupt context.

 On benchmarks realized by me, spinlocks were more efficient then
>>> scheduling
 the f_midi_transmit tasklet in process context and using a mutex to
 synchronize. Also it performs better then previous implementation
>>> that
 allocated a usb_request for every new transmit made.
>>>
>>> behaves better in what way ? Also, previous implementation would not
>>> suffer from this concurrency problem, right ?
>>
>> The spin lock is faster than allocating usb requests all the time,
>> even if the udc uses da for it.
> 
> did you measure ? Is the extra speed really necessary ? How did you
> benchmark this ?

Yes I did measure and it was not that significant. This is not about
speed. There was a bug in that approach that I already explained on that
patch, which was approved and applied BTW.

Any way, this spinlock should've been there since that patch but I
couldn't really trigger this problem without a stress test.

So, this patch fixes a bug in the current implementation.

Felipe


0x92698E6A.asc
Description: application/pgp-keys


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-06 Thread Felipe Balbi

Hi,

(please break your lines at 80-characters, have a look at
Documentation/email-clients.txt if needed)

Felipe Ferreri Tonello  writes:
> [ text/plain ]
> Hi Balbi, 
>
> On March 4, 2016 7:20:10 AM GMT+00:00, Felipe Balbi  wrote:
>>
>>Hi,
>>
>>"Felipe F. Tonello"  writes:
>>> [ text/plain ]
>>> Since f_midi_transmit is called by both ALSA and USB frameworks, it
>>can
>>> potentially cause a race condition between both calls. This is bad
>>because the
>>> way f_midi_transmit is implemented can't handle concurrent calls.
>>This is due
>>> to the fact that the usb request fifo looks for the next element and
>>only if
>>> it has data to process it enqueues the request, otherwise re-uses it.
>>If both
>>> (ALSA and USB) frameworks calls this function at the same time, the
>>> kfifo_seek() will return the same usb_request, which will cause a
>>race
>>> condition.
>>>
>>> To solve this problem a syncronization mechanism is necessary. In
>>this case it
>>> is used a spinlock since f_midi_transmit is also called by
>>usb_request->complete
>>> callback in interrupt context.
>>>
>>> On benchmarks realized by me, spinlocks were more efficient then
>>scheduling
>>> the f_midi_transmit tasklet in process context and using a mutex to
>>> synchronize. Also it performs better then previous implementation
>>that
>>> allocated a usb_request for every new transmit made.
>>
>>behaves better in what way ? Also, previous implementation would not
>>suffer from this concurrency problem, right ?
>
> The spin lock is faster than allocating usb requests all the time,
> even if the udc uses da for it.

did you measure ? Is the extra speed really necessary ? How did you
benchmark this ?

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-06 Thread Felipe Balbi

Hi,

(please break your lines at 80-characters, have a look at
Documentation/email-clients.txt if needed)

Felipe Ferreri Tonello  writes:
> [ text/plain ]
> Hi Balbi, 
>
> On March 4, 2016 7:20:10 AM GMT+00:00, Felipe Balbi  wrote:
>>
>>Hi,
>>
>>"Felipe F. Tonello"  writes:
>>> [ text/plain ]
>>> Since f_midi_transmit is called by both ALSA and USB frameworks, it
>>can
>>> potentially cause a race condition between both calls. This is bad
>>because the
>>> way f_midi_transmit is implemented can't handle concurrent calls.
>>This is due
>>> to the fact that the usb request fifo looks for the next element and
>>only if
>>> it has data to process it enqueues the request, otherwise re-uses it.
>>If both
>>> (ALSA and USB) frameworks calls this function at the same time, the
>>> kfifo_seek() will return the same usb_request, which will cause a
>>race
>>> condition.
>>>
>>> To solve this problem a syncronization mechanism is necessary. In
>>this case it
>>> is used a spinlock since f_midi_transmit is also called by
>>usb_request->complete
>>> callback in interrupt context.
>>>
>>> On benchmarks realized by me, spinlocks were more efficient then
>>scheduling
>>> the f_midi_transmit tasklet in process context and using a mutex to
>>> synchronize. Also it performs better then previous implementation
>>that
>>> allocated a usb_request for every new transmit made.
>>
>>behaves better in what way ? Also, previous implementation would not
>>suffer from this concurrency problem, right ?
>
> The spin lock is faster than allocating usb requests all the time,
> even if the udc uses da for it.

did you measure ? Is the extra speed really necessary ? How did you
benchmark this ?

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-04 Thread Felipe Ferreri Tonello
Hi Balbi, 

On March 4, 2016 7:20:10 AM GMT+00:00, Felipe Balbi  wrote:
>
>Hi,
>
>"Felipe F. Tonello"  writes:
>> [ text/plain ]
>> Since f_midi_transmit is called by both ALSA and USB frameworks, it
>can
>> potentially cause a race condition between both calls. This is bad
>because the
>> way f_midi_transmit is implemented can't handle concurrent calls.
>This is due
>> to the fact that the usb request fifo looks for the next element and
>only if
>> it has data to process it enqueues the request, otherwise re-uses it.
>If both
>> (ALSA and USB) frameworks calls this function at the same time, the
>> kfifo_seek() will return the same usb_request, which will cause a
>race
>> condition.
>>
>> To solve this problem a syncronization mechanism is necessary. In
>this case it
>> is used a spinlock since f_midi_transmit is also called by
>usb_request->complete
>> callback in interrupt context.
>>
>> On benchmarks realized by me, spinlocks were more efficient then
>scheduling
>> the f_midi_transmit tasklet in process context and using a mutex to
>> synchronize. Also it performs better then previous implementation
>that
>> allocated a usb_request for every new transmit made.
>
>behaves better in what way ? Also, previous implementation would not
>suffer from this concurrency problem, right ?

The spin lock is faster than allocating usb requests all the time, even if the 
udc uses da for it. 

That's true it wasn't necessary to put a spin lock in the gadget driver because 
the udc driver does it when allocating a new request. 

Felipe 

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-04 Thread Felipe Ferreri Tonello
Hi Balbi, 

On March 4, 2016 7:20:10 AM GMT+00:00, Felipe Balbi  wrote:
>
>Hi,
>
>"Felipe F. Tonello"  writes:
>> [ text/plain ]
>> Since f_midi_transmit is called by both ALSA and USB frameworks, it
>can
>> potentially cause a race condition between both calls. This is bad
>because the
>> way f_midi_transmit is implemented can't handle concurrent calls.
>This is due
>> to the fact that the usb request fifo looks for the next element and
>only if
>> it has data to process it enqueues the request, otherwise re-uses it.
>If both
>> (ALSA and USB) frameworks calls this function at the same time, the
>> kfifo_seek() will return the same usb_request, which will cause a
>race
>> condition.
>>
>> To solve this problem a syncronization mechanism is necessary. In
>this case it
>> is used a spinlock since f_midi_transmit is also called by
>usb_request->complete
>> callback in interrupt context.
>>
>> On benchmarks realized by me, spinlocks were more efficient then
>scheduling
>> the f_midi_transmit tasklet in process context and using a mutex to
>> synchronize. Also it performs better then previous implementation
>that
>> allocated a usb_request for every new transmit made.
>
>behaves better in what way ? Also, previous implementation would not
>suffer from this concurrency problem, right ?

The spin lock is faster than allocating usb requests all the time, even if the 
udc uses da for it. 

That's true it wasn't necessary to put a spin lock in the gadget driver because 
the udc driver does it when allocating a new request. 

Felipe 

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-03 Thread Felipe Balbi

Hi,

"Felipe F. Tonello"  writes:
> [ text/plain ]
> Since f_midi_transmit is called by both ALSA and USB frameworks, it can
> potentially cause a race condition between both calls. This is bad because the
> way f_midi_transmit is implemented can't handle concurrent calls. This is due
> to the fact that the usb request fifo looks for the next element and only if
> it has data to process it enqueues the request, otherwise re-uses it. If both
> (ALSA and USB) frameworks calls this function at the same time, the
> kfifo_seek() will return the same usb_request, which will cause a race
> condition.
>
> To solve this problem a syncronization mechanism is necessary. In this case it
> is used a spinlock since f_midi_transmit is also called by 
> usb_request->complete
> callback in interrupt context.
>
> On benchmarks realized by me, spinlocks were more efficient then scheduling
> the f_midi_transmit tasklet in process context and using a mutex to
> synchronize. Also it performs better then previous implementation that
> allocated a usb_request for every new transmit made.

behaves better in what way ? Also, previous implementation would not
suffer from this concurrency problem, right ?

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-03 Thread Felipe Balbi

Hi,

"Felipe F. Tonello"  writes:
> [ text/plain ]
> Since f_midi_transmit is called by both ALSA and USB frameworks, it can
> potentially cause a race condition between both calls. This is bad because the
> way f_midi_transmit is implemented can't handle concurrent calls. This is due
> to the fact that the usb request fifo looks for the next element and only if
> it has data to process it enqueues the request, otherwise re-uses it. If both
> (ALSA and USB) frameworks calls this function at the same time, the
> kfifo_seek() will return the same usb_request, which will cause a race
> condition.
>
> To solve this problem a syncronization mechanism is necessary. In this case it
> is used a spinlock since f_midi_transmit is also called by 
> usb_request->complete
> callback in interrupt context.
>
> On benchmarks realized by me, spinlocks were more efficient then scheduling
> the f_midi_transmit tasklet in process context and using a mutex to
> synchronize. Also it performs better then previous implementation that
> allocated a usb_request for every new transmit made.

behaves better in what way ? Also, previous implementation would not
suffer from this concurrency problem, right ?

-- 
balbi


signature.asc
Description: PGP signature