Re: [beagleboard] Re: remoteproc write to PRU over rpmsg device blocks even when set non-blocking

2020-12-08 Thread Andrew P. Lentvorski
Bumping this.  Again.

I'd like to *NOT* have to keep supporting the fix for this on the user side 
in the 5.X series when this really needs to get fixed on the kernel side.  
I've filed the bug reports.  They're just sitting.

In reality, the rpmsg system doesn't really have the hooks to even support 
the fix from the user side as I can't query the size and depths of the 
buffers.  This needs to get fixed in the PRU rpmsg kernel subsystem.

Thanks.

On Tuesday, June 30, 2020 at 3:43:01 AM UTC-7 Andrew P. Lentvorski wrote:

> So, we're still back at the original question of "Where do I file this bug 
> so that it gets tracked?"
>
> I see some recent work on rpmsg bugs at 
> https://github.com/beagleboard/linux/issues, so I'll file a bug there.  
> But, is there somewhere else I should file it?
>
> Thanks.
>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/8c22cc26-7c2e-4fc9-a239-db3573efc1ean%40googlegroups.com.


Re: [beagleboard] Re: remoteproc write to PRU over rpmsg device blocks even when set non-blocking

2020-06-30 Thread Andrew P. Lentvorski
So, we're still back at the original question of "Where do I file this bug 
so that it gets tracked?"

I see some recent work on rpmsg bugs at 
https://github.com/beagleboard/linux/issues, so I'll file a bug there.  
But, is there somewhere else I should file it?

Thanks.

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/9382c9f2-3ecf-4a2b-8869-10b2b2cf3e6eo%40googlegroups.com.


Re: [beagleboard] Re: remoteproc write to PRU over rpmsg device blocks even when set non-blocking

2020-06-23 Thread Andrew P. Lentvorski
Sure.  Right now, I just keep track of how many messages are in flight and 
I don't allow it to queue too many.

That's useful once you know what the bug is.  Fortunately, I hit this bug 
before I had two threads (one receiving USB and one receiving ethernet) 
which would have made hunting it down quite painful.  So, at least now I 
know that I *must* have a single thread acting as a gatekeeper on top of 
the rpmsg system.

If, however, you try to use a library on top of this bug that actually 
expects the O_NONBLOCK behavior to work, you will have a long debugging 
chain.

What *originally* tripped all of this was that I tried to use Rust and 
Tokio, which failed mysteriously.  After far too much fruitless debugging, 
I switched down to Rust and mio, which also failed weirdly.

So, I switched down to C, poll, and O_NONBLOCK, which then gave the 
incorrect blocking behavior and the ERESTARTSYS.  After *that*, I could 
actually pinpoint the incorrect behavior as belonging to pru_rpmsg and as 
being due to a full queue with incorrect blocking semantics.

Getting to that point, however, was neither pleasant nor straightforward.

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/4f17a6a6-0c12-49de-81ec-25b072b6d9efo%40googlegroups.com.


Re: [beagleboard] Re: remoteproc write to PRU over rpmsg device blocks even when set non-blocking

2020-06-23 Thread 'Mark Lazarewicz' via BeagleBoard
You could increase  the vring buffers or check for full and retry depending  on 
how critical the timing is.

Sent from Yahoo Mail on Android 
 
  On Tue, Jun 23, 2020 at 2:04 AM, Andrew P. Lentvorski 
wrote:   Urk, sorry I didn't quite get the implications of this statement:

The kfifo is used only on the receive path because of the asynchronous 
callbacks. The 
Tx-path is synchronous, the copy is attempted directly on the vring buffers

That means that kfifo doesn't exist on send so the only available solution 
appears to be calling rpmsg_trysend when in O_NONBLOCK mode.
That will hit the full vring buffers and should bounce back immediately with 
ENOMEM.

Thanks.


-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/dcbb9c5a-229a-481f-8ea0-11a8735ac095o%40googlegroups.com.
  

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/961636032.1615460.1592898453563%40mail.yahoo.com.


Re: [beagleboard] Re: remoteproc write to PRU over rpmsg device blocks even when set non-blocking

2020-06-23 Thread Andrew P. Lentvorski
Urk, sorry I didn't quite get the implications of this statement:

The kfifo is used only on the receive path because of the asynchronous 
> callbacks. The 
> Tx-path is synchronous, the copy is attempted directly on the vring buffers
>

That means that kfifo doesn't exist on send so the only available solution 
appears to be calling rpmsg_trysend when in O_NONBLOCK mode.

That will hit the full vring buffers and should bounce back immediately 
with ENOMEM.

Thanks.

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/dcbb9c5a-229a-481f-8ea0-11a8735ac095o%40googlegroups.com.


Re: [beagleboard] Re: remoteproc write to PRU over rpmsg device blocks even when set non-blocking

2020-06-22 Thread Andrew P. Lentvorski
Hi, folks,

The issue is that requests cause the rpmsg channels to the PRU to fill.  
Which is actually fine, the PRU in this case is servicing slow requests and 
the rpmsg being full should exert backpressure.

The problem is that the rpmsg system *HANGS* several second before timing 
out and throws a fairly bizarre error.  Quoting my original message:

> Except that my overrun writes to "/dev/rpmsg_pru30" *still* block for 
several seconds (very bad) and then terminate with an Error 512 (huh?).

This is not good behavior from all manner of perspectives:

1) Why does the write time out *at all* when not O_NONBLOCK?  That's 
certainly not expected behavior.  There is no reason why the PRU might not 
take a couple seconds to service a request.  If that's a problem, you 
either set a timeout manually (usually only valid for file descriptors of 
sockets) or you put the file descriptor into non-blocking mode.  (It 
appears that this is the fault of the rpmsg driver which will time out 
after 15 seconds and then return ERESTARTSYS)

2) Why does the write hang *at all* when in O_NONBLOCK?  That's also not 
expected behavior.  If the queue is full, an attempt to write to it should 
return *IMMEDIATELY* with something like ENOMEM/EAGAIN.  (This appears to 
be the fault of the rpmsg_pru driver).

The file I was looking at is here:
https://github.com/beagleboard/linux/blob/4.19/drivers/rpmsg/rpmsg_pru.c

Two solutions seem to present themselves:

1) Use rpmsg_trysend when O_NONBLOCK is set  (see rpmsg_eptdev_write_iter 
in rpmsg_char.c line 243 for an example)

2) Check the queue for space and return immediately with ENOMEM.  (Saves 
the call to rpmsg_trysend and all its indirections).

3) Do both.  (It's possible that trysend covers other cases than just kfifo 
full--but the kfifo check may be a useful optimization and catch 99%+ or 
all the cases quickly).

Thanks.

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/dcb03907-1c54-4d83-8b42-6d396acbdd5co%40googlegroups.com.


Re: [beagleboard] Re: remoteproc write to PRU over rpmsg device blocks even when set non-blocking

2020-06-22 Thread 'Mark Lazarewicz' via BeagleBoard
Hi Suman 
Here is original  thread so you have  background info and time to respond if 
Andrew has more to add. 

https://groups.google.com/forum/m/?utm_medium=email_source=footer#!msg/beagleboard/6Ch7Do4Hm7k/CAcSRi1pBQAJ

Regards Mark 

Sent from Yahoo Mail on Android 
 
  On Mon, Jun 22, 2020 at 10:16 AM, 'Suman Anna' via 
BeagleBoard wrote:   If it is for support from a 
TI SDK, please post a query to E2E.

Can someone clarify meanwhile exactly what the issue is? The kfifo is 
used only on the receive path because of the asynchronous callbacks. The 
Tx-path is synchronous, the copy is attempted directly on the vring 
buffers, and you have a number of vring buffers (dictated by firmware), 
and if all of them are busy (implies PRU has either stopped processing 
or is overwhelmed), then you get a failure.

regards
Suman

On 6/22/20 8:11 AM, Jason Kridner wrote:
> Which repo has the code that is causing problems?
> 
> I took a quick look at 
> https://git.ti.com/cgit/pru-software-support-package/pru-software-support-package/tree/lib/src/rpmsg_lib/pru_rpmsg.c
>  
> and it seems to be structured a fair bit differently. If the same issue 
> had been there, I'd recommend posting to e2e.ti.com .
> 
> Switching over to the kernel, I see the function you mention:
> https://github.com/beagleboard/linux/blob/4.14/drivers/rpmsg/rpmsg_pru.c#L106-L129
>  
> 
> 
> The driver isn't upstream yet: 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/rpmsg
> 
> The post to a public list seems to be here:
> * https://patchwork.kernel.org/patch/10795751/
> 
> The development tree seems to be here:
> * https://git.ti.com/cgit/rpmsg/rpmsg/
> 
> The code seems the same in the latest development branch:
> * https://git.ti.com/cgit/rpmsg/rpmsg/tree/drivers/rpmsg/rpmsg_pru.c#n108
> 
> Er, I guess that is an example of doing it right and the issue is here?
> * https://git.ti.com/cgit/rpmsg/rpmsg/tree/drivers/rpmsg/rpmsg_pru.c#n142
> 
> Since it isn't upstream, I'd think an e2e post might be OK, but it might 
> be more productive to reply to the latest post on linux-omap:
> * 
> https://lore.kernel.org/linux-omap/e97f7bfc-a3c2-92a9-953e-572d9438d...@ti.com/
> 
> Copy Jason Reeder, Anthony F. Davis and Suman Anna. Not sure why it has 
> been so long between revision posts.
> 
> Personally, I don't see any harm in modifying the _write code with a 
> fifo check on O_NONBLOCK.
> 
> On Mon, Jun 22, 2020 at 2:01 AM Andrew P. Lentvorski  > wrote:
> 
>    Nobody knows where I should file this bug?
> 
>    On Saturday, June 6, 2020 at 6:34:26 PM UTC-7, Andrew P. Lentvorski
>    wrote:
> 
>        It appears that the problem is in rpmsg_pru.c.
> 
>        rpmsg_pru_read has the following code:
> 
>        |
>        if(kfifo_is_empty(>msg_fifo)&&
>        (filp->f_flags _NONBLOCK))
>        return-EAGAIN;
> 
>        |
> 
> 
>        rpmsg_pru_write presumably needs a similar piece of code with
>        kfifo_is_full() or it needs to look for O_NONBLOCK and then use
>        rpmsg_trysend instead of rpmsg_send.
> 
>        Unfortunately, I've got nowhere near the Linux kernel
>        programming chops to debate the implications of that.
> 
>        Presumably, I need to file a bug somewhere?
> 
>        Thanks.
> 
>    -- 
>    For more options, visit http://beagleboard.org/discuss
>    ---
>    You received this message because you are subscribed to the Google
>    Groups "BeagleBoard" group.
>    To unsubscribe from this group and stop receiving emails from it,
>    send an email to beagleboard+unsubscr...@googlegroups.com
>    .
>    To view this discussion on the web visit
>    
>https://groups.google.com/d/msgid/beagleboard/2c824e98-015d-4471-b787-a8c27ceaae5fo%40googlegroups.com
>    
>.
> 
> 
> 
> -- 
> https://beagleboard.org/about - a 501c3 non-profit educating around open 
> hardware computing

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/33317f41-b499-3d1f-7281-29ac57976f7e%40ti.com.
  

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/1252527866.1463871.1592859848808%40mail.yahoo.com.


Re: [beagleboard] Re: remoteproc write to PRU over rpmsg device blocks even when set non-blocking

2020-06-22 Thread 'Suman Anna' via BeagleBoard

If it is for support from a TI SDK, please post a query to E2E.

Can someone clarify meanwhile exactly what the issue is? The kfifo is 
used only on the receive path because of the asynchronous callbacks. The 
Tx-path is synchronous, the copy is attempted directly on the vring 
buffers, and you have a number of vring buffers (dictated by firmware), 
and if all of them are busy (implies PRU has either stopped processing 
or is overwhelmed), then you get a failure.


regards
Suman

On 6/22/20 8:11 AM, Jason Kridner wrote:

Which repo has the code that is causing problems?

I took a quick look at 
https://git.ti.com/cgit/pru-software-support-package/pru-software-support-package/tree/lib/src/rpmsg_lib/pru_rpmsg.c 
and it seems to be structured a fair bit differently. If the same issue 
had been there, I'd recommend posting to e2e.ti.com .


Switching over to the kernel, I see the function you mention:
https://github.com/beagleboard/linux/blob/4.14/drivers/rpmsg/rpmsg_pru.c#L106-L129 



The driver isn't upstream yet: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/rpmsg


The post to a public list seems to be here:
* https://patchwork.kernel.org/patch/10795751/

The development tree seems to be here:
* https://git.ti.com/cgit/rpmsg/rpmsg/

The code seems the same in the latest development branch:
* https://git.ti.com/cgit/rpmsg/rpmsg/tree/drivers/rpmsg/rpmsg_pru.c#n108

Er, I guess that is an example of doing it right and the issue is here?
* https://git.ti.com/cgit/rpmsg/rpmsg/tree/drivers/rpmsg/rpmsg_pru.c#n142

Since it isn't upstream, I'd think an e2e post might be OK, but it might 
be more productive to reply to the latest post on linux-omap:
* 
https://lore.kernel.org/linux-omap/e97f7bfc-a3c2-92a9-953e-572d9438d...@ti.com/


Copy Jason Reeder, Anthony F. Davis and Suman Anna. Not sure why it has 
been so long between revision posts.


Personally, I don't see any harm in modifying the _write code with a 
fifo check on O_NONBLOCK.


On Mon, Jun 22, 2020 at 2:01 AM Andrew P. Lentvorski > wrote:


Nobody knows where I should file this bug?

On Saturday, June 6, 2020 at 6:34:26 PM UTC-7, Andrew P. Lentvorski
wrote:

It appears that the problem is in rpmsg_pru.c.

rpmsg_pru_read has the following code:

|
if(kfifo_is_empty(>msg_fifo)&&
(filp->f_flags _NONBLOCK))
return-EAGAIN;

|


rpmsg_pru_write presumably needs a similar piece of code with
kfifo_is_full() or it needs to look for O_NONBLOCK and then use
rpmsg_trysend instead of rpmsg_send.

Unfortunately, I've got nowhere near the Linux kernel
programming chops to debate the implications of that.

Presumably, I need to file a bug somewhere?

Thanks.

-- 
For more options, visit http://beagleboard.org/discuss

---
You received this message because you are subscribed to the Google
Groups "BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to beagleboard+unsubscr...@googlegroups.com
.
To view this discussion on the web visit

https://groups.google.com/d/msgid/beagleboard/2c824e98-015d-4471-b787-a8c27ceaae5fo%40googlegroups.com

.



--
https://beagleboard.org/about - a 501c3 non-profit educating around open 
hardware computing


--
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups "BeagleBoard" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/33317f41-b499-3d1f-7281-29ac57976f7e%40ti.com.


Re: [beagleboard] Re: remoteproc write to PRU over rpmsg device blocks even when set non-blocking

2020-06-22 Thread Jason Kridner
Which repo has the code that is causing problems?

I took a quick look at
https://git.ti.com/cgit/pru-software-support-package/pru-software-support-package/tree/lib/src/rpmsg_lib/pru_rpmsg.c
and it seems to be structured a fair bit differently. If the same issue had
been there, I'd recommend posting to e2e.ti.com.

Switching over to the kernel, I see the function you mention:
https://github.com/beagleboard/linux/blob/4.14/drivers/rpmsg/rpmsg_pru.c#L106-L129

The driver isn't upstream yet:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/rpmsg

The post to a public list seems to be here:
* https://patchwork.kernel.org/patch/10795751/

The development tree seems to be here:
* https://git.ti.com/cgit/rpmsg/rpmsg/

The code seems the same in the latest development branch:
* https://git.ti.com/cgit/rpmsg/rpmsg/tree/drivers/rpmsg/rpmsg_pru.c#n108

Er, I guess that is an example of doing it right and the issue is here?
* https://git.ti.com/cgit/rpmsg/rpmsg/tree/drivers/rpmsg/rpmsg_pru.c#n142

Since it isn't upstream, I'd think an e2e post might be OK, but it might be
more productive to reply to the latest post on linux-omap:
*
https://lore.kernel.org/linux-omap/e97f7bfc-a3c2-92a9-953e-572d9438d...@ti.com/

Copy Jason Reeder, Anthony F. Davis and Suman Anna. Not sure why it has
been so long between revision posts.

Personally, I don't see any harm in modifying the _write code with a fifo
check on O_NONBLOCK.

On Mon, Jun 22, 2020 at 2:01 AM Andrew P. Lentvorski 
wrote:

> Nobody knows where I should file this bug?
>
> On Saturday, June 6, 2020 at 6:34:26 PM UTC-7, Andrew P. Lentvorski wrote:
>>
>> It appears that the problem is in rpmsg_pru.c.
>>
>> rpmsg_pru_read has the following code:
>>
>> if (kfifo_is_empty(>msg_fifo) &&
>> (filp->f_flags & O_NONBLOCK))
>> return -EAGAIN;
>>
>>
>>
>> rpmsg_pru_write presumably needs a similar piece of code with
>> kfifo_is_full() or it needs to look for O_NONBLOCK and then use
>> rpmsg_trysend instead of rpmsg_send.
>>
>> Unfortunately, I've got nowhere near the Linux kernel programming chops
>> to debate the implications of that.
>>
>> Presumably, I need to file a bug somewhere?
>>
>> Thanks.
>>
> --
> For more options, visit http://beagleboard.org/discuss
> ---
> You received this message because you are subscribed to the Google Groups
> "BeagleBoard" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to beagleboard+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beagleboard/2c824e98-015d-4471-b787-a8c27ceaae5fo%40googlegroups.com
> 
> .
>


-- 
https://beagleboard.org/about - a 501c3 non-profit educating around open
hardware computing

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/CA%2BT6QPkt6RSbEKb5Jx7Tb9DmSW1%3DNDipFZkA2LkcyL%3DNi%3DtVQQ%40mail.gmail.com.