[beagleboard] Re: GPIO from PRU via OCP on a Beaglebone Black - Timing Issues

2021-04-01 Thread Remy Porter
I've been playing with building my own kernel, which I *did* get 
cross-compilation working, even with the bindeb-pkg target, which is cool. 
The weird sideeffect of trying and disabling various bits and modules is 
that… the PRU cycle counter stops working. I have no idea how or why that 
happens, but if I use my own kernel, ensure that all the appropriate 
modules are loaded, bupkiss. I suspect I have the same problem with the FPP 
kernel (I threw that on, tried running my software, saw it didn't work, and 
just went to work making my own, but didn't dig into the root cause, just 
chalking it up as an incompatibility).

So yeah, playing with the cpufrequtils is probably going to be the next 
thing to try. Thanks again!

On Friday, March 26, 2021 at 10:04:52 AM UTC-4 d...@kulp.com wrote:

> One more thing you can try with the TI kernel if your kernel build has 
> issues:
>
> The biggest problem is the CPU going in and out of idle states.   Thus, 
> you can install the cpufrequtils and linux-cpupower packages and then at 
> boot run:
>
> cpufreq-set -g performance
> cpupower idle-set -d 1
>
> (and maybe cpupower idle-set -d 0 )
>
> The first will lock the cpu freq at 1Ghz.   The second will disable the 
> very costly idle states in the processor.   I believe the M3 processor in 
> the L4_WAKEUP is in charge of the power management stuff which includes the 
> CPU idle settings.   Flipping the CPU out of idle seems to take a long time 
> and blocks the bus while it waits.   Disabling that state helped a lot.
> Alternatively, you can install the "bone" kernel which doesn't have the 
> idle driver (or at least didn't early last year, not sure anymore).  
>  Anyway, those had a huge impact, but still wasn't 100% which is why we 
> decided to compile our own kernel completely disabling everything on the 
> L4-WAKEUP.
>
>
> Dan
>
>
> On Thursday, March 25, 2021 at 3:01:56 PM UTC-4 Remy Porter wrote:
>
>> > Personal plug:  I'd be happy to sell capes that don't use gpio0.  
>> https://kulplights.com
>> Heh, we've already got all the boards for this project. Maybe we'll 
>> revisit that design in future projects, though. 
>>
>> Your software is definitely doing a *lot* more than ours, and certainly 
>> much more than we need- we just listen for RGB data on a UDP socket. We, 
>> uh… don't really treat them like lights, and instead as very large pixels 
>> in a screen. All the mapping/direction/orientation stuff is handled in the 
>> render stack, which we build custom for pretty much every project. Last one 
>> was a Unity App that was part of a kiosk connected to a gigantic 
>> chandelier. Our current project is *kinda* like an architectural scale 
>> video wall with a C++/OpenCV app driving the pixels.
>>
>> I might give the FPP image a shot though, if building my own custom 
>> kernel doesn't help. Your guidance on that was *super* helpful, though 
>> your FPP kernel did *not* get along with our software (LEDs just didn't 
>> work- in lieu of diagnosing that, I just opted to compile my own, which is 
>> going on… right now).  Thanks a bunch!
>>
>> On Tuesday, March 23, 2021 at 4:17:37 PM UTC-4 d...@kulp.com wrote:
>>
>>> The debs for the kernel are at:
>>> https://github.com/FalconChristmas/fpp-linux-kernel/tree/master/debs
>>> do you should be able to update to our kernel fairly easy.If you 
>>> need to start building your own kernel, I'd suggest grabbing a Beaglebone 
>>> AI and building on that.   It's WAY faster for kernel building.  :).You 
>>> can cross-compile from a debian x86_64, but I was never able to get that to 
>>> actually produce proper .deb files that could be installed cleanly on the 
>>> BBB so I pretty much just use the AI for kernel builds.  (It's actually the 
>>> ONLY thing I use my AI for.)
>>>
>>> FPP provides a complete UI frontend for configuring the pixel strings 
>>> and such and we do allow the various 4 channel types. It does a lot of 
>>> other things as well.   That said, most of these things are done on the ARM 
>>> side and not the PRU.  Part of trying to figure out the latency issue was 
>>> seeing what make sense to do on the arm side and what works best on the PRU 
>>> side.If you actually wanted to try FPP and see if FPP's optimized PRU 
>>> code and kernel combination would work, you could use the FPP 4.6.1 image 
>>> on an SD card (see the release assets at github).   You would just need to 
>>> create a small json file in /opt/fpp/capes/bbb/strings to describe the 
>>> pinout of your cape (use any of them in that directory as a starting point) 
>

[beagleboard] Re: GPIO from PRU via OCP on a Beaglebone Black - Timing Issues

2021-03-25 Thread Remy Porter
> Personal plug:  I'd be happy to sell capes that don't use gpio0.  
https://kulplights.com
Heh, we've already got all the boards for this project. Maybe we'll revisit 
that design in future projects, though. 

Your software is definitely doing a *lot* more than ours, and certainly 
much more than we need- we just listen for RGB data on a UDP socket. We, 
uh… don't really treat them like lights, and instead as very large pixels 
in a screen. All the mapping/direction/orientation stuff is handled in the 
render stack, which we build custom for pretty much every project. Last one 
was a Unity App that was part of a kiosk connected to a gigantic 
chandelier. Our current project is *kinda* like an architectural scale 
video wall with a C++/OpenCV app driving the pixels.

I might give the FPP image a shot though, if building my own custom kernel 
doesn't help. Your guidance on that was *super* helpful, though your FPP 
kernel did *not* get along with our software (LEDs just didn't work- in 
lieu of diagnosing that, I just opted to compile my own, which is going on… 
right now).  Thanks a bunch!

On Tuesday, March 23, 2021 at 4:17:37 PM UTC-4 d...@kulp.com wrote:

> The debs for the kernel are at:
> https://github.com/FalconChristmas/fpp-linux-kernel/tree/master/debs
> do you should be able to update to our kernel fairly easy.If you need 
> to start building your own kernel, I'd suggest grabbing a Beaglebone AI and 
> building on that.   It's WAY faster for kernel building.  :).You can 
> cross-compile from a debian x86_64, but I was never able to get that to 
> actually produce proper .deb files that could be installed cleanly on the 
> BBB so I pretty much just use the AI for kernel builds.  (It's actually the 
> ONLY thing I use my AI for.)
>
> FPP provides a complete UI frontend for configuring the pixel strings and 
> such and we do allow the various 4 channel types. It does a lot of 
> other things as well.   That said, most of these things are done on the ARM 
> side and not the PRU.  Part of trying to figure out the latency issue was 
> seeing what make sense to do on the arm side and what works best on the PRU 
> side.If you actually wanted to try FPP and see if FPP's optimized PRU 
> code and kernel combination would work, you could use the FPP 4.6.1 image 
> on an SD card (see the release assets at github).   You would just need to 
> create a small json file in /opt/fpp/capes/bbb/strings to describe the 
> pinout of your cape (use any of them in that directory as a starting point) 
> and it should then "just work".  You would need to configure e1.31/artnet 
> input universes on the Channel Input tab, put FPP in "bridge" mode, and 
> then it should work like a normal light controller and accept pixel data.  
>  (Or use DDP protocol which doesn't require configuring the input)
>
> Personal plug:  I'd be happy to sell capes that don't use gpio0.  
> https://kulplights.com [image: Screen Shot 2021-03-23 at 4.01.22 PM.png]
>
>
>
> On Tuesday, March 23, 2021 at 3:49:47 PM UTC-4 Remy Porter wrote:
>
>> That is *super* helpful. Thanks a bunch. The pinlayout we're using on our 
>> boards uses a lot of GPIO0 already, so it's definitely too late to change 
>> on this. The way we're banging things out, all the GPIOs are being hit at 
>> the same time, so the latency does appear to hit our strings. I'll try 
>> giving your kernel a shot, though- that'll definitely help. And maybe I'll 
>> move the GPIO0 bits over to the other PRU. I hate to have to do that, but 
>> if it's what needs done, it's what needs done.
>>
>> Also, off topic, but poking at FPP: SK6182s support the WS281x protocol, 
>> so you mostly already support them, but if you poke around at 
>> ThrowingBagels approach a little, it's not a big push to get 32-bit support 
>> for RGBW LEDs (we use a lot of SK6182s with the warm-white LED, and it 
>> looks *great*). We've been running a custom hack of LEDScape for *ages* 
>> so ThrowingBagels is sorta a consolidation of the features we use, stripped 
>> down to the bare minimum.
>>
>> On Tuesday, March 23, 2021 at 3:17:25 PM UTC-4 d...@kulp.com wrote:
>>
>>> Wow... You should have contact us before doing a lot of that.   I 
>>> completely re-wrote most of the LEDScape code over the last couple years to 
>>> completely optimize things in attempts to reduce some of the timing 
>>> issues.   Porting to clpru and rproc was already part of that.   All my 
>>> updates are in FPP ( https://github.com/FalconChristmas/fpp ).
>>>
>>> Anyway, to answer your question, the issue is specific to GPIO0.  
>>>  GPIO1-3 is not affected by the massive latency issues.   Thus, the best 
>>> opti

[beagleboard] Re: GPIO from PRU via OCP on a Beaglebone Black - Timing Issues

2021-03-23 Thread Remy Porter
That is *super* helpful. Thanks a bunch. The pinlayout we're using on our 
boards uses a lot of GPIO0 already, so it's definitely too late to change 
on this. The way we're banging things out, all the GPIOs are being hit at 
the same time, so the latency does appear to hit our strings. I'll try 
giving your kernel a shot, though- that'll definitely help. And maybe I'll 
move the GPIO0 bits over to the other PRU. I hate to have to do that, but 
if it's what needs done, it's what needs done.

Also, off topic, but poking at FPP: SK6182s support the WS281x protocol, so 
you mostly already support them, but if you poke around at ThrowingBagels 
approach a little, it's not a big push to get 32-bit support for RGBW LEDs 
(we use a lot of SK6182s with the warm-white LED, and it looks *great*). 
We've been running a custom hack of LEDScape for *ages* so ThrowingBagels 
is sorta a consolidation of the features we use, stripped down to the bare 
minimum.

On Tuesday, March 23, 2021 at 3:17:25 PM UTC-4 d...@kulp.com wrote:

> Wow... You should have contact us before doing a lot of that.   I 
> completely re-wrote most of the LEDScape code over the last couple years to 
> completely optimize things in attempts to reduce some of the timing 
> issues.   Porting to clpru and rproc was already part of that.   All my 
> updates are in FPP ( https://github.com/FalconChristmas/fpp ).
>
> Anyway, to answer your question, the issue is specific to GPIO0.   GPIO1-3 
> is not affected by the massive latency issues.   Thus, the best option is 
> to chose GPIO pins on GPIO1-3 and not use the GPIO0 pins.   That wasn't an 
> option for me as we needed to output 48 strings.   In the FPP code, if 
> nothing is using the second PRU (the second PRU could be used for DMX or 
> pixelnet output), we divide the work and have one pru do the GPIO1-3 and 
> the other do the GPIO0.If something IS using the other PRU, and the 
> strings are short enough, then we split it on the one pru and do GPIO1-3 
> first, then do the GPIO0's.   For the most part, that keeps the GPIO0 
> problems from affecting all the strings so the random flashes would really 
> just be on the GPIO0 strings.   In the case where the second PRU is used 
> for something else AND the strings are longer, then we do have to do all 4 
> GPIO's at once and all of them can be affected so it's definitely not a 
> perfect solution.   
>
> To minimize the issues (but not entirely eliminate) I do now build a 
> custom 4.19 kernel that disables most of the devices on the L4_WAKEUP 
> interconnect.  Any power management and frequency scaling stuff causes huge 
> issues with GPIO0 latencies so those are the most important things to 
> disable. I think my notes are at:
>
> https://github.com/FalconChristmas/fpp-linux-kernel/tree/master/bbb-kernel
>
> Not sure if that helps enough for you.  Feel free to ask more questions.  
> :)
> Dan
>
>
>
>
>
> On Tuesday, March 23, 2021 at 1:45:51 PM UTC-4 Remy Porter wrote:
>
>> For those that may remember the old LEDScape library, I've been working 
>> on an updated version of that library, which focuses on strips instead of 
>> matrices, uses rproc instead of UIO PRUSS, and updates the PRU assembly to 
>> clpru from pasm. 
>>
>> Link: https://github.com/iontank/ThrowingBagels
>>
>> The key thing you need to know is that we hook up 32 addressable LED 
>> strips and then use the PRU to bitbang out RGB(W) data. We use the PRU 
>> because our timings need to be pretty precise- a few hundred nanoseconds 
>> for each key phase of the operation. 
>>
>> Here's the important issue: we need to address all 32 GPIO pins from the 
>> PRU, but not all of them are bound to the r30 register. So we need to go 
>> through the OCP port. This is exactly how LEDScape worked, and continues to 
>> work, just fine. We've never been able to get LEDScape working under 4.x 
>> kernels, mostly because of UIO problems (which is what kicked off this 
>> whole "move to rproc" thing).
>>
>> My upgrade, ThrowingBagels, uses basically the same core logic on the 
>> PRU, just ported to clpru assembly and running on a 4.19 kernel. And 
>> seemingly randomly, the timings hitch which causes the LEDs to flicker to 
>> the wrong color. Phases of our bitbang operation will sometimes take almost 
>> twice as long as they should- a sleep that should have been 600ns ends up 
>> taking 1100ns. The only operation happening that doesn't have guaranteed 
>> timings is writing to the GPIO pins via OCP, everything else we do happens 
>> entirely in PRU DRAM. Since this appears to happen randomly, the hitch 
>> *must* be coming from that OCP step, I assume.
>>
>> In support of that hypothes

[beagleboard] GPIO from PRU via OCP on a Beaglebone Black - Timing Issues

2021-03-23 Thread Remy Porter
For those that may remember the old LEDScape library, I've been working on 
an updated version of that library, which focuses on strips instead of 
matrices, uses rproc instead of UIO PRUSS, and updates the PRU assembly to 
clpru from pasm. 

Link: https://github.com/iontank/ThrowingBagels

The key thing you need to know is that we hook up 32 addressable LED strips 
and then use the PRU to bitbang out RGB(W) data. We use the PRU because our 
timings need to be pretty precise- a few hundred nanoseconds for each key 
phase of the operation. 

Here's the important issue: we need to address all 32 GPIO pins from the 
PRU, but not all of them are bound to the r30 register. So we need to go 
through the OCP port. This is exactly how LEDScape worked, and continues to 
work, just fine. We've never been able to get LEDScape working under 4.x 
kernels, mostly because of UIO problems (which is what kicked off this 
whole "move to rproc" thing).

My upgrade, ThrowingBagels, uses basically the same core logic on the PRU, 
just ported to clpru assembly and running on a 4.19 kernel. And seemingly 
randomly, the timings hitch which causes the LEDs to flicker to the wrong 
color. Phases of our bitbang operation will sometimes take almost twice as 
long as they should- a sleep that should have been 600ns ends up taking 
1100ns. The only operation happening that doesn't have guaranteed timings 
is writing to the GPIO pins via OCP, everything else we do happens entirely 
in PRU DRAM. Since this appears to happen randomly, the hitch *must* be 
coming from that OCP step, I assume.

In support of that hypothesis, if I upgrade from the kernel that ships with 
the "AM3358 Debian 10.3 2020-04-06 4GB eMMC IoT Flasher 
"
 
image to the most recent 4.19 kernel, the problem becomes a lot more 
infrequent. We're blasting this data out at 30fps, like video, and when cut 
down on the number of services running and update the kernel, I can get the 
glitches down from happening every few seconds, to happening every few tens 
of seconds.

My suspicion, and I can't quite prove anything, is that on 4.19 there's 
something about the kernel or configuration that sometimes adds latency to 
OCP writes, which wasn't there on 3.16. So my key question is: how do I 
improve the timing consistency when the PRU uses OCP to write to DDR RAM? I 
understand that it will never have *guaranteed* timing, but sometimes it's 
hitting me with latencies of up to 500ns. Anything I can do to minimize 
that latency would be a huge help.

TL;DR: how can I make PRU->OCP->GPIO more consistent in its timing under a 
4.19 kernel?

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/9d98af0b-9fb2-4a7d-88f4-61d93fad0d79n%40googlegroups.com.