I don't have a test case that I can post or give away, yet. I may put one together later.
Yes. All 4 tasks are on that same conditional wait. The condition is a response from a block device. The test case is hammering the RAM disk. I wrote some of the flash drivers but as far as I know, the RAM disk was used as is so I haven't looked at the code. That conditional wait does not have a time out so it just pends forever. Thanks. I'll try the RFS trace when I have time. Unfortunately, this is late in the development cycle so I don't have ready access to the article under test. Loading it with non production code takes time. Trying to reproduce it on a non production environment. For now, all I have is shell commands and manually decoding dumped memory. Somebody more knowledgable on the inner workings of the RTEMS kernel expressed concern a while ago that heavy use of the RAM disk could basically out run the kernel worker thread. I noted it but didn't ask about the mechanics of how or why that could happen. Are requests to the worker threads in a lossy queue? Is it possible the request is getting dropped? On Mon, Oct 7, 2019 at 11:30 PM Chris Johns <chr...@rtems.org> wrote: > On 8/10/19 12:53 pm, Mathew Benson wrote: > > I'm using RTEMS 5 on a LEON3. I'm troubleshooting a failure condition > that > > occurs when stress test reading and writing to and from RAM disk. RAM > disk to > > RAM disk. When the condition is tripped, it appears that I have 4 tasks > that > > are pending on conditions that just never happens. > > Do you have a test case? > > > The task command shows: > > > > ID NAME SHED PRI STATE MODES EVENTS WAITINFO > > > ------------------------------------------------------------------------------ > > 0a01000c TSKA UPD 135 MTX P:T:nA NONE RFS > > 0a01001f TSKB UPD 135 CV P:T:nA NONE bdbuf > access > > 0a010020 TSKC UPD 150 MTX P:T:nA NONE RFS > > 0a010032 TSKD UPD 245 MTX P:T:nA NONE RFS > > It looks like TSKA, TSKC and TSKD are waiting for the RFS lock and TSKB is > blocked in a bdbuf access. I wonder why that is blocked? > > The RFS hold's it lock over the bdbuf calls. > > > > None of my tasks appear to be failed. Nobody is pending on anything > noticeable > > except the 4 above. The conditional wait is a single shared resource so > any > > attempt to access the file system after this happens results in yet > another > > forever pended task. > > > > Digging into source code, it appears that the kernel is waiting for a > specific > > response from a block device, but just didn't get what its expecting. > The next > > thing is to determine which block device the kernel is pending on, what > the > > expected response is, and what the block device actually did. Can > anybody shed > > some light on this or recommend some debugging steps? I'm trying to > exhaust > > all I can do before I start manually decoding machine code. > > The RFS has trace support you can access via > `rtems/rfs/rtems-rfs-trace.h`. You > can set the trace mask in your code or you can can call > `rtems_rfs_trace_shell_command()` with suitable arguments or hook it to an > existing shell. There is a buffer trace flag that show the release calls > to bdbuf .. > > RTEMS_RFS_TRACE_BUFFER_RELEASE > > There is no trace call to get or read. Maybe add a get/read trace as well. > > The RAM disk also has trace in the code which can be enabled by editing > the file. > > Chris > -- *Mathew Benson* CEO | Chief Engineer Windhover Labs, LLC 832-640-4018 www.windhoverlabs.com
_______________________________________________ users mailing list users@rtems.org http://lists.rtems.org/mailman/listinfo/users