There have been numerous fixes for Cortex-A SMP since 7.31. You might
want to upgrade.
On 7/29/2020 10:19 AM, Mansfield, Ryan wrote:
Hi,
We are trying to enable SMP for a dual core ARM Cortex-R8. We are using a
kernel based on NuttX-7.31, with added proprietary support for our embedded
Cortex-R8 SOC with separate instruction and data caches. We modeled the R-core
SMP implementation after the A-core implementation and the IMX6 as recent as
February, 2020.
Our current method of stress testing is to enable our features, which consist
of some threads (around 6 or so) and low priority work queue work, and running
Ostest many times.
We would like to run both I-cache and D-cache enabled, but we spend some time
trying to enable coherent d-caches with the SCU with no luck, and noticed some
stability issues called out in the Sabre6 readme. So we settled on I-cache
only. However, even with just I-cache, we have been experiencing issues booting
into the operating system with that enabled. It seems that when we enable more
features to start on boot, we seem to get caught in the ldrex/strex assembly
loop of the up_testset.S function in both cores. We cannot reproduce this with
I-cache disabled. We added some DSB instructions before and after the
lrdex/strex pair, and we don't see that issue anymore. Is there anything you
would be aware of that would cause this issue?
However, when we do this, we start hitting some system asserts related to critical
sections and sched_lock/unlock commands during "stress testing". For example,
we seem to hit an enter_critical_section assert where the irqlock on the tcb is 0, while
g_cpu_irqset indicates the calling CPU has the irq lock. We see this happen recently when
we call nxsem_post from a ramlog_read.
DEBUGASSERT((g_cpu_irqset & (1 << cpu)) == 0);
We have also hit system asserts during OS test during the signest test in the
waiter_action function. During the sched_unlock after we increment the
nest_count var, we see the assert where the lockcount of the tcb is 1, as
expected, but the locks are not held by the CPU in question.
DEBUGASSERT(g_cpu_schedlock == SP_LOCKED &&
(g_cpu_lockset & (1 << cpu)) != 0);
We think the issue might be one of two things. that the changes to up_testset
are somehow ruining the atomic access to the irq/sched_lock and corresponding
set bits via the set_locks, or that we pulled architecture specific changes
that require some generic OS code to be updated as well.
Two questions:
1) What conditions have you seen that trigger those specific
asserts that we should look out for?
2) Do any of the current working SMP implementations support
I-cache?
Thank you,
Ryan