On Mon, Mar 1, 2021 at 11:20 AM Jan Kiszka via Xenomai <xenomai@xenomai.org>
wrote:

> On 25.02.21 15:18, Philippe Gerum wrote:
> >
> > Jan Kiszka <jan.kis...@siemens.com> writes:
> >
> >> On 25.02.21 14:54, Philippe Gerum wrote:
> >>>
> >>> Jan Kiszka <jan.kis...@siemens.com> writes:
> >>>
> >>>> On 24.02.21 12:35, Henning Schild via Xenomai wrote:
> >>>>> Am Wed, 24 Feb 2021 11:24:55 +0100
> >>>>> schrieb Henning Schild via Xenomai <xenomai@xenomai.org>:
> >>>>>
> >>>>>> Am Wed, 10 Feb 2021 12:08:43 +0100
> >>>>>> schrieb Jan Kiszka via Xenomai <xenomai@xenomai.org>:
> >>>>>>
> >>>>>>> On 10.02.21 11:07, Bezdeka, Florian (T RDA IOT SES-DE) wrote:
> >>>>>>>> On Wed, 2021-02-10 at 09:15 +0100, Jan Kiszka via Xenomai wrote:
> >>>>>>>>
> >>>>>>>>> On 10.02.21 07:22, xenomai--- via Xenomai wrote:
> >>>>>>>>>> Download URL:
> >>>>>>>>>>
> https://xenomai.org/downloads/ipipe/v4.x/arm64/ipipe-core-4.19.165-cip41-arm64-09.patch
> >>>>>>>>>>
> >>>>>>>>>> Repository: https://git.xenomai.org/ipipe-arm64
> >>>>>>>>>> Release tag: ipipe-core-4.19.165-cip41-arm64-09
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Hmm, now we have the 5.4-arm64 issue also on 4.19:
> >>>>>>>>> https://gitlab.denx.de/Xenomai/xenomai-images/-/jobs/219984
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I don't know much about the things going on here, but found this
> >>>>>>>> line in the log. Maybe a starting point...
> >>>>>>>>
> >>>>>>>> 2021-02-10T07:51:47 setsched.c:120, assertion failed: stats.msw ==
> >>>>>>>> msw
> >>>>>>>
> >>>>>>> Exactly, that is causing the overall failure. And it was first seen
> >>>>>>> with the newly added 5.4 kernel.
> >>>>>>
> >>>>>> Seing the same on amd64 when testing on qemu, real HW is fine.
> >>>>>>
> >>>>>> Managed to bisect it down to 4.19.147-cip (good) 4.19.150-cip (bad)
> >>>>>>
> >>>>>> Which also means that ipipe-core-4.19.152-cip37-x86-15 is affected.
> >>>>>>
> >>>>>> https://gitlab.denx.de/Xenomai/xenomai-images/-/jobs/200646
> >>>>>> did not find it, so maybe our config differs
> >>>>
> >>>> Already compared yours against the one in xenomai-images? That would
> be
> >>>> useful.
> >>>>
> >>>>>
> >>>>> Digging further i found 0f0b6099c45ff3e06d2487816cf1ff30d21835f6
> likely
> >>>>> causing the problem.
> >>>>>
> >>>>> ipipe-core-4.19.152-cip37-x86-15 <- bad
> >>>>> revert 2b294ac325c7ce3f36854b74d0d1d89dc1d1d8b8
> >>>>> revert 8579a0440381353e0a71dd6a4d4371be8457eac4 <- bad
> >>>>> revert 0f0b6099c45ff3e06d2487816cf1ff30d <- good
> >>>>>
> >>>>> I think here Jan or Phillipe should take over.
> >>>>
> >>>> Thanks for bisecting, this is helpful!
> >>>>
> >>>> Philippe, any immediate idea why all that is failing now?
> >>>
> >>> Something may be going wrong with MAP_SHARED mappings wrt commit_vma()
> >>> in Dovetail. I'm adding this to my debug queue.
> >>>
> >>
> >> This is still I-pipe, not a dovetail-related issues.
> >
> > This I-pipe release mimics what Dovetail does wrt mm pinning.
> >
>
> Any news on this from your side?
>
> Florian took a trace from the system where this was observed on x86. It
> seems to confirm that we have an unexpected minor fault here:
>
> smokey-568   [000]    74.233945: cobalt_head_sysentry:
> syscall=thread_getschedparam_ex
> smokey-568   [000]    74.233950: cobalt_pthread_getschedparam:
> pth=0x7f2a18ebf700 policy=fifo param={ priority=3 }
> smokey-568   [000]    74.233950: cobalt_head_sysexit:  result=0
> smokey-568   [000]    74.233952: cobalt_head_sysentry:
> syscall=thread_getstat
> smokey-568   [000]    74.233952: cobalt_pthread_stat:  pid=568
> smokey-568   [000]    74.233953: cobalt_head_sysexit:  result=0
> smokey-568   [000]    74.233962: cobalt_thread_fault:  ip=0x7f2a19cd6b46
> type=e
> smokey-568   [000]    74.233962: cobalt_shadow_gorelax: reason=fault
> smokey-568   [000]    74.233963: cobalt_lostage_request:
> request=ffffffffbcd85992 pid=568 comm=smokey
> ...
> smokey-568   [000]    74.234005: cobalt_shadow_relaxed: state=0x480c0
> info=0x0
> ...
> smokey-568   [000]    74.235027: cobalt_head_sysentry: syscall=ftrace_puts
> smokey-568   [000]    74.235028: cobalt_root_sysentry: syscall=ftrace_puts
> smokey-568   [000]    74.235028: print:                CoBaLt_ftrace_puts:
> Second assertion failed
> (that's around line 120 in smokey/setsched/setsched.c)
>
> Now, before I dig into the code you pointed to, I just wanted to sync.
>
> Jan
>
> --
> Siemens AG, T RDA IOT
> Corporate Competence Center Embedded Linux
>
>
On my end I spent the weekend trying to reproduce the issue on ARM64.  I've
been trying to reproduce on qemu using the default defconfig and the
defconfig from the CI machines.  To save time I run the setsched test in
smokey by passing in the test id.  I still haven't had any luck reproducing
it.  I'm currently running it in a loop hoping it will cause the failure.
When you guys reproduce the issue, do all the tests get run?

-Greg

Reply via email to