Re: sideeffect of rcu_nocbs on periodic Alchemy task
On 17.09.19 15:57, jk.beh...@web.de wrote: Hi Jan, wow, that was a really quick response. Does "testing latest rt release" mean that I also have to use a new Linux kernel version? Currently I am running linux 4.14.71-rt44. The most recent rt-patch for linux 4.14 seems to be "patch-4.14.137-rt66.patch" Would you recommend to use that? I would recommend v5.2.14-rt7 to cross-check if the issue is version-agnostic. Jan Thanks for your comments Jochen >> Hello, >> >> I am running an Alchemy application (xenomai 3.0.9) over Linux >> prempt-rt on a dual core >> atom (E3930) and noticed a side effect of the Linux boot parameter >> "rcu_nocbs". > >To clarify: You are using Xenomai in mercury mode (userspace libs only) on a >stock preempt-rt kernel, right? Then a kernel splash is best reported to the >preempt-rt community. But first make sure to have tested also the latest release >to clarify if that is a stable -rt issue or a general one. > >Jan > >> >> Whenever the kernel bootparameter "rcu_nocbs=1" is set, I get the >> following Linux >> kernel warning, when the AlchemyTask (TestTask) is terminated. >> >> >> Sep 17 13:02:43 localhost kernel: [ 97.342398] [ cut here >> ] >> Sep 17 13:02:43 localhost kernel: [ 97.342412] WARNING: CPU: 0 PID: >> 530 at >> /home/behnkjoc/prc2020/poky/build-tca5-32/tmp/work-shared/congatec-tca5 >> -32/kernel-source/kernel/rcu/tree_plugin.h:310 >> rcu_note_context_switch+0x2a0/0x4d0 >> Sep 17 13:02:43 localhost kernel: [ 97.342414] Modules linked in: >> ec_generic(O) ec_master(O) spidev nls_iso8859_1 cmdlinepart >> intel_spi_platform intel_spi spi_nor mtd spi_pxa2xx_platform joydev >> intel_rapl intel_powerclamp coretemp crc32_pclmul snd_hda_codec_hdmi >> pcbc aesni_intel aes_i586 crypto_simd cryptd intel_cstate >> intel_rapl_perf i2c_i801 lpc_ich pcspkr idma64 virt_dma intel_lpss_ >> pci intel_lpss input_leds mac_hid i915 video snd_hda_intel >> drm_kms_helper snd_hda_codec snd_hda_core snd_hwdep snd_pcm >> hid_multitouch drm mei_me snd_timer fb_sys_fops syscopyarea sysfillrect >> snd >> mei sysimgblt soundcore shpchp sch_fq_codel nfsd autofs4 >> Sep 17 13:02:43 localhost kernel: [ 97.342470] CPU: 0 PID: 530 Comm: >> TestTask Tainted: G O 4.14.71-rt44 #1 >> Sep 17 13:02:43 localhost kernel: [ 97.342472] task: f3347300 >> task.stack: f32ea000 >> Sep 17 13:02:43 localhost kernel: [ 97.342476] EIP: >> rcu_note_context_switch+0x2a0/0x4d0 >> Sep 17 13:02:43 localhost kernel: [ 97.342478] EFLAGS: 00010002 CPU: >> 0 >> Sep 17 13:02:43 localhost kernel: [ 97.342480] EAX: 0001 EBX: >> ECX: 0001 EDX: >> Sep 17 13:02:43 localhost kernel: [ 97.342482] ESI: EDI: >> f3347300 EBP: f32ebeec ESP: f32ebed0 >> Sep 17 13:02:43 localhost kernel: [ 97.342485] DS: 007b ES: 007b FS: >> 00d8 GS: 00e0 SS: 0068 >> Sep 17 13:02:43 localhost kernel: [ 97.342487] CR0: 80050033 CR2: >> 06338490 CR3: 332d4000 CR4: 003406d0 >> Sep 17 13:02:43 localhost kernel: [ 97.342489] Call Trace: >> Sep 17 13:02:43 localhost kernel: [ 97.342501] ? >> unpin_current_cpu+0x53/0x80 >> Sep 17 13:02:43 localhost kernel: [ 97.342507] __schedule+0x85/0x700 >> Sep 17 13:02:43 localhost kernel: [ 97.342511] ? >> _raw_spin_unlock_irqrestore+0x17/0x50 >> Sep 17 13:02:43 localhost kernel: [ 97.342514] ? >> rt_spin_unlock+0x24/0x50 >> Sep 17 13:02:43 localhost kernel: [ 97.342517] schedule+0x41/0xe0 >> Sep 17 13:02:43 localhost kernel: [ 97.342521] >> hrtimer_wait_for_timer+0x5d/0x90 >> Sep 17 13:02:43 localhost kernel: [ 97.342525] ? >> wait_woken+0x70/0x70 >> Sep 17 13:02:43 localhost kernel: [ 97.342530] >> timer_wait_for_callback+0x40/0x50 >> Sep 17 13:02:43 localhost kernel: [ 97.342533] >> SyS_timer_delete+0x6b/0x140 >> Sep 17 13:02:43 localhost kernel: [ 97.342538] >> do_int80_syscall_32+0x6b/0xf0 >> Sep 17 13:02:43 localhost kernel: [ 97.342542] >> entry_INT80_32+0x31/0x31 >> Sep 17 13:02:43 localhost kernel: [ 97.342545] EIP: 0xb22c68d0 >> Sep 17 13:02:43 localhost kernel: [ 97.342546] EFLAGS: 0282 CPU: >> 0 >> Sep 17 13:02:43 localhost kernel: [ 97.342548] EAX: ffda EBX: >> 0002 ECX: EDX: b1d00480 >> Sep 17 13:02:43 localhost kernel: [ 97.342550] ESI: b1d005e0 EDI: >> b22cc000 EBP: b1fc7318 ESP: b1fc72b0 >> Sep 17 13:02:43 localhost kernel: [ 97.342552] DS: 007b ES: 007b FS: >> GS: 0033 SS: 007b >> Sep 17 13:02:43 localhost kernel: [ 97.342556] Code: c3 83 e8 01 39 >> c2 0f 85 27 02 00 00 83 f9 0f 8d 97 f8 02 00 00 0f 87 78 01 00 00 8b 04 >> 8d f8 52 51 c3 e9 a4 9b 83 00 8d 74 26 00 <0f> 0b 80 >> bf f4 02 00 00 00 0f 85 a6 fd ff ff e9 1c ff ff ff 8d >> Sep 17 13:02:43 localhost kernel: [ 97.342600] ---[ end trace >> 0002 ]--- >> The issue can be reproduced with the following simple program >> /// >> >> // Test application >> ///
Aw: Re: sideeffect of rcu_nocbs on periodic Alchemy task
Hi Jan, wow, that was a really quick response. Does "testing latest rt release" mean that I also have to use a new Linux kernel version? Currently I am running linux 4.14.71-rt44. The most recent rt-patch for linux 4.14 seems to be "patch-4.14.137-rt66.patch" Would you recommend to use that? Thanks for your comments Jochen >> Hello, >> >> I am running an Alchemy application (xenomai 3.0.9) over Linux >> prempt-rt on a dual core >> atom (E3930) and noticed a side effect of the Linux boot parameter >> "rcu_nocbs". > >To clarify: You are using Xenomai in mercury mode (userspace libs only) on a >stock preempt-rt kernel, right? Then a kernel splash is best reported to the >preempt-rt community. But first make sure to have tested also the latest release >to clarify if that is a stable -rt issue or a general one. > >Jan > >> >> Whenever the kernel bootparameter "rcu_nocbs=1" is set, I get the >> following Linux >> kernel warning, when the AlchemyTask (TestTask) is terminated. >> >> >> Sep 17 13:02:43 localhost kernel: [ 97.342398] [ cut here >> ] >> Sep 17 13:02:43 localhost kernel: [ 97.342412] WARNING: CPU: 0 PID: >> 530 at >> /home/behnkjoc/prc2020/poky/build-tca5-32/tmp/work-shared/congatec-tca5 >> -32/kernel-source/kernel/rcu/tree_plugin.h:310 >> rcu_note_context_switch+0x2a0/0x4d0 >> Sep 17 13:02:43 localhost kernel: [ 97.342414] Modules linked in: >> ec_generic(O) ec_master(O) spidev nls_iso8859_1 cmdlinepart >> intel_spi_platform intel_spi spi_nor mtd spi_pxa2xx_platform joydev >> intel_rapl intel_powerclamp coretemp crc32_pclmul snd_hda_codec_hdmi >> pcbc aesni_intel aes_i586 crypto_simd cryptd intel_cstate >> intel_rapl_perf i2c_i801 lpc_ich pcspkr idma64 virt_dma intel_lpss_ >> pci intel_lpss input_leds mac_hid i915 video snd_hda_intel >> drm_kms_helper snd_hda_codec snd_hda_core snd_hwdep snd_pcm >> hid_multitouch drm mei_me snd_timer fb_sys_fops syscopyarea sysfillrect >> snd >> mei sysimgblt soundcore shpchp sch_fq_codel nfsd autofs4 >> Sep 17 13:02:43 localhost kernel: [ 97.342470] CPU: 0 PID: 530 Comm: >> TestTask Tainted: G O 4.14.71-rt44 #1 >> Sep 17 13:02:43 localhost kernel: [ 97.342472] task: f3347300 >> task.stack: f32ea000 >> Sep 17 13:02:43 localhost kernel: [ 97.342476] EIP: >> rcu_note_context_switch+0x2a0/0x4d0 >> Sep 17 13:02:43 localhost kernel: [ 97.342478] EFLAGS: 00010002 CPU: >> 0 >> Sep 17 13:02:43 localhost kernel: [ 97.342480] EAX: 0001 EBX: >> ECX: 0001 EDX: >> Sep 17 13:02:43 localhost kernel: [ 97.342482] ESI: EDI: >> f3347300 EBP: f32ebeec ESP: f32ebed0 >> Sep 17 13:02:43 localhost kernel: [ 97.342485] DS: 007b ES: 007b FS: >> 00d8 GS: 00e0 SS: 0068 >> Sep 17 13:02:43 localhost kernel: [ 97.342487] CR0: 80050033 CR2: >> 06338490 CR3: 332d4000 CR4: 003406d0 >> Sep 17 13:02:43 localhost kernel: [ 97.342489] Call Trace: >> Sep 17 13:02:43 localhost kernel: [ 97.342501] ? >> unpin_current_cpu+0x53/0x80 >> Sep 17 13:02:43 localhost kernel: [ 97.342507] __schedule+0x85/0x700 >> Sep 17 13:02:43 localhost kernel: [ 97.342511] ? >> _raw_spin_unlock_irqrestore+0x17/0x50 >> Sep 17 13:02:43 localhost kernel: [ 97.342514] ? >> rt_spin_unlock+0x24/0x50 >> Sep 17 13:02:43 localhost kernel: [ 97.342517] schedule+0x41/0xe0 >> Sep 17 13:02:43 localhost kernel: [ 97.342521] >> hrtimer_wait_for_timer+0x5d/0x90 >> Sep 17 13:02:43 localhost kernel: [ 97.342525] ? >> wait_woken+0x70/0x70 >> Sep 17 13:02:43 localhost kernel: [ 97.342530] >> timer_wait_for_callback+0x40/0x50 >> Sep 17 13:02:43 localhost kernel: [ 97.342533] >> SyS_timer_delete+0x6b/0x140 >> Sep 17 13:02:43 localhost kernel: [ 97.342538] >> do_int80_syscall_32+0x6b/0xf0 >> Sep 17 13:02:43 localhost kernel: [ 97.342542] >> entry_INT80_32+0x31/0x31 >> Sep 17 13:02:43 localhost kernel: [ 97.342545] EIP: 0xb22c68d0 >> Sep 17 13:02:43 localhost kernel: [ 97.342546] EFLAGS: 0282 CPU: >> 0 >> Sep 17 13:02:43 localhost kernel: [ 97.342548] EAX: ffda EBX: >> 0002 ECX: EDX: b1d00480 >> Sep 17 13:02:43 localhost kernel: [ 97.342550] ESI: b1d005e0 EDI: >> b22cc000 EBP: b1fc7318 ESP: b1fc72b0 >> Sep 17 13:02:43 localhost kernel: [ 97.342552] DS: 007b ES: 007b FS: >> GS: 0033 SS: 007b >> Sep 17 13:02:43 localhost kernel: [ 97.342556] Code: c3 83 e8 01 39 >> c2 0f 85 27 02 00 00 83 f9 0f 8d 97 f8 02 00 00 0f 87 78 01 00 00 8b 04 >> 8d f8 52 51 c3 e9 a4 9b 83 00 8d 74 26 00 <0f> 0b 80 >> bf f4 02 00 00 00 0f 85 a6 fd ff ff e9 1c ff ff ff 8d >> Sep 17 13:02:43 localhost kernel: [ 97.342600] ---[ end trace >> 0002 ]--- >> The issue can be reproduced with the following simple program >> //
Re: sideeffect of rcu_nocbs on periodic Alchemy task
On 17.09.19 15:16, JK.Behnke--- via Xenomai wrote: Hello, I am running an Alchemy application (xenomai 3.0.9) over Linux prempt-rt on a dual core atom (E3930) and noticed a side effect of the Linux boot parameter "rcu_nocbs". To clarify: You are using Xenomai in mercury mode (userspace libs only) on a stock preempt-rt kernel, right? Then a kernel splash is best reported to the preempt-rt community. But first make sure to have tested also the latest release to clarify if that is a stable -rt issue or a general one. Jan Whenever the kernel bootparameter "rcu_nocbs=1" is set, I get the following Linux kernel warning, when the AlchemyTask (TestTask) is terminated. Sep 17 13:02:43 localhost kernel: [ 97.342398] [ cut here ] Sep 17 13:02:43 localhost kernel: [ 97.342412] WARNING: CPU: 0 PID: 530 at /home/behnkjoc/prc2020/poky/build-tca5-32/tmp/work-shared/congatec-tca5 -32/kernel-source/kernel/rcu/tree_plugin.h:310 rcu_note_context_switch+0x2a0/0x4d0 Sep 17 13:02:43 localhost kernel: [ 97.342414] Modules linked in: ec_generic(O) ec_master(O) spidev nls_iso8859_1 cmdlinepart intel_spi_platform intel_spi spi_nor mtd spi_pxa2xx_platform joydev intel_rapl intel_powerclamp coretemp crc32_pclmul snd_hda_codec_hdmi pcbc aesni_intel aes_i586 crypto_simd cryptd intel_cstate intel_rapl_perf i2c_i801 lpc_ich pcspkr idma64 virt_dma intel_lpss_ pci intel_lpss input_leds mac_hid i915 video snd_hda_intel drm_kms_helper snd_hda_codec snd_hda_core snd_hwdep snd_pcm hid_multitouch drm mei_me snd_timer fb_sys_fops syscopyarea sysfillrect snd mei sysimgblt soundcore shpchp sch_fq_codel nfsd autofs4 Sep 17 13:02:43 localhost kernel: [ 97.342470] CPU: 0 PID: 530 Comm: TestTask Tainted: G O4.14.71-rt44 #1 Sep 17 13:02:43 localhost kernel: [ 97.342472] task: f3347300 task.stack: f32ea000 Sep 17 13:02:43 localhost kernel: [ 97.342476] EIP: rcu_note_context_switch+0x2a0/0x4d0 Sep 17 13:02:43 localhost kernel: [ 97.342478] EFLAGS: 00010002 CPU: 0 Sep 17 13:02:43 localhost kernel: [ 97.342480] EAX: 0001 EBX: ECX: 0001 EDX: Sep 17 13:02:43 localhost kernel: [ 97.342482] ESI: EDI: f3347300 EBP: f32ebeec ESP: f32ebed0 Sep 17 13:02:43 localhost kernel: [ 97.342485] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Sep 17 13:02:43 localhost kernel: [ 97.342487] CR0: 80050033 CR2: 06338490 CR3: 332d4000 CR4: 003406d0 Sep 17 13:02:43 localhost kernel: [ 97.342489] Call Trace: Sep 17 13:02:43 localhost kernel: [ 97.342501] ? unpin_current_cpu+0x53/0x80 Sep 17 13:02:43 localhost kernel: [ 97.342507] __schedule+0x85/0x700 Sep 17 13:02:43 localhost kernel: [ 97.342511] ? _raw_spin_unlock_irqrestore+0x17/0x50 Sep 17 13:02:43 localhost kernel: [ 97.342514] ? rt_spin_unlock+0x24/0x50 Sep 17 13:02:43 localhost kernel: [ 97.342517] schedule+0x41/0xe0 Sep 17 13:02:43 localhost kernel: [ 97.342521] hrtimer_wait_for_timer+0x5d/0x90 Sep 17 13:02:43 localhost kernel: [ 97.342525] ? wait_woken+0x70/0x70 Sep 17 13:02:43 localhost kernel: [ 97.342530] timer_wait_for_callback+0x40/0x50 Sep 17 13:02:43 localhost kernel: [ 97.342533] SyS_timer_delete+0x6b/0x140 Sep 17 13:02:43 localhost kernel: [ 97.342538] do_int80_syscall_32+0x6b/0xf0 Sep 17 13:02:43 localhost kernel: [ 97.342542] entry_INT80_32+0x31/0x31 Sep 17 13:02:43 localhost kernel: [ 97.342545] EIP: 0xb22c68d0 Sep 17 13:02:43 localhost kernel: [ 97.342546] EFLAGS: 0282 CPU: 0 Sep 17 13:02:43 localhost kernel: [ 97.342548] EAX: ffda EBX: 0002 ECX: EDX: b1d00480 Sep 17 13:02:43 localhost kernel: [ 97.342550] ESI: b1d005e0 EDI: b22cc000 EBP: b1fc7318 ESP: b1fc72b0 Sep 17 13:02:43 localhost kernel: [ 97.342552] DS: 007b ES: 007b FS: GS: 0033 SS: 007b Sep 17 13:02:43 localhost kernel: [ 97.342556] Code: c3 83 e8 01 39 c2 0f 85 27 02 00 00 83 f9 0f 8d 97 f8 02 00 00 0f 87 78 01 00 00 8b 04 8d f8 52 51 c3 e9 a4 9b 83 00 8d 74 26 00 <0f> 0b 80 bf f4 02 00 00 00 0f 85 a6 fd ff ff e9 1c ff ff ff 8d Sep 17 13:02:43 localhost kernel: [ 97.342600] ---[ end trace 0002 ]--- The issue can be reproduced with the following simple program /// // Test application /// #include #include // usleep #include #define CPU_AFFINITY_DEFAULT 0 #define MAIN_TASK_NAME "MainTask" #define MAIN_TASK_PRIO 0 #define MAIN_TASK_MODE 0 #define TESTTASK_NAME "TestTask" #define TESTTASK_PRIO 10 #define TEST
sideeffect of rcu_nocbs on periodic Alchemy task
Hello, I am running an Alchemy application (xenomai 3.0.9) over Linux prempt-rt on a dual core atom (E3930) and noticed a side effect of the Linux boot parameter "rcu_nocbs". Whenever the kernel bootparameter "rcu_nocbs=1" is set, I get the following Linux kernel warning, when the AlchemyTask (TestTask) is terminated. Sep 17 13:02:43 localhost kernel: [ 97.342398] [ cut here ] Sep 17 13:02:43 localhost kernel: [ 97.342412] WARNING: CPU: 0 PID: 530 at /home/behnkjoc/prc2020/poky/build-tca5-32/tmp/work-shared/congatec-tca5 -32/kernel-source/kernel/rcu/tree_plugin.h:310 rcu_note_context_switch+0x2a0/0x4d0 Sep 17 13:02:43 localhost kernel: [ 97.342414] Modules linked in: ec_generic(O) ec_master(O) spidev nls_iso8859_1 cmdlinepart intel_spi_platform intel_spi spi_nor mtd spi_pxa2xx_platform joydev intel_rapl intel_powerclamp coretemp crc32_pclmul snd_hda_codec_hdmi pcbc aesni_intel aes_i586 crypto_simd cryptd intel_cstate intel_rapl_perf i2c_i801 lpc_ich pcspkr idma64 virt_dma intel_lpss_ pci intel_lpss input_leds mac_hid i915 video snd_hda_intel drm_kms_helper snd_hda_codec snd_hda_core snd_hwdep snd_pcm hid_multitouch drm mei_me snd_timer fb_sys_fops syscopyarea sysfillrect snd mei sysimgblt soundcore shpchp sch_fq_codel nfsd autofs4 Sep 17 13:02:43 localhost kernel: [ 97.342470] CPU: 0 PID: 530 Comm: TestTask Tainted: G O4.14.71-rt44 #1 Sep 17 13:02:43 localhost kernel: [ 97.342472] task: f3347300 task.stack: f32ea000 Sep 17 13:02:43 localhost kernel: [ 97.342476] EIP: rcu_note_context_switch+0x2a0/0x4d0 Sep 17 13:02:43 localhost kernel: [ 97.342478] EFLAGS: 00010002 CPU: 0 Sep 17 13:02:43 localhost kernel: [ 97.342480] EAX: 0001 EBX: ECX: 0001 EDX: Sep 17 13:02:43 localhost kernel: [ 97.342482] ESI: EDI: f3347300 EBP: f32ebeec ESP: f32ebed0 Sep 17 13:02:43 localhost kernel: [ 97.342485] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Sep 17 13:02:43 localhost kernel: [ 97.342487] CR0: 80050033 CR2: 06338490 CR3: 332d4000 CR4: 003406d0 Sep 17 13:02:43 localhost kernel: [ 97.342489] Call Trace: Sep 17 13:02:43 localhost kernel: [ 97.342501] ? unpin_current_cpu+0x53/0x80 Sep 17 13:02:43 localhost kernel: [ 97.342507] __schedule+0x85/0x700 Sep 17 13:02:43 localhost kernel: [ 97.342511] ? _raw_spin_unlock_irqrestore+0x17/0x50 Sep 17 13:02:43 localhost kernel: [ 97.342514] ? rt_spin_unlock+0x24/0x50 Sep 17 13:02:43 localhost kernel: [ 97.342517] schedule+0x41/0xe0 Sep 17 13:02:43 localhost kernel: [ 97.342521] hrtimer_wait_for_timer+0x5d/0x90 Sep 17 13:02:43 localhost kernel: [ 97.342525] ? wait_woken+0x70/0x70 Sep 17 13:02:43 localhost kernel: [ 97.342530] timer_wait_for_callback+0x40/0x50 Sep 17 13:02:43 localhost kernel: [ 97.342533] SyS_timer_delete+0x6b/0x140 Sep 17 13:02:43 localhost kernel: [ 97.342538] do_int80_syscall_32+0x6b/0xf0 Sep 17 13:02:43 localhost kernel: [ 97.342542] entry_INT80_32+0x31/0x31 Sep 17 13:02:43 localhost kernel: [ 97.342545] EIP: 0xb22c68d0 Sep 17 13:02:43 localhost kernel: [ 97.342546] EFLAGS: 0282 CPU: 0 Sep 17 13:02:43 localhost kernel: [ 97.342548] EAX: ffda EBX: 0002 ECX: EDX: b1d00480 Sep 17 13:02:43 localhost kernel: [ 97.342550] ESI: b1d005e0 EDI: b22cc000 EBP: b1fc7318 ESP: b1fc72b0 Sep 17 13:02:43 localhost kernel: [ 97.342552] DS: 007b ES: 007b FS: GS: 0033 SS: 007b Sep 17 13:02:43 localhost kernel: [ 97.342556] Code: c3 83 e8 01 39 c2 0f 85 27 02 00 00 83 f9 0f 8d 97 f8 02 00 00 0f 87 78 01 00 00 8b 04 8d f8 52 51 c3 e9 a4 9b 83 00 8d 74 26 00 <0f> 0b 80 bf f4 02 00 00 00 0f 85 a6 fd ff ff e9 1c ff ff ff 8d Sep 17 13:02:43 localhost kernel: [ 97.342600] ---[ end trace 0002 ]--- The issue can be reproduced with the following simple program /// // Test application /// #include #include // usleep #include #define CPU_AFFINITY_DEFAULT 0 #define MAIN_TASK_NAME "MainTask" #define MAIN_TASK_PRIO 0 #define MAIN_TASK_MODE 0 #define TESTTASK_NAME "TestTask" #define TESTTASK_PRIO 10 #define TESTTASK_MODE 0 #define TESTTASK_STACKSIZE 0x10l// 1 MB #define TESTTASK_PERIOD_NS (5* 100) typedef struct { RT_TASK TaskDescr; int nEndTask; int Period_ns; } TESTTASK_CONTEXT; RT_TASKg_MainTask; int g_nRun=1; void TestTask(void *pData) { TESTTASK_CONTEXT *pCtx = (TESTTASK_CONTEXT *)pData; unsigned long overrun; int nErr = 0; printf("TestTask starting..
Re: [PATCH] net/tcp: Account for connection teardown handshake
On 17.09.19 14:01, Sebastian Smolorz wrote: When closing a TCP connection a handshake procedure is executed between the peers. The close routine of the rttcp driver did not participate in detecting the end of this handshake but rather waited one second inside a close call unconditionally. Especially when peers are directly connected this is a waste of time which can hurt a lot in some situations. This patch replaces the msleep(1000) call with a timed wait on a semaphore which gets sigalled when the termination handshake is complete. Signed-off-by: Sebastian Smolorz --- kernel/drivers/net/stack/ipv4/tcp/tcp.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/kernel/drivers/net/stack/ipv4/tcp/tcp.c b/kernel/drivers/net/stack/ipv4/tcp/tcp.c index 54bafa80f..81089afd1 100644 --- a/kernel/drivers/net/stack/ipv4/tcp/tcp.c +++ b/kernel/drivers/net/stack/ipv4/tcp/tcp.c @@ -147,6 +147,9 @@ struct tcp_socket { struct rtskb_queue retransmit_queue; struct timerwheel_timer timer; + struct semaphore close_sem; Sounds rather lake a job for struct completion. + rtdm_nrtsig_t close_sig; + #ifdef CONFIG_XENO_DRIVERS_NET_RTIPV4_TCP_ERROR_INJECTION unsigned int packet_counter; unsigned int error_rate; @@ -1042,6 +1045,7 @@ static void rt_tcp_rcv(struct rtskb *skb) rt_tcp_send(ts, TCP_FLAG_ACK); /* data receiving is not possible anymore */ rtdm_sem_destroy(&ts->sock.pending_sem); + rtdm_nrtsig_pend(&ts->close_sig); goto feed; } else if (ts->tcp_state == TCP_FIN_WAIT1) { /* Send ACK */ @@ -1105,6 +1109,7 @@ static void rt_tcp_rcv(struct rtskb *skb) ts->tcp_state = TCP_CLOSE; rtdm_lock_put_irqrestore(&ts->socket_lock, context); /* socket destruction will be done on close() */ + rtdm_nrtsig_pend(&ts->close_sig); goto drop; } else if (ts->tcp_state == TCP_FIN_WAIT1) { ts->tcp_state = TCP_FIN_WAIT2; @@ -1119,6 +1124,7 @@ static void rt_tcp_rcv(struct rtskb *skb) ts->tcp_state = TCP_TIME_WAIT; rtdm_lock_put_irqrestore(&ts->socket_lock, context); /* socket destruction will be done on close() */ + rtdm_nrtsig_pend(&ts->close_sig); goto feed; } } @@ -1190,6 +1196,11 @@ static int rt_tcp_window_send(struct tcp_socket *ts, u32 data_len, u8 *data_ptr) return ret; } +static void rt_tcp_close_signal_handler(rtdm_nrtsig_t *nrtsig, void *arg) +{ + up((struct semaphore *)arg); +} + static int rt_tcp_socket_create(struct tcp_socket *ts) { rtdm_lockctx_t context; @@ -1226,6 +1237,10 @@ static int rt_tcp_socket_create(struct tcp_socket *ts) timerwheel_init_timer(&ts->timer, rt_tcp_retransmit_handler, ts); rtskb_queue_init(&ts->retransmit_queue); + sema_init(&ts->close_sem, 0); + rtdm_nrtsig_init(&ts->close_sig, rt_tcp_close_signal_handler, +&ts->close_sem); + #ifdef CONFIG_XENO_DRIVERS_NET_RTIPV4_TCP_ERROR_INJECTION ts->packet_counter = counter_start; ts->error_rate = error_rate; @@ -1237,6 +1252,7 @@ static int rt_tcp_socket_create(struct tcp_socket *ts) /* enforce maximum number of TCP sockets */ if (free_ports == 0) { rtdm_lock_put_irqrestore(&tcp_socket_base_lock, context); + rtdm_nrtsig_destroy(&ts->close_sig); return -EAGAIN; } free_ports--; @@ -1338,6 +1354,8 @@ static void rt_tcp_socket_destruct(struct tcp_socket *ts) rtdm_event_destroy(&ts->conn_evt); + rtdm_nrtsig_destroy(&ts->close_sig); + /* cleanup already collected fragments */ rt_ip_frag_invalidate_socket(sock); @@ -1362,6 +1380,7 @@ static void rt_tcp_close(struct rtdm_fd *fd) struct rt_tcp_dispatched_packet_send_cmd send_cmd; rtdm_lockctx_t context; int signal = 0; + int ret; rtdm_lock_get_irqsave(&ts->socket_lock, context); @@ -1380,7 +1399,10 @@ static void rt_tcp_close(struct rtdm_fd *fd) /* result is ignored */ /* Give the peer some time to reply to our FIN. */ - msleep(1000); + ret = down_timeout(&ts->close_sem, msecs_to_jiffies(1000)); + if (ret) + rtdm_printk("rttcp: waiting for FIN-ACK handshake returned %d\n", + ret); Do we consider a timeout to worth a kernel log entry? This could also be caused by a lost connection, right? And should we make the waiting time configurable? } else if (ts->tcp_state == TCP_CLOSE_WAIT) {
[PATCH] net/tcp: Account for connection teardown handshake
When closing a TCP connection a handshake procedure is executed between the peers. The close routine of the rttcp driver did not participate in detecting the end of this handshake but rather waited one second inside a close call unconditionally. Especially when peers are directly connected this is a waste of time which can hurt a lot in some situations. This patch replaces the msleep(1000) call with a timed wait on a semaphore which gets sigalled when the termination handshake is complete. Signed-off-by: Sebastian Smolorz --- kernel/drivers/net/stack/ipv4/tcp/tcp.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/kernel/drivers/net/stack/ipv4/tcp/tcp.c b/kernel/drivers/net/stack/ipv4/tcp/tcp.c index 54bafa80f..81089afd1 100644 --- a/kernel/drivers/net/stack/ipv4/tcp/tcp.c +++ b/kernel/drivers/net/stack/ipv4/tcp/tcp.c @@ -147,6 +147,9 @@ struct tcp_socket { struct rtskb_queue retransmit_queue; struct timerwheel_timer timer; + struct semaphore close_sem; + rtdm_nrtsig_t close_sig; + #ifdef CONFIG_XENO_DRIVERS_NET_RTIPV4_TCP_ERROR_INJECTION unsigned int packet_counter; unsigned int error_rate; @@ -1042,6 +1045,7 @@ static void rt_tcp_rcv(struct rtskb *skb) rt_tcp_send(ts, TCP_FLAG_ACK); /* data receiving is not possible anymore */ rtdm_sem_destroy(&ts->sock.pending_sem); + rtdm_nrtsig_pend(&ts->close_sig); goto feed; } else if (ts->tcp_state == TCP_FIN_WAIT1) { /* Send ACK */ @@ -1105,6 +1109,7 @@ static void rt_tcp_rcv(struct rtskb *skb) ts->tcp_state = TCP_CLOSE; rtdm_lock_put_irqrestore(&ts->socket_lock, context); /* socket destruction will be done on close() */ + rtdm_nrtsig_pend(&ts->close_sig); goto drop; } else if (ts->tcp_state == TCP_FIN_WAIT1) { ts->tcp_state = TCP_FIN_WAIT2; @@ -1119,6 +1124,7 @@ static void rt_tcp_rcv(struct rtskb *skb) ts->tcp_state = TCP_TIME_WAIT; rtdm_lock_put_irqrestore(&ts->socket_lock, context); /* socket destruction will be done on close() */ + rtdm_nrtsig_pend(&ts->close_sig); goto feed; } } @@ -1190,6 +1196,11 @@ static int rt_tcp_window_send(struct tcp_socket *ts, u32 data_len, u8 *data_ptr) return ret; } +static void rt_tcp_close_signal_handler(rtdm_nrtsig_t *nrtsig, void *arg) +{ + up((struct semaphore *)arg); +} + static int rt_tcp_socket_create(struct tcp_socket *ts) { rtdm_lockctx_t context; @@ -1226,6 +1237,10 @@ static int rt_tcp_socket_create(struct tcp_socket *ts) timerwheel_init_timer(&ts->timer, rt_tcp_retransmit_handler, ts); rtskb_queue_init(&ts->retransmit_queue); + sema_init(&ts->close_sem, 0); + rtdm_nrtsig_init(&ts->close_sig, rt_tcp_close_signal_handler, +&ts->close_sem); + #ifdef CONFIG_XENO_DRIVERS_NET_RTIPV4_TCP_ERROR_INJECTION ts->packet_counter = counter_start; ts->error_rate = error_rate; @@ -1237,6 +1252,7 @@ static int rt_tcp_socket_create(struct tcp_socket *ts) /* enforce maximum number of TCP sockets */ if (free_ports == 0) { rtdm_lock_put_irqrestore(&tcp_socket_base_lock, context); + rtdm_nrtsig_destroy(&ts->close_sig); return -EAGAIN; } free_ports--; @@ -1338,6 +1354,8 @@ static void rt_tcp_socket_destruct(struct tcp_socket *ts) rtdm_event_destroy(&ts->conn_evt); + rtdm_nrtsig_destroy(&ts->close_sig); + /* cleanup already collected fragments */ rt_ip_frag_invalidate_socket(sock); @@ -1362,6 +1380,7 @@ static void rt_tcp_close(struct rtdm_fd *fd) struct rt_tcp_dispatched_packet_send_cmd send_cmd; rtdm_lockctx_t context; int signal = 0; + int ret; rtdm_lock_get_irqsave(&ts->socket_lock, context); @@ -1380,7 +1399,10 @@ static void rt_tcp_close(struct rtdm_fd *fd) /* result is ignored */ /* Give the peer some time to reply to our FIN. */ - msleep(1000); + ret = down_timeout(&ts->close_sem, msecs_to_jiffies(1000)); + if (ret) + rtdm_printk("rttcp: waiting for FIN-ACK handshake returned %d\n", + ret); } else if (ts->tcp_state == TCP_CLOSE_WAIT) { /* Send FIN in CLOSE_WAIT */ send_cmd.ts = ts; @@ -1394,7 +1416,10 @@ static void rt_tcp_close(struct rtdm_fd *fd) /* result is ignored */ /* Give the peer some time to reply to our FIN. */ - msleep(1
Re: Static build of rtnet
On 17.09.19 10:29, Lange Norbert wrote: -Original Message- From: Jan Kiszka Sent: Dienstag, 17. September 2019 09:42 To: Lange Norbert ; Xenomai (xenomai@xenomai.org) Subject: Re: Static build of rtnet NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR ATTACHMENTS. On 16.09.19 11:13, Lange Norbert via Xenomai wrote: Hello, I havent tested this in a while, but building rtnet static will crash the kernel when this module initializes. With the various fixes and cleanups in master/next (like rtdm_available) that might be worth a look? I would hope to build a static kernel one day, and so far there are 2 roadblocks: - rtnet (+ rtpacket) crashing when built statically - symbol nameclashes with linux + rt drivers enabled (I could work on fixing that for rt_igb atleast) Do you mean removing the "depends on m"? Yes, ideally I would use a kernel without loadable modules, so kernel upgrades/changes don’t affect the rootfs (ideally read-only apart from few places). Possibly, that moves the initialization order in a way that causes troubles. I also just added another case that exploits the module [1], but that would be solvable. More critical is understanding the crashes. I had a quick test removing the "depends on m" about a year ago, I brought this up now because it might be fitting with the recent cleanups. I don't think recent cleanups have changed the situation. Someone has to sit down, analyze the crashes, resolve them, and propose all changes for upstream. Also, startup scripts need to be adjusted to accept non-module RTnet. Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
RE: Static build of rtnet
> -Original Message- > From: Jan Kiszka > Sent: Dienstag, 17. September 2019 09:42 > To: Lange Norbert ; Xenomai > (xenomai@xenomai.org) > Subject: Re: Static build of rtnet > > NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR > ATTACHMENTS. > > > On 16.09.19 11:13, Lange Norbert via Xenomai wrote: > > Hello, > > > > I havent tested this in a while, but building rtnet static will crash the > > kernel > when this module initializes. > > With the various fixes and cleanups in master/next (like rtdm_available) > > that > might be worth a look? > > > > I would hope to build a static kernel one day, and so far there are 2 > roadblocks: > > > > - rtnet (+ rtpacket) crashing when built statically > > > > - symbol nameclashes with linux + rt drivers enabled (I could work on > > fixing > that for rt_igb atleast) > > > > Do you mean removing the "depends on m"? Yes, ideally I would use a kernel without loadable modules, so kernel upgrades/changes don’t affect the rootfs (ideally read-only apart from few places). > Possibly, that moves the > initialization order in a way that causes troubles. I also just added another > case > that exploits the module [1], but that would be solvable. More critical is > understanding the crashes. I had a quick test removing the "depends on m" about a year ago, I brought this up now because it might be fitting with the recent cleanups. Regards, Norbert This message and any attachments are solely for the use of the intended recipients. They may contain privileged and/or confidential information or other information protected from disclosure. If you are not an intended recipient, you are hereby notified that you received this email in error and that any review, dissemination, distribution or copying of this email and any attachment is strictly prohibited. If you have received this email in error, please contact the sender and delete the message and any attachment from your system. ANDRITZ HYDRO GmbH Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation Firmensitz/ Registered seat: Wien Firmenbuchgericht/ Court of registry: Handelsgericht Wien Firmenbuchnummer/ Company registration: FN 61833 g DVR: 0605077 UID-Nr.: ATU14756806 Thank You
Re: Static build of rtnet
On 16.09.19 11:13, Lange Norbert via Xenomai wrote: Hello, I havent tested this in a while, but building rtnet static will crash the kernel when this module initializes. With the various fixes and cleanups in master/next (like rtdm_available) that might be worth a look? I would hope to build a static kernel one day, and so far there are 2 roadblocks: - rtnet (+ rtpacket) crashing when built statically - symbol nameclashes with linux + rt drivers enabled (I could work on fixing that for rt_igb atleast) Do you mean removing the "depends on m"? Possibly, that moves the initialization order in a way that causes troubles. I also just added another case that exploits the module [1], but that would be solvable. More critical is understanding the crashes. Jan [1] https://xenomai.org/pipermail/xenomai/2019-September/041583.html -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
Re: RTDM open, open_rt, and open_nrt
- Den 16 sep 2019, på kl 19:01, Jan Kiszka jan.kis...@siemens.com skrev: > On 16.09.19 17:33, Per Oberg wrote: > > - Den 16 sep 2019, på kl 16:59, Jan Kiszka jan.kis...@siemens.com skrev: > >> On 16.09.19 14:41, Per Oberg via Xenomai wrote: > >>> - Den 16 sep 2019, på kl 14:36, Per Öberg p...@wolfram.com skrev: > - Den 16 sep 2019, på kl 11:34, Jan Kiszka jan.kis...@siemens.com > skrev: > > On 16.09.19 09:32, Per Oberg via Xenomai wrote: > >> Hello list > >> I am trying to understand how rtdm works, and possibly why out of a > >> historical > >> context. Perhaps there is a good place to read up on this stuff, then > >> please > >> let me know. > >> It seems like in the rtdm-api there is only open, but no open_rt or > >> open_nrt. > >> More specifically we have: > >> - read_rt / read_nrt > >> - recvmsg_rt / recvmsg_nrt > >> - ioctl_rt / ioctl_nrt > >> - .. etc. > >> However, when studying an old xenomai2->3 ported driver it seems like > >> there used > >> to be open_rt and open_nrt. The problem I was having before (see my > >> background > >> comment below) was because the open had been mapped to the old > >> open_nrt code, > >> which in turned used a rt-lock, thus a mix of the two. When switching > >> to a > >> regular mutex it "worked", as in it didn't complain. > >> In a short discussion Jan Kiszka gave me the impression that open > >> could possibly > >> end up being rt or nrt depending on situation. > >> PÖ: I'm guessing that open is always non-rt and therefore a rtdm_lock > >> should be > >> used? ... > >> JK: This depends. If the open code needs to synchronize only with > >> other non-RT > >> JK: paths, normal Linux locks are fine. If there is the need to sync > >> with the > >> JK: interrupt handler or some of the _rt callbacks, rtdm_lock & Co. is > >> needed. > >> So, how does this work? And why was (if it was) open_nrt and open_rt > >> replaced > >> with a common open? > > The original RTDM design was foreseeing the use case of creating and > > destroying > > resources like file descriptors for devices in RT context. That idea > > was dropped > > as also the trend for the core was clearly making this less realistic. > > Therefore, we removed open/socket_rt from Xenomai 3. > > If you have a driver that exploited open_rt, you need to remove all > > rt-sleeping > > operations from its open function. If rtdm_lock is an appropriate > > alternative > > depends on the driver locking structure and the code run under the lock. > > rtdm_lock_get makes the lock holder unpreemptible. So, if rtdm_mutex > > was chosen > > because of lengthy code under the lock, that would not be a good > > alternative. > > Then we would have to discuss what exactly is run there, and why. > Ok, can I read up on this somewhere? I found [1], is that still valid in > this > context? ( Oh, and can we expect a third edition perhaps ? =) ) > [1] Building Embedded Linux Systems: Concepts, Techniques, Tricks, and > Traps 2nd > Edition, Kindle Edition > >> Basic locking principles should be covered there, not sure if it had a > >> Xenomai/RTDM section. If so, check if it was written/updated after 2015. >> It has, but it's written in 2008. With references for a paper you wrote. ( >> "The > > Real-Time Driver Model and First Applications" ) > >> Background > >> > >> I recently wrote about a driver which warned about "drvlib.c:1349 > >> rtdm_mutex_timedlock". I got good answers which led me to some more > >> general > >> questions, but instead of continuing in the old tread I thought it > >> better to > >> start a new one since it's not about the initial problem. The driver > >> in case is > >> the Peak Linux Driver for their CAN hardware, see [1] > >> [1] https://www.peak-system.com/fileadmin/media/linux/index.htm > > Did you inform them about their problem already? Maybe they are willing > > to fix > > it. We can't, it's not upstream code. > No, I haven't, but I will. The reason I haven't yet is because I was > under the > impression that this didn't happen to them. I'm trying to compile > everything > (driver, lib, and application) in a Yocto based SDK setup and it seems > like > compilation flags and environment variables are getting squashed in > interesting > ways. My reasoning so far was that I got this wrong somehow. > >>> Forget that, I did actually ask them and they answered in a manner that > >>> suggested that I was doing something wrong (wrong compilation flags or > >>> user > >>> privileges ). I never got rid of the warning though and it fell into the > >>> dark > >>> corners of the backlo