Philippe Gerum via Xenomai <[email protected]> writes:
> François Legal <[email protected]> writes: > >> Le Lundi, Juin 21, 2021 16:57 CEST, Philippe Gerum <[email protected]> a >> écrit: >> >>> >>> Jan Kiszka <[email protected]> writes: >>> >>> > On 21.06.21 16:28, Philippe Gerum wrote: >>> >> >>> >> Jan Kiszka <[email protected]> writes: >>> >> >>> >>> On 21.06.21 15:54, Philippe Gerum wrote: >>> >>>> >>> >>>> Jan Kiszka <[email protected]> writes: >>> >>>> >>> >>>>> On 21.06.21 11:39, Philippe Gerum wrote: >>> >>>>>> >>> >>>>>> Jan Kiszka <[email protected]> writes: >>> >>>>>> >>> >>>>>>> On 18.06.21 20:41, François Legal wrote: >>> >>>>>>>> Le Mercredi, Juin 16, 2021 17:10 CEST, Jan Kiszka >>> >>>>>>>> <[email protected]> a écrit: >>> >>>>>>>> >>> >>>>>>>>> On 16.06.21 15:51, François Legal wrote: >>> >>>>>>>>>> Le Mercredi, Juin 16, 2021 15:38 CEST, Jan Kiszka >>> >>>>>>>>>> <[email protected]> a écrit: >>> >>>>>>>>>> >>> >>>>>>>>>>> On 16.06.21 15:29, François Legal wrote: >>> >>>>>>>>>>>> Le Mercredi, Juin 16, 2021 11:40 CEST, Jan Kiszka >>> >>>>>>>>>>>> <[email protected]> a écrit: >>> >>>>>>>>>>>> >>> >>>>>>>>>>>>> On 16.06.21 11:12, François Legal via Xenomai wrote: >>> >>>>>>>>>>>>>> Le Mercredi, Juin 16, 2021 11:05 CEST, "Chen, Hongzhan" >>> >>>>>>>>>>>>>> <[email protected]> a écrit: >>> >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> -----Original Message----- >>> >>>>>>>>>>>>>>>> From: François Legal <[email protected]> >>> >>>>>>>>>>>>>>>> Sent: Wednesday, June 16, 2021 4:19 PM >>> >>>>>>>>>>>>>>>> To: Chen, Hongzhan <[email protected]> >>> >>>>>>>>>>>>>>>> Cc: [email protected] >>> >>>>>>>>>>>>>>>> Subject: RE: Large gpio interrupt latency >>> >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> Le Mercredi, Juin 16, 2021 10:10 CEST, "Chen, Hongzhan" >>> >>>>>>>>>>>>>>>> <[email protected]> a écrit: >>> >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>> -----Original Message----- >>> >>>>>>>>>>>>>>>>>> From: Xenomai <[email protected]> On Behalf Of >>> >>>>>>>>>>>>>>>>>> François Legal via Xenomai >>> >>>>>>>>>>>>>>>>>> Sent: Wednesday, June 16, 2021 3:16 PM >>> >>>>>>>>>>>>>>>>>> To: [email protected] >>> >>>>>>>>>>>>>>>>>> Subject: Large gpio interrupt latency >>> >>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>> Hello, >>> >>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>> working on a realtime data recorder (for which I >>> >>>>>>>>>>>>>>>>>> submitted a patch to add timestamp retrieval for net >>> >>>>>>>>>>>>>>>>>> packets), I experience a strange latency problem with >>> >>>>>>>>>>>>>>>>>> taking GPIO interrupts. >>> >>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>> So basically my app monitors network packets (which as >>> >>>>>>>>>>>>>>>>>> received by the DUT would trigger toggling of a GPIO on >>> >>>>>>>>>>>>>>>>>> the DUT) and one GPIO. >>> >>>>>>>>>>>>>>>>>> We know (from some ohter reference instrument) that the >>> >>>>>>>>>>>>>>>>>> DUT would toggle the GPIO 1-2ms after receiving a >>> >>>>>>>>>>>>>>>>>> specific network packet. >>> >>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>> My recorder app relies on the timestamping of events >>> >>>>>>>>>>>>>>>>>> done in interrupts service routines for GPIOs and >>> >>>>>>>>>>>>>>>>>> network interface. By checking the timestamps returned >>> >>>>>>>>>>>>>>>>>> by the application, I get an about 50ms delay between >>> >>>>>>>>>>>>>>>>>> the network packet and the GPIO (so something between 51 >>> >>>>>>>>>>>>>>>>>> and 52ms delay). >>> >>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> Are you toggling gpio and access gpio device through >>> >>>>>>>>>>>>>>>>> your rtdm device like on path >>> >>>>>>>>>>>>>>>>> /dev/rtdm/your_gpiocontroller/gpio*? >>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> Maybe my setup was not very clear. >>> >>>>>>>>>>>>>>>> I've got a DUT that receives network data, and toggles one >>> >>>>>>>>>>>>>>>> of its GPIOs depending one the network data. >>> >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> I've got another device running my realtime recording app, >>> >>>>>>>>>>>>>>>> that receives the same network data as the DUT, and that >>> >>>>>>>>>>>>>>>> has one of its GPIO connected to the DUT GPIO. >>> >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> On the recording app, I use the RTDM device to open, ioctl >>> >>>>>>>>>>>>>>>> (enable interrupt + timestamp), then select & read the >>> >>>>>>>>>>>>>>>> GPIO value. >>> >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>> When issue happen, the recording app side have got same >>> >>>>>>>>>>>>>>> number of network data packages and gpio interrupts with >>> >>>>>>>>>>>>>>> that DUT have been toggling? >>> >>>>>>>>>>>>>>> I am asking this because I met gpio hardware issue that >>> >>>>>>>>>>>>>>> would cause gpio interrupt missing or invalid gpio >>> >>>>>>>>>>>>>>> interrupts when gpio is connecting on two different boards. >>> >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> AFAICT, I'm not missing GPIO edges/interrupts in the app. >>> >>>>>>>>>>>>>> Regarding network, the DUT and the recoding device are >>> >>>>>>>>>>>>>> connected to the same switch with the same port config. >>> >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> Moreover, I tried changing the interrupt polarity of the >>> >>>>>>>>>>>>>> recording device (switched from rising edge to falling >>> >>>>>>>>>>>>>> edge), and the result is 1-2ms + ~30ms (the width of the >>> >>>>>>>>>>>>>> pulse of the DUT) + 50ms latency >>> >>>>>>>>>>>>>> >>> >>>>>>>>>>>>> >>> >>>>>>>>>>>>> Can you (or did you already) break down the latencies on the >>> >>>>>>>>>>>>> DUT via >>> >>>>>>>>>>>>> tracing? Is it really the GPIO output path? What is happening >>> >>>>>>>>>>>>> in it, >>> >>>>>>>>>>>>> starting with RT/Linux task switches, mode transitions etc.? >>> >>>>>>>>>>>>> >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> I just did it. >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> I did put gpio_pin_interrupt as trigger, then ran my >>> >>>>>>>>>>>> application, triggered the GPIO. >>> >>>>>>>>>>>> This is what I get in frozen, but I'm not quite sure how to >>> >>>>>>>>>>>> read it and what conclusion I could derive : >>> >>>>>>>>>>>> cat frozen >>> >>>>>>>>>>>> I-pipe frozen back-tracing service on 4.4.227+/ipipe release >>> >>>>>>>>>>>> #10 >>> >>>>>>>>>>>> ------------------------------------------------------------ >>> >>>>>>>>>>>> CPU: 0, Freeze: 218343820846 cycles, Trace Points: 100 (+10) >>> >>>>>>>>>>>> Calibrated minimum trace-point overhead: 0.461 us >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> +----- Hard IRQs ('|': locked) >>> >>>>>>>>>>>> |+-- Xenomai >>> >>>>>>>>>>>> ||+- Linux ('*': domain stalled, '+': current, '#': >>> >>>>>>>>>>>> current+stalled) >>> >>>>>>>>>>>> ||| +---------- Delay flag ('+': > 1 us, >>> >>>>>>>>>>>> '!': > 10 us) >>> >>>>>>>>>>>> ||| | +- NMI noise ('N') >>> >>>>>>>>>>>> ||| | | >>> >>>>>>>>>>>> Type User Val. Time Delay Function (Parent) >>> >>>>>>>>>>>> : +func -6087+ 1.302 load_balance+0x14 >>> >>>>>>>>>>>> (run_rebalance_domains+0x7e8) >>> >>>>>>>>>>>> : +func -6085 0.826 idle_cpu+0x10 >>> >>>>>>>>>>>> (load_balance+0x180) >>> >>>>>>>>>>>> : +func -6084 0.892 >>> >>>>>>>>>>>> find_busiest_group+0x14 (load_balance+0x1a4) >>> >>>>>>>>>>>> : +func -6084 0.757 >>> >>>>>>>>>>>> update_group_capacity+0x14 (find_busiest_group+0x128) >>> >>>>>>>>>>>> : +func -6083+ 1.452 __msecs_to_jiffies+0x10 >>> >>>>>>>>>>>> (update_group_capacity+0x30) >>> >>>>>>>>>>>> : +func -6081+ 1.535 idle_cpu+0x10 >>> >>>>>>>>>>>> (find_busiest_group+0x1e4) >>> >>>>>>>>>>>> : +func -6080+ 1.410 idle_cpu+0x10 >>> >>>>>>>>>>>> (find_busiest_group+0x1e4) >>> >>>>>>>>>>>> : +func -6078 0.967 >>> >>>>>>>>>>>> __msecs_to_jiffies+0x10 (run_rebalance_domains+0x810) >>> >>>>>>>>>>>> : +func -6077 0.886 >>> >>>>>>>>>>>> __rcu_read_unlock+0x10 (run_rebalance_domains+0x648) >>> >>>>>>>>>>>> : +func -6077 0.820 rcu_bh_qs+0x10 >>> >>>>>>>>>>>> (__do_softirq+0x1b0) >>> >>>>>>>>>>>> : +func -6076 0.742 ipipe_stall_root+0x10 >>> >>>>>>>>>>>> (__do_softirq+0x1b4) >>> >>>>>>>>>>>> : +func -6075 0.766 ipipe_root_only+0x10 >>> >>>>>>>>>>>> (ipipe_stall_root+0x18) >>> >>>>>>>>>>>> :| +begin 0x80000001 -6074 0.934 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_root_only+0xb8) >>> >>>>>>>>>>>> :| +end 0x80000001 -6073 0.811 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_root_only+0xfc) >>> >>>>>>>>>>>> :| +begin 0x80000001 -6072 0.895 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_stall_root+0x78) >>> >>>>>>>>>>>> :| #end 0x80000001 -6072 0.847 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_stall_root+0xb8) >>> >>>>>>>>>>>> : #func -6071 0.814 >>> >>>>>>>>>>>> __local_bh_enable+0x10 (__do_softirq+0x214) >>> >>>>>>>>>>>> : #func -6070 0.760 ipipe_test_root+0x10 >>> >>>>>>>>>>>> (__local_bh_enable+0x1c) >>> >>>>>>>>>>>> :| #begin 0x80000001 -6069 0.907 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_test_root+0x74) >>> >>>>>>>>>>>> :| #end 0x80000001 -6068 0.898 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_test_root+0xb8) >>> >>>>>>>>>>>> : #func -6067 0.811 rcu_irq_exit+0x10 >>> >>>>>>>>>>>> (irq_exit+0x84) >>> >>>>>>>>>>>> : #func -6067 0.781 >>> >>>>>>>>>>>> ipipe_test_and_stall_root+0x10 (rcu_irq_exit+0x18) >>> >>>>>>>>>>>> : #func -6066 0.799 ipipe_root_only+0x10 >>> >>>>>>>>>>>> (ipipe_test_and_stall_root+0x18) >>> >>>>>>>>>>>> :| #begin 0x80000001 -6065+ 1.041 ipipe_trace_begin+0x24 >>> >>>>>>>>>>>> (ipipe_root_only+0xb8) >>> >>>>>>>>>>>> :| #end 0x80000001 -6064 0.769 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_root_only+0xfc) >>> >>>>>>>>>>>> :| #begin 0x80000001 -6063 0.895 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_test_and_stall_root+0x80) >>> >>>>>>>>>>>> :| #end 0x80000001 -6062 0.841 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_test_and_stall_root+0xc4) >>> >>>>>>>>>>>> : #func -6061+ 1.197 >>> >>>>>>>>>>>> rcu_eqs_enter_common.constprop.21+0x10 (rcu_irq_exit+0x80) >>> >>>>>>>>>>>> :| #begin 0x80000000 -6060+ 1.413 ipipe_trace_begin+0x24 >>> >>>>>>>>>>>> (__ipipe_do_sync_stage+0x2b8) >>> >>>>>>>>>>>> :| +end 0x00000012 -6059+ 1.044 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (__ipipe_grab_irq+0x84) >>> >>>>>>>>>>>> :| +func -6058 0.988 >>> >>>>>>>>>>>> __ipipe_check_root_interruptible+0x10 (__irq_svc+0x70) >>> >>>>>>>>>>>> :| +func -6057 0.976 ipipe_test_root+0x10 >>> >>>>>>>>>>>> (__ipipe_check_root_interruptible+0x68) >>> >>>>>>>>>>>> :| +func -6056 0.829 >>> >>>>>>>>>>>> __ipipe_bugon_irqs_enabled+0x10 (__ipipe_fast_svc_irq_exit+0x4) >>> >>>>>>>>>>>> :| +end 0x90000000 -6055 0.913 >>> >>>>>>>>>>>> __ipipe_fast_svc_irq_exit+0x20 (ipipe_unstall_root+0x88) >>> >>>>>>>>>>>> : +func -6054 0.781 ipipe_test_root+0x10 >>> >>>>>>>>>>>> (cpu_startup_entry+0x12c) >>> >>>>>>>>>>>> :| +begin 0x80000001 -6053 0.868 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_test_root+0x74) >>> >>>>>>>>>>>> :| +end 0x80000001 -6052 0.781 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_test_root+0xb8) >>> >>>>>>>>>>>> : +func -6052 0.748 rcu_idle_exit+0x10 >>> >>>>>>>>>>>> (cpu_startup_entry+0x138) >>> >>>>>>>>>>>> : +func -6051 0.739 >>> >>>>>>>>>>>> ipipe_test_and_stall_root+0x10 (rcu_idle_exit+0x18) >>> >>>>>>>>>>>> : +func -6050 0.775 ipipe_root_only+0x10 >>> >>>>>>>>>>>> (ipipe_test_and_stall_root+0x18) >>> >>>>>>>>>>>> :| +begin 0x80000001 -6049+ 1.011 ipipe_trace_begin+0x24 >>> >>>>>>>>>>>> (ipipe_root_only+0xb8) >>> >>>>>>>>>>>> :| +end 0x80000001 -6048 0.742 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_root_only+0xfc) >>> >>>>>>>>>>>> :| +begin 0x80000001 -6048 0.940 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_test_and_stall_root+0x80) >>> >>>>>>>>>>>> :| #end 0x80000001 -6047 0.790 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_test_and_stall_root+0xc4) >>> >>>>>>>>>>>> : #func -6046 0.859 >>> >>>>>>>>>>>> rcu_eqs_exit_common.constprop.19+0x10 (rcu_idle_exit+0x8c) >>> >>>>>>>>>>>> : #func -6045 0.772 >>> >>>>>>>>>>>> ipipe_unstall_root+0x10 (rcu_idle_exit+0x78) >>> >>>>>>>>>>>> :| #begin 0x80000000 -6044 0.814 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_unstall_root+0x98) >>> >>>>>>>>>>>> :| #func -6043+ 1.077 ipipe_root_only+0x10 >>> >>>>>>>>>>>> (ipipe_unstall_root+0x24) >>> >>>>>>>>>>>> :| +end 0x80000000 -6042 0.835 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_unstall_root+0x84) >>> >>>>>>>>>>>> : +func -6042 0.922 >>> >>>>>>>>>>>> arch_cpu_idle_exit+0x10 (cpu_startup_entry+0xfc) >>> >>>>>>>>>>>> : +func -6041 0.793 ipipe_stall_root+0x10 >>> >>>>>>>>>>>> (cpu_startup_entry+0xc4) >>> >>>>>>>>>>>> : +func -6040 0.724 ipipe_root_only+0x10 >>> >>>>>>>>>>>> (ipipe_stall_root+0x18) >>> >>>>>>>>>>>> :| +begin 0x80000001 -6039+ 1.098 ipipe_trace_begin+0x24 >>> >>>>>>>>>>>> (ipipe_root_only+0xb8) >>> >>>>>>>>>>>> :| +end 0x80000001 -6038 0.772 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_root_only+0xfc) >>> >>>>>>>>>>>> :| +begin 0x80000001 -6037 0.841 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_stall_root+0x78) >>> >>>>>>>>>>>> :| #end 0x80000001 -6036 0.763 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_stall_root+0xb8) >>> >>>>>>>>>>>> : #func -6036 0.838 >>> >>>>>>>>>>>> arch_cpu_idle_enter+0x10 (cpu_startup_entry+0xc8) >>> >>>>>>>>>>>> : #func -6035 0.745 arm_heavy_mb+0x10 >>> >>>>>>>>>>>> (arch_cpu_idle_enter+0x1c) >>> >>>>>>>>>>>> : #func -6034 0.916 l2c210_sync+0x10 >>> >>>>>>>>>>>> (arm_heavy_mb+0x2c) >>> >>>>>>>>>>>> : #func -6033+ 1.062 >>> >>>>>>>>>>>> tick_check_broadcast_expired+0x10 (cpu_startup_entry+0xd8) >>> >>>>>>>>>>>> : #func -6032 0.787 rcu_idle_enter+0x10 >>> >>>>>>>>>>>> (cpu_startup_entry+0x124) >>> >>>>>>>>>>>> : #func -6031 0.745 >>> >>>>>>>>>>>> ipipe_test_and_stall_root+0x10 (rcu_idle_enter+0x18) >>> >>>>>>>>>>>> : #func -6031 0.751 ipipe_root_only+0x10 >>> >>>>>>>>>>>> (ipipe_test_and_stall_root+0x18) >>> >>>>>>>>>>>> :| #begin 0x80000001 -6030 0.991 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_root_only+0xb8) >>> >>>>>>>>>>>> :| #end 0x80000001 -6029 0.772 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_root_only+0xfc) >>> >>>>>>>>>>>> :| #begin 0x80000001 -6028 0.892 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ipipe_test_and_stall_root+0x80) >>> >>>>>>>>>>>> :| #end 0x80000001 -6027 0.847 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_test_and_stall_root+0xc4) >>> >>>>>>>>>>>> : #func -6026 0.922 >>> >>>>>>>>>>>> rcu_eqs_enter_common.constprop.21+0x10 (rcu_idle_enter+0x90) >>> >>>>>>>>>>>> : #func -6025 0.862 >>> >>>>>>>>>>>> default_idle_call+0x10 (cpu_startup_entry+0x128) >>> >>>>>>>>>>>> : #func -6024 0.877 arch_cpu_idle+0x10 >>> >>>>>>>>>>>> (default_idle_call+0x38) >>> >>>>>>>>>>>> :| #begin 0x80000000 -6024! 5992.167 ipipe_trace_begin+0x24 >>> >>>>>>>>>>>> (arch_cpu_idle+0xb8) >>> >>>>>>>>>>> >>> >>>>>>>>>>> Here your system (or this core) went idle, waiting for the next >>> >>>>>>>>>>> event. >>> >>>>>>>>>>> >>> >>>>>>>>>>>> :| +func -31 0.760 >>> >>>>>>>>>>>> ipipe_unstall_root+0x10 (arch_cpu_idle+0x30) >>> >>>>>>>>>>> >>> >>>>>>>>>>> Comming back from idle. >>> >>>>>>>>>>> >>> >>>>>>>>>>>> :| +func -31+ 1.116 ipipe_root_only+0x10 >>> >>>>>>>>>>>> (ipipe_unstall_root+0x24) >>> >>>>>>>>>>>> :| +end 0x80000000 -30 0.931 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ipipe_unstall_root+0x84) >>> >>>>>>>>>>>> :| +begin 0x90000000 -29 0.844 __irq_svc+0x58 >>> >>>>>>>>>>>> (ipipe_unstall_root+0x88) >>> >>>>>>>>>>>> :| +func -28 0.925 gic_handle_irq+0x10 >>> >>>>>>>>>>>> (__irq_svc+0x6c) >>> >>>>>>>>>>>> :| +func -27 0.904 irq_find_mapping+0x10 >>> >>>>>>>>>>>> (gic_handle_irq+0x50) >>> >>>>>>>>>>>> :| +func -26 0.940 __ipipe_grab_irq+0x10 >>> >>>>>>>>>>>> (gic_handle_irq+0x58) >>> >>>>>>>>>>>> :| +begin 0x000000c9 -25 0.826 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (__ipipe_grab_irq+0x58) >>> >>>>>>>>>>>> :| +func -24 0.814 >>> >>>>>>>>>>>> __ipipe_dispatch_irq+0x10 (__ipipe_grab_irq+0x7c) >>> >>>>>>>>>>>> :| +func -23+ 1.275 irq_to_desc+0x10 >>> >>>>>>>>>>>> (__ipipe_dispatch_irq+0x184) >>> >>>>>>>>>>>> :| +func -22+ 1.679 irq_to_desc+0x10 >>> >>>>>>>>>>>> (__ipipe_dispatch_irq+0x198) >>> >>>>>>>>>>>> :| +func -20+ 2.092 >>> >>>>>>>>>>>> ucc_gpio_irqhandler+0x14 (__ipipe_dispatch_irq+0x1fc) >>> >>>>>>>>>>>> :| +func -18+ 1.413 irq_find_mapping+0x10 >>> >>>>>>>>>>>> (ucc_gpio_irqhandler+0x84) >>> >>>>>>>>>>>> :| +begin 0x000000e3 -17 0.757 >>> >>>>>>>>>>>> ipipe_trace_begin+0x24 (ucc_gpio_irqhandler+0x8c) >>> >>>>>>>>>>>> :| +func -16 0.778 >>> >>>>>>>>>>>> __ipipe_dispatch_irq+0x10 (ucc_gpio_irqhandler+0x98) >>> >>>>>>>>>>>> :| +func -15+ 1.023 irq_to_desc+0x10 >>> >>>>>>>>>>>> (__ipipe_dispatch_irq+0x184) >>> >>>>>>>>>>>> :| +func -14+ 1.494 irq_to_desc+0x10 >>> >>>>>>>>>>>> (__ipipe_dispatch_irq+0x198) >>> >>>>>>>>>>>> :| +func -13+ 1.014 >>> >>>>>>>>>>>> __ipipe_ack_level_irq+0x10 (__ipipe_dispatch_irq+0x1fc) >>> >>>>>>>>>>>> :| +func -12 0.763 >>> >>>>>>>>>>>> ucc_gpio_irq_mask+0x10 (__ipipe_ack_level_irq+0x54) >>> >>>>>>>>>>>> :| +func -11+ 1.248 >>> >>>>>>>>>>>> __ipipe_spin_lock_irqsave+0x10 (ucc_gpio_irq_mask+0x2c) >>> >>>>>>>>>>>> :| #func -10+ 1.619 >>> >>>>>>>>>>>> __ipipe_spin_unlock_irqrestore+0x10 (ucc_gpio_irq_mask+0x4c) >>> >>>>>>>>>>>> :| +func -8+ 1.239 >>> >>>>>>>>>>>> __ipipe_set_irq_pending+0x10 (__ipipe_dispatch_irq+0x3bc) >>> >>>>>>>>>>>> :| +end 0x000000e3 -7 0.994 ipipe_trace_end+0x24 >>> >>>>>>>>>>>> (ucc_gpio_irqhandler+0xa0) >>> >>>>>>>>>>>> :| +func -6+ 1.029 gic_eoi_irq+0x10 >>> >>>>>>>>>>>> (ucc_gpio_irqhandler+0xd4) >>> >>>>>>>>>>>> :| +func -5+ 1.353 >>> >>>>>>>>>>>> __ipipe_do_sync_pipeline+0x14 (__ipipe_dispatch_irq+0x17c) >>> >>>>>>>>>>>> :|+ func -4+ 1.449 >>> >>>>>>>>>>>> __ipipe_do_sync_stage+0x14 (__ipipe_do_sync_pipeline+0xf0) >>> >>>>>>>>>>>> :|# func -2+ 1.191 xnintr_irq_handler+0x14 >>> >>>>>>>>>>>> (__ipipe_do_sync_stage+0x200) >>> >>>>>>>>>>>> :|# func -1+ 1.455 ___xnlock_get+0x10 >>> >>>>>>>>>>>> (xnintr_irq_handler+0xc0) >>> >>>>>>>>>>>> <|# func 0 1.107 >>> >>>>>>>>>>>> gpio_pin_interrupt+0x10 (xnintr_irq_handler+0xf4) >>> >>>>>>>>>>> >>> >>>>>>>>>>> And here we start to process that GPIO interrupt in the RTDM >>> >>>>>>>>>>> handler, >>> >>>>>>>>>>> roughly after 30 µs (which are also due to tracing overhead). >>> >>>>>>>>>>> So far >>> >>>>>>>>>>> nothing suspiciuos. >>> >>>>>>>>>>> >>> >>>>>>>>>>>> |# func 1 1.236 >>> >>>>>>>>>>>> xnclock_core_read_monotonic+0x10 (gpio_pin_interrupt+0x1c) >>> >>>>>>>>>>>> |# func 2 1.224 >>> >>>>>>>>>>>> rtdm_event_signal+0x10 (gpio_pin_interrupt+0x2c) >>> >>>>>>>>>>> >>> >>>>>>>>>>> Here some likely waiting RT task is signalled. Does that one >>> >>>>>>>>>>> notice a >>> >>>>>>>>>>> too high latency? >>> >>>>>>>>>> >>> >>>>>>>>>> I may have been wrong speaking of "latency". The problem I'm >>> >>>>>>>>>> trying to fix, is understanding why my reference instrument >>> >>>>>>>>>> indicates a delay from network packet to GPIO on the DUT of >>> >>>>>>>>>> about 1 to 2ms, whereas my xenomai powered realtime recorder >>> >>>>>>>>>> application, given the same network stream and gpio access gives >>> >>>>>>>>>> me ~50 to 52 ms. >>> >>>>>>>>>> >>> >>>>>>>>>> I first though there was something wrong/delayed on the GPIO >>> >>>>>>>>>> interrupt side (the network packet could not have been received >>> >>>>>>>>>> before it is sent right). >>> >>>>>>>>>> The trace seem to demonstrate ~30µs of interrupt latency (a >>> >>>>>>>>>> number I expected for interrupt latency on that system) for the >>> >>>>>>>>>> GPIO. So something is wrong on my system, but I don't know what ! >>> >>>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> Try to trace events, not functions, using regular ftrace >>> >>>>>>>>> ("trace-cmd >>> >>>>>>>>> record -e cobalt* -e sched* -e irq* -e signal*" e.g.). Check when >>> >>>>>>>>> the >>> >>>>>>>>> NIC gets the interrupt and compare that to when the GPIO event is >>> >>>>>>>>> triggered (or whatever is trigger and reaction). Function tracing >>> >>>>>>>>> is for >>> >>>>>>>>> zooming in when you know where to zoom. >>> >>>>>>>>> >>> >>>>>>>> >>> >>>>>>>> So I think I found the answer to my problem. >>> >>>>>>>> Is there any reason why NET events (in NIC irq functions) are >>> >>>>>>>> dated with rtdm_clock_read, whereas GPIO event are dated with >>> >>>>>>>> rtdm_clock_read_monotonic ? >>> >>>>>>>> >>> >>>>>>> >>> >>>>>>> Inconsistency of the GPIO drivers, UART drivers use rtdm_clock_read >>> >>>>>>> for >>> >>>>>>> user-exposed timestamping as well. Maybe Philippe can comment on >>> >>>>>>> thoughts behind this deviation. >>> >>>>>>> >>> >>>>>> >>> >>>>>> When running over the I-pipe, rtdm_clock_read() is based on Xenomai's >>> >>>>>> idea of real time, which is the Cobalt monotonic clock plus an >>> >>>>>> arbitrary >>> >>>>>> offset. For this reason, rtdm_clock_read() is not SMP-consistent >>> >>>>>> (different CPUs might read different timestamps at the exact same >>> >>>>>> time), >>> >>>>>> is not in sync with linux's wall clock either. For these reasons, I >>> >>>>>> don't see any practical way to synchronize multiple systems on the >>> >>>>>> clock >>> >>>>>> underlying rtdm_clock_read(). >>> >>>>>> >>> >>>>>> Therefore, there is no upside in using rtdm_clock_read() for >>> >>>>>> timestamping in this context, only adding the potential for even more >>> >>>>>> surprising results due to the mono->real-time offset changing under >>> >>>>>> our >>> >>>>>> feet, since the epoch of the Xenomai real-time clock can be (re)set >>> >>>>>> during runtime. >>> >>>>>> >>> >>>>>> I believe the UART driver is wrong here, it should return timestamps >>> >>>>>> based on the monotonic source, which best fits the common need: >>> >>>>>> getting >>> >>>>>> timestamps from the local CPU for measuring delays between events >>> >>>>>> received by drivers and the actions taken by the applications which >>> >>>>>> consume them, immune from updates to the underlying clock >>> >>>>>> epoch. Granted, there might be a catch when a timestamp is taken from >>> >>>>>> IRQ context, which is then consumed from a thread living on a >>> >>>>>> different >>> >>>>>> CPU, if per-CPU clocks are not/badly synchronized. But that would >>> >>>>>> happen >>> >>>>>> the exact same way with rtdm_clock_read() anyway. >>> >>>>>> >>> >>>>>> The situation improves when running on top of Dovetail, since Xenomai >>> >>>>>> now refers to the common linux clocks (mono / real) instead of >>> >>>>>> providing >>> >>>>>> its own idea of time, but the UART driver code predates the Dovetail >>> >>>>>> port. >>> >>>>>> >>> >>>>> >>> >>>>> Well, it's not just UART. All hardware drivers - except for GPIO - use >>> >>>>> rtdm_clock_read. That was no problem in practice for their use cases >>> >>>>> so >>> >>>>> far. One problem is that GPIO timestamps are now not comparable to >>> >>>>> others. >>> >>>>> >>> >>>>> But IIRC, most Xenomai APIs using absolute timestamps are based on >>> >>>>> Xenomai's real-time clock. Therefore, providing timestamps for that >>> >>>> >>> >>>> /* >>> >>>> * The Copperplate clock shall be monotonic unless the threading >>> >>>> * library has restrictions to support this over Mercury. >>> >>>> * >>> >>>> * In the normal case, this means that ongoing delays and timeouts >>> >>>> * won't be affected when the host system date is changed. In the >>> >>>> * restricted case by contrast, ongoing delays and timeouts may be >>> >>>> * impacted by changes to the host system date. >>> >>>> * >>> >>>> * The implementation maintains a per-clock epoch value, so that >>> >>>> * different emulators can have different (virtual) system dates. >>> >>>> */ >>> >>>> #ifdef CONFIG_XENO_COPPERPLATE_CLOCK_RESTRICTED >>> >>>> #define CLOCK_COPPERPLATE CLOCK_REALTIME >>> >>>> #else >>> >>>> #define CLOCK_COPPERPLATE CLOCK_MONOTONIC >>> >>>> #endif >>> >>>> >>> >>>> So no, only the POSIX API is using what the standard mandates, which is >>> >>>> CLOCK_REALTIME. All other APIs are based on copperplate, and they are >>> >>>> using a monotonic source as documented above. >>> >>> >>> >>> OK, but that changed in 3.x. At the time that RTDM API was originally >>> >>> added and then promoted, it was the other not this way. We became >>> >>> inconsistent then. >>> >>> >>> >>>> >>> >>>>> particular clock was the original idea of rtdm_clock_read (which >>> >>>>> predates rtdm_clock_read_monotonic). GPIO breaks that and should be >>> >>>>> fixed - unless I'm wrong with that assumption. >>> >>>>> >>> >>>> >>> >>>> We cannot assume the epoch is going to remain stable with >>> >>>> rtdm_clock_read() the way it is implemented, which is quite of a >>> >>>> problem wrt the common use case. For timestamping, a majority of >>> >>>> mainline drivers is using ktime_get() or a variant thereof which is >>> >>>> based on the monotonic clock source, not the _real form. Why would the >>> >>>> real-time I/O drivers be different? >>> >>> >>> >>> We have two cases here: >>> >>> >>> >>> - I-pipe-based version where the realtime clock is under full >>> >>> application control -> no problem to use rtdm_clock_read >>> >> >>> >> Well, there is still the issue that rtdm_clock_read() is not immune to >>> >> some part of userland changing the CLOCK_REALTIME epoch Xenomai-wise >>> >> e.g. via a call to clock_settime(), which is the same problem than Linux >>> >> changing the epoch of CLOCK_REALTIME over Dovetail. This would break the >>> >> application. >>> > >>> > Yes, but that's about the application(s) breaking themselves. Nothing >>> > new, not going to change when we only avoid clock_realtime for stamps >>> > but still use POSIX services basing timers on that clock. The key point >>> > is that I-pipe gave that into RT application hands (with all related >>> > downsides), with Dovetail it's in system hands. >>> > >>> >> >>> >>> - Dovetail where we share the realtime clock - with all its tuning - >>> >>> with Linux -> here we have a problem with rtdm_clock_read and should >>> >>> reconsider its usage (and promotion!) >>> >>> >>> >>> For stable 3.1, the proper fix is with GPIO going to rtdm_clock_read. >>> >> >>> >> Wait, you have downstream users already depending on GPIO returning >>> >> monotonic timestamps, and this is a _stable_ release. So why not fixing >>> >> other drivers based on the fact that timestamping with rtdm_clock_read() >>> >> is wrong instead? Same issue, right? >>> > >>> > GPIO was broken, but you are right that we may have users relying on >>> > that breakage now. Obviously, we can change the other drivers for the >>> > very same reasons: They are working like they work for more than 10 >>> > years now. >>> > >>> >>> Sorry, but I my views, rtdm_read_clock() was broken since day one. GPIO >>> had to work around the breakage.. :) >>> >>> >> >>> >> So the best course of action to sort this out for 3.1.x may be to extend >>> >> GPIO_RTIOC_TS with say, GPIO_RTIOC_TS_REAL, which would log and return >>> >> timestamps based on the Xenomai wallclock. Applications which do want to >>> >> align on that clock would simply have to issue GPIO_RTIOC_TS_REAL >>> >> instead of GPIO_RTIOC_TS. This would break backward ABI compat only for >>> >> users of GPIO_RTIOC_TS_REAL, but that would be much better than >>> >> introducing a sneaky change in behavior for the GPIO driver. >>> > >>> > Yeah, likely the way to go. >>> > >>> >> >>> >>> For 3.2, I'm not sure yet what to do with rtdm_clock_read. >>> >>> >>> >> >>> >> The decision looks pretty simple for the common use case: when >>> >> timestamps are needed for performance/delay measurements, we want to get >>> >> them from a clock source which won't play funny games, warping back to >>> >> the future. >>> > >>> > You only think of measurements. The other drivers used the stamping also >>> > for real applications, means to calculate absolute clock-realtime >>> > timeouts and wait for them to arrive. We will likely need to enhance >>> > also the other driver APIs to select the desired clocksource, just like >>> > for GPIO. >>> >>> I'm referring to what is in the GPIO code, which is the problem at hand: >>> that timestamping was designed since day #1 to provide a way to measure >>> the scheduling latency. >>> >>> I agree that the only way out is to enable all timestamp consumers to >>> pick their base clock of choice (mono or wallclock). >>> >>> -- >>> Philippe. >> >> Hi there, >> >> as we speak of breaking ABI, why not introducing some specific IOCTL to >> select which timesource to use for each driver using it, then using a >> function pointer to call the right rtdm_time_get_whatever function ? >> > > That would mean adding a generic ioctl, and many changes all over the > place (we would need no function pointer with distinct ioctl > codes). Doable for sure, but likely too much for 3.1.x though. wrt function pointer: I mean that a plain simple branch on some selector is likely to perform better than a function pointer, with the spectre mitigations around. -- Philippe.
