> -----Original Message----- > From: Alexander Frolov <fro...@nicevt.ru> > Sent: Montag, 13. Juli 2020 13:09 > To: Lange Norbert <norbert.la...@andritz.com>; Xenomai > (xenomai@xenomai.org) <xenomai@xenomai.org> > Subject: Re: FW: Xenomai with isolcpus and workqueue task > > NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR > ATTACHMENTS. > > > On 7/13/20 1:48 PM, Lange Norbert wrote: > > > >> -----Original Message----- > >> From: Xenomai <xenomai-boun...@xenomai.org> On Behalf Of > Alexander > >> Frolov via Xenomai > >> Sent: Montag, 13. Juli 2020 12:27 > >> To: xenomai@xenomai.org > >> Subject: Re: FW: Xenomai with isolcpus and workqueue task > >> > >> NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR > ATTACHMENTS. > >> > >> > >> -----Original Message----- > >>>> From: Lange Norbert > >>>> Sent: Montag, 13. Juli 2020 10:34 > >>>> To: Alexander Frolov <fro...@nicevt.ru> > >>>> Subject: RE: Xenomai with isolcpus and workqueue task > >>>> > >>>> > >>>> > >>>>> -----Original Message----- > >>>>> From: Xenomai <xenomai-boun...@xenomai.org> On Behalf Of > >> Alexander > >>>>> Frolov via Xenomai > >>>>> Sent: Samstag, 11. Juli 2020 16:26 > >>>>> To: xenomai@xenomai.org > >>>>> Subject: Xenomai with isolcpus and workqueue task > >>>>> > >>>>> NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR > >>>> ATTACHMENTS. > >>>>> Hi all! > >>>>> > >>>>> I am using Xenomai 3.1 with 4.19.124 I-pipe patchon a smp > motherboard. > >>>>> For my RT task I allocate few CPU cores with isolcpus option. > >>>>> However, large latency spikes are noticed due to igb watchdog > >>>>> activities (I am using common igb driver, not rt_igb). > >>>>> > >>>>> Looking into igb sources, it was understood that workqueue is used > >>>>> for some tasks (afaiu, it is used to link status > >>>>> monitoring) > >>>>> > >>>>> from igb_main.c > >>>>> ... > >>>>> INIT_WORK(&adapter->reset_task, igb_reset_task); > >>>>> INIT_WORK(&adapter->watchdog_task, igb_watchdog_task); ... > >>>>> > >>>>> The Linux kernel scheduler runs this igb activities on isolated CPUs > >>>>> disregarding isolcpus option, ruining real-time system behavior. > >>>> isolcpus does not mean the CPUs aren't used, it means they are > >>>> excluded from the normal CPU scheduler. No process will automatically > >>>> be moved from/to isolated CPUs, but you still need to make sure to > >>>> free them of any tasks. > >>>> Irq-handlers still run anywhere, and processes still can allow those > >>>> CPUs to be used. > >>>> > >>>>> So the question, is it a correct way to use normal igb on Xenomai at > >>>>> all or it is not recommended? What can be done to prohibit Linux > >>>>> scheduler to allocate those tasks on isolated cores? > >>>> I use the normal igb and rt_igb concurrently, I doubt it is > >>>> recommended but possible ;) > >>>> > >>>> You should add irqaffinity=0 to the cmdline (CPU0 is apparently > >>>> always used for irqs), then check 'cat /proc/irq/*/smp_affinity'. > >>>> This keeps the other CPUs free from linux IRQs. > >>>> You can use some measures to bind Linux tasks to CPU0 aswell. One of: > >>>> > >>>> - isolcpus (sets default affinity mask aswell) > >>>> - set affinity early (like in Ramdisk) > >>>> - Use cgroups (cset-shield) > >>>> > >>>> Only cgroups actually prohibit processes ignoring your defaults and > >>>> using other CPUs, I did not get around playing with this, and just use > >> isolcpus. > >>>> But the most important part is to dont run RT on cores dealing with > >>>> Linux interrupts, some handlers/drivers don’t expects being > >>>> preempted, had the MMC driver bail because of a timeout. > >>>> > >>>> I haven’t solved moving the rtnet-stack, rtnet-rpc off CPU0, and the > >>>> rt_igb IRQs will use all CPUs. > >>>> > >>>> Norbert > >> Thank you! Using an IRQ affinity feature to move handlers to specified > cores > >> is very practical, but in this case we experience problem with another > >> artefakt of igb. > >> > >> Just as example of influence of igb activity (igb_watchdog_task) on CPU4 > >> (which is an isolated one). > > Hmm, I dont have that task. > > You can use taskset -p 0 <pidof igb_watchdog_task> to change affinity, > > If not pretty but should work. > > > > (isolcpus doesn’t work the way you think, it only affects CPU migration) > > Not sure, that I can find out <pidof igb_watchdog_task>, cause it can be > kworker > thread which takes this task to execution. I can try to move all kworkers to > general- > purposed cores, which looks a bit crazy.
Ok, missed the general workqueue part. No, moving kworkers is likely dangerous. Still I wonder why its picking core 4, maybe binding irqs to core #0 will Indirectly affect that. > > > > >> # cat /proc/ipipe/trace/frozen | grep '\!' > >> ... > >> : +func -1231! 52.379 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x520 [igb]) > >> : +func -1145! 45.864 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x536 [igb]) > >> : +func -1099! 51.917 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x75a [igb]) > >> : +func -1047! 51.517 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x782 [igb]) > >> : +func -996! 51.988 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x54e [igb]) > >> : +func -944! 51.436 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x564 [igb]) > >> : +func -893! 52.569 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x57a [igb]) > >> : +func -840! 52.529 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x590 [igb]) > >> : +func -787! 52.018 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x5a6 [igb]) > >> : +func -735! 52.058 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x5bc [igb]) > >> : +func -683! 51.497 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x5d2 [igb]) > >> : +func -632! 51.436 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x5e8 [igb]) > >> : +func -580! 51.416 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x5fe [igb]) > >> : +func -529! 52.038 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x614 [igb]) > >> : +func -477! 52.058 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x62a [igb]) > >> : +func -425! 51.436 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x6f4 [igb]) > >> : +func -373! 51.416 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x70a [igb]) > >> : +func -322! 51.517 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x720 [igb]) > >> : +func -271! 51.847 igb_rd32+0x0 [igb] > >> (igb_update_stats+0x736 [igb]) > >> : +func -189! 47.247 igb_rd32+0x0 [igb] > >> (igb_ptp_rx_hang+0x1e [igb]) > >> : +func -103! 72.735 igb_rd32+0x0 [igb] > >> (igb_watchdog_task+0x66a [igb]) > >> > >> > >> From the other side, rt kernels do not have this issue, probably because > of > >> modified workqueue subsystem. > >> > >> Any ideas how to keep this work out of critical code? > > Is this task blocking RT from running? I mean it's better to run it at > > another > core, > > particularly because register accesses are painfully slow on that hardware. > > But I don’t see why it should make a big impact. > Yes, that what it does. I think igb_rd32 is too slow, do not know why, > however. It does some weird stuff, handling PCI read error AFAIU. For the rt_igp part, I have this disabled (not sure if its upstream): #define rd32(__reg)readl(hw->hw_addr + (__reg)) What kinda spikes are we talking about btw. (ie. normal latency and outliers)? I don’t think the workqueue should block Xenomai, so those reads alone should not Affect latency by more than ~20musec (atleast on my hardware). Norbert ________________________________ This message and any attachments are solely for the use of the intended recipients. They may contain privileged and/or confidential information or other information protected from disclosure. If you are not an intended recipient, you are hereby notified that you received this email in error and that any review, dissemination, distribution or copying of this email and any attachment is strictly prohibited. If you have received this email in error, please contact the sender and delete the message and any attachment from your system. ANDRITZ HYDRO GmbH Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation Firmensitz/ Registered seat: Wien Firmenbuchgericht/ Court of registry: Handelsgericht Wien Firmenbuchnummer/ Company registration: FN 61833 g DVR: 0605077 UID-Nr.: ATU14756806 Thank You ________________________________