RE: [ANNOUNCE] Xenomai v3.1-rc3 available
I am building a Xenomai cobalt kernel under Ubuntu 18.04 based on: Linux 4.19.75 Xenomai 3.1-rc3 ipipe-core-4.19.75-x86-7.patch After running prepare-kernel.sh and "makeolddefconfig" (using the unmodified default .config), I get this error: WARNING: unmet direct dependencies detected for PARAVIRT Depends on [n]: HYPERVISOR_GUEST [=y] && !IPIPE [=y] Selected by [m]: - HYPERV [=m] && X86 [=y] && ACPI [=y] && X86_LOCAL_APIC [=y] && HYPERVISOR_GUEST [=y] I turned off HYPERVISOR_GUEST manually to fix the warning. I hadn't seen this type of error before, but perhaps something could be done in the patch to make the default config consistent... Just FYI. Otherwise, everything went fine with the build process and initial testing. I will be giving this build some more testing over the next week or so and will let you know if I find any issues. Thanks, -Jeff
Re: INTR-REMAP error with UDD driver
‐‐‐ Original Message ‐‐‐ On Thursday, November 14, 2019 3:41 PM, Jan Kiszka wrote: > On 14.11.19 14:16, Jeff Webb wrote: > > ‐‐‐ Original Message ‐‐‐ > > On Thursday, November 14, 2019 1:50 AM, Jan Kiszka jan.kis...@siemens.com > > wrote: > > > > > On 14.11.19 06:05, Jeff Webb via Xenomai wrote: > > > > > > > I would like to revive this thread from several months ago: > > > > https://xenomai.org/pipermail/xenomai/2019-March/040498.html > > > > The issue is that on some hardware (a specific rack-mount PC with a > > > > PICMG daughtercard on a backplane containing PCI and PCIe slots) I get > > > > an INTR-REMAP error when trying to receive legacy (not MSI) interrupts > > > > from a custom FPGA-based PCI card using a UDD driver. The card did work > > > > properly in one out of the five PCI slots on that machine, but UDD > > > > interrupts did not work in the other four slots. > > > > Please review the original thread for more details about the specific > > > > error. > > > > Here are a few more tidbits I have gathered: > > > > > > > > - The UDD driver / userspace code works fine on the other hardware > > > > > > > > - The UDD driver / userspace code works fine in one PCI slot out of > > > > five on this hardware. > > > > > > > > - With another backplane model, but same processor card, the problem > > > > occurs in all four of the PCI slots. > > > > > > > > - An almost identical pure-linux UIO version of the driver / > > > > userspace code works in all the cases I tested, even when the UDD > > > > version fails, and even with the same xenomai-patched kernel used for > > > > UDD testing. > > > > > > > > > > > > In one of the previous posts in this thread a few months ago, Per Öberg > > > > mentioned experiencing something similar. Based on the information that > > > > was shared, I tried my code with linux version 4.9.38, but it still > > > > failed. This prompted me to try other linux / ipipe / xenomai > > > > combinations. These are my findings: > > > > Interrupts work: > > > > xenomai-2.6.5 ipipe-core-3.18.20-x86-7.patch (2016-07-05) > > > > xenomai-3.0.9+ ipipe-core-3.18.20-x86-7.patch (2016-07-05) > > > > xenomai-3.0.9+ ipipe-core-4.1.18-x86-9.patch (2017-05-25) > > > > INTR-REMAP error: > > > > xenomai-3.0.9+ ipipe-core-4.4.43-x86-6.patch (2017-02-25) > > > > xenomai-3.0.9+ ipipe-core-4.4.43-x86-7.patch (2017-05-25) > > > > xenomai-3.0.9+ ipipe-core-4.4.43-x86-8.patch (2017-06-14) > > > > xenomai-3.1-rc3 ipipe-core-4.4.196-cip38-x86-19.patch (2019-11-04) > > > > xenomai-3.0.9+ ipipe-core-4.9.38-x86-4.patch (2017-10-03) > > > > xenomai-3.0.9 ipipe-core-4.14.132-x86-6.patch (2019-07-03) > > > > The Xenomai 2.6.5 version of course does not use UDD, but uses the old > > > > pthread_intr_* userspace functions. > > > > Hopefully this additional information can shed a little light on the > > > > matter. > > > > > > This sounds like some RT interrupt enabling issue related to the IOAPIC > > > in the x86 I-pipe patch. Please also test 4.19. > > > > Ok, I will do this. Today I had a chance to make a build based on the ipipe-core-4.19.75-x86-7.patch with xenomai-3.1-rc3. I haven't had a chance to test it thoroughly, but I think it works. I am sorry I didn't try this earlier, and thanks for the reminder to do so. Since a 4.19 patch wasn't available back in early March when I first had the trouble, I got fixated on the fact that things worked with very old kernels and forgot to try 4.19 when I started looking into this again. I will continue to test and let you know if I find any issues. > > > Are you using UDD_IRQ_CUSTOM or do you leave the interrupt registration > > > to the UDD core? > > > > I just tell UDD the IRQ number and let it register the interrupt. > > > > > And please share your kernel config. > > > > I attached one to my original post earlier this year -- you should be able > > to download it from the link in the mailing list archive. Let me know if > > you need something different. I started with the standard Ubuntu desktop > > kernel config and tweaked options from there, so there is a lot of stuff > > enabled, obviously. > > > > > BTW, interrupt remapping issues can be worked around by disabling the > > > interrupt remapping feature (e.g. "intremap=off"). But that does not > > > solve the unterlying issue, of course. > > > > I can't remember if I tried this or not. I will give it a go. Obviously, it > > would be good to get this fixed in the patch, though. > > I've reproduced some problem with INTx/IOAPIC and intremap=on on 4.4 in > KVM. When we are lucky, it's the same as yours. Will debug that tomorrow. Thank you so much for looking into this, Jan. -Jeff
Re: INTR-REMAP error with UDD driver
On 14.11.19 14:16, Jeff Webb wrote: ‐‐‐ Original Message ‐‐‐ On Thursday, November 14, 2019 1:50 AM, Jan Kiszka wrote: On 14.11.19 06:05, Jeff Webb via Xenomai wrote: I would like to revive this thread from several months ago: https://xenomai.org/pipermail/xenomai/2019-March/040498.html The issue is that on some hardware (a specific rack-mount PC with a PICMG daughtercard on a backplane containing PCI and PCIe slots) I get an INTR-REMAP error when trying to receive legacy (not MSI) interrupts from a custom FPGA-based PCI card using a UDD driver. The card did work properly in one out of the five PCI slots on that machine, but UDD interrupts did not work in the other four slots. Please review the original thread for more details about the specific error. Here are a few more tidbits I have gathered: - The UDD driver / userspace code works fine on the other hardware - The UDD driver / userspace code works fine in one PCI slot out of five on this hardware. - With another backplane model, but same processor card, the problem occurs in all four of the PCI slots. - An almost identical pure-linux UIO version of the driver / userspace code works in all the cases I tested, even when the UDD version fails, and even with the same xenomai-patched kernel used for UDD testing. In one of the previous posts in this thread a few months ago, Per Öberg mentioned experiencing something similar. Based on the information that was shared, I tried my code with linux version 4.9.38, but it still failed. This prompted me to try other linux / ipipe / xenomai combinations. These are my findings: Interrupts work: xenomai-2.6.5 ipipe-core-3.18.20-x86-7.patch (2016-07-05) xenomai-3.0.9+ ipipe-core-3.18.20-x86-7.patch (2016-07-05) xenomai-3.0.9+ ipipe-core-4.1.18-x86-9.patch (2017-05-25) INTR-REMAP error: xenomai-3.0.9+ ipipe-core-4.4.43-x86-6.patch (2017-02-25) xenomai-3.0.9+ ipipe-core-4.4.43-x86-7.patch (2017-05-25) xenomai-3.0.9+ ipipe-core-4.4.43-x86-8.patch (2017-06-14) xenomai-3.1-rc3 ipipe-core-4.4.196-cip38-x86-19.patch (2019-11-04) xenomai-3.0.9+ ipipe-core-4.9.38-x86-4.patch (2017-10-03) xenomai-3.0.9 ipipe-core-4.14.132-x86-6.patch (2019-07-03) The Xenomai 2.6.5 version of course does not use UDD, but uses the old pthread_intr_* userspace functions. Hopefully this additional information can shed a little light on the matter. This sounds like some RT interrupt enabling issue related to the IOAPIC in the x86 I-pipe patch. Please also test 4.19. Ok, I will do this. Are you using UDD_IRQ_CUSTOM or do you leave the interrupt registration to the UDD core? I just tell UDD the IRQ number and let it register the interrupt. And please share your kernel config. I attached one to my original post earlier this year -- you should be able to download it from the link in the mailing list archive. Let me know if you need something different. I started with the standard Ubuntu desktop kernel config and tweaked options from there, so there is a lot of stuff enabled, obviously. BTW, interrupt remapping issues can be worked around by disabling the interrupt remapping feature (e.g. "intremap=off"). But that does not solve the unterlying issue, of course. I can't remember if I tried this or not. I will give it a go. Obviously, it would be good to get this fixed in the patch, though. I've reproduced some problem with INTx/IOAPIC and intremap=on on 4.4 in KVM. When we are lucky, it's the same as yours. Will debug that tomorrow. Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
[PATCH] rtdm: Do not return an error from send/recvmmsg if there are packets
From: Jan Kiszka This is in line with Linux behavior. We likely still miss an equivalent to sk_err in recvmmsg, though. Reported-by: Lange Norbert Signed-off-by: Jan Kiszka --- kernel/cobalt/rtdm/fd.c | 20 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/kernel/cobalt/rtdm/fd.c b/kernel/cobalt/rtdm/fd.c index 0a1c6e44ed..5b2c3834da 100644 --- a/kernel/cobalt/rtdm/fd.c +++ b/kernel/cobalt/rtdm/fd.c @@ -734,14 +734,12 @@ int __rtdm_fd_recvmmsg(int ufd, void __user *u_msgvec, unsigned int vlen, xnlock_put_irqrestore(&nklock, s); } - if (datagrams > 0 && - (ret == 0 || ret == -ETIMEDOUT || ret == -EWOULDBLOCK)) { - /* NOTE: SO_ERROR should be honored for other errors. */ - rtdm_fd_put(fd); - return datagrams; - } fail: rtdm_fd_put(fd); + + if (datagrams > 0) + ret = datagrams; + out: trace_cobalt_fd_recvmmsg_status(current, fd, ufd, ret); @@ -826,13 +824,11 @@ int __rtdm_fd_sendmmsg(int ufd, void __user *u_msgvec, unsigned int vlen, datagrams++; } - if (datagrams > 0 && (ret == 0 || ret == -EWOULDBLOCK)) { - /* NOTE: SO_ERROR should be honored for other errors. */ - rtdm_fd_put(fd); - return datagrams; - } - rtdm_fd_put(fd); + + if (datagrams > 0) + ret = datagrams; + out: trace_cobalt_fd_sendmmsg_status(current, fd, ufd, ret); -- 2.16.4 -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
Re: RTnet sendmmsg and ENOBUFS
On 14.11.19 19:18, Jan Kiszka wrote: On 14.11.19 18:55, Lange Norbert wrote: According to the code in __rtdm_fd_sendmmsg, that’s not what happens, ENOBUFS would be returned instead, And the amount of sent packets is lost forever. if (datagrams > 0 && (ret == 0 || ret == -EWOULDBLOCK)) { /* NOTE: SO_ERROR should be honored for other errors. */ rtdm_fd_put(fd); return datagrams; } IMHO this condition would need to added: ((flags | MSG_DONTWAIT) && ret == -ENOBUFS) (Recvmmsg possibly similarly, havent checked yet) sendmmsg was only added to Xenomai 3.1. There might be room for improvements, if not corrections. So, if we do not return the number of sent messages or signal an error where we should not (this is how I read the man page currently), this needs a patch... The implementation of sendmmsg is wrong when comparing it to the man page and the reference in the kernel (as this is Linux-only): /* We only return an error if no datagrams were able to be sent */ says the kernel e.g. and does if (datagrams != 0) return datagrams; It's also missing to trace on certain exits. Will write a patch. Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
Re: RTnet sendmmsg and ENOBUFS
On 14.11.19 18:55, Lange Norbert wrote: So, for my setup socket_rtskbs is 16, the rt_igp driver rtskbs are 256TX + 256RX. As said, our software prepares packets before a timeslice, and would aim to minimize systemcalls and interrupts, packets are sent over raw rtsockets. if understand __rtdm_fd_sendmmsg and rt_packet_sendmsg correctly, sendmsg will pick one socket_rtskbs, copies data from userspace and then passes this rtskbs to rtdev_xmit. I don’t see how a free buffers gets passed back, like README.pools describes, I guess rtskb_acquire should somehow do this. The buffer returns (not necessarily the same one, though) when the packet was truly sent, and the driver ran its TX cleanup. If you submit many packets as a chunk, that may block them for a while. So in short, I am using only one socket_rtskbs temporarily, as the function passes the buffer to the rtdev (rt_igp driver)? You are using as many rtskbs as it takes to get the data you passed down forwarded as packets to the NIC, and that as long as the NIC needs to get that data DMA'ed to the transmitter. I suppose the receive path works similarly. RX works by accepting a global-pool buffer (this is where incoming packets first end up in) filled with data in exchange to an empty rtskb from the socket pool. That filled rtskb is put into the socket pool once the data was transferred to userspace. Now if I would want to send nonblocking, ie. as much packets as are possible, exhausting the rtskbs then I would expect the EAGAIN/EWOULDBLOCK error and getting back the number of successfully queued packets (so I could drop them and send the remaining later). I don't recall why anymore, but we decided to use a different error code in RTnet for this back then, possibly to differentiate this "should never ever happen in a deterministic network" from other errors. According to the code in __rtdm_fd_sendmmsg, that’s not what happens, ENOBUFS would be returned instead, And the amount of sent packets is lost forever. if (datagrams > 0 && (ret == 0 || ret == -EWOULDBLOCK)) { /* NOTE: SO_ERROR should be honored for other errors. */ rtdm_fd_put(fd); return datagrams; } IMHO this condition would need to added: ((flags | MSG_DONTWAIT) && ret == -ENOBUFS) (Recvmmsg possibly similarly, havent checked yet) sendmmsg was only added to Xenomai 3.1. There might be room for improvements, if not corrections. So, if we do not return the number of sent messages or signal an error where we should not (this is how I read the man page currently), this needs a patch... Jan -Original Message- From: Xenomai On Behalf Of Lange Norbert via Xenomai Sent: Mittwoch, 13. November 2019 18:53 To: Jan Kiszka ; Xenomai (xenomai@xenomai.org) Subject: RE: RTnet sendmmsg and ENOBUFS NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR ATTACHMENTS. -Original Message- From: Jan Kiszka Sent: Mittwoch, 13. November 2019 18:39 To: Lange Norbert ; Xenomai (xenomai@xenomai.org) Subject: Re: RTnet sendmmsg and ENOBUFS NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR ATTACHMENTS. On 13.11.19 16:10, Lange Norbert via Xenomai wrote: Hello, for one of our applications, we have (unfortunatly) a single ethernet connection for Realtime and Nonrealtime. We solve this by sending timeslices with RT first, then filling up the remaining space. When stressing the limits (quite possibly beyond if accounting for bugs), the sendmmsg call over a raw socket returns ENOBUFS (even with a single small packet). I was expecting this call to just block until the resouces are available. Blocking would mean that the sites which make buffers available again had to signal this. The original design idea was to avoid such overhead and rather rely on the applications to schedule their submissions properly and preallocate resources accordingly. Ok. In other words, this is the same behaviour as using MSG_DONTWAIT (with a different errno value) Timeslices are 1 ms, so that could be around 12Kbyte total or ~190 60Byte packets (theoretical max). What variables are involved (whats the xenomai buffer limits, are they shared or per interface) and choices do I have? - I could send the packages nonblocking and wait or drop the remaining myself - I could deal with ENOBUFS the same way as EAGAIN (is there any difference actually) - I could raise the amount of internal buffer somehow Check kernel/drivers/net/doc/README.pools Also while stresstesting I get these messages: [ 5572.044934] hard_start_xmit returned 16 [ 5572.054989] hard_start_xmit returned 16 [ 5572.064007] hard_start_xmit returned 16 [ 5572.067893] hard_start_xmit returned 16 [ 5572.071739] hard_start_xmit returned 16 [ 5572.075586] hard_start_xmit returned 16 [ 5575.096116] hard_start_xmit returned 16 [ 5579.377038] hard_start_xmit returned 16 This likely comes from NETDEV_TX_BUSY signaled by the driver. Check the one you use for reasons. May include "I don't have
RE: RTnet sendmmsg and ENOBUFS
So, for my setup socket_rtskbs is 16, the rt_igp driver rtskbs are 256TX + 256RX. As said, our software prepares packets before a timeslice, and would aim to minimize systemcalls and interrupts, packets are sent over raw rtsockets. if understand __rtdm_fd_sendmmsg and rt_packet_sendmsg correctly, sendmsg will pick one socket_rtskbs, copies data from userspace and then passes this rtskbs to rtdev_xmit. I don’t see how a free buffers gets passed back, like README.pools describes, I guess rtskb_acquire should somehow do this. So in short, I am using only one socket_rtskbs temporarily, as the function passes the buffer to the rtdev (rt_igp driver)? I suppose the receive path works similarly. Now if I would want to send nonblocking, ie. as much packets as are possible, exhausting the rtskbs then I would expect the EAGAIN/EWOULDBLOCK error and getting back the number of successfully queued packets (so I could drop them and send the remaining later). According to the code in __rtdm_fd_sendmmsg, that’s not what happens, ENOBUFS would be returned instead, And the amount of sent packets is lost forever. if (datagrams > 0 && (ret == 0 || ret == -EWOULDBLOCK)) { /* NOTE: SO_ERROR should be honored for other errors. */ rtdm_fd_put(fd); return datagrams; } IMHO this condition would need to added: ((flags | MSG_DONTWAIT) && ret == -ENOBUFS) (Recvmmsg possibly similarly, havent checked yet) Thanks for the help, Norbert > -Original Message- > From: Xenomai On Behalf Of Lange > Norbert via Xenomai > Sent: Mittwoch, 13. November 2019 18:53 > To: Jan Kiszka ; Xenomai > (xenomai@xenomai.org) > Subject: RE: RTnet sendmmsg and ENOBUFS > > NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR > ATTACHMENTS. > > > > -Original Message- > > From: Jan Kiszka > > Sent: Mittwoch, 13. November 2019 18:39 > > To: Lange Norbert ; Xenomai > > (xenomai@xenomai.org) > > Subject: Re: RTnet sendmmsg and ENOBUFS > > > > NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR > ATTACHMENTS. > > > > > > On 13.11.19 16:10, Lange Norbert via Xenomai wrote: > > > Hello, > > > > > > for one of our applications, we have (unfortunatly) a single ethernet > > connection for Realtime and Nonrealtime. > > > > > > We solve this by sending timeslices with RT first, then filling up the > > > remaining space. When stressing the limits (quite possibly beyond if > > accounting for bugs), the sendmmsg call over a raw socket returns > ENOBUFS > > (even with a single small packet). > > > I was expecting this call to just block until the resouces are available. > > > > Blocking would mean that the sites which make buffers available again had > to > > signal this. The original design idea was to avoid such overhead and rather > > rely on the applications to schedule their submissions properly and > > preallocate resources accordingly. > > Ok. > In other words, this is the same behaviour as using MSG_DONTWAIT > (with a different errno value) > > > > > > > > > Timeslices are 1 ms, so that could be around 12Kbyte total or ~190 60Byte > > packets (theoretical max). > > > > > > What variables are involved (whats the xenomai buffer limits, are they > > shared or per interface) and choices do I have? > > > > > > - I could send the packages nonblocking and wait or drop the remaining > > > myself > > > - I could deal with ENOBUFS the same way as EAGAIN (is there any > > > difference actually) > > > - I could raise the amount of internal buffer somehow > > > > Check kernel/drivers/net/doc/README.pools > > > > > > > > Also while stresstesting I get these messages: > > > > > > [ 5572.044934] hard_start_xmit returned 16 [ 5572.054989] > > > hard_start_xmit returned 16 [ 5572.064007] hard_start_xmit returned 16 > > > [ 5572.067893] hard_start_xmit returned 16 [ 5572.071739] > > > hard_start_xmit returned 16 [ 5572.075586] hard_start_xmit returned 16 > > > [ 5575.096116] hard_start_xmit returned 16 [ 5579.377038] > > > hard_start_xmit returned 16 > > > > This likely comes from NETDEV_TX_BUSY signaled by the driver. Check the > > one you use for reasons. May include "I don't have buffers left". > > Yes it does, I was afraid this would indicate some leaked buffers. > > Norbert > > > This message and any attachments are solely for the use of the intended > recipients. They may contain privileged and/or confidential information or > other information protected from disclosure. If you are not an intended > recipient, you are hereby notified that you received this email in error and > that any review, dissemination, distribution or copying of this email and any > attachment is strictly prohibited. If you have received this email in error, > please contact the sender and delete the message and any attachment from > your system. > > ANDRITZ HYDRO GmbH > > > Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation > > Firmensitz/ Registered seat: Wien > > Firmenbuchgericht/ Court of registry: Handelsg
Re: Xenomai crashes when braking into the debugger
‐‐‐ Original Message ‐‐‐ On Thursday, November 14, 2019 1:43 AM, Jan Kiszka wrote: > On 14.11.19 02:58, Jeff Webb via Xenomai wrote: > > > Lange Norbert via Xenomai wrote: > > > > > > From: Jan Kiszka > > > > On 13.11.19 16:18, Lange Norbert via Xenomai wrote: > > > > > > > > > I am running into some bad issues with debugging, can't really narrow > > > > > down when they happen, but usually when I run through GDB and want to > > > > > "break" (pause execution), it seems to be related to other Xenomai > > > > > programs running at the same time (as said its hard to narrow down). > > > > > > > > We have a gdb test case. Does it trigger for you as well when you run > > > > some > > > > other program in parallel? > > > > Also, could you provide the kernel full log? Possibly, enabling the > > > > I-pipe > > > > tracer with panic dump could be useful as well. But the most important > > > > step > > > > would be to create reproducibility for a third party like me. > > > > > > Currently the issue is gone, and I don't have time for researching the > > > cause. > > > is panic dump a kernel compilation config? > > > > I think one of my colleagues has experienced something similar. > > He said that a when one application was stopped in a breakpoint, > > it caused sem_timedwait calls in another application to not time > > out until execution of the other program was resumed. I will ask > > and see if he can put together a reproducible test case. I know > > the problem was repeatable at one point with the two applications > > he was working with. > > This particular behavior is solved in 3.1 by > https://gitlab.denx.de/Xenomai/xenomai/commit/9ebc2b6ea49406026e9e69d8fa490b3f8d8f0a24. That is great. Thanks for pointing this out. > > I have personally experienced what seems (to me) to be a similar > > issue involving signal handling where a signal handling thread > > received a SIGINT via sigwait (other threads had SIGINT blocked), > > and tried to set a global variable that should have caused the > > other threads to terminate. The other threads had an issue where > > they would not wake up from sem_timedwait calls (or even sleep > > calls) after the SIGINT was received by the other thread, so they > > would not terminate properly. The same code worked fine under > > Xenomai 2.6. I tried to create a standalone example to reproduce > > this today, but I could recreate the problem. I know it was very > > reproducible when I was constructing a work-around for it. > > Could it be that some fault occurs that causes subsequent bad > > behavior with respect to signal handling (SIGINT/debugging) that > > is fixed by a reboot? > > Just trying to shed some light on the problem. I think there is > > a bug here somewhere... > > Stand-alone test cases or test sequences are always welcome! Just please > also make sure 3.1-rc as debugging code changed there quite a bit. Also good to know. Thanks again! -Jeff
Re: INTR-REMAP error with UDD driver
‐‐‐ Original Message ‐‐‐ On Thursday, November 14, 2019 1:50 AM, Jan Kiszka wrote: > On 14.11.19 06:05, Jeff Webb via Xenomai wrote: > > I would like to revive this thread from several months ago: > > https://xenomai.org/pipermail/xenomai/2019-March/040498.html > > The issue is that on some hardware (a specific rack-mount PC with a PICMG > > daughtercard on a backplane containing PCI and PCIe slots) I get an > > INTR-REMAP error when trying to receive legacy (not MSI) interrupts from a > > custom FPGA-based PCI card using a UDD driver. The card did work properly > > in one out of the five PCI slots on that machine, but UDD interrupts did > > not work in the other four slots. > > Please review the original thread for more details about the specific error. > > Here are a few more tidbits I have gathered: > > > > - The UDD driver / userspace code works fine on the other hardware > > > > - The UDD driver / userspace code works fine in one PCI slot out of five > > on this hardware. > > > > - With another backplane model, but same processor card, the problem > > occurs in all four of the PCI slots. > > > > - An almost identical pure-linux UIO version of the driver / userspace > > code works in all the cases I tested, even when the UDD version fails, and > > even with the same xenomai-patched kernel used for UDD testing. > > > > > > In one of the previous posts in this thread a few months ago, Per Öberg > > mentioned experiencing something similar. Based on the information that was > > shared, I tried my code with linux version 4.9.38, but it still failed. > > This prompted me to try other linux / ipipe / xenomai combinations. These > > are my findings: > > Interrupts work: > > xenomai-2.6.5 ipipe-core-3.18.20-x86-7.patch (2016-07-05) > > xenomai-3.0.9+ ipipe-core-3.18.20-x86-7.patch (2016-07-05) > > xenomai-3.0.9+ ipipe-core-4.1.18-x86-9.patch (2017-05-25) > > INTR-REMAP error: > > xenomai-3.0.9+ ipipe-core-4.4.43-x86-6.patch (2017-02-25) > > xenomai-3.0.9+ ipipe-core-4.4.43-x86-7.patch (2017-05-25) > > xenomai-3.0.9+ ipipe-core-4.4.43-x86-8.patch (2017-06-14) > > xenomai-3.1-rc3 ipipe-core-4.4.196-cip38-x86-19.patch (2019-11-04) > > xenomai-3.0.9+ ipipe-core-4.9.38-x86-4.patch (2017-10-03) > > xenomai-3.0.9 ipipe-core-4.14.132-x86-6.patch (2019-07-03) > > The Xenomai 2.6.5 version of course does not use UDD, but uses the old > > pthread_intr_* userspace functions. > > Hopefully this additional information can shed a little light on the matter. > > This sounds like some RT interrupt enabling issue related to the IOAPIC > in the x86 I-pipe patch. Please also test 4.19. Ok, I will do this. > Are you using UDD_IRQ_CUSTOM or do you leave the interrupt registration > to the UDD core? I just tell UDD the IRQ number and let it register the interrupt. > And please share your kernel config. I attached one to my original post earlier this year -- you should be able to download it from the link in the mailing list archive. Let me know if you need something different. I started with the standard Ubuntu desktop kernel config and tweaked options from there, so there is a lot of stuff enabled, obviously. > BTW, interrupt remapping issues can be worked around by disabling the > interrupt remapping feature (e.g. "intremap=off"). But that does not > solve the unterlying issue, of course. I can't remember if I tried this or not. I will give it a go. Obviously, it would be good to get this fixed in the patch, though. Thank you (and Per Öberg) for your help. -Jeff