Re: [Xenomai] RTDM serial illicit call from head domain 'Xenomai'

2018-05-13 Thread Steve Freyder
   Did a build with that fix, and then installed it and ran the same
   test program - no more traceback - excellent!
   Thanks!
   Steve

   On 5/13/2018 12:53 PM, Philippe Gerum wrote:

On 05/13/2018 07:37 PM, Greg Gallagher wrote:

I can take a look at it this week. I'd like to trace through some of the drivers
 to get a better understanding of when or if they cause a domain switch.


I pushed a quick fix for the i.MX serial driver, but I believe that a
general review of the oldish drivers we have there would be very useful.
___
Xenomai mailing list
Xenomai@xenomai.org
https://xenomai.org/mailman/listinfo/xenomai


Re: [Xenomai] RTDM serial illicit call from head domain 'Xenomai'

2018-05-13 Thread Philippe Gerum
On 05/13/2018 07:37 PM, Greg Gallagher wrote:
> I can take a look at it this week. I'd like to trace through some of the 
> drivers to get a better understanding of when or if they cause a domain 
> switch.
> 

I pushed a quick fix for the i.MX serial driver, but I believe that a
general review of the oldish drivers we have there would be very useful.

-- 
Philippe.

___
Xenomai mailing list
Xenomai@xenomai.org
https://xenomai.org/mailman/listinfo/xenomai


Re: [Xenomai] RTDM serial illicit call from head domain 'Xenomai'

2018-05-13 Thread Greg Gallagher
I can take a look at it this week. I'd like to trace through some of the 
drivers to get a better understanding of when or if they cause a domain switch.

  Original Message  
From: r...@xenomai.org
Sent: May 13, 2018 10:16 AM
To: g...@embeddedgreg.com; st...@freyder.net
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai] RTDM serial illicit call from head domain 'Xenomai'

On 05/13/2018 07:02 AM, Greg Gallagher wrote:
> I won't say the driver is improperly written, this ioctl call may not
> be expected to be used in a low latency situation.  Some code maybe
> expected to be called at initialization time and then never again so
> it doesn't impact RT operation.  The rt_imx_uart driver is part of the
> Xenomai code base, I'm not 100% sure if this functions shouldn't be
> called from an RTDM driver or if the author needs to be aware they
> could just introduce latency when used in the rt context.
> These panics aren't kernel or driver errors, we are creating them on
> purpose to try to track down domain switches which could impact
> latency.  So as long as your latency number are acceptable you don't
> need to enable the debug flag to track down domain switches and these
> panics can be ignored for now.
>

I would recommend to fix the situation asap in the RTDM driver though
[1], because this may actually lead to crashes.

Those debug checks trap invalid re-entries of the regular kernel code
from the co-kernel (rt) context. This is an illustration of a possible
scenario involving such a bad re-entry:

   /* regular kernel context */
   spin_lock_irqsave(&lock, flags);
  
 handle_rt_event();
    
    /* rt thread context. */
    ioctl(imx_fd, ...);
   /* invalid call of regular kernel code */
   spin_lock_irqsave(&lock, flags);
   

IOW, we have to keep in mind that any IRQ routed to the real-time
co-kernel may preempt most of the regular kernel code, even though the
latter has [only virtually] disabled interrupts: making this event
possible is the purpose of the interrupt pipeline. The restriction which
is paired to this feature is that we may not re-enter the preempted
code, because the logic there still rightfully assumes this may not happen.

As a matter of fact, the interrupt pipeline potentially turns any device
IRQ as a NMI from the standpoint of a Linux kernel. In addition, the
co-kernel has its own scheduler to manage rt threads. In short, rt
activities are not serialized by the regular kernel locking scheme.

[1]
diff --git a/kernel/drivers/serial/rt_imx_uart.c
b/kernel/drivers/serial/rt_imx_uart.c
index 61836ae09..1aec219b7 100644
--- a/kernel/drivers/serial/rt_imx_uart.c
+++ b/kernel/drivers/serial/rt_imx_uart.c
@@ -963,6 +963,13 @@ static int rt_imx_uart_ioctl(struct rtdm_fd *fd,
struct rtser_config config_buf;
uint64_t *hist_buf = NULL;

+   /*
+   * We may call regular kernel services ahead, ask for
+   * re-entering secondary mode if need be.
+   */
+   if (rtdm_in_rt_context())
+   return -ENOSYS;
+
config = (struct rtser_config *)arg;

if (rtdm_fd_is_user(fd)) {
@@ -984,13 +991,6 @@ static int rt_imx_uart_ioctl(struct rtdm_fd *fd,
return -EINVAL;

if (config->config_mask & RTSER_SET_TIMESTAMP_HISTORY) {
-   /*
-   * Reflect the call to non-RT as we will likely
-   * allocate or free the buffer.
-   */
-   if (rtdm_in_rt_context())
-   return -ENOSYS;
-
if (config->timestamp_history &
RTSER_RX_TIMESTAMP_HISTORY)
hist_buf = kmalloc(IN_BUFFER_SIZE *
@@ -1000,7 +1000,8 @@ static int rt_imx_uart_ioctl(struct rtdm_fd *fd,

rt_imx_uart_set_config(ctx, config, &hist_buf);

-   kfree(hist_buf);
+   if (hist_buf)
+   kfree(hist_buf);
break;
}

-- 
Philippe.
___
Xenomai mailing list
Xenomai@xenomai.org
https://xenomai.org/mailman/listinfo/xenomai


Re: [Xenomai] RTDM serial illicit call from head domain 'Xenomai'

2018-05-13 Thread Philippe Gerum
On 05/13/2018 07:02 AM, Greg Gallagher wrote:
> I won't say the driver is improperly written, this ioctl call may not
> be expected to be used in a low latency situation.  Some code maybe
> expected to be called at initialization time and then never again so
> it doesn't impact RT operation.  The rt_imx_uart driver is part of the
> Xenomai code base, I'm not 100% sure if this functions shouldn't be
> called from an RTDM driver or if the author needs to be aware they
> could just introduce latency when used in the rt context.
> These panics aren't kernel or driver errors, we are creating them on
> purpose to try to track down domain switches which could impact
> latency.  So as long as your latency number are acceptable you don't
> need to enable the debug flag to track down domain switches and these
> panics can be ignored for now.
> 

I would recommend to fix the situation asap in the RTDM driver though
[1], because this may actually lead to crashes.

Those debug checks trap invalid re-entries of the regular kernel code
from the co-kernel (rt) context. This is an illustration of a possible
scenario involving such a bad re-entry:

   /* regular kernel context */
   spin_lock_irqsave(&lock, flags);
  
 handle_rt_event();

/* rt thread context. */
ioctl(imx_fd, ...);
   /* invalid call of regular kernel code */
   spin_lock_irqsave(&lock, flags);
   

IOW, we have to keep in mind that any IRQ routed to the real-time
co-kernel may preempt most of the regular kernel code, even though the
latter has [only virtually] disabled interrupts: making this event
possible is the purpose of the interrupt pipeline. The restriction which
is paired to this feature is that we may not re-enter the preempted
code, because the logic there still rightfully assumes this may not happen.

As a matter of fact, the interrupt pipeline potentially turns any device
IRQ as a NMI from the standpoint of a Linux kernel. In addition, the
co-kernel has its own scheduler to manage rt threads. In short, rt
activities are not serialized by the regular kernel locking scheme.

[1]
diff --git a/kernel/drivers/serial/rt_imx_uart.c
b/kernel/drivers/serial/rt_imx_uart.c
index 61836ae09..1aec219b7 100644
--- a/kernel/drivers/serial/rt_imx_uart.c
+++ b/kernel/drivers/serial/rt_imx_uart.c
@@ -963,6 +963,13 @@ static int rt_imx_uart_ioctl(struct rtdm_fd *fd,
struct rtser_config config_buf;
uint64_t *hist_buf = NULL;

+   /*
+* We may call regular kernel services ahead, ask for
+* re-entering secondary mode if need be.
+*/
+   if (rtdm_in_rt_context())
+   return -ENOSYS;
+
config = (struct rtser_config *)arg;

if (rtdm_fd_is_user(fd)) {
@@ -984,13 +991,6 @@ static int rt_imx_uart_ioctl(struct rtdm_fd *fd,
return -EINVAL;

if (config->config_mask & RTSER_SET_TIMESTAMP_HISTORY) {
-   /*
-* Reflect the call to non-RT as we will likely
-* allocate or free the buffer.
-*/
-   if (rtdm_in_rt_context())
-   return -ENOSYS;
-
if (config->timestamp_history &
RTSER_RX_TIMESTAMP_HISTORY)
hist_buf = kmalloc(IN_BUFFER_SIZE *
@@ -1000,7 +1000,8 @@ static int rt_imx_uart_ioctl(struct rtdm_fd *fd,

rt_imx_uart_set_config(ctx, config, &hist_buf);

-   kfree(hist_buf);
+   if (hist_buf)
+   kfree(hist_buf);
break;
}


-- 
Philippe.

___
Xenomai mailing list
Xenomai@xenomai.org
https://xenomai.org/mailman/listinfo/xenomai


Re: [Xenomai] RTDM serial illicit call from head domain 'Xenomai'

2018-05-12 Thread Greg Gallagher
I won't say the driver is improperly written, this ioctl call may not
be expected to be used in a low latency situation.  Some code maybe
expected to be called at initialization time and then never again so
it doesn't impact RT operation.  The rt_imx_uart driver is part of the
Xenomai code base, I'm not 100% sure if this functions shouldn't be
called from an RTDM driver or if the author needs to be aware they
could just introduce latency when used in the rt context.
These panics aren't kernel or driver errors, we are creating them on
purpose to try to track down domain switches which could impact
latency.  So as long as your latency number are acceptable you don't
need to enable the debug flag to track down domain switches and these
panics can be ignored for now.

-Greg

On Sun, May 13, 2018 at 12:07 AM, Steve Freyder  wrote:
> So the fundamental issue here seems to be "how bad is bad enough" when it
> comes to these mode switches.
>
> The write() call is wrapped with __RT(write)(...), so I assume it is doing
> an RTDM-based write request, and not a standard Linux write() syscall.  If
> I remove that wrapper, I get an EPERM error from the un-wrapped write()
> call.
>
> Had I not been running with a debug kernel,
> this would never have shown up at all as far as I have seen.  Perhaps this
> imx_uart driver is improperly coded - I assume it is not part of the
> standard
> Xenomai code base, is that correct?  Perhaps the writers of this driver
> improperly mixed functions that cause the code path inside the RTDM calling
> sequence to invoke code that should not really be getting invoked
> "by convention" but due to the specifics, it's known to be benign overall,
> and so when the debug code is enabled, it results in a false-positive detect
> of a generic problem scenario that in this specific case is truly always
> 100% benign.  If that's true, then I'm going to take this off my list of
> things to concern myself about.  It's only recently that we've been running
> kernels with this debug capability enabled, so I've never seen this before
> and we have never had any issues with it at all so that tends to suggest
> this
> is truly benign for these specific code paths.
>
> Thanks Greg,
> Steve
>
>
>
>
> On 5/12/2018 9:30 PM, Greg Gallagher wrote:
>
> I'll try to answer part of this. The detection of a cross domain call
> would come form the ipipe code in the kernel.  This is being called
> because the ipipe debug flags are on and it's detecting the switch
> from the root domain and then causing a panic so we can see the stack
> trace.
> I'm not sure why the ioctl call is causing this.  It looks like when
> we try to get the clock rate we start to access a normal Linux service
> which causes the stall and triggers the panic you see in your logs.
> I'm assuming in your write you aren't accessing a normal Linux
> resource so we don't see the attempt to switch domains and therefore
> no panic.
> Panics in general happen because the system is in a "bad" state, in
> this scenario it's not really "bad" we are just detecting the mode
> switch and then getting enough information to fix the issue.  This is
> why your system seems sane, but if you hit a worse panic then your
> system may not be stable enough to do anything.
>
> -Greg
>
> On Sat, May 12, 2018 at 5:53 PM, Steve Freyder  wrote:
>
> Greetings again,
>
> Xenomai 3.0.6, armv7, imx6, imx_uart rtdm driver
>
> I've seen many postings about this, and about symbol wrapping, etc, etc.
> I'm still
> not understanding something very basic here, I'm sure.
>
> When I run a program built with --alchemy (no --posix) skin, and I execute
> these lines
> of code (error checking is omitted here but being done in the real program
> and not failing):
>
> #define SER_BAUD9600/**< Baud rate for SYNC interface */
> #define SYNC_DEVICE "rtser0"/**< serial device used for SYNC */
>
> static const struct rtser_config sync_config = {
> .config_mask   = 0x,
> .baud_rate = SER_BAUD,
> .parity= RTSER_NO_PARITY,
> .data_bits = RTSER_8_BITS,
> .stop_bits = RTSER_1_STOPB,
> .handshake = RTSER_NO_HAND,
> .fifo_depth= RTSER_FIFO_DEPTH_1,
> .rx_timeout= RTSER_TIMEOUT_NONE,
> .tx_timeout= 1e9,
> .event_timeout = 1e9,
> .timestamp_history = RTSER_DEF_TIMESTAMP_HISTORY,
> .event_mask= RTSER_EVENT_RXPEND,
> };
>
> fd = __RT(open)(SYNC_DEVICE,0) ;
>
> err = __RT(ioctl)(fd, RTSER_RTIOC_SET_CONFIG, &sync_config);
>
> I get this traceback (once only per system boot):
>
> 
> [  411.088376] I-pipe: Detected illicit call from head domain 'Xenomai'
> [  411.088376] into a regular Linux service
> [  411.100666] CPU: 1 PID: 875 Comm: rtserE Not tainted
> 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #1
> [  411.10964

Re: [Xenomai] RTDM serial illicit call from head domain 'Xenomai'

2018-05-12 Thread Steve Freyder
   So the fundamental issue here seems to be "how bad is bad enough" when
   it
   comes to these mode switches.
   The write() call is wrapped with __RT(write)(...), so I assume it is
   doing
   an RTDM-based write request, and not a standard Linux write() syscall.
   If
   I remove that wrapper, I get an EPERM error from the un-wrapped write()
   call.
   Had I not been running with a debug kernel,
   this would never have shown up at all as far as I have seen.  Perhaps
   this
   imx_uart driver is improperly coded - I assume it is not part of the
   standard
   Xenomai code base, is that correct?  Perhaps the writers of this driver
   improperly mixed functions that cause the code path inside the RTDM
   calling
   sequence to invoke code that should not really be getting invoked
   "by convention" but due to the specifics, it's known to be benign
   overall,
   and so when the debug code is enabled, it results in a false-positive
   detect
   of a generic problem scenario that in this specific case is truly
   always
   100% benign.  If that's true, then I'm going to take this off my list
   of
   things to concern myself about.  It's only recently that we've been
   running
   kernels with this debug capability enabled, so I've never seen this
   before
   and we have never had any issues with it at all so that tends to
   suggest this
   is truly benign for these specific code paths.
   Thanks Greg,
   Steve

   On 5/12/2018 9:30 PM, Greg Gallagher wrote:

I'll try to answer part of this. The detection of a cross domain call
would come form the ipipe code in the kernel.  This is being called
because the ipipe debug flags are on and it's detecting the switch
from the root domain and then causing a panic so we can see the stack
trace.
I'm not sure why the ioctl call is causing this.  It looks like when
we try to get the clock rate we start to access a normal Linux service
which causes the stall and triggers the panic you see in your logs.
I'm assuming in your write you aren't accessing a normal Linux
resource so we don't see the attempt to switch domains and therefore
no panic.
Panics in general happen because the system is in a "bad" state, in
this scenario it's not really "bad" we are just detecting the mode
switch and then getting enough information to fix the issue.  This is
why your system seems sane, but if you hit a worse panic then your
system may not be stable enough to do anything.

-Greg

On Sat, May 12, 2018 at 5:53 PM, Steve Freyder [1] wrote:

Greetings again,

Xenomai 3.0.6, armv7, imx6, imx_uart rtdm driver

I've seen many postings about this, and about symbol wrapping, etc, etc.
I'm still
not understanding something very basic here, I'm sure.

When I run a program built with --alchemy (no --posix) skin, and I execute
these lines
of code (error checking is omitted here but being done in the real program
and not failing):

#define SER_BAUD9600/**< Baud rate for SYNC interface */
#define SYNC_DEVICE "rtser0"/**< serial device used for SYNC */

static const struct rtser_config sync_config = {
.config_mask   = 0x,
.baud_rate = SER_BAUD,
.parity= RTSER_NO_PARITY,
.data_bits = RTSER_8_BITS,
.stop_bits = RTSER_1_STOPB,
.handshake = RTSER_NO_HAND,
.fifo_depth= RTSER_FIFO_DEPTH_1,
.rx_timeout= RTSER_TIMEOUT_NONE,
.tx_timeout= 1e9,
.event_timeout = 1e9,
.timestamp_history = RTSER_DEF_TIMESTAMP_HISTORY,
.event_mask= RTSER_EVENT_RXPEND,
};

fd = __RT(open)(SYNC_DEVICE,0) ;

err = __RT(ioctl)(fd, RTSER_RTIOC_SET_CONFIG, &sync_config);

I get this traceback (once only per system boot):


[  411.088376] I-pipe: Detected illicit call from head domain 'Xenomai'
[  411.088376] into a regular Linux service
[  411.100666] CPU: 1 PID: 875 Comm: rtserE Not tainted
4.1.18_C01571-15S00-00.000.zimg+83fdace666 #1
[  411.109644] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  411.116189] Backtrace:
[  411.118694] [<80014a64>] (dump_backtrace) from [<80014c9c>]
(show_stack+0x20/0x24)
[  411.126280]  r7: r6:0080 r5: r4:80b81c94
[  411.132072] [<80014c7c>] (show_stack) from [<806b5f3c>]
(dump_stack+0xa0/0xc4)
[  411.139326] [<806b5e9c>] (dump_stack) from [<800ab000>]
(ipipe_root_only+0x11c/0x188)
[  411.147171]  r9:80c58300 r8: r7:80c45380 r6:80b34e6c r5:600d0013
r4:809abba4
[  411.155073] [<800aaee4>] (ipipe_root_only) from [<8001f5ac>]
(ipipe_test_and_stall_root+0x18/0xc0)
[  411.164046]  r10:bc5c0024 r9: r8:40480201 r7:0005 r6:2580
r5:80bc154c
[  411.172023]  r4:80ba5c9c r3:
[  411.175675] [<8001f594>] (ipipe_test_and_stall_root) from [<806b8274>]
(mutex_trylock+0x40/0x1ec)
[  411.184561]  r7:0005 r6:2580 r5:80bc154c r4:80ba5c9c
[  411.190358] 

Re: [Xenomai] RTDM serial illicit call from head domain 'Xenomai'

2018-05-12 Thread Greg Gallagher
I'll try to answer part of this. The detection of a cross domain call
would come form the ipipe code in the kernel.  This is being called
because the ipipe debug flags are on and it's detecting the switch
from the root domain and then causing a panic so we can see the stack
trace.
I'm not sure why the ioctl call is causing this.  It looks like when
we try to get the clock rate we start to access a normal Linux service
which causes the stall and triggers the panic you see in your logs.
I'm assuming in your write you aren't accessing a normal Linux
resource so we don't see the attempt to switch domains and therefore
no panic.
Panics in general happen because the system is in a "bad" state, in
this scenario it's not really "bad" we are just detecting the mode
switch and then getting enough information to fix the issue.  This is
why your system seems sane, but if you hit a worse panic then your
system may not be stable enough to do anything.

-Greg

On Sat, May 12, 2018 at 5:53 PM, Steve Freyder  wrote:
> Greetings again,
>
> Xenomai 3.0.6, armv7, imx6, imx_uart rtdm driver
>
> I've seen many postings about this, and about symbol wrapping, etc, etc.
> I'm still
> not understanding something very basic here, I'm sure.
>
> When I run a program built with --alchemy (no --posix) skin, and I execute
> these lines
> of code (error checking is omitted here but being done in the real program
> and not failing):
>
> #define SER_BAUD9600/**< Baud rate for SYNC interface */
> #define SYNC_DEVICE "rtser0"/**< serial device used for SYNC */
>
> static const struct rtser_config sync_config = {
> .config_mask   = 0x,
> .baud_rate = SER_BAUD,
> .parity= RTSER_NO_PARITY,
> .data_bits = RTSER_8_BITS,
> .stop_bits = RTSER_1_STOPB,
> .handshake = RTSER_NO_HAND,
> .fifo_depth= RTSER_FIFO_DEPTH_1,
> .rx_timeout= RTSER_TIMEOUT_NONE,
> .tx_timeout= 1e9,
> .event_timeout = 1e9,
> .timestamp_history = RTSER_DEF_TIMESTAMP_HISTORY,
> .event_mask= RTSER_EVENT_RXPEND,
> };
>
> fd = __RT(open)(SYNC_DEVICE,0) ;
>
> err = __RT(ioctl)(fd, RTSER_RTIOC_SET_CONFIG, &sync_config);
>
> I get this traceback (once only per system boot):
>
> 
> [  411.088376] I-pipe: Detected illicit call from head domain 'Xenomai'
> [  411.088376] into a regular Linux service
> [  411.100666] CPU: 1 PID: 875 Comm: rtserE Not tainted
> 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #1
> [  411.109644] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [  411.116189] Backtrace:
> [  411.118694] [<80014a64>] (dump_backtrace) from [<80014c9c>]
> (show_stack+0x20/0x24)
> [  411.126280]  r7: r6:0080 r5: r4:80b81c94
> [  411.132072] [<80014c7c>] (show_stack) from [<806b5f3c>]
> (dump_stack+0xa0/0xc4)
> [  411.139326] [<806b5e9c>] (dump_stack) from [<800ab000>]
> (ipipe_root_only+0x11c/0x188)
> [  411.147171]  r9:80c58300 r8: r7:80c45380 r6:80b34e6c r5:600d0013
> r4:809abba4
> [  411.155073] [<800aaee4>] (ipipe_root_only) from [<8001f5ac>]
> (ipipe_test_and_stall_root+0x18/0xc0)
> [  411.164046]  r10:bc5c0024 r9: r8:40480201 r7:0005 r6:2580
> r5:80bc154c
> [  411.172023]  r4:80ba5c9c r3:
> [  411.175675] [<8001f594>] (ipipe_test_and_stall_root) from [<806b8274>]
> (mutex_trylock+0x40/0x1ec)
> [  411.184561]  r7:0005 r6:2580 r5:80bc154c r4:80ba5c9c
> [  411.190358] [<806b8234>] (mutex_trylock) from [<80580d78>]
> (clk_prepare_lock+0x1c/0xfc)
> [  411.198376]  r7:0005 r6:2580 r5:bece7e50 r4:bec36480
> [  411.204164] [<80580d5c>] (clk_prepare_lock) from [<80581e8c>]
> (clk_core_get_rate+0x1c/0x70)
> [  411.212530]  r5:bece7e50 r4:bec36480
> [  411.216180] [<80581e70>] (clk_core_get_rate) from [<80581f04>]
> (clk_get_rate+0x24/0x28)
> [  411.224198]  r5:bece7e50 r4:bc5c2000
> [  411.227863] [<80581ee0>] (clk_get_rate) from [<7f08d2c4>]
> (rt_imx_uart_ioctl+0xa88/0xe5c [xeno_imx_uart])
> [  411.237464] [<7f08c83c>] (rt_imx_uart_ioctl [xeno_imx_uart]) from
> [<8010779c>] (rtdm_fd_ioctl+0xc0/0x218)
> [  411.247048]  r10:00011638 r9: r8:40480201 r7:0005 r6:bc5c
> r5:600d0013
> [  411.255025]  r4:80c58300
> [  411.257609] [<801076e0>] (rtdm_fd_ioctl) from [<8010dc70>]
> (CoBaLt_ioctl+0x18/0x1c)
> [  411.265280]  r3:00011638 r2:00011638 r1:40480201
> [  411.269989]  r10:bf648800 r9:c0943008 r8:8010dc58 r7:80b34e6c r6:0001
> r5:0052
> [  411.277964]  r4:bece7fb0
> [  411.280548] [<8010dc58>] (CoBaLt_ioctl) from [<8011efc4>]
> (ipipe_syscall_hook+0x174/0x380)
> [  411.288839] [<8011ee50>] (ipipe_syscall_hook) from [<800ad6d8>]
> (__ipipe_notify_syscall+0xa4/0x3e0)
> [  411.297899]  r10:bf648800 r9:80c45380 r8:80b34e6c r7:bf649800 r6:80c45380
> r5:0001
> [  411.305873]  r4:200d0013
> [  411.30

[Xenomai] RTDM serial illicit call from head domain 'Xenomai'

2018-05-12 Thread Steve Freyder

Greetings again,

Xenomai 3.0.6, armv7, imx6, imx_uart rtdm driver

I've seen many postings about this, and about symbol wrapping, etc, 
etc.  I'm still

not understanding something very basic here, I'm sure.

When I run a program built with --alchemy (no --posix) skin, and I 
execute these lines
of code (error checking is omitted here but being done in the real 
program and not failing):


#define SER_BAUD9600/**< Baud rate for SYNC interface */
#define SYNC_DEVICE "rtser0"/**< serial device used for SYNC */

static const struct rtser_config sync_config = {
.config_mask   = 0x,
.baud_rate = SER_BAUD,
.parity= RTSER_NO_PARITY,
.data_bits = RTSER_8_BITS,
.stop_bits = RTSER_1_STOPB,
.handshake = RTSER_NO_HAND,
.fifo_depth= RTSER_FIFO_DEPTH_1,
.rx_timeout= RTSER_TIMEOUT_NONE,
.tx_timeout= 1e9,
.event_timeout = 1e9,
.timestamp_history = RTSER_DEF_TIMESTAMP_HISTORY,
.event_mask= RTSER_EVENT_RXPEND,
};

fd = __RT(open)(SYNC_DEVICE,0) ;

err = __RT(ioctl)(fd, RTSER_RTIOC_SET_CONFIG, &sync_config);

I get this traceback (once only per system boot):


[  411.088376] I-pipe: Detected illicit call from head domain 'Xenomai'
[  411.088376] into a regular Linux service
[  411.100666] CPU: 1 PID: 875 Comm: rtserE Not tainted 
4.1.18_C01571-15S00-00.000.zimg+83fdace666 #1

[  411.109644] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  411.116189] Backtrace:
[  411.118694] [<80014a64>] (dump_backtrace) from [<80014c9c>] 
(show_stack+0x20/0x24)

[  411.126280]  r7: r6:0080 r5: r4:80b81c94
[  411.132072] [<80014c7c>] (show_stack) from [<806b5f3c>] 
(dump_stack+0xa0/0xc4)
[  411.139326] [<806b5e9c>] (dump_stack) from [<800ab000>] 
(ipipe_root_only+0x11c/0x188)
[  411.147171]  r9:80c58300 r8: r7:80c45380 r6:80b34e6c 
r5:600d0013 r4:809abba4
[  411.155073] [<800aaee4>] (ipipe_root_only) from [<8001f5ac>] 
(ipipe_test_and_stall_root+0x18/0xc0)
[  411.164046]  r10:bc5c0024 r9: r8:40480201 r7:0005 
r6:2580 r5:80bc154c

[  411.172023]  r4:80ba5c9c r3:
[  411.175675] [<8001f594>] (ipipe_test_and_stall_root) from 
[<806b8274>] (mutex_trylock+0x40/0x1ec)

[  411.184561]  r7:0005 r6:2580 r5:80bc154c r4:80ba5c9c
[  411.190358] [<806b8234>] (mutex_trylock) from [<80580d78>] 
(clk_prepare_lock+0x1c/0xfc)

[  411.198376]  r7:0005 r6:2580 r5:bece7e50 r4:bec36480
[  411.204164] [<80580d5c>] (clk_prepare_lock) from [<80581e8c>] 
(clk_core_get_rate+0x1c/0x70)

[  411.212530]  r5:bece7e50 r4:bec36480
[  411.216180] [<80581e70>] (clk_core_get_rate) from [<80581f04>] 
(clk_get_rate+0x24/0x28)

[  411.224198]  r5:bece7e50 r4:bc5c2000
[  411.227863] [<80581ee0>] (clk_get_rate) from [<7f08d2c4>] 
(rt_imx_uart_ioctl+0xa88/0xe5c [xeno_imx_uart])
[  411.237464] [<7f08c83c>] (rt_imx_uart_ioctl [xeno_imx_uart]) from 
[<8010779c>] (rtdm_fd_ioctl+0xc0/0x218)
[  411.247048]  r10:00011638 r9: r8:40480201 r7:0005 
r6:bc5c r5:600d0013

[  411.255025]  r4:80c58300
[  411.257609] [<801076e0>] (rtdm_fd_ioctl) from [<8010dc70>] 
(CoBaLt_ioctl+0x18/0x1c)

[  411.265280]  r3:00011638 r2:00011638 r1:40480201
[  411.269989]  r10:bf648800 r9:c0943008 r8:8010dc58 r7:80b34e6c 
r6:0001 r5:0052

[  411.277964]  r4:bece7fb0
[  411.280548] [<8010dc58>] (CoBaLt_ioctl) from [<8011efc4>] 
(ipipe_syscall_hook+0x174/0x380)
[  411.288839] [<8011ee50>] (ipipe_syscall_hook) from [<800ad6d8>] 
(__ipipe_notify_syscall+0xa4/0x3e0)
[  411.297899]  r10:bf648800 r9:80c45380 r8:80b34e6c r7:bf649800 
r6:80c45380 r5:0001

[  411.305873]  r4:200d0013
[  411.308464] [<800ad634>] (__ipipe_notify_syscall) from [<80010868>] 
(pipeline_syscall+0x8/0x24)
[  411.317177]  r10:0002 r9:bece6000 r8:80010928 r7:000f0042 
r6:0005 r5:40480201

[  411.325153]  r4:00011638


If I do not execute the ioctl call, and I instead call:

err = __RT(write)(fd,"x",1) ;

I do not get the traceback, and the write is successful.  This tells me 
that ioctl() path
has some kind of check in it that the write() path doesn't have.  Is the 
detection of a
cross-domain call something that an RTDM driver is doing or is this 
something at a higher

level making these checks?

What's more, I've seen many comments that this is a problem scenario, 
and that it will put
the system into a "bad state".  But all of my testing says that this is 
completely benign
and everything is working as I expect it to.  It can't be both ways - 
which way is it, and

why?

Thanks in advance,
Steve


___
Xenomai mailing list
Xenomai@xenomai.org
https://xenomai.org/mailman/listinfo/xenomai