Re: hith latency when cpu is fully loaded

2019-11-18 Thread chensong via Xenomai




On 2019年11月19日 14:49, Jan Kiszka wrote:

On 19.11.19 02:01, chensong via Xenomai wrote:

Dear experts,

i'm new in xenomai, i got an issue, here is the detail:

Main processor architect: ARM64 phytium ft2000ahk
Kernel release number: 4.14.4
cmdline:BOOT_IMAGE=/Image-tmp
root=UUID=9fea0634-a9c9-4e9f-906c-9c36b7249822 console=ttyS1,115200
earlyprintk=uart8250-32bit,0x28001000 rw rootdelay=10 KEYBOARDTYPE=pc
KEYTABLE=us security=
xenomai release number:3.1-devel
xenomai configuration:
kylin@kylin-os:~/workspace/code/nudt-hgj-xenomai-tjrd$ grep
configure config.status
# Generated by configure.
   # Compiler output produced by configure, useful for debugging
# configure, is in config.log if it exists.
configured by ./configure, generated by GNU Autoconf 2.69,
   ac_configure_extra_args=
   ac_configure_extra_args="$ac_configure_extra_args --silent"
   set X /bin/bash './configure'  '--with-core=cobalt'
'--enable-smp' '--enable-pshared' $ac_configure_extra_args --no-create
--no-recursion
   configure_time_dlsearch_path='/lib /usr/lib
/lib/aarch64-linux-gnu /usr/lib/aarch64-linux-gnu
/usr/lib/aarch64-linux-gnu/mesa-egl /usr/lib/aarch64-linux-gnu/mesa
/usr/local/lib '
   configure_time_lt_sys_library_path=''
  for var in reload_cmds old_postinstall_cmds
old_postuninstall_cmds old_archive_cmds extract_expsyms_cmds
old_archive_from_new_cmds old_archive_from_expsyms_cmds archive_cmds
archive_expsym_cmds module_cmds module_expsym_cmds export_symbols_cmds
prelink_cmds postlink_cmds postinstall_cmds postuninstall_cmds
finish_cmds sys_lib_search_path_spec configure_time_dlsearch_path
configure_time_lt_sys_library_path; do
   # on some systems where configure will not decide to define it.
   # Let's still pretend it is `configure' which instantiates
(i.e., don't
   configure_input='Generated from '`
`' by configure.'
   configure_input="$ac_file.  $configure_input"
case $configure_input in #(
ac_sed_conf_input=`$as_echo "$configure_input" |
*) ac_sed_conf_input=$configure_input;;
 s|@configure_input@|$ac_sed_conf_input|;t t
   $as_echo "/* $configure_input  */" \
 $as_echo "/* $configure_input  */" \
  # Libtool was configured on host `(hostname || uname -n)
2>/dev/null | sed 1q`:
   : \${LT_SYS_LIBRARY_PATH="$configure_time_lt_sys_library_path"}
sys_lib_dlsearch_path_spec=$lt_configure_time_dlsearch_path
   # Explicit LT_SYS_LIBRARY_PATH set during ./configure time.
configure_time_lt_sys_library_path=$lt_configure_time_lt_sys_library_path

OR:
kylin@kylin-os:~$ xeno-config --info
 Xenomai version: Xenomai/cobalt v3.1-devel -- # ()
Linux kylin-os 4.14.4.kylin.rt-1118-ipipe-trace+ #2 SMP
PREEMPT Mon Nov 18 18:28:17 CST 2019 aarch64 aarch64 aarch64 GNU/Linux
Kernel parameters: BOOT_IMAGE=/Image-tmp
root=UUID=9fea0634-a9c9-4e9f-906c-9c36b7249822 console=ttyS1,115200
earlyprintk=uart8250-32bit,0x28001000 rw rootdelay=10 KEYBOARDTYPE=pc
KEYTABLE=us security=
I-pipe release #2 detected
Cobalt core 3.1-devel detected
Compiler: gcc version 5.4.0 20160609 (Ubuntu/Linaro
5.4.0-6kord1~16.04.10)
Build args: --prefix=/usr --includedir=/usr/include/xenomai
--mandir=/usr/share/man --with-testdir=/usr/lib/xenomai/testsuite
--enable-smp --build aarch64-linux-gnu build_alias=aarch64-linux-gnu

Desktop: kylin 4.0.2 (ubuntu likely desktop)

Issue description:
   latency and cyclictest work fine in my system in most of cases,
the worst latency is around 100us ~ 200us. however, when i ran a
script to increase the cpu load in the system, the worst latency
reached 2000us ~ 5000us or even worse. Basically, the script forks 6
processes by default and each process applies a four-pages buffer and
keeps writing without any breath, no warning or error messages in
dmesg. below is the script:

#include 
#include 
#include 
#include 
#include 
#include 

#define PAGE_SIZE 4096
#define TEST_THREADS 6

unsigned int test_threads;

void do_thread_test(void)
{
 void *mm;
 char i = 0;

 printf("mem test thread start \n");
 mm = malloc(PAGE_SIZE * 4); // 1M
 while(1) {
 for (i = 0; i<100; i++)
 memset(mm, i, PAGE_SIZE * 4);
 }


You cannot run Xenomai threads at 100% on Linux. You need to leave some
time for the rest of the system to do housekeeping. That explains the
"deadlock" you see. If you turn on CONFIG_XENO_OPT_WATCHDOG, it will
detect such mistakes and kick the task out of RT.

Jan



"do_thread_test" is not an RT task, it's running in linux domain and 
latency running in xenomai domain was affected.


/Song





Re: Deadlock during debugging

2019-11-18 Thread Jan Kiszka via Xenomai

On 18.11.19 18:31, Lange Norbert wrote:




-Original Message-
From: Jan Kiszka 
Sent: Montag, 18. November 2019 18:22
To: Lange Norbert ; Xenomai
(xenomai@xenomai.org) 
Subject: Re: Deadlock during debugging

NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR
ATTACHMENTS.


On 18.11.19 17:24, Lange Norbert via Xenomai wrote:

One more,

Note that there seem to be quite different reports, from a recursive fault

to some threads getting marked as "runaway".

I can reproduce the issue now easily, but its proprietary software I cant

reach around.

Understood. Will try to read something from the traces.

This is a regression over 3.0 now, correct?


No, can't say that. I had various recurring issues with 4.9, 4.14 and 4.19 
kernels,
aswell as 3.0 and and 3.1.
It's hard to narrow down and often just vanished after a while, and my only
gut-feeling is that condition variables are involved.
I also have a couple cobalt threads *not* pinned to a single cpu.


I'm only talking about the crash during debug - one issue after the other.

Jan



Atleast I can now say it’s a single app causing the issue, not using rtnet or 
having additional cobalt applications running.
Since I can easily reproduce the issue, I will now try using debian's gcc-8, to 
rule out troubles with the toolchain.

Norbert.



This message and any attachments are solely for the use of the intended 
recipients. They may contain privileged and/or confidential information or 
other information protected from disclosure. If you are not an intended 
recipient, you are hereby notified that you received this email in error and 
that any review, dissemination, distribution or copying of this email and any 
attachment is strictly prohibited. If you have received this email in error, 
please contact the sender and delete the message and any attachment from your 
system.

ANDRITZ HYDRO GmbH


Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation

Firmensitz/ Registered seat: Wien

Firmenbuchgericht/ Court of registry: Handelsgericht Wien

Firmenbuchnummer/ Company registration: FN 61833 g

DVR: 0605077

UID-Nr.: ATU14756806


Thank You





--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux



Re: hith latency when cpu is fully loaded

2019-11-18 Thread Jan Kiszka via Xenomai

On 19.11.19 02:01, chensong via Xenomai wrote:

Dear experts,

i'm new in xenomai, i got an issue, here is the detail:

Main processor architect: ARM64 phytium ft2000ahk
Kernel release number: 4.14.4
cmdline:BOOT_IMAGE=/Image-tmp 
root=UUID=9fea0634-a9c9-4e9f-906c-9c36b7249822 console=ttyS1,115200 
earlyprintk=uart8250-32bit,0x28001000 rw rootdelay=10 KEYBOARDTYPE=pc 
KEYTABLE=us security=

xenomai release number:3.1-devel
xenomai configuration:
    kylin@kylin-os:~/workspace/code/nudt-hgj-xenomai-tjrd$ grep 
configure config.status

    # Generated by configure.
   # Compiler output produced by configure, useful for debugging
    # configure, is in config.log if it exists.
    configured by ./configure, generated by GNU Autoconf 2.69,
   ac_configure_extra_args=
   ac_configure_extra_args="$ac_configure_extra_args --silent"
   set X /bin/bash './configure'  '--with-core=cobalt' 
'--enable-smp' '--enable-pshared' $ac_configure_extra_args --no-create 
--no-recursion
   configure_time_dlsearch_path='/lib /usr/lib 
/lib/aarch64-linux-gnu /usr/lib/aarch64-linux-gnu 
/usr/lib/aarch64-linux-gnu/mesa-egl /usr/lib/aarch64-linux-gnu/mesa 
/usr/local/lib '

   configure_time_lt_sys_library_path=''
  for var in reload_cmds old_postinstall_cmds old_postuninstall_cmds 
old_archive_cmds extract_expsyms_cmds old_archive_from_new_cmds 
old_archive_from_expsyms_cmds archive_cmds archive_expsym_cmds 
module_cmds module_expsym_cmds export_symbols_cmds prelink_cmds 
postlink_cmds postinstall_cmds postuninstall_cmds finish_cmds 
sys_lib_search_path_spec configure_time_dlsearch_path 
configure_time_lt_sys_library_path; do

   # on some systems where configure will not decide to define it.
   # Let's still pretend it is `configure' which instantiates (i.e., 
don't

   configure_input='Generated from '`
    `' by configure.'
   configure_input="$ac_file.  $configure_input"
    case $configure_input in #(
    ac_sed_conf_input=`$as_echo "$configure_input" |
    *) ac_sed_conf_input=$configure_input;;
     s|@configure_input@|$ac_sed_conf_input|;t t
   $as_echo "/* $configure_input  */" \
     $as_echo "/* $configure_input  */" \
  # Libtool was configured on host `(hostname || uname -n) 
2>/dev/null | sed 1q`:

   : \${LT_SYS_LIBRARY_PATH="$configure_time_lt_sys_library_path"}
    sys_lib_dlsearch_path_spec=$lt_configure_time_dlsearch_path
   # Explicit LT_SYS_LIBRARY_PATH set during ./configure time.
configure_time_lt_sys_library_path=$lt_configure_time_lt_sys_library_path

OR:
    kylin@kylin-os:~$ xeno-config --info
     Xenomai version: Xenomai/cobalt v3.1-devel -- # ()
    Linux kylin-os 4.14.4.kylin.rt-1118-ipipe-trace+ #2 SMP PREEMPT 
Mon Nov 18 18:28:17 CST 2019 aarch64 aarch64 aarch64 GNU/Linux
    Kernel parameters: BOOT_IMAGE=/Image-tmp 
root=UUID=9fea0634-a9c9-4e9f-906c-9c36b7249822 console=ttyS1,115200 
earlyprintk=uart8250-32bit,0x28001000 rw rootdelay=10 KEYBOARDTYPE=pc 
KEYTABLE=us security=

    I-pipe release #2 detected
    Cobalt core 3.1-devel detected
    Compiler: gcc version 5.4.0 20160609 (Ubuntu/Linaro 
5.4.0-6kord1~16.04.10)
    Build args: --prefix=/usr --includedir=/usr/include/xenomai 
--mandir=/usr/share/man --with-testdir=/usr/lib/xenomai/testsuite 
--enable-smp --build aarch64-linux-gnu build_alias=aarch64-linux-gnu


Desktop: kylin 4.0.2 (ubuntu likely desktop)

Issue description:
   latency and cyclictest work fine in my system in most of cases, 
the worst latency is around 100us ~ 200us. however, when i ran a script 
to increase the cpu load in the system, the worst latency reached 2000us 
~ 5000us or even worse. Basically, the script forks 6 processes by 
default and each process applies a four-pages buffer and keeps writing 
without any breath, no warning or error messages in dmesg. below is the 
script:


#include 
#include 
#include 
#include 
#include 
#include 

#define PAGE_SIZE 4096
#define TEST_THREADS 6

unsigned int test_threads;

void do_thread_test(void)
{
     void *mm;
     char i = 0;

     printf("mem test thread start \n");
     mm = malloc(PAGE_SIZE * 4); // 1M
     while(1) {
     for (i = 0; i<100; i++)
     memset(mm, i, PAGE_SIZE * 4);
     }


You cannot run Xenomai threads at 100% on Linux. You need to leave some 
time for the rest of the system to do housekeeping. That explains the 
"deadlock" you see. If you turn on CONFIG_XENO_OPT_WATCHDOG, it will 
detect such mistakes and kick the task out of RT.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux



hith latency when cpu is fully loaded

2019-11-18 Thread chensong via Xenomai

Dear experts,

i'm new in xenomai, i got an issue, here is the detail:

Main processor architect: ARM64 phytium ft2000ahk
Kernel release number: 4.14.4
cmdline:BOOT_IMAGE=/Image-tmp 
root=UUID=9fea0634-a9c9-4e9f-906c-9c36b7249822 console=ttyS1,115200 
earlyprintk=uart8250-32bit,0x28001000 rw rootdelay=10 KEYBOARDTYPE=pc 
KEYTABLE=us security=

xenomai release number:3.1-devel
xenomai configuration:
   kylin@kylin-os:~/workspace/code/nudt-hgj-xenomai-tjrd$ grep 
configure config.status

   # Generated by configure.
  # Compiler output produced by configure, useful for debugging
   # configure, is in config.log if it exists.
   configured by ./configure, generated by GNU Autoconf 2.69,
  ac_configure_extra_args=
  ac_configure_extra_args="$ac_configure_extra_args --silent"
  set X /bin/bash './configure'  '--with-core=cobalt' 
'--enable-smp' '--enable-pshared' $ac_configure_extra_args --no-create 
--no-recursion
  configure_time_dlsearch_path='/lib /usr/lib 
/lib/aarch64-linux-gnu /usr/lib/aarch64-linux-gnu 
/usr/lib/aarch64-linux-gnu/mesa-egl /usr/lib/aarch64-linux-gnu/mesa 
/usr/local/lib '

  configure_time_lt_sys_library_path=''
 for var in reload_cmds old_postinstall_cmds old_postuninstall_cmds 
old_archive_cmds extract_expsyms_cmds old_archive_from_new_cmds 
old_archive_from_expsyms_cmds archive_cmds archive_expsym_cmds 
module_cmds module_expsym_cmds export_symbols_cmds prelink_cmds 
postlink_cmds postinstall_cmds postuninstall_cmds finish_cmds 
sys_lib_search_path_spec configure_time_dlsearch_path 
configure_time_lt_sys_library_path; do

  # on some systems where configure will not decide to define it.
  # Let's still pretend it is `configure' which instantiates (i.e., 
don't

  configure_input='Generated from '`
   `' by configure.'
  configure_input="$ac_file.  $configure_input"
   case $configure_input in #(
   ac_sed_conf_input=`$as_echo "$configure_input" |
   *) ac_sed_conf_input=$configure_input;;
s|@configure_input@|$ac_sed_conf_input|;t t
  $as_echo "/* $configure_input  */" \
$as_echo "/* $configure_input  */" \
 # Libtool was configured on host `(hostname || uname -n) 
2>/dev/null | sed 1q`:

  : \${LT_SYS_LIBRARY_PATH="$configure_time_lt_sys_library_path"}
   sys_lib_dlsearch_path_spec=$lt_configure_time_dlsearch_path
  # Explicit LT_SYS_LIBRARY_PATH set during ./configure time.
configure_time_lt_sys_library_path=$lt_configure_time_lt_sys_library_path

OR:
   kylin@kylin-os:~$ xeno-config --info
Xenomai version: Xenomai/cobalt v3.1-devel -- # ()
   Linux kylin-os 4.14.4.kylin.rt-1118-ipipe-trace+ #2 SMP PREEMPT 
Mon Nov 18 18:28:17 CST 2019 aarch64 aarch64 aarch64 GNU/Linux
   Kernel parameters: BOOT_IMAGE=/Image-tmp 
root=UUID=9fea0634-a9c9-4e9f-906c-9c36b7249822 console=ttyS1,115200 
earlyprintk=uart8250-32bit,0x28001000 rw rootdelay=10 KEYBOARDTYPE=pc 
KEYTABLE=us security=

   I-pipe release #2 detected
   Cobalt core 3.1-devel detected
   Compiler: gcc version 5.4.0 20160609 (Ubuntu/Linaro 
5.4.0-6kord1~16.04.10)
   Build args: --prefix=/usr --includedir=/usr/include/xenomai 
--mandir=/usr/share/man --with-testdir=/usr/lib/xenomai/testsuite 
--enable-smp --build aarch64-linux-gnu build_alias=aarch64-linux-gnu


Desktop: kylin 4.0.2 (ubuntu likely desktop)

Issue description:
  latency and cyclictest work fine in my system in most of cases, 
the worst latency is around 100us ~ 200us. however, when i ran a script 
to increase the cpu load in the system, the worst latency reached 2000us 
~ 5000us or even worse. Basically, the script forks 6 processes by 
default and each process applies a four-pages buffer and keeps writing 
without any breath, no warning or error messages in dmesg. below is the 
script:


#include 
#include 
#include 
#include 
#include 
#include 

#define PAGE_SIZE 4096
#define TEST_THREADS 6

unsigned int test_threads;

void do_thread_test(void)
{
void *mm;
char i = 0;

printf("mem test thread start \n");
mm = malloc(PAGE_SIZE * 4); // 1M
while(1) {
for (i = 0; i<100; i++)
memset(mm, i, PAGE_SIZE * 4);
}
}

void thread_test(void)
{
pid_t pid[TEST_THREADS];
int i = 0;

printf("begin start\n");
for (i = 0; i< test_threads; i++) {
pid[i] = fork();
if (pid[i] == 0)
do_thread_test();
}
}


int get_cpu_idle_info(int *idle, int *total)
{
FILE *fp;
int var[5][7];
char name[5][20];
char *line;
ssize_t read, len = 0;
int i = 0;

if ((fp = fopen("/proc/stat", "r")) == NULL) {
printf("open /proc/stat err !\n");
return -1;
}

while((read = getline(, , fp)) != -1) {

//  fgets(buff, sizeof(buff), fp);
sscanf(line, "%s %u %u %u %u %u %u %u",
[i][0], [i][0], [i][1], 
[i][2], [i][3],

 

RE: Deadlock during debugging

2019-11-18 Thread Lange Norbert via Xenomai
One more,

Note that there seem to be quite different reports, from a recursive fault to 
some threads getting marked as "runaway".
I can reproduce the issue now easily, but its proprietary software I cant reach 
around.

Norbert

[  226.354729] I-pipe: Detected stalled head domain, probably caused by a bug.
[  226.354729] A critical section may have been left unterminated.
[  226.370156] CPU: 1 PID: 0 Comm: swapper/2 Tainted: GW 
4.19.84-xenod8-static #1
[  226.370160] CPU: 2 PID: 732 Comm: fup.fast Tainted: GW 
4.19.84-xenod8-static #1
[  226.378775] Hardware name: TQ-Group TQMxE39M/Type2 - Board Product Name, 
BIOS 5.12.30.21.16 01/31/2019
[  226.387475] Hardware name: TQ-Group TQMxE39M/Type2 - Board Product Name, 
BIOS 5.12.30.21.16 01/31/2019
[  226.396782] I-pipe domain: Linux
[  226.406089] I-pipe domain: Linux
[  226.409320] RIP: 0010:do_idle+0xaf/0x140
[  226.412549] Call Trace:
[  226.416476] Code: 85 92 00 00 00 e8 51 f5 04 00 e8 bc 65 03 00 e8 77 36 7c 
00 f0 80 4d 02 20 9c 58 f6 c4 02 74 7e e8 66 2d 07 00 48 85 c0 74 6b <0f> 0b e8 
0a 42 07 00 e8 45 68 03 00 9c 58 f6 c4 02 0f 85 79 ff ff
[  226.418936]  dump_stack+0x8c/0xc0
[  226.437687] RSP: 0018:932cc00afef8 EFLAGS: 00010002
[  226.441009]  ipipe_root_only.cold+0x11/0x32
[  226.446240]  ipipe_stall_root+0xe/0x60
[  226.450424] RAX: 0001 RBX: 0002 RCX: 000b
[  226.454182]  __ipipe_trap_prologue+0x2ae/0x2f0
[  226.461319] RDX: a3fc RSI: 8f63f99c8208 RDI: 
[  226.465767]  ? __ipipe_complete_domain_migration+0x40/0x40
[  226.472899] RBP: 8f63f815a7c0 R08:  R09: 0002e248
[  226.478386]  invalid_op+0x26/0x51
[  226.485518] R10: 00015800 R11: 003480cf3801 R12: 8f63f815a7c0
[  226.488839] RIP: 0010:xnthread_suspend+0x3ef/0x540
[  226.495973] R13:  R14:  R15: 
[  226.500766] Code: 58 12 00 00 4c 89 e7 e8 ef ca ff ff 41 83 8c 24 c4 11 00 
00 01 e9 82 fd ff ff 0f 0b 48 83 bf 58 12 00 00 00 0f 84 49 fc ff ff <0f> 0b 0f 
0b 9c 58 f6 c4 02 0f 84 85 fd ff ff fa bf 00 00 00 80 e8
[  226.507900] FS:  () GS:8f63f980() 
knlGS:
[  226.52] RSP: 0018:932cc083bd60 EFLAGS: 00010082
[  226.534755] CS:  0010 DS:  ES:  CR0: 80050033
[  226.539986] CR2: 7ff8dca27000 CR3: 000174c54000 CR4: 003406e0
[  226.545738] RAX: 932cc0617e30 RBX: 00025090 RCX: 
[  226.552870] Call Trace:
[  226.560005] RDX:  RSI: 0002 RDI: 932cc0616240
[  226.562461]  cpu_startup_entry+0x6f/0x80
[  226.569590] RBP: 932cc0617e08 R08: 932cc0617e08 R09: 0005cc88
[  226.573520]  start_secondary+0x169/0x1b0
[  226.580655] R10:  R11:  R12: 932cc0616240
[  226.584585]  secondary_startup_64+0xa4/0xb0
[  226.591716] R13:  R14:  R15: 932cc0617e08
[  226.595905] ---[ end trace aa5dc96dbf303c58 ]---
[  226.603042]  xnsynch_sleep_on+0x117/0x2d0
[  226.611670]  __cobalt_cond_wait_prologue+0x29f/0x950
[  226.616647]  ? __cobalt_cond_wait_prologue+0x950/0x950
[  226.621798]  CoBaLt_cond_wait_prologue+0x23/0x30
[  226.626425]  handle_head_syscall+0xe1/0x370
[  226.630618]  ipipe_fastcall_hook+0x14/0x20
[  226.634724]  ipipe_handle_syscall+0x57/0xe0
[  226.638920]  do_syscall_64+0x4b/0x500
[  226.642598]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  226.647660] RIP: 0033:0x77f9c134
[  226.651244] Code: 8b 73 04 49 89 dc e8 fb ef ff ff 48 89 de 48 8b 5c 24 10 
45 31 c0 b9 23 00 00 10 48 8d 54 24 44 45 31 d2 48 89 df 89 c8 0f 05 <8b> 7c 24 
2c 31 f6 49 89 c5 89 c5 e8 cc ef ff ff 4c 89 ff e8 74 e9
[  226.670014] RSP: 002b:7fffe1a6bb10 EFLAGS: 0246 ORIG_RAX: 
1023
[  226.677599] RAX: ffda RBX: 74d91c78 RCX: 77f9c134
[  226.684744] RDX: 7fffe1a6bb54 RSI: 74d91c48 RDI: 74d91c78
[  226.691885] RBP: 7fffe1a6bc30 R08:  R09: 
[  226.699027] R10:  R11: 0246 R12: 74d91c48
[  226.706166] R13:  R14: 0001 R15: 7fffe1a6bb60
[  226.713325] I-pipe tracer log (100 points):
[  226.717520]  |*+func0 ipipe_trace_panic_freeze+0x0 
(ipipe_root_only+0xcf)
[  226.726114]  |*+func0 ipipe_root_only+0x0 
(ipipe_stall_root+0xe)
[  226.733926]  |*+func   -1 ipipe_stall_root+0x0 
(__ipipe_trap_prologue+0x2ae)
[  226.742431]  |# func   -2 ipipe_trap_hook+0x0 
(__ipipe_notify_trap+0x98)
[  226.750590]  |# func   -3 __ipipe_notify_trap+0x0 
(__ipipe_trap_prologue+0x7f)
[  226.759268]  |# func   -3 __ipipe_trap_prologue+0x0 
(invalid_op+0x26)
[  226.767167]  |# func   -5 xnthread_suspend+0x0 
(xnsynch_sleep_on+0x117)
[  

RE: Deadlock during debugging

2019-11-18 Thread Lange Norbert via Xenomai
New crash, same thing with ipipe panic trace (the decoded log does not add 
information to the relevant parts).

Is the dump_stack function itself trashing the stack?

[  168.411205] [Xenomai] watchdog triggered on CPU #1 -- runaway thread 'main' 
signaled
[  209.176742] [ cut here ]
[  209.181381] xnthread_relax() failed for thread aboard_runner[790]
[  209.181389] BUG: Unhandled exception over domain Xenomai at 
0x7fed - switching to ROOT
[  209.196451] CPU: 0 PID: 790 Comm: aboard_runner Tainted: GW 
4.19.84-xenod8-static #1
[  209.205588] Hardware name: TQ-Group TQMxE39M/Type2 - Board Product Name, 
BIOS 5.12.30.21.16 01/31/2019
[  209.214900] I-pipe domain: Linux
[  209.218137] Call Trace:
[  209.220593]  dump_stack+0x8c/0xc0
[  209.223919]  __ipipe_trap_prologue.cold+0x1f/0x5e
[  209.228629]  invalid_op+0x26/0x51
[  209.231952] RIP: 0010:xnthread_relax+0x46d/0x4a0
[  209.236576] Code: f6 83 c2 11 00 00 01 75 0e 48 8b 03 48 85 c0 74 33 8b 90 
c0 04 00 00 48 8d b3 5c 14 00 00 48 c7 c7 90 00 8b 9a e8 02 02 ef ff <0f> 0b e9 
42 fd ff ff 89 c6 48 c7 c7 c4 f8 a3 9a e8 2e 71 f3 ff e9
[  209.255347] RSP: 0018:9a0e4074fd90 EFLAGS: 00010286
[  209.260586] RAX:  RBX: 9a0e4065aa40 RCX: 000b
[  209.267728] RDX: 5129 RSI: 902a794791f8 RDI: 007800c0
[  209.274869] RBP: 9a0e4074fe68 R08: 007800c0 R09: 0002e248
[  209.282013] R10: 9bb72040 R11: 9bb3209c R12: 9bbfdc80
[  209.289157] R13: 902a76da8000 R14: 0001 R15: 0292
[  209.296299]  ? xnthread_prepare_wait+0x20/0x20
[  209.300752]  ? trace+0x59/0x8d
[  209.303814]  ? __cobalt_clock_nanosleep+0x540/0x540
[  209.308700]  handle_head_syscall+0x307/0x370
[  209.312979]  ipipe_fastcall_hook+0x14/0x20
[  209.317083]  ipipe_handle_syscall+0x57/0xe0
[  209.321280]  do_syscall_64+0x4b/0x500
[  209.324950]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  209.330011] RIP: 0033:0x77f9bd68
[  209.333598] Code: 89 fb bf 01 00 00 00 48 83 ec 18 48 8d 74 24 0c e8 bd f3 
ff ff b9 19 00 00 10 48 63 f5 48 63 fb 4d 89 ea 4c 89 e2 89 c8 0f 05 <8b> 7c 24 
0c 48 89 c3 31 f6 e8 9a f3 ff ff 48 83 c4 18 89 d8 f7 d8
[  209.352370] RSP: 002b:7fffe7d0 EFLAGS: 0246 ORIG_RAX: 
1019
[  209.359954] RAX: fe00 RBX: 0001 RCX: 77f9bd68
[  209.367098] RDX: 7fffe820 RSI: 0001 RDI: 0001
[  209.374237] RBP: 0001 R08: 0001 R09: 0014
[  209.381381] R10: 7fffe820 R11: 0246 R12: 7fffe820
[  209.388524] R13: 7fffe820 R14:  R15: 
[  209.395665] I-pipe tracer log (100 points):
[  209.399857]  | #func0 ipipe_trace_panic_freeze+0x0 
(__ipipe_trap_prologue+0x237)
[  209.409056]  | +func0 ipipe_root_only+0x0 
(ipipe_stall_root+0xe)
[  209.416862]  | +func   -1 ipipe_stall_root+0x0 
(__ipipe_trap_prologue+0x2ae)
[  209.425365]  |+ func   -2 ipipe_trap_hook+0x0 
(__ipipe_notify_trap+0x98)
[  209.433523]  |+ func   -3 __ipipe_notify_trap+0x0 
(__ipipe_trap_prologue+0x7f)
[  209.442199]  |+ func   -4 __ipipe_trap_prologue+0x0 
(invalid_op+0x26)
[  209.450097]  |+ end 0x8001 -5 
__ipipe_spin_unlock_irqrestore+0x4f (<>)
[  209.458425]  |# func   -6 __ipipe_spin_unlock_irqrestore+0x0 
(__ipipe_log_printk+0x69)
[  209.467797]  |+ begin   0x8001-10 __ipipe_spin_lock_irqsave+0x5e 
(<>)
[  209.475693]   + func  -10 __ipipe_spin_lock_irqsave+0x0 
(__ipipe_log_printk+0x22)
[  209.484630]   + func  -10 __ipipe_log_printk+0x0 
(__warn_printk+0x6c)
[  209.492525]  |+ end 0x8001-11 do_vprintk+0xf6 (<>)
[  209.499120]  |+ begin   0x8001-11 do_vprintk+0x106 (<>)
[  209.505799]   + func  -12 do_vprintk+0x0 (__warn_printk+0x6c)
[  209.513000]   + func  -12 vprintk+0x0 (__warn_printk+0x6c)
[  209.519939]  |+ end 0x8001-12 ipipe_raise_irq+0x70 (<>)
[  209.526969]  |+ func  -13 __ipipe_set_irq_pending+0x0 
(__ipipe_dispatch_irq+0xad)
[  209.535905]  |+ func  -14 __ipipe_dispatch_irq+0x0 
(ipipe_raise_irq+0x7e)
[  209.544148]  |+ begin   0x8001-14 ipipe_raise_irq+0x64 (<>)
[  209.551178]   + func  -15 ipipe_raise_irq+0x0 
(__ipipe_log_printk+0x84)
[  209.559250]  |+ end 0x8001-15 
__ipipe_spin_unlock_irqrestore+0x4f (<>)
[  209.567581]  |# func  -15 __ipipe_spin_unlock_irqrestore+0x0 
(__ipipe_log_printk+0x69)
[  209.576951]  |+ begin   0x8001-17 __ipipe_spin_lock_irqsave+0x5e 
(<>)
[  209.584847]   + func  -18 __ipipe_spin_lock_irqsave+0x0 
(__ipipe_log_printk+0x22)
[  

Deadlock during debugging

2019-11-18 Thread Lange Norbert via Xenomai
Hello,

Here's one of my deadlocks, the output seems interleaved from 2 concurrent 
dumps,
I ran the crashlog through decode_stacktrace.sh.

I got to this, after enabling a breakpoint in gdb  (execution did stop there), 
setting another breakpoint and
hitting continue.

[  135.414273] CPU: 1 PID: 0 Comm: swapper/2 Tainted: GW 
4.19.84-xeno8-static #1
[  135.414275] I-pipe: Detected stalled head domain, probably caused by a bug.
[  135.414275] A critical section may have been left unterminated.
[  135.414287] CPU: 2 PID: 798 Comm: fup.fast Tainted: GW 
4.19.84-xeno8-static #1
[  135.422810] Hardware name: TQ-Group TQMxE39M/Type2 - Board Product Name, 
BIOS 5.12.30.21.16 01/31/2019
[  135.436373] Hardware name: TQ-Group TQMxE39M/Type2 - Board Product Name, 
BIOS 5.12.30.21.16 01/31/2019
[  135.444984] I-pipe domain: Linux
[  135.454290] I-pipe domain: Linux
[  135.463598] RIP: 0010:rcu_nmi_exit+0x140/0x150
[  135.466825] Call Trace:
[  135.470057] Code: 45 89 f0 4c 89 f9 4c 89 e2 4c 89 ee ff d0 48 8b 03 48 85 
c0 75 e2 48 8b 45 08 4c 8d 78 fe e9 5b ff ff ff 0f 0b e9 ee fe ff ff <0f> 0b e9 
f8
[  135.474513]  dump_stack+0x8c/0xc0
[  135.476950] RSP: 0018:a3513bb03f18 EFLAGS: 00010046
[  135.495720]  ipipe_stall_root+0xc/0x30
[  135.504264]  __ipipe_trap_prologue+0x209/0x210
[  135.508011] RAX: 000573f4 RBX: 00019480 RCX: 001f
[  135.512458]  invalid_op+0x26/0x51
[  135.519592] RDX:  RSI: 50523fbe RDI: 0001
[  135.522914] RIP: 0010:xnthread_suspend+0x3d5/0x4e0
[  135.530050] RBP: a3513ba99480 R08: 0001 R09: 
[  135.534843] Code: 58 12 00 00 4c 89 e7 e8 f9 cf ff ff 41 83 8c 24 c4 11 00 
00 01 e9 92 fd ff ff 0f 0b 48 83 bf 58 12 00 00 00 0f 84 63 fc ff ff <0f> 0b 0f 
0b
[  135.541979] R10: a35139832440 R11: 0424 R12: 
[  135.560746] RSP: 0018:bddd0073fd60 EFLAGS: 00010082
[  135.567878] R13: 0022 R14:  R15: 
[  135.580241] FS:  () GS:a3513ba8() 
knlGS:
[  135.580246] RAX: bddd005fbe30 RBX: 00025090 RCX: 
[  135.588336] CS:  0010 DS:  ES:  CR0: 80050033
[  135.595477] RDX:  RSI: 0002 RDI: bddd005fa240
[  135.601225] CR2: 7f8899c36a10 CR3: 00017b31c000 CR4: 003406e0
[  135.608362] RBP: bddd005fbe08 R08: bddd005fbe08 R09: 
[  135.615500] Call Trace:
[  135.622637] R10:  R11:  R12: bddd005fa240
[  135.625085] ---[ end trace adb8b44963759cc1 ]---
[  135.632220] R13:  R14:  R15: bddd005fbe08
[  135.636851] WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:941 
rcu_nmi_enter+0xe4/0xf0
[  135.643982]  xnsynch_sleep_on+0x102/0x260
[  135.651634] Modules linked in:
[  135.655649]  __cobalt_cond_wait_prologue+0x295/0x8c0
[  135.655653]  rt_igb
[  135.658713]  ? __cobalt_cond_wait_prologue+0x8c0/0x8c0
[  135.663677]  plusb
[  135.665781]  CoBaLt_cond_wait_prologue+0x23/0x30
[  135.670918]  usbnet
[  135.672936]  handle_head_syscall+0xe1/0x370
[  135.677555]  mii
[  135.679658]  ipipe_fastcall_hook+0x14/0x20
[  135.685687]  ipipe_handle_syscall+0x4a/0xa0
[  135.689784] CPU: 1 PID: 0 Comm: swapper/2 Tainted: GW 
4.19.84-xeno8-static #1
[  135.693971]  do_syscall_64+0x41/0x3d0
[  135.702495] Hardware name: TQ-Group TQMxE39M/Type2 - Board Product Name, 
BIOS 5.12.30.21.16 01/31/2019
[  135.706160]  ? __ipipe_handle_irq+0xb7/0x200
[  135.715464] I-pipe domain: Linux
[  135.719738]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  135.722972] RIP: 0010:rcu_nmi_enter+0xe4/0xf0
[  135.728025] RIP: 0033:0x77f9c134
[  135.732386] Code: 48 85 c0 75 d9 48 8b 6b 08 eb 9a e8 b6 a9 ff ff 48 8b 6b 
08 41 bd 01 00 00 00 4c 8b 35 5d cb 23 01 4c 8d 7d 01 e9 72 ff ff ff <0f> 0b e9 
44
[  135.735966] Code: 8b 73 04 49 89 dc e8 fb ef ff ff 48 89 de 48 8b 5c 24 10 
45 31 c0 b9 23 00 00 10 48 8d 54 24 44 45 31 d2 48 89 df 89 c8 0f 05 <8b> 7c 24 
29
[  135.754730] RSP: 0018:a3513bb03f38 EFLAGS: 00010082
[  135.773496] RSP: 002b:7fffe82dab10 EFLAGS: 0246
[  135.778728]  ORIG_RAX: 1023
[  135.783954] RAX: 00019480 RBX: a3513ba99480 RCX: a3513ba9c008
[  135.787792] RAX: ffda RBX: 74127c78 RCX: 77f9c134
[  135.794927] RDX: a3513ba9c000 RSI: 0001 RDI: 1140
[  135.802065] RDX: 7fffe82dab54 RSI: 74127c48 RDI: 74127c78
[  135.809201] RBP: fffe R08: a3513ba9c228 R09: 0045
[  135.816337] RBP: 7fffe82dac30 R08:  R09: 
[  135.823470] R10: a35139832440 R11: 0424 R12: 9e7af080
[  135.830604] R10:  R11: 0246 R12: 74127c48
[  135.837738] R13: 00045000 R14: 

Re: [PATCH 3/3] boilerplate/avl: fix NULL link representation in pshared mode - take #2

2019-11-18 Thread Jan Kiszka via Xenomai

On 18.11.19 09:20, Philippe Gerum wrote:

On 11/18/19 9:12 AM, Jan Kiszka wrote:

On 17.11.19 12:41, Philippe Gerum via Xenomai wrote:

Since zero is the offset pointing at the AVL tree anchor, it cannot be
used for representing a NULL link. Use (ptrdiff_t)-1 instead.

Signed-off-by: Philippe Gerum 
---
   include/boilerplate/avl-inner.h | 8 
   1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/boilerplate/avl-inner.h b/include/boilerplate/avl-inner.h
index 8e4de8487..9c0576213 100644
--- a/include/boilerplate/avl-inner.h
+++ b/include/boilerplate/avl-inner.h
@@ -105,14 +105,14 @@ shavlh_link(const struct shavl *const avl,
   const struct shavlh *const holder, unsigned int dir)
   {
   ptrdiff_t offset = holder->link[avl_type2index(dir)].offset;
-    return offset ? (void *)avl + offset : NULL;
+    return offset == (ptrdiff_t)-1 ? NULL : (void *)avl + offset;
   }
     static inline void
   shavlh_set_link(struct shavl *const avl, struct shavlh *lhs,
   int dir, struct shavlh *rhs)
   {
-    ptrdiff_t offset = rhs ? (void *)rhs - (void *)avl : 0;
+    ptrdiff_t offset = rhs ? (void *)rhs - (void *)avl : (ptrdiff_t)-1;
   lhs->link[avl_type2index(dir)].offset = offset;
   }
   @@ -120,13 +120,13 @@ static inline
   struct shavlh *shavl_end(const struct shavl *const avl, int dir)
   {
   ptrdiff_t offset = avl->end[avl_type2index(dir)].offset;
-    return offset ? (void *)avl + offset : NULL;
+    return offset == (ptrdiff_t)-1 ? NULL : (void *)avl + offset;
   }
     static inline void
   shavl_set_end(struct shavl *const avl, int dir, struct shavlh *holder)
   {
-    ptrdiff_t offset = holder ? (void *)holder - (void *)avl : 0;
+    ptrdiff_t offset = holder ? (void *)holder - (void *)avl : (ptrdiff_t)-1;
   avl->end[avl_type2index(dir)].offset = offset;
   }
  


Thanks, all applied to next.

But, again, please always add a proper "From:" line to the commit so that I do not need to 
manually edit all of them to avoid the infamous "Philippe Gerum via Xenomai 
" entries. TIA.



This is the output of git send-email. Will check.




The trick I'm using for scripted submission is to format-patch with a 
--from set to dummy value. That enforces git format-patch to add the 
"From: ". When sending this, I'm using the correct email of course.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux



Re: [PATCH 3/3] boilerplate/avl: fix NULL link representation in pshared mode - take #2

2019-11-18 Thread Philippe Gerum via Xenomai
On 11/18/19 9:12 AM, Jan Kiszka wrote:
> On 17.11.19 12:41, Philippe Gerum via Xenomai wrote:
>> Since zero is the offset pointing at the AVL tree anchor, it cannot be
>> used for representing a NULL link. Use (ptrdiff_t)-1 instead.
>>
>> Signed-off-by: Philippe Gerum 
>> ---
>>   include/boilerplate/avl-inner.h | 8 
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/boilerplate/avl-inner.h 
>> b/include/boilerplate/avl-inner.h
>> index 8e4de8487..9c0576213 100644
>> --- a/include/boilerplate/avl-inner.h
>> +++ b/include/boilerplate/avl-inner.h
>> @@ -105,14 +105,14 @@ shavlh_link(const struct shavl *const avl,
>>   const struct shavlh *const holder, unsigned int dir)
>>   {
>>   ptrdiff_t offset = holder->link[avl_type2index(dir)].offset;
>> -    return offset ? (void *)avl + offset : NULL;
>> +    return offset == (ptrdiff_t)-1 ? NULL : (void *)avl + offset;
>>   }
>>     static inline void
>>   shavlh_set_link(struct shavl *const avl, struct shavlh *lhs,
>>   int dir, struct shavlh *rhs)
>>   {
>> -    ptrdiff_t offset = rhs ? (void *)rhs - (void *)avl : 0;
>> +    ptrdiff_t offset = rhs ? (void *)rhs - (void *)avl : (ptrdiff_t)-1;
>>   lhs->link[avl_type2index(dir)].offset = offset;
>>   }
>>   @@ -120,13 +120,13 @@ static inline
>>   struct shavlh *shavl_end(const struct shavl *const avl, int dir)
>>   {
>>   ptrdiff_t offset = avl->end[avl_type2index(dir)].offset;
>> -    return offset ? (void *)avl + offset : NULL;
>> +    return offset == (ptrdiff_t)-1 ? NULL : (void *)avl + offset;
>>   }
>>     static inline void
>>   shavl_set_end(struct shavl *const avl, int dir, struct shavlh *holder)
>>   {
>> -    ptrdiff_t offset = holder ? (void *)holder - (void *)avl : 0;
>> +    ptrdiff_t offset = holder ? (void *)holder - (void *)avl : 
>> (ptrdiff_t)-1;
>>   avl->end[avl_type2index(dir)].offset = offset;
>>   }
>>  
> 
> Thanks, all applied to next.
> 
> But, again, please always add a proper "From:" line to the commit so that I 
> do not need to manually edit all of them to avoid the infamous "Philippe 
> Gerum via Xenomai " entries. TIA.
> 

This is the output of git send-email. Will check.


-- 
Philippe.



Re: [PATCH 3/3] boilerplate/avl: fix NULL link representation in pshared mode - take #2

2019-11-18 Thread Jan Kiszka via Xenomai

On 17.11.19 12:41, Philippe Gerum via Xenomai wrote:

Since zero is the offset pointing at the AVL tree anchor, it cannot be
used for representing a NULL link. Use (ptrdiff_t)-1 instead.

Signed-off-by: Philippe Gerum 
---
  include/boilerplate/avl-inner.h | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/boilerplate/avl-inner.h b/include/boilerplate/avl-inner.h
index 8e4de8487..9c0576213 100644
--- a/include/boilerplate/avl-inner.h
+++ b/include/boilerplate/avl-inner.h
@@ -105,14 +105,14 @@ shavlh_link(const struct shavl *const avl,
const struct shavlh *const holder, unsigned int dir)
  {
ptrdiff_t offset = holder->link[avl_type2index(dir)].offset;
-   return offset ? (void *)avl + offset : NULL;
+   return offset == (ptrdiff_t)-1 ? NULL : (void *)avl + offset;
  }
  
  static inline void

  shavlh_set_link(struct shavl *const avl, struct shavlh *lhs,
int dir, struct shavlh *rhs)
  {
-   ptrdiff_t offset = rhs ? (void *)rhs - (void *)avl : 0;
+   ptrdiff_t offset = rhs ? (void *)rhs - (void *)avl : (ptrdiff_t)-1;
lhs->link[avl_type2index(dir)].offset = offset;
  }
  
@@ -120,13 +120,13 @@ static inline

  struct shavlh *shavl_end(const struct shavl *const avl, int dir)
  {
ptrdiff_t offset = avl->end[avl_type2index(dir)].offset;
-   return offset ? (void *)avl + offset : NULL;
+   return offset == (ptrdiff_t)-1 ? NULL : (void *)avl + offset;
  }
  
  static inline void

  shavl_set_end(struct shavl *const avl, int dir, struct shavlh *holder)
  {
-   ptrdiff_t offset = holder ? (void *)holder - (void *)avl : 0;
+   ptrdiff_t offset = holder ? (void *)holder - (void *)avl : 
(ptrdiff_t)-1;
avl->end[avl_type2index(dir)].offset = offset;
  }
  



Thanks, all applied to next.

But, again, please always add a proper "From:" line to the commit so 
that I do not need to manually edit all of them to avoid the infamous 
"Philippe Gerum via Xenomai " entries. TIA.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux