Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-04-12 Thread Jan Kiszka
Saul wrote:
 Gilles Chanteperdrix wrote:
   I thought SMI was supposed to increase latencies, but I only
   see latencies  60us at any load, unless I start video
   capture. Does the latency test absolutely rule out SMI as
   the cause?
 
  Not really, e.g. if the video capture device is USB-attached,
  then SMI could possibly remain in the picture.
   
I have used the bttv driver with xawtv or mythtv and nvidia
proprietary drivers on a dual pIII machine running latency and
the maximum latency was far less than 100 us. What produces large
spots, however, is switching virtual terminals or changing
resolution. Doesn't xawtv change resolution for full screen mode
? Did not you use the 768x576 modeline hack ?
  
  
   The video capture is not USB, it's a PCI card. Since I don't see
   poor latencies I'm guessing my problem is *not* SMI related.

 The latencies above 100us that you observe when starting video capture
 are abnormal. I was suggesting that these latencies are not due to
 video capture but to the fact that xawtv change resolution. Big
 latencies when switching virtual terminal are a well-known issue, not
 related to SMI. So my question was: is xawtv configured on your
 machine to change resolution when starting capture ?
 
 Oh, well xawtv starts up in a window and does not change the screen
 resolution. The large latencies occur during program startup. I
 have seen no other program produce poor latencies in the latency
 test program.
 

There is a way to narrow down your issue: apply the ipipe latency tracer
patch on top of your current Xenomai kernel, switch tracing on in the
config (ipipe feature), and recompile the kernel. Then you should run
the latency test with the -f parameter.

After hitting a high latency, grab the content of
/proc/ipipe/trace/frozen. This back-trace is taken always for the
highest latency the test tool measured.

You may want to play with /proc/ipipe/trace/back_trace_points (even
belatedly), or .../verbose to see more details. Look for high delays
before or after the timer interrupt - or post a (packed) version of the
trace to the list.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-04-11 Thread Gilles Chanteperdrix
Saul wrote:
  I thought SMI was supposed to increase latencies, but I only see
  latencies  60us at any load, unless I start video capture. Does
  the latency test absolutely rule out SMI as the cause?

 Not really, e.g. if the video capture device is USB-attached, then
 SMI could possibly remain in the picture.
  
   I have used the bttv driver with xawtv or mythtv and nvidia
   proprietary drivers on a dual pIII machine running latency and the
   maximum latency was far less than 100 us. What produces large spots,
   however, is switching virtual terminals or changing resolution.
   Doesn't xawtv change resolution for full screen mode ? Did not you use
   the 768x576 modeline hack ?
  
  
  The video capture is not USB, it's a PCI card. Since I don't see poor
  latencies I'm guessing my problem is *not* SMI related.

The latencies above 100us that you observe when starting video capture
are abnormal. I was suggesting that these latencies are not due to video
capture but to the fact that xawtv change resolution. Big latencies
when switching virtual terminal are a well-known issue, not related to
SMI. So my question was: is xawtv configured on your machine to change
resolution when starting capture ?

-- 


Gilles Chanteperdrix.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-04-11 Thread Saul
Gilles Chanteperdrix wrote:
   I thought SMI was supposed to increase latencies, but I only
   see latencies  60us at any load, unless I start video
   capture. Does the latency test absolutely rule out SMI as
   the cause?
 
  Not really, e.g. if the video capture device is USB-attached,
  then SMI could possibly remain in the picture.
   
I have used the bttv driver with xawtv or mythtv and nvidia
proprietary drivers on a dual pIII machine running latency and
the maximum latency was far less than 100 us. What produces large
spots, however, is switching virtual terminals or changing
resolution. Doesn't xawtv change resolution for full screen mode
? Did not you use the 768x576 modeline hack ?
  
  
   The video capture is not USB, it's a PCI card. Since I don't see
   poor latencies I'm guessing my problem is *not* SMI related.

 The latencies above 100us that you observe when starting video capture
 are abnormal. I was suggesting that these latencies are not due to
 video capture but to the fact that xawtv change resolution. Big
 latencies when switching virtual terminal are a well-known issue, not
 related to SMI. So my question was: is xawtv configured on your
 machine to change resolution when starting capture ?

Oh, well xawtv starts up in a window and does not change the screen
resolution. The large latencies occur during program startup. I
have seen no other program produce poor latencies in the latency
test program.

Saul.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-04-10 Thread Saul
I thought SMI was supposed to increase latencies, but I only see
latencies  60us at any load, unless I start video capture. Does
the latency test absolutely rule out SMI as the cause?
  
   Not really, e.g. if the video capture device is USB-attached, then
   SMI could possibly remain in the picture.

 I have used the bttv driver with xawtv or mythtv and nvidia
 proprietary drivers on a dual pIII machine running latency and the
 maximum latency was far less than 100 us. What produces large spots,
 however, is switching virtual terminals or changing resolution.
 Doesn't xawtv change resolution for full screen mode ? Did not you use
 the 768x576 modeline hack ?


The video capture is not USB, it's a PCI card. Since I don't see poor
latencies I'm guessing my problem is *not* SMI related.


 Is there any way to
check whether SMI is actually disabled?
   
  
   Gilles will likely have more clue here, but I guess that reading
   back the  state of the GBL_SMI_EN_BIT from the SMI's MMIO space
   would probably tell us if a previous request to disable SMI
   globally did succeed or not. The code in question is in
   ksrc/arch/i386/smi.c.

 That is a good idea. Another idea would also be to read the state of
 the lock bit (named SMI_LOCK).


At boot GEN_PMCON_1 = 0x0220. Bit 4 is SMI_LOCK, which is zero, so I
should be allowed to disable SMI. This was read using the following
code, which I hope is correct:

u_int8_t hi, lo;
int gen_pmcon_1;
pci_read_config_byte(smi_dev, 0xA0, lo);
pci_read_config_byte(smi_dev, 0xA1, hi);
gen_pmcon_1 = ((int)hi8) | lo;

Further, reading SMI_CTRL_ADDR at boot:
 - before clearing GBL_SMI_EN_BIT - 0x203B
 - after clearing GBL_SMI_EN_BIT - 0x203A

GBL_SMI_EN_BIT is bit zero, which is reported as cleared i.e. SMI
is disabled.

If there are no further ideas on how to fix this, I would be interested
in hearing from people about the PC's that you use that are know to work
well with Xenomai. I'd be looking for a fast new desktop PC. Is AMD a
better bet? I deliberately avoided AMD when purchasing this new Intel PC
(the I/O controller is ICH6) since new AMD PC's have always given me
trouble with I/O controller drivers lagging behind hardware changes. Any
suggestions much appreciated.

Saul.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-31 Thread Philippe Gerum

Saul wrote:

Philippe Gerum wrote:

Mm, this really looks like some firmware/BIOS cyclic activity, that 
would hit the busy loop more or less frequently, depending on the 
internal timing of the outer loop. This reminds me that some recent 
Intel chipset are known to prevent global SMI disabling actually, maybe 
this is the case here, so our work-around would be basically useless. 
Another usual suspect is NMI handling (which the Adeos layer does not 
pipeline but rather delivers immediately), but I guess that you did not 
activate the NMI watchdog anyway, so back to square #1, I guess.



No, I never enabled the NMI watchdog.

Yeah, that doesn't sound good for me. I just tried, for the first time,
allowing SMI to remain enabled. Behaves exactly the same :(  A failure
to disable SMI would also explain why none of my peripherals were ever
affected.

I thought SMI was supposed to increase latencies, but I only see
latencies  60us at any load, unless I start video capture. Does the
latency test absolutely rule out SMI as the cause?


Not really, e.g. if the video capture device is USB-attached, then SMI 
could possibly remain in the picture.


 Is there any way to

check whether SMI is actually disabled?



Gilles will likely have more clue here, but I guess that reading back 
the  state of the GBL_SMI_EN_BIT from the SMI's MMIO space would 
probably tell us if a previous request to disable SMI globally did 
succeed or not. The code in question is in ksrc/arch/i386/smi.c.


Another option to cross-check that we might have an SMI issue would be 
to find a box with an ICH4 chipset which is known to be reasonably 
controllable by our SMI work-around, and try to do video capture using 
it after Xenomai has globally disabled SMI sources.



Thanks,
Saul.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help




--

Philippe.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-31 Thread Gilles Chanteperdrix
Philippe Gerum wrote:
  Saul wrote:
   Philippe Gerum wrote:
   
  Mm, this really looks like some firmware/BIOS cyclic activity, that 
  would hit the busy loop more or less frequently, depending on the 
  internal timing of the outer loop. This reminds me that some recent 
  Intel chipset are known to prevent global SMI disabling actually, maybe 
  this is the case here, so our work-around would be basically useless. 
  Another usual suspect is NMI handling (which the Adeos layer does not 
  pipeline but rather delivers immediately), but I guess that you did not 
  activate the NMI watchdog anyway, so back to square #1, I guess.
   
   
   No, I never enabled the NMI watchdog.
   
   Yeah, that doesn't sound good for me. I just tried, for the first time,
   allowing SMI to remain enabled. Behaves exactly the same :(  A failure
   to disable SMI would also explain why none of my peripherals were ever
   affected.
   
   I thought SMI was supposed to increase latencies, but I only see
   latencies  60us at any load, unless I start video capture. Does the
   latency test absolutely rule out SMI as the cause?
  
  Not really, e.g. if the video capture device is USB-attached, then SMI 
  could possibly remain in the picture.

I have used the bttv driver with xawtv or mythtv and nvidia proprietary
drivers on a dual pIII machine running latency and the maximum latency
was far less than 100 us. What produces large spots, however, is
switching virtual terminals or changing resolution. Doesn't xawtv change
resolution for full screen mode ? Did not you use the 768x576 modeline
hack ?

  
Is there any way to
   check whether SMI is actually disabled?
   
  
  Gilles will likely have more clue here, but I guess that reading back 
  the  state of the GBL_SMI_EN_BIT from the SMI's MMIO space would 
  probably tell us if a previous request to disable SMI globally did 
  succeed or not. The code in question is in ksrc/arch/i386/smi.c.

That is a good idea. Another idea would also be to read the state of the
lock bit (named SMI_LOCK).

-- 


Gilles Chanteperdrix.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-30 Thread Gilles Chanteperdrix
Saul wrote:
  I have no idea what to try next but thanks for the advice. I think some
  progress was made :)

If it is enabled, could you try disabling cpufreq ?

-- 


Gilles Chanteperdrix.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-30 Thread Philippe Gerum

Saul wrote:

Philippe Gerum wrote:

It would be nice to known whether the latency test from the testsuite 
has correct latency figures, so that we could rule out any SMI issue.



See my reply to Jan Kiszka. The max is 55us, but I've just noticed that
starting xawtv using the bttv driver causes an overrun in the latency
test, but that may be unrelated.



Anyway, the erratic execution times come and go with the smallest
changes in the program e.g. moving the delay into it's own function or
creating functions for timing the delay loop, so I wouldn't be too
surprised if the problem can't be reproduced.



What /proc/xenomai/stat says during execution btw?



Here's /proc/xenomai/stat sampled every four seconds:

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572094   0 00500080  ROOT

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572110   0 00500080  ROOT
  0  28088  1/08  0 00380084  timed-task
  0  28089  8/715 0 00300081  display-task

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572164   0 00500080  ROOT
  0  28088  1/035 0 00380084  timed-task
  0  28089  35/34  69 0 00300081  display-task

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572216   0 00500080  ROOT
  0  28088  1/061 0 00380084  timed-task
  0  28089  61/60  1210 00300081  display-task

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572272   0 00500080  ROOT
  0  28088  1/089 0 00380084  timed-task
  0  28089  89/88  1770 00300081  display-task

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572296   0 00500080  ROOT



Also, I found that using rt_timer_spin() instead of a real busy loop
does not have the same problem, which indicates that the simple busy
loop is in fact being interrupted during execution.



Just to rule this out, maybe you could try something along these lines:

iopl(3)
...
asm(cli);
 RTIME prev_tsc = rt_timer_tsc();
 my_processing();
 RTIME tsc = rt_timer_tsc();
asm(sti);



Case 1: the original data was the case of the busy loop in a function 
interrupts enabled.
  1:  40.017 ms
  2:  40.017 ms
  3:  53.357 ms
  4:  53.356 ms
  5:  40.017 ms
  6:  40.020 ms
  7:  53.355 ms
  8:  40.020 ms
  9:  40.017 ms
10:  53.356 ms

Case 2: as suggested, this is the case of having the busy loop in a
function  interrupts disabled.
  1:  53.316 ms
  2:  53.316 ms
  3:  53.316 ms
  4:  53.316 ms
  5:  53.316 ms
  6:  106.632 ms
  7:  53.316 ms
  8:  106.632 ms
  9:  35.544 ms
 10:  39.987 ms

It appears to have made the situation worse?! Note:
35.544 = 39.987 * 8/9
53.316 = 39.987 * 12/9
106.632 = 39.987 * 24/9
OK, sending me crazy enough without delving to numerological lows.
Next I tried two more cases...

Case 3: busy loop inline  interrupts enabled.
  1:  35.571 ms
  2:  40.019 ms
  3:  40.020 ms
  4:  40.017 ms
  5:  40.017 ms
  6:  40.019 ms
  7:  40.017 ms
  8:  40.020 ms
  9:  40.017 ms
 10:  40.019 ms
...
 55:  53.357 ms

Case 4: busy loop inline  interrupts disabled.
  1:  39.987 ms
  2:  39.987 ms
  3:  39.987 ms
  4:  39.987 ms
  5:  39.987 ms
  6:  39.987 ms
  7:  39.987 ms
  8:  39.987 ms
  9:  39.987 ms
 10:  39.987 ms

Oh, that last one looks good :) But even it has produced a 106.632ms
measure.


Out of curiosity, I'd also be interested in looking at the assembly dump 
of my_process in the working and non-working cases.



Well, from the above tests the problem appears to always be there, just
to varying degrees - I no longer believe there is a working case on my
PC. Disabling interrupts removes some noise from the signal, but the
real problem persists. The large increase in computation time (many ms)
appears to be quantized, made more visible by disabling interrupts. I
guess placing the delay in a function is simply pushing the execution
time into the next quantization level. The assembly all looks about the
same between cases:

--
Case 1: busy loop in a function  interrupts enabled.
my_processing:
pushl   %ebp
movl%esp, %ebp
movl$8000, %eax
.p2align 4,,15
.L2:
decl%eax
jne .L2
popl%ebp
ret
...snip...
callrt_timer_tsc
movl%eax, %ebx
movl%edx, %esi
callmy_processing
callrt_timer_tsc

--
Case 2: busy loop in a function  interrupts disabled.
my_processing:
pushl   %ebp
movl%esp, %ebp
movl$8000, %eax
.p2align 4,,15
.L2:
decl%eax
jne .L2
popl%ebp
ret
...snip...

Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-30 Thread Saul
Gilles Chanteperdrix wrote:
   I have no idea what to try next but thanks for the advice. I think some
   progress was made :)
 
 If it is enabled, could you try disabling cpufreq ?

Previously at boot there was a message saying my CPU doesn't support cpufreq. 
But just to be sure, the tests that I last posted were all performed with 
frequency scaling and all ACPI and power management disabled in the kernel.

Saul.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-29 Thread Jan Kiszka
Saul wrote:
 Hi,
 
 I'm trying to run a short test program, just to get into the swing
 of this shiny new real-time system. I'm running Xenomai 2.1.0 on
 kernel 2.6.15.6
 
 The following program runs a busy delay in primary mode (verified by
 enabling the primary mode switch warning) and calculates the time taken.
 Another task is resumed and prints the time.
 
 I get output something like this:
   1:  40.017 ms
   2:  40.017 ms
   3:  53.357 ms
   4:  53.356 ms
   5:  40.017 ms
   6:  40.020 ms
   7:  53.355 ms
   8:  40.020 ms
   9:  40.017 ms
  10:  53.356 ms
  11:  40.020 ms
  12:  40.019 ms
  13:  106.710 ms
  14:  40.017 ms
  15:  40.018 ms ...
 
 I assume I've done something silly but I can't find anything wrong. My
 computer is a 3GHz P4. Xenomai is built-in statically, and at boot says
 Xenomai: SMI-enabled Intel chipset found, enabling SMI workaround.
 
 Anyway, the erratic execution times come and go with the smallest
 changes in the program e.g. moving the delay into it's own function or
 creating functions for timing the delay loop, so I wouldn't be too
 surprised if the problem can't be reproduced.
 
 Also, I found that using rt_timer_spin() instead of a real busy loop
 does not have the same problem, which indicates that the simple busy
 loop is in fact being interrupted during execution.
 
 Is it a problem that I'm not explicitly destroying the tasks? They don't
 linger in /proc/xenomai/sched once the program is killed. How much
 is automatically cleaned up upon process exit?

Tasks are cleaned up, other stuff not.

 
 #include stdio.h
 #include stdlib.h // exit
 #include unistd.h // pause
 #include sys/mman.h // mlockall
 #include signal.h
 #include xenomai/native/task.h
 #include xenomai/native/timer.h
 
 #define MS 100
 
 RT_TASK timed_task;
 RT_TASK display_task;
 int timed_count = 0;
 double timed_delay = 0;
 
 void my_processing(void)
 {
 int i;
 //  rt_timer_spin(40*MS);
 for (i=8000; --i; );
 }
 
 void timed_task_handler(void *cookie)
 {
 RTIME start_tsc = rt_timer_tsc();
 rt_task_set_mode(0, T_WARNSW, NULL);
 for (timed_count=0;;) {
 rt_task_sleep(40*MS); /* allow linux to run */
 
 RTIME prev_tsc = rt_timer_tsc();
 my_processing();
 RTIME tsc = rt_timer_tsc();
 timed_delay = rt_timer_tsc2ns(tsc-prev_tsc)*1e-9;
 
 timed_count++;
 rt_task_resume(display_task);
 if (timed_count = 100)
 break;
 }
 rt_task_set_mode(T_WARNSW, 0, NULL);
 printf(total time: %.3f\n,
 rt_timer_tsc2ns(rt_timer_tsc()-start_tsc)*1e-6);
 exit(0);
 }
 
 void display_task_handler(void *cookie)
 {
 for (;;) {
 rt_task_suspend(NULL);
 printf(%3d:  %.3f ms\n,
 timed_count, timed_delay * 1e3);
 }
 exit(0);
 }
 
 void context_switch_warn_handler(int sig)
 {
 printf(WARNING: detected switch to secondary mode\n);
 }
 
 int main(int argc, char *argv[])
 {
 int err;
 mlockall(MCL_CURRENT|MCL_FUTURE);
 signal(SIGXCPU, context_switch_warn_handler);
 rt_timer_set_mode(TM_ONESHOT);
 rt_task_spawn(timed_task, timed-task,
 0, 99, 0, timed_task_handler, 0);
 rt_task_spawn(display_task, display-task,
 0, 98, 0, display_task_handler, 0);
 pause();
 return 0;
 }
 
 Any clues?
 Thanks in advance,
 Saul.
 

No obvious problem with your program. Does the latency test of Xenomai
show any significant times? Maybe the SMI workaround does not catch all
issues on your system (though up to 60 ms jitter is still quite a lot).

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-29 Thread Philippe Gerum

Saul wrote:

Hi,

I'm trying to run a short test program, just to get into the swing
of this shiny new real-time system. I'm running Xenomai 2.1.0 on
kernel 2.6.15.6

The following program runs a busy delay in primary mode (verified by
enabling the primary mode switch warning) and calculates the time taken.
Another task is resumed and prints the time.

I get output something like this:
  1:  40.017 ms
  2:  40.017 ms
  3:  53.357 ms
  4:  53.356 ms
  5:  40.017 ms
  6:  40.020 ms
  7:  53.355 ms
  8:  40.020 ms
  9:  40.017 ms
 10:  53.356 ms
 11:  40.020 ms
 12:  40.019 ms
 13:  106.710 ms
 14:  40.017 ms
 15:  40.018 ms ...

I assume I've done something silly but I can't find anything wrong. My
computer is a 3GHz P4. Xenomai is built-in statically, and at boot says
Xenomai: SMI-enabled Intel chipset found, enabling SMI workaround.



It would be nice to known whether the latency test from the testsuite 
has correct latency figures, so that we could rule out any SMI issue.



Anyway, the erratic execution times come and go with the smallest
changes in the program e.g. moving the delay into it's own function or
creating functions for timing the delay loop, so I wouldn't be too
surprised if the problem can't be reproduced.



What /proc/xenomai/stat says during execution btw?


Also, I found that using rt_timer_spin() instead of a real busy loop
does not have the same problem, which indicates that the simple busy
loop is in fact being interrupted during execution.



Just to rule this out, maybe you could try something along these lines:

iopl(3)
...
asm(cli);
 RTIME prev_tsc = rt_timer_tsc();
 my_processing();
 RTIME tsc = rt_timer_tsc();
asm(sti);

Out of curiosity, I'd also be interested in looking at the assembly dump 
of my_process in the working and non-working cases.



Is it a problem that I'm not explicitly destroying the tasks? They don't
linger in /proc/xenomai/sched once the program is killed. How much
is automatically cleaned up upon process exit?


Only tasks are automatically wiped out by the nucleus, basically because 
they could not exist anymore after their parent process terminates. 
OTOH, other objects of the native API can linger since they do not 
depend on process resources, until they are explicitely destroyed or the 
API module is unloaded.




#include stdio.h
#include stdlib.h // exit
#include unistd.h // pause
#include sys/mman.h // mlockall
#include signal.h
#include xenomai/native/task.h
#include xenomai/native/timer.h



I'll look at this on my side too.

--

Philippe.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-29 Thread Saul
Philippe Gerum wrote:
 It would be nice to known whether the latency test from the testsuite 
 has correct latency figures, so that we could rule out any SMI issue.

See my reply to Jan Kiszka. The max is 55us, but I've just noticed that
starting xawtv using the bttv driver causes an overrun in the latency
test, but that may be unrelated.

  Anyway, the erratic execution times come and go with the smallest
  changes in the program e.g. moving the delay into it's own function or
  creating functions for timing the delay loop, so I wouldn't be too
  surprised if the problem can't be reproduced.
  
 
 What /proc/xenomai/stat says during execution btw?

Here's /proc/xenomai/stat sampled every four seconds:

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572094   0 00500080  ROOT

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572110   0 00500080  ROOT
  0  28088  1/08  0 00380084  timed-task
  0  28089  8/715 0 00300081  display-task

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572164   0 00500080  ROOT
  0  28088  1/035 0 00380084  timed-task
  0  28089  35/34  69 0 00300081  display-task

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572216   0 00500080  ROOT
  0  28088  1/061 0 00380084  timed-task
  0  28089  61/60  1210 00300081  display-task

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572272   0 00500080  ROOT
  0  28088  1/089 0 00380084  timed-task
  0  28089  89/88  1770 00300081  display-task

CPU  PIDMSWCSWPFSTAT  NAME
  0  0  0/042572296   0 00500080  ROOT

  Also, I found that using rt_timer_spin() instead of a real busy loop
  does not have the same problem, which indicates that the simple busy
  loop is in fact being interrupted during execution.
  
 
 Just to rule this out, maybe you could try something along these lines:
 
   iopl(3)
 ...
   asm(cli);
   RTIME prev_tsc = rt_timer_tsc();
   my_processing();
   RTIME tsc = rt_timer_tsc();
   asm(sti);

Case 1: the original data was the case of the busy loop in a function 
interrupts enabled.
  1:  40.017 ms
  2:  40.017 ms
  3:  53.357 ms
  4:  53.356 ms
  5:  40.017 ms
  6:  40.020 ms
  7:  53.355 ms
  8:  40.020 ms
  9:  40.017 ms
10:  53.356 ms

Case 2: as suggested, this is the case of having the busy loop in a
function  interrupts disabled.
  1:  53.316 ms
  2:  53.316 ms
  3:  53.316 ms
  4:  53.316 ms
  5:  53.316 ms
  6:  106.632 ms
  7:  53.316 ms
  8:  106.632 ms
  9:  35.544 ms
 10:  39.987 ms

It appears to have made the situation worse?! Note:
35.544 = 39.987 * 8/9
53.316 = 39.987 * 12/9
106.632 = 39.987 * 24/9
OK, sending me crazy enough without delving to numerological lows.
Next I tried two more cases...

Case 3: busy loop inline  interrupts enabled.
  1:  35.571 ms
  2:  40.019 ms
  3:  40.020 ms
  4:  40.017 ms
  5:  40.017 ms
  6:  40.019 ms
  7:  40.017 ms
  8:  40.020 ms
  9:  40.017 ms
 10:  40.019 ms
...
 55:  53.357 ms

Case 4: busy loop inline  interrupts disabled.
  1:  39.987 ms
  2:  39.987 ms
  3:  39.987 ms
  4:  39.987 ms
  5:  39.987 ms
  6:  39.987 ms
  7:  39.987 ms
  8:  39.987 ms
  9:  39.987 ms
 10:  39.987 ms

Oh, that last one looks good :) But even it has produced a 106.632ms
measure.

 Out of curiosity, I'd also be interested in looking at the assembly dump 
 of my_process in the working and non-working cases.

Well, from the above tests the problem appears to always be there, just
to varying degrees - I no longer believe there is a working case on my
PC. Disabling interrupts removes some noise from the signal, but the
real problem persists. The large increase in computation time (many ms)
appears to be quantized, made more visible by disabling interrupts. I
guess placing the delay in a function is simply pushing the execution
time into the next quantization level. The assembly all looks about the
same between cases:

--
Case 1: busy loop in a function  interrupts enabled.
my_processing:
pushl   %ebp
movl%esp, %ebp
movl$8000, %eax
.p2align 4,,15
.L2:
decl%eax
jne .L2
popl%ebp
ret
...snip...
callrt_timer_tsc
movl%eax, %ebx
movl%edx, %esi
callmy_processing
callrt_timer_tsc

--
Case 2: busy loop in a function  interrupts disabled.
my_processing:
pushl   %ebp
movl%esp, %ebp
movl$8000, %eax
.p2align 4,,15
.L2:
decl%eax
jne .L2
popl%ebp
ret
...snip...

Re: [Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-29 Thread Saul
Jan Kiszka said:
 No obvious problem with your program. Does the latency test of
 Xenomai show any significant times? Maybe the SMI workaround does not
 catch all issues on your system (though up to 60 ms jitter is still
 quite a lot).

This is after a few minutes with dd running to disk simultaneously from
/dev/zero and /dev/urandom:

RTD| 0.609| 7.218| 25.352|   0|  0.063| 53.592

This is the highest latency I've seen on this PC, except when I run
another task on the same priority :) This latency is in microseconds,
while the extra delays I'm seeing during processing are in ms, often
taking double or triple the expected execution time.

Curiously the worst case latency often shows up during low-load
conditions.

The SMI workaround is set to globally disable SMI. I haven't tried with
SMI enabled since disabling it has had no noticeable effects - USB
works, firewire works, etc.

Actually, I just noticed that starting video capture using xawtv and the
bttv driver causes an overrun in the latency test, then the worst sits
on 100+ us until the test is restarted. This may be a different issue.

Saul.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help


[Xenomai-help] Non-deterministic behaviour in primary mode

2006-03-28 Thread Saul
Hi,

I'm trying to run a short test program, just to get into the swing
of this shiny new real-time system. I'm running Xenomai 2.1.0 on
kernel 2.6.15.6

The following program runs a busy delay in primary mode (verified by
enabling the primary mode switch warning) and calculates the time taken.
Another task is resumed and prints the time.

I get output something like this:
  1:  40.017 ms
  2:  40.017 ms
  3:  53.357 ms
  4:  53.356 ms
  5:  40.017 ms
  6:  40.020 ms
  7:  53.355 ms
  8:  40.020 ms
  9:  40.017 ms
 10:  53.356 ms
 11:  40.020 ms
 12:  40.019 ms
 13:  106.710 ms
 14:  40.017 ms
 15:  40.018 ms ...

I assume I've done something silly but I can't find anything wrong. My
computer is a 3GHz P4. Xenomai is built-in statically, and at boot says
Xenomai: SMI-enabled Intel chipset found, enabling SMI workaround.

Anyway, the erratic execution times come and go with the smallest
changes in the program e.g. moving the delay into it's own function or
creating functions for timing the delay loop, so I wouldn't be too
surprised if the problem can't be reproduced.

Also, I found that using rt_timer_spin() instead of a real busy loop
does not have the same problem, which indicates that the simple busy
loop is in fact being interrupted during execution.

Is it a problem that I'm not explicitly destroying the tasks? They don't
linger in /proc/xenomai/sched once the program is killed. How much
is automatically cleaned up upon process exit?

#include stdio.h
#include stdlib.h // exit
#include unistd.h // pause
#include sys/mman.h // mlockall
#include signal.h
#include xenomai/native/task.h
#include xenomai/native/timer.h

#define MS 100

RT_TASK timed_task;
RT_TASK display_task;
int timed_count = 0;
double timed_delay = 0;

void my_processing(void)
{
int i;
//  rt_timer_spin(40*MS);
for (i=8000; --i; );
}

void timed_task_handler(void *cookie)
{
RTIME start_tsc = rt_timer_tsc();
rt_task_set_mode(0, T_WARNSW, NULL);
for (timed_count=0;;) {
rt_task_sleep(40*MS); /* allow linux to run */

RTIME prev_tsc = rt_timer_tsc();
my_processing();
RTIME tsc = rt_timer_tsc();
timed_delay = rt_timer_tsc2ns(tsc-prev_tsc)*1e-9;

timed_count++;
rt_task_resume(display_task);
if (timed_count = 100)
break;
}
rt_task_set_mode(T_WARNSW, 0, NULL);
printf(total time: %.3f\n,
rt_timer_tsc2ns(rt_timer_tsc()-start_tsc)*1e-6);
exit(0);
}

void display_task_handler(void *cookie)
{
for (;;) {
rt_task_suspend(NULL);
printf(%3d:  %.3f ms\n,
timed_count, timed_delay * 1e3);
}
exit(0);
}

void context_switch_warn_handler(int sig)
{
printf(WARNING: detected switch to secondary mode\n);
}

int main(int argc, char *argv[])
{
int err;
mlockall(MCL_CURRENT|MCL_FUTURE);
signal(SIGXCPU, context_switch_warn_handler);
rt_timer_set_mode(TM_ONESHOT);
rt_task_spawn(timed_task, timed-task,
0, 99, 0, timed_task_handler, 0);
rt_task_spawn(display_task, display-task,
0, 98, 0, display_task_handler, 0);
pause();
return 0;
}

Any clues?
Thanks in advance,
Saul.

___
Xenomai-help mailing list
Xenomai-help@gna.org
https://mail.gna.org/listinfo/xenomai-help