Il 06/07/2018 11:04, Philippe Gerum ha scritto:
On 07/04/2018 07:06 PM, Federico Sbalchiero wrote:
Hi,
first I want to say thanks to everyone involved in Xenomai for their job.

I'm testing Xenomai 3.0.7 and ipipe-arm/4.14 on Freescale/NXP i.MX6q
sabresd board using Yocto. System boots fine and is stable, but latency
under load (xeno-test) is higher than in my reference system (Xenomai
2.6.5 on Freescale kernel 3.10.17 + ipipe 3.10.18).
This is after disabling power management, frequency scaling, CMA,
graphics, tracing, debug.

I have found that a simple non-realtime user space process writing a
buffer in memory (memwrite) is able to trigger such high latencies.
Latency worsen a lot running a copy of the process on each core.
There is a correlation between buffer size and cache size suggesting
an L2 cache issue, like the L2 write allocate discussed in the mailing
list, but I can confirm L2 WA is disabled (see log).

I'm looking for comments or suggestions.

Thanks,
Federico


"memwrite" test case:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
unsigned char *buffer;
int main(int argc, char **argv)
{
     int i;
     int count = 0;
     int n;
     int size = 10 * 1024 * 1024;
     volatile unsigned *pt;
     printf("load system by writing in memory\n");
     buffer = malloc(size);
     if (buffer == NULL) {
         printf("buffer allocation failed\n");
         exit(1);
     }
     n = size / sizeof(unsigned);
     while (1) {
         // write some data to memory buffer
         pt = (unsigned *) buffer;
         for (i = 0; i < n; i++)
             *pt++ = i;
         count++;
     }
     return 0;
}

xeno-test on Xenomai 3.0.7 and ipipe-arm/4.14:
RTT|  00:00:01  (periodic user-mode task, 1000 us period, priority 99)
RTH|----lat min|----lat avg|----lat max|-overrun|---msw|---lat
best|--lat worst
RTD|     18.000|     26.504|     42.667|       0|     0| 18.000|     42.667
RTD|     19.000|     25.198|     41.000|       0|     0| 18.000|     42.667
RTD|     18.999|     25.494|     40.999|       0|     0| 18.000|     42.667
RTD|     18.666|     25.060|     38.999|       0|     0| 18.000|     42.667
RTD|     18.999|     24.464|     38.332|       0|     0| 18.000|     42.667
RTD|     18.332|     24.546|     41.999|       0|     0| 18.000|     42.667
RTD|     13.332|     22.445|     45.665|       0|     0| 13.332|     45.665
RTD|     13.331|     21.164|     43.665|       0|     0| 13.331|     45.665
RTD|     13.331|     21.930|     43.665|       0|     0| 13.331|     45.665
RTD|     13.331|     22.254|     48.664|       0|     0| 13.331|     48.664
RTD|     13.331|     22.037|     46.664|       0|     0| 13.331|     48.664
RTD|     13.330|     21.053|     42.664|       0|     0| 13.330|     48.664
RTD|     13.330|     20.610|     37.330|       0|     0| 13.330|     48.664
RTD|     13.330|     20.520|     34.997|       0|     0| 13.330|     48.664
RTD|     13.330|     20.398|     39.330|       0|     0| 13.330|     48.664
RTD|     13.663|     21.249|     37.996|       0|     0| 13.330|     48.664
RTD|     13.329|     20.983|     35.663|       0|     0| 13.329|     48.664
RTD|     12.996|     20.039|     34.329|       0|     0| 12.996|     48.664
RTD|     13.329|     20.580|     42.662|       0|     0| 12.996|     48.664
RTD|     12.995|     20.518|     39.329|       0|     0| 12.995|     48.664
RTD|     13.328|     20.168|     35.662|       0|     0| 12.995|     48.664

xeno-test on Xenomai 2.6.5 and Freescale Linux 3.10.17 + ipipe 3.10.18:
RTT|  00:00:01  (periodic user-mode task, 1000 us period, priority 99)
RTH|----lat min|----lat avg|----lat max|-overrun|---msw|---lat
best|--lat worst
RTD|      4.957|     17.575|     28.088|       0|     0| 4.957|     28.088
RTD|      4.904|     17.560|     26.828|       0|     0| 4.904|     28.088
RTD|      4.479|     13.472|     29.767|       0|     0| 4.479|     29.767
RTD|      4.522|     12.724|     23.275|       0|     0| 4.479|     29.767
RTD|      4.512|     12.904|     25.641|       0|     0| 4.479|     29.767
RTD|      4.542|     12.818|     27.878|       0|     0| 4.479|     29.767
RTD|      4.520|     13.068|     27.926|       0|     0| 4.479|     29.767
RTD|      4.409|     12.770|     26.689|       0|     0| 4.409|     29.767
RTD|      4.568|     12.265|     27.065|       0|     0| 4.409|     29.767
RTD|      4.492|     12.017|     25.898|       0|     0| 4.409|     29.767
RTD|      4.469|     12.303|     24.540|       0|     0| 4.409|     29.767
RTD|      4.489|     12.030|     27.924|       0|     0| 4.409|     29.767
RTD|      4.590|     11.851|     23.651|       0|     0| 4.409|     29.767
RTD|      4.479|     13.371|     24.838|       0|     0| 4.409|     29.767
RTD|      4.396|     13.204|     28.797|       0|     0| 4.396|     29.767
RTD|      4.411|     12.454|     26.002|       0|     0| 4.396|     29.767
RTD|      4.560|     12.234|     27.146|       0|     0| 4.396|     29.767
RTD|      4.593|     12.441|     24.686|       0|     0| 4.396|     29.767
RTD|      4.520|     12.510|     24.275|       0|     0| 4.396|     29.767
RTD|      4.568|     11.797|     24.982|       0|     0| 4.396|     29.767
RTD|      4.482|     12.631|     24.972|       0|     0| 4.396|     29.767

Worst-case on 2.6.5 + 3.18.20 is 67 us here, after 10 hrs runtime on
imx6q - definitely not 30 us - stressing the latency test with:

- dd loop (zero -> null, 16M bs)
- switchtest -s 200

30 us worst case are in very short term (1-2 minutes) with just one instance of
memwrite in background.
Using dd loop and switchtest gives 50 us in short term. I suppose this compares
reasonably to 67 us after 10 hours.

I also confirm
dd if=/dev/zero of=/dev/null bs=16M
has the same effect on latency as memwrite, thanks

_______________________________________________
Xenomai mailing list
[email protected]
https://xenomai.org/mailman/listinfo/xenomai

Reply via email to