On 28.07.20 15:28, Stéphane Ancelot wrote:

Le 27/07/2020 à 15:17, Jan Kiszka a écrit :
On 27.07.20 14:44, Stéphane Ancelot via Xenomai wrote:
Hi,

Using pipe created with poolsize = 0, meaning all message allocations for this pipe are performed on the Cobalt core heap.

Unfortunately,  using rt_pipe_write(), when no user task is consuming it, we discovered after almost many rt_pipe_write() cycles (700000 at least in our process)  , that the cobalt heap and system heap seem being corrupted.

Leading to system issues like unattended task crashes .....


"3.x" implies both 3.1 and 3.0 are affected?

Do you see a constantly growing use of system heap (leak)? If that is not the case, we might have some wrap-around issue somewhere.

The version we are using is  based on release b3e18b6d  of master branch.

We don't sea system memory increasing (using top).

Comparing it to the latest releases, we have not found any big differences in xddp code .

Using other releases , applications and compiled kernel does not warranty  to identify it has been solved , since the memory mapping to reproduce it , changes.

For certifications reasons, we can't validate the latest source code, but only cherry pick a localised hotfix in the xenomai code.


Reproduction case would be nice.

It is not easy, the initial problem was reported by one of our users , we spent lot of time to achieve to reproduce it in our context.

Some graphics user tasks were locking or crashing after some days usage and production .

At first,  we went in wrong directions in order to identify from where it could happen.

In our system, we had to test each code commits back....in order to isolate the problem, and understand that it was visible after almost 700000 rt_pipe_write calls in our case.


As a unittest, we can provide the enclosed snippet.That is the extracted code that would cause problem.


Under which condition does that test_pipe.cpp cause the issue? I've given it a quick try, and as it's late, I disabled the delay in the loop. That so far did not trigger an issue. Is the delay important?

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Reply via email to