On 4/12/2018 5:23 AM, Philippe Gerum wrote:
On 04/12/2018 11:31 AM, Philippe Gerum wrote:
On 04/09/2018 01:01 AM, Steve Freyder wrote:
On 4/2/2018 11:51 AM, Philippe Gerum wrote:
On 04/02/2018 06:11 PM, Steve Freyder wrote:
On 4/2/2018 10:20 AM, Philippe Gerum wrote:
On 04/02/2018 04:54 PM, Steve Freyder wrote:
On 4/2/2018 8:41 AM, Philippe Gerum wrote:
On 04/01/2018 07:28 PM, Steve Freyder wrote:
Greetings again.
As I understand it, for each rt_queue there's supposed to be a
"status file" located in the fuse filesystem underneath the
"/run/xenomai/user/session/pid/alchemy/queues" directory, with
the file name being the queue name. This used to contain very
useful info about queue status, message counts, etc. I don't know
when it broke or whether it's something I'm doing wrong but I'm
now getting a "memory exhausted" message on the console when I
attempt to do a "cat" on the status file.
Here's a small C program that just creates a queue, and then does
a pause to hold the accessor count non-zero.
<snip>
The resulting output (logged in via the system console):
# sh qtest.sh
+ sleep 1
+ ./qc --mem-pool-size=64M --session=mysession foo
+ find /run/xenomai
/run/xenomai
/run/xenomai/root
/run/xenomai/root/mysession
/run/xenomai/root/mysession/821
/run/xenomai/root/mysession/821/alchemy
/run/xenomai/root/mysession/821/alchemy/tasks
/run/xenomai/root/mysession/821/alchemy/tasks/task@1[821]
/run/xenomai/root/mysession/821/alchemy/queues
/run/xenomai/root/mysession/821/alchemy/queues/foo
/run/xenomai/root/mysession/system
/run/xenomai/root/mysession/system/threads
/run/xenomai/root/mysession/system/heaps
/run/xenomai/root/mysession/system/version
+ qfile='/run/xenomai/*/*/*/alchemy/queues/foo'
+ cat /run/xenomai/root/mysession/821/alchemy/queues/foo
memory exhausted
At this point, it hangs, although SIGINT usually terminates it.
I've seen some cases where SIGINT won't terminate it, and a
reboot is
required to clean things up. I see this message appears to be
logged
in the obstack error handler. I don't think I'm running out of
memory,
which makes me think "heap corruption". Not much of an analysis!
I did
try varying queue sizes and max message counts - no change.
I can't reproduce this. I would suspect a rampant memory corruption
too,
although running the test code over valgrind (mercury build) did not
reveal any issue.
- which Xenomai version are you using?
- cobalt / mercury ?
- do you enable the shared heap when configuring ? (--enable-pshared)
I'm using Cobalt. uname -a reports:
Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri
Mar
9 11:07:52 CST 2018 armv7l GNU/Linux
Here is the config dump:
CONFIG_XENO_PSHARED=1
Any chance you could have some leftover files in /dev/shm from aborted
runs, which would steal RAM?
I've been rebooting before each test run, but I'll keep that in mind for
future testing.
Sounds like I need to try rolling back to an older build, I have a 3.0.5
and a 3.0.3 build handy.
The standalone test should work with the shared heap disabled, could you
check it against a build configure with --disable-pshared? Thanks,
Philippe,
Sorry for the delay - our vendor had been doing all of our kernel and SDK
builds so I had to do a lot of learning to get this all going.
With the --disable-pshared in effect:
/.g3l # ./qc --dump-config | grep SHARED
based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 +0200)
CONFIG_XENO_PSHARED is OFF
/.g3l # ./qc foo &
/.g3l # find /run/xenomai/
/run/xenomai/
/run/xenomai/root
/run/xenomai/root/opus
/run/xenomai/root/opus/3477
/run/xenomai/root/opus/3477/alchemy
/run/xenomai/root/opus/3477/alchemy/tasks
/run/xenomai/root/opus/3477/alchemy/tasks/qcreate3477
/run/xenomai/root/opus/3477/alchemy/queues
/run/xenomai/root/opus/3477/alchemy/queues/foo
/run/xenomai/root/opus/system
/run/xenomai/root/opus/system/threads
/run/xenomai/root/opus/system/heaps
/run/xenomai/root/opus/system/version
root@ICB-G3L:/.g3l # cat run/xenomai/root/opus/3477/alchemy/queues/foo
[TYPE] [TOTALMEM] [USEDMEM] [QLIMIT] [MCOUNT]
FIFO 5344 3248 10 0
Perfect!
What's the next step?
I can reproduce this issue. I'm on it.
The patch below should solve the problem for the registry, however this
may have uncovered a bug in the "tlsf" allocator (once again), which
should not have failed allocating memory. Two separate issues then.
diff --git a/include/copperplate/registry-obstack.h
b/include/copperplate/registry-obstack.h
index fe192faf7..48e453bc3 100644
--- a/include/copperplate/registry-obstack.h
+++ b/include/copperplate/registry-obstack.h
@@ -29,11 +29,12 @@ struct threadobj;
struct syncobj;
/*
- * Assume we may want fast allocation of private memory from real-time
- * mode when growing the obstack.
+ * Obstacks are grown from handlers called by the fusefs server
+ * thread, which has no real-time requirement: malloc/free is fine for
+ * memory management.
*/
-#define obstack_chunk_alloc pvmalloc
-#define obstack_chunk_free pvfree
+#define obstack_chunk_alloc malloc
+#define obstack_chunk_free free
struct threadobj;
Thanks Philippe,
I shall add this to my build ASAP. If I understand correctly, this is
switching
the entire registry-obstack-related dynamic storage allocation mechanism
from the
"pv" routines (TLSF allocator?) paradigm to the standard malloc/free
paradigm.
I ask because my next issue report was going to be about a SEGV that I have
been seeing occasionally in registry_add_file() after having called
pvstrdup()
and having gotten a NULL return back. The caller there apparently does not
expect a NULL return, so when you said "should not have failed
allocating memory"
that brought my attention back to the SEGV issue. This appears to be
related to
what I will call "heavy registry activity" when I am initializing - creating
lots of RT tasks, queues, mutexes, etc, causing hot activity in
registry_add_file
I would expect.
I'm thinking creation of a "registry exerciser" program may be in order...
_______________________________________________
Xenomai mailing list
[email protected]
https://xenomai.org/mailman/listinfo/xenomai