Jorge Almeida wrote: > Em Terça, 19 de Setembro de 2006 16:50, escreveu: >> Jorge Almeida wrote: >>> Hello to all, >>> >>> I'm testing the SOCK_RAW functionality in the rtnet framework for long >>> periods of time and a problem is happening. >>> I will try to describe it: >>> >>> I'm making tests sending 1.000.000 (one milion messages), with an interval >>> of 5 ms each. >>> After some time, more than 100.000 messages, the host were the test is >>> running has a strange behaviour, the program does not return but the bash >>> dies. I must make the login phase again to enter the host. >>> The messages stop of being sent (I'm monitoring the network with ethereal). >>> >>> In the /proc i find some data about the file descriptor used by the socket >>> (/proc/rtai/rtdm/open_fildes) >>> Index Locked Device >>> 0 0 PACKET_RAW >>> >>> I think this is OK because the socket was never closed. >> Because the sender somehow died I think. But why does the console also >> die? That's not a typical program error. Anything on the kernel console? > > In attach follows the messages file for one session where the problem happens
Ok, your job (+ its shell) got OOM-killed (terminated due to lacking memory). Is the test program allocating some memory in a loop? Unless it is a kernel leak (RTnet or even lower), the OOM-killer typically (not always) picks The Right process... > >>> But the behaviour is strange. >>> My guess is that this problem is due to some kind of semaphore or any >>> synchronization mechanism. >> The guess is based on which information? > Because it only happens in a very high number of messages and not in a small > number. Maybe a variable that overflows or anything like that. dmesg tells some other story so far. >>> >>> Any clues for wath is happening? >> Nope. >> >> If there are no signs anywhere, I would first try to run your scenario >> over a similar time using some vanilla RTnet version with normal packet >> sockets. Have you tried this before? Just to exclude that there are >> major stability issues. > > I've tested with SOCK_DGRAM two times, one OK the other the same problem. I'm > doing some more tests with SOCK_DGRAM. I think this is not related to latest changes. FWIW: that SOCK_DGRAM test took place over RTnet, say, 0.9.5 vanilla? > >> BTW, you are on RTAI? What version, patch, gcc? > I'm using RTAI 3.4 test1, with gcc-4.1.0, with patch HAL IPIPE-NOTHREADS > 1.3-08 I'm definitely no RTAI expert anymore, but the last time I tried it with gcc-4.1 (a few months ago) it also jumped out of the windows by just running the latency test for half an hour or so. I think I read on the RTAI list that gcc4.1 is not producing reliable RTAI code. > > I'm gonna try with rtai 3.4 now. I got flamed for such suggestions before, but to reduce the number of unknowns I would really recommend to run a similar setup over Xenomai (2.2.2 recommended for now due to pending FPU issues in 2.2.3). This can help to find out where we have to dig deeper for the problem. Jan
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________ RTnet-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rtnet-developers

