John, Thank you for your detailed reply.
> > I was finally able to get it to configure similar to Eric's setup > > (althought I > > am running 2.6.27-magma due to problems with the 2.6.24 kernel not playing > > nicely with my CPU fan -- a known problem), but have an odd thing which > > seems > > to require a CPU hog process in another terminal. If the process is not > > there, then the RT thread seems to stall waiting for the scheduler to give > > it > > a time slice. > > How long of a stall? Here is my experimental setup... open up two windows. In the first run either the configuration program or the latency tests. The test writes out the header block and just sits there. I mean it does not appear to run the thread at all. No go and run a CPU hog in the other terminal. My hog of choice is "top -d 0" which grabs process info with a 0 second period and just start flashes. Once I start the second process, the first (RT) starts chunking out the latency info at 1 second intervals. Now for the really interesting part. Kill the first proces. Killing the CPU hog process causes the RT thread to stall again. Until I restart CPU hog. Every once and awhile it will give another tic, but not continuously as expected. So, while the 1 second wait does check the latency, control never goes back to the calling process to get ready for the next tic. This completely disrupts the actual flow of the latency code and does not really check it for overall smoothness of the requested tic's -- which has huge implications with regard to however the tool velocities of a machine which is driven by a processing thread on a CPU with this problem (and I would call it a bug). It may just be some weirdness with my kernel config, but I personally consider this a warning sign... > In my experience, the "cpu hog" is able to reduce latencies from 10-20 > microseconds down to perhaps 5-7uS. If your stalls are much longer than > that you must be seeing something new. The stall is not happening inside the RT loop, but outside it in the calling process. As I was playing with various configurations (like Steve's? suggestion of isolcpus=1, turning off hyperthreading, etc.) I did see similar reduction. There are some interesting patterns in the actual results. I was watching not only the ovr_max, but also the lat_max. For me the lat_max will bounce around between a say -200ns to maybe 300ns, and then jump to 2000ns to 4000ns blocks, and then sometimes settle back to near 0. > My own theory (and it is only a theory) about why the cpu hog works is > related to cache. The hog uses very little memory, and since it keeps > one CPU busy, that CPU never runs any other code. So the RT code > doesn't get flushed out of cache, and doesn't have to get fetched back > into cache later. The cache theory (which makes seance) would explain the jumping blocks seen above, but it does not explain my current problem with the non RT side of the latency test stalling the way it does. Maybe what is needed is another latency test which uses a continuous/periodic interrupt. This would have caught my problem -- of if it is already written that way, then something hinky is definitely going on, because it is not only the I/O which is getting backed up, but it appears to be actually stalling since it resumes at 1 second intervals (similar to putting the process to sleep and then resuming it). > I saw some other cache related behavior a long time ago when doing some > latency testing. The latency results improved noticeably when I lowered > the thread period below some threshold. (I don't remember the threshold > period, it was at least a year ago.) > > I eventually realized that when I was running the thread very > frequently, the RT code never got pushed out of cache. When I increased > the period, other processes had enough time to replace the RT code in > cache between invocations of the thread. John, thanks again for your reply, and I will keep this in mind while I trudge along. As a note, I am going to go back and completely reconfigure my machine from scratch and see if I can sort this out. Best regards, EBo -- ------------------------------------------------------------------------------ Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com _______________________________________________ Emc-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/emc-developers
