Hi John,

Please see below ..

On Tuesday, June 7, 2016 at 5:26:32 AM UTC-4, John leger wrote:
>
> Hi Islam,
>
> I like the definition of 95% hard real time; it suits my needs. Thanks for 
> this good paper.
>
> Le lundi 6 juin 2016 18:45:35 UTC+2, Islam Badreldin a écrit :
>>
>> Hi John,
>>
>> I am currently pursuing similar effort. I got a GPIO pin on the 
>> BeagleBone Black embedded board toggling in hard real-time and verified the 
>> jitter with an oscilloscope. For that, I used a vanilla Linux 4.4.11 kernel 
>> with the PREEMPT_RT patch applied. I also released an initial version of a 
>> Julia package that wraps the clock_nanosleep() and clock_gettime() 
>> functions from the POSIX real-time extensions. Please see this other thread:
>> https://groups.google.com/forum/#!topic/julia-users/0Vr2rCRwJY4
>>
>> I tested that package both on Intel-based laptop and on the BeagleBone 
>> Black. I am giving some of the relevant details below..
>>
>> On Monday, June 6, 2016 at 5:41:29 AM UTC-4, John leger wrote:
>>>
>>> Since it seems you have a good overview in this domain I will give more 
>>> details:
>>> We are working in signal processing and especially in image processing. 
>>> The goal here is just the adaptive optic: we just want to stabilize the 
>>> image and not get the final image.
>>> The consequence is that we will not store anything on the hard drive: we 
>>> read an image, process it and destroy it. We stay in RAM all the time.
>>> The processing is done by using/coding our algorithms. So for now, no 
>>> need of any external library (for now, but I don't see any reason for that 
>>> now)
>>>
>>> First I would like to apologize: just after posting my answer I went to 
>>> wikipedia to search the difference between soft and real time. 
>>> I should have done it before so that you don't have to spend more time 
>>> to explain.
>>>
>>> In the end I still don't know if I am hard real time or soft real time: 
>>> the timing is given by the camera speed and the processing should be done 
>>> between the acquisition of two images.
>>> We don't want to miss an image or delay the processing, I still need to 
>>> clarify the consequences of a delay or if we miss an image.
>>> For now let's just say that we can miss some images so we want soft real 
>>> time.
>>>
>>
>> The real-time performance you are after could be 95% hard real-time. See 
>> e.g. here: https://www.osadl.org/fileadmin/dam/rtlws/12/Brown.pdf
>>  
>>
>>>
>>> I'm making a benchmark that should match the system in term of 
>>> complexity, these are my first remarks:
>>>
>>> When you say that one allocation is unacceptable, I say it's shockingly 
>>> true: In my case I had 2 allocations done by:
>>>     A +=1 where A is an array
>>> and in 7 seconds I had 600k allocations. 
>>> Morality :In closed loop you cannot accept any alloc and so you have to 
>>> explicit all loops.
>>>
>>
>> Yes, try to completely avoid memory allocations while developing your own 
>> algorithms in Julia. Pre-allocations and in-place operations are your 
>> friends! The example script available on the POSIXClock package is one way 
>> to do this (
>> https://github.com/ibadr/POSIXClock.jl/blob/master/examples/rt_histogram.jl).
>>  
>> The real-time section of the code is marked by a ccall to mlockall() in 
>> order to cause immediate failure upon memory allocations in the real-time 
>> section. You can also use the --track-allocation option to hunt down 
>> memory allocations while developing your algorithm. See e.g. 
>> http://docs.julialang.org/en/release-0.4/manual/profile/#man-track-allocation
>>  
>>
>
> I discovered --track-allocation not so long ago and it is a good tool. 
> For now I think I will rely on tracking allocation manually. I am a little 
> afraid of using mlockall(): In soft or real time crashing (failure) is not 
> a good option for me...
> Since you are talking about --track-allocation I have a question:
>
>
>         -     function deflat(v::globalVar)
>         0         @simd for i in 1:v.len_sub
>         0             @inbounds v.sub_imagef[i] = v.flat[i]*v.image[i]
>         -         end
>         -         
>         0         @simd for i in 1:v.len_ref
>         0             @inbounds v.ref_imagef[i] = v.flat[i]*v.image[i]
>         -         end
>         0         return
>         -     end
>         - 
>         -     # get min max
>         -     # apply norm_coef
>         -     # MORE TO DO HERE
>         -     function normalization(v::globalVar)
>         0         min::Float32 = Float32(4095)
>         0         max::Float32 = Float32(0)
>         0         tmp::Float32 = Float32(0)
>         0         norm_fact::Float32 = Float32(0)
>         0         norm_coef::Float32 = Float32(0)
>         -         # find min max
>         0         @simd for i in 1:v.nb_mat
>         0             # Doing something with no allocs
>         0         end
>         0     end
>         0 
>   1226415     # SAD[70] 16x16 de Ref_Image sur Sub_Image[60]
>         -     function correlation_SAD(v::globalVar)
>         0 
>         -     end
>         - 
>
> In the mem output file I have this information: at the end of 
> normalization I have no alloc and in front of the SAD comment and before 
> the empty correlation function I have 1226415 allocations.
> It should be logic that these allocations happened in normalization but 
> why is it here between two function ?
>

Yes, I noticed the same thing when I used track-allocation=user. The 
following lines from the manual solved the puzzle:
"In interpreting the results, there are a few important details. Under the 
user setting, the first line of any function directly called from the REPL 
will exhibit allocation due to events that happen in the REPL code itself. 
More significantly, JIT-compilation also adds to allocation counts, because 
much of Julia’s compiler is written in Julia (and compilation usually 
requires memory allocation). The recommended procedure is to force 
compilation by executing all the commands you want to analyze, then call 
Profile.clear_malloc_data() to reset all allocation counters."
http://docs.julialang.org/en/release-0.4/manual/profile/#memory-allocation-analysis


 

>  
>
>>
>>> I have two problems now:
>>>
>>> 1/ Many times, the first run that include the compilation was the 
>>> fastest and then any other run was slower by a factor 2.
>>> 2/ If I relaunch many times the main function that is in a module, there 
>>> are some run that were very different (slower) from the previous.
>>>
>>> About 1/, although I find it strange I don't really care.
>>> 2/ If far more problematic, once the code is compiled I want it to act 
>>> the same whatever the number of launch.
>>> I have some ideas why but no certitudes. What bother me the most is that 
>>> all the runs in the benchmark will be slower, it's not a temporary slowdown 
>>> it's all the current benchmark that will be slower.
>>> If I launch again it will be back to the best performances.
>>>
>>> Thank you for the links they are very interesting and I keep that in 
>>> mind.
>>>
>>> Note: I disabled hyperthreading and overclock, so it should not be the 
>>> CPU doing funky things.
>>>
>>>
>>>
>> Regarding these two issues, I encountered similar ones. Are you running 
>> on an Intel-based computer? I had to do many tweaks to get to acceptable 
>> real-time performance with Intel processors. Many factors could be at play. 
>> As you said, you have to make sure hyper-threading is disabled and not to 
>> overclock the processor. Also, monitor the kernel dmesg log for any errors 
>> or warnings regarding RT throttling or local_softitq_pending.
>>
>> Additionally, I had to use the following options in the Linux command 
>> line (pass them from the bootloader):
>>
>> intel_idle.max_cstate=0 processor.max_cstate=0 idle=poll
>>
>> Together with removing the intel_powerclamp kernel module (sudo rm 
>> intel_powerclamp). Caution: be extremely careful with such configuration as 
>> it disables many power saving features in the processor and can potentially 
>> overheat it. Keep an eye on the kernel dmesg log and try to monitor the CPU 
>> temperature.
>>
>> I also found it useful to isolate one CPU core using the isolcpus=1 
>> kernel command line option and then set the affinity of the real-time Julia 
>> process to run on that isolated CPU (using the taskset command). This way, 
>> you can almost guarantee the Linux kernel and all other user-space process 
>> will not run on that isolated CPU so it becomes wholly dedicated to running 
>> the real-time Julia process. I am planning to post more details to the 
>> POSIXClock package in the near future.
>>
>>
> I have an intel processor indeed and thanks for all the tips I will first 
> try to apply to isolate a CPU then disabling the intel options.
>  
>
>> Best,
>> Islam
>>
>>
> Again thanks a lot for all the help.
>  
>

You're welcome!

Cheers,
Islam 

Reply via email to