On Wed, 2012-02-08 at 00:16 +0100, Zygmunt Krynicki wrote:
> On 02/07/2012 11:43 PM, Andrew Richardson wrote:
> > Greetings,
> >
> > I'm experiencing what appears to be a minimum clock resolution issue in
> > using clock_gettime() on a PandaBoard ES running ubuntu.
> >
> >     *> uname -r*
> >     3.1.1-8-linaro-lt-omap
> >
> >     *> cat /proc/version*
> >     Linux version 3.1.1-8-linaro-lt-omap (buildd@diphda) (gcc version
> >     4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) )
> >     #8~lt~ci~20120118001257+025756-Ubuntu SMP PREEMPT Thu Jan 19 09:
> >
> > I'm using clock_gettime() (and have tried gettimeofday()) to compute the
> 
> Which clock_t were you using? I think CLOCK_MONOTONIC makes sense for 
> what you are trying to do and perhaps it has different resolution/accuracy.

Hrm. No, that shouldn't be the case. CLOCK_MONOTONIC and CLOCK_REALTIME
are driven by the same accumulation, and are only different by an
offset.

That said, in the test case you're using CLOCK_MONOTONIC_RAW, which I
don't think you really want, as its not NTP freq corrected. In addition
it is driven by some slightly different accumulation logic.  But you
said CLOCK_REALTIME showed the same issue, so its probably not some
CLOCK_MONOTONIC_RAW specific bug.

> > elapsed time around roughly 15ms of computation (image processing).
> > While the computed time is stable on my x86_64 machine, it is not on my
> > PandaBoard ES. I have tried various clocks (e.g. CLOCK_REALTIME), but
> > the issue remains. No error codes are returned by clock_gettime().
> >
> > The result on my x86_64 machine looks like this:
> >
> >     *elapsed (s) elapsed (ns) elapsed (us) time (after) time (before)*
> >     0s 532260ns *532us* (t1: 73741s 92573265ns) (t0: 73741s 92041005ns)
> >     0s 544413ns *544us* (t1: 73741s 109390136ns) (t0: 73741s 108845723ns)
> >     0s 529328ns *529us* (t1: 73741s 126024860ns) (t0: 73741s 125495532ns)
> >
> >     A: 1.7s in total. *0.536ms* on average.
> >
> >
> > If I move over to my PandaBoard ES, I calculate elapsed times of 0us on
> > some iterations.
> >
> >     *elapsed (s) elapsed (ns) elapsed (us) time (after) time (before)*
> >     0s 0ns *0us* (t1: 269529s 192626951ns) (t0: 269529s 192626951ns)
> >     0s 0ns *0us* (t1: 269529s 215606688ns) (t0: 269529s 215606688ns)
> >     0s 2655030ns *2655us* (t1: 269529s 252349852ns) (t0: 269529s
> >     249694822ns)
> >     0s 2593994ns *2593us* (t1: 269529s 286163328ns) (t0: 269529s
> >     283569334ns)
> >     0s 30518ns *30us* (t1: 269529s 317657469ns) (t0: 269529s 317626951ns)
> >
> >
> > If I crank up the amount of work done between the time calls
> > (timetest.c:18: inneriters = 1e7;) such that the timed loop takes around
> > 72ms, the timing results seem accurate and none of the intermediate
> > calculations result in a 0us elapsed time. If I reduce it to around
> > 10-25ms (inneriters=1e6), I get occasional 0us elapsed times. Around 2ms
> > (inneriters=1e5), most results measure an elapsed time of 0us.

Hrm. So I'm not familiar with the clocksource on panda. It may be so
coarse grained as to not allow for better intervals, but 2.5ms intervals
are a little outrageous. In the above you do have a 30us interval, so
clearly there are smaller intervals, so I doubt that is the real issue.

2.5ms is much closer to a tick length when HZ=300. Or a sched out and in
w/ HZ=1000.


> > I'm trying to optimize image processing functions, which take on the
> > order of 2-15ms to process. Am I stuck with this timing resolution? I
> > want to be careful to not omit issues like cache performance when
> > timing, as I might if I repeatedly process an image to average the
> > results. Currently, that seems like the best option.

Might the compiler be out smarting you, and you end up with basically
two calls to clock_gettime() next to each other? Then it would be more
normal to see 0 ns time intervals (if the clocksource is somewhat coarse
grained), with the occasional scheduling blip hitting inbetween the
timings? 

This explanation doesn't match your image timing results though, as I
assume you're doing actual work in that case. 

Hmmm. I'm a little stumped. Can anyone closer to the OMAP hardware
comment?

thanks
-john




_______________________________________________
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev

Reply via email to