On 04/25 22:47:11, Maxim Uvarov wrote:
> On 04/25/17 21:51, Brian Brooks wrote:
> > On 04/25 12:25:03, Brian Brooks wrote:
> >> On 04/24 08:07:58, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> >>>>> diff --git a/platform/linux-generic/arch/x86/odp_cpu_arch.c
> >>>> b/platform/linux-generic/arch/x86/odp_cpu_arch.c
> >>>>> index c8cf27b6..9ba601a3 100644
> >>>>> --- a/platform/linux-generic/arch/x86/odp_cpu_arch.c
> >>>>> +++ b/platform/linux-generic/arch/x86/odp_cpu_arch.c
> >>>>> @@ -3,7 +3,14 @@
> >>>>>   *
> >>>>>   * SPDX-License-Identifier:     BSD-3-Clause
> >>>>>   */
> >>>>> +
> >>>>> +#include <odp_posix_extensions.h>
> >>>>> +
> >>>>>  #include <odp/api/cpu.h>
> >>>>> +#include <odp_time_internal.h>
> >>>>> +#include <odp_debug_internal.h>
> >>>>> +
> >>>>> +#include <time.h>
> >>>>>
> >>>>>  uint64_t odp_cpu_cycles(void)
> >>>>>  {
> >>>>> @@ -31,3 +38,55 @@ uint64_t odp_cpu_cycles_resolution(void)
> >>>>>  {
> >>>>>         return 1;
> >>>>>  }
> >>>>> +
> >>>>> +uint64_t cpu_global_time(void)
> >>>>> +{
> >>>>> +       return odp_cpu_cycles();
> >>>>
> >>>> A cycle counter cannot always be used to measure time. Even on x86,
> >>>> odp_cpu_cycles() will return the value of RDTSC which is not actually
> >>>> representative of the cycle count. Even if the x86 processor is set
> >>>> to a fixed frequency, the Invariant TSC may run at a different fixed
> >>>> frequency. Please take a look at the odp_tick_t proposal here:
> >>>>
> >>>> https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-
> >>>> eL0oGLAQ4OM/edit?usp=sharing
> >>>>
> >>>
> >>> From coverletter:
> >>> "This patch set modifies time implementation to use TSC when running on a 
> >>> x86 
> >>> CPU that has invarint TSC CPU flag set. Otherwise, the same Linux system 
> >>> time 
> >>> is used as before. TSC is much more efficient both in performance and 
> >>> latency/jitter wise than Linux system call. This can be seen also with 
> >>> scheduler latency test which time stamps events with this API. All 
> >>> latency 
> >>> measurements (min, ave, max) improved significantly."
> >>>
> >>> This function (cpu_global_time()) is called only when we have first 
> >>> checked that TSC is invariant. Also we measure the TSC frequency in that 
> >>> case. This function is defined in the same file as cpu_cycles(), and the 
> >>> file is x86 specific. So, we know what we are doing, and just re-using 
> >>> the code to read TSC.
> > 
> > What sort of timing accuracy is expected from the app?
> > 
> > From benchmarking the maximum single-threaded rate of these reads:
> > 
> >  x86_64:
> > 
> >    read       7 ns/op
> >    read_sync  22 ns/op
> > 
> >  A57:
> > 
> >    read       4 ns/op
> >    read_sync  26 ns/op
> > 
> > read_sync issues a synchronizing instruction for greater timing accuracy
> > but clearly takes more time to return the time value read from the core.
> > 
> 
> 
> it has to be depend on cpu frequency.
> 
> Maxim.

We are showing the difference between 'read' and 'read_sync' on the
same machine here.

Reply via email to