On 03.09.15 14:32, Savolainen, Petri (Nokia - FI/Espoo) wrote:
-----Original Message-----
From: ext Ivan Khoronzhuk [mailto:ivan.khoronz...@linaro.org]
Sent: Thursday, September 03, 2015 1:29 AM
To: Savolainen, Petri (Nokia - FI/Espoo); lng-odp@lists.linaro.org
Subject: Re: [lng-odp] RFC: New time API
Hi, Petri
We have to look at it proceeding from performance, platform portability
and simplicity
If you want to split on hi-res time and low-res they must have separate
functions and
be not under one common opaque time in order to not break hi-res
measurements.
But in fact you split the same quality timers, farther below...
The API has two goal (as any other API under ODP)
- solve a user problem (take timestamps and work with those)
- enable good performance on multiple HW platforms (enable direct HW time
counter(s) usage)
On 02.09.15 18:21, Savolainen, Petri (Nokia - FI/Espoo) wrote:
Hi,
I think we need to restart the time API discussion and specify it
only wall time in mind.
Let's suppose.
CPU cycle count APIs can be tuned as a next step. CPU cycle counters
are affected by frequency scaling, which makes those difficult to use
for counting linear, real time.
The time API should specify an easy way to check and use the real,
wall clock time.
We need at least one time source that will not wrap in years - here
it's the "global" time
(e.g. in POSIX it's CLOCK_MONOTONIC).
Don't mix it with this API.
CLOCK_MONOTONIC is guaranteed by OS, that can handle wraps, OS can use
interrupts for that, you cannot.
A monotonic, very long wrap around time source is an application requirement.
CLOCK_MONOTONIC is an example of solving the same requirement in POSIX world.
Yes, an ODP implementation should not be interrupt driven, but still an
implementation can and likely will serve some interrupts: in worst case on a
worker core, in a better case on a control core and in the best case on a
system core outside of the ODP application (e.g. linux kernel on core #0). The
key is how often those interrupts need to be served and how long it takes in
the worst case. E.g. one interrupt due to counter wrap in 5 years on the core
running linux kernel, does not matter much.
Counter wraps are really an issue with short time counters. Today most chips
provide large enough counters, so that wrap around is not really an issue (max
one wrap in several years).
Also it must be zero at platform init and begin count time when
application starts. For each application that starts.
Can you guarantee that it's inited to zero for each platform? I
hesitate to answer on this question.
The API does not specify that HW counter is reset at any point in time. It
specifies that wall time (nsec time) is zero in an application start up. In
practice, implementation needs to read HW counter once in start up and store
it. Basic stuff.
But let's suppose that we can guarantee it.
In this case time should be aligned for all executed applications to
start from zero.
Let's suppose some start_time = odp_time() at application init.
As it was noted earlier, in some fast loop, init_count must be
extracted in diff function.
I'm not talking event about checking of time type, as you are going to
put all of them under one type.
Global time can be also compared between threads. Local time is
defined for optimizing short interval time checks.
In fact global has same quality as local. As noted earlier, global can
be emulated with local.
Why we need to split them in this case? It'll add load on user only.
It's thread local, may wrap sooner than global time, but may be lower
overhead to use.
They are both 64-bit. You are going to use the same function for both,
overhead the same.
First, this as any other ODP API affect only one ODP application (instance).
Global time is global between threads of a single application.
These are two different application use cases:
- global == time that can be shared between thread
- local == time that does not need to be shared between threads
Ability to share is the key difference in quality. Yes, when there's a SoC
level, low latency, high frequency, 64-bit time counter - it's sensible to use
that to implement both local and global. In this case, implementation also
avoids check between global and local time (it's all the same). I'm expecting
that this is the common case.
BUT, what if the SoC level HW counter has high latency to access (e.g. 150 CPU
cycles) and is low frequency? And the HW would have low latency, high frequency
per CPU counter that you could use for counting local time? If API has only
global definition, you could not use that HW resource even when application is
not interested in sharing timestamps with other thread (needs only local time).
The application would run slower on your HW, since every (local) timestamp
would consume 150 cycles instead of e.g. 1 cycle.
There could be actually four time bases defined in the API, if we'd
want to optimize for each use case (global.hi_res, global.low_res,
local.hi_res and local.low_res).
I'd propose to have only two and give system specific way to
configure those (rate and duration).
What do you mean "giv system way...", do you mean add some API?
If so, I disagree. It's not ODP responsibility and it's not every
platform applicable.
Time counters are likely used also for OS, etc. So, vendor would need to document e.g. if
and how ODP global time resolution can be tuned. It may be e.g. through a Linux boot
parameter, because ODP reads the same counter that Linux uses for its wall clock time.
It's "system specific" how the HW resource (e.g. time counter rate) can be
configured and it's unlikely that an ODP application could change the setting, so we
don't need an API to set the rate, only an API to get the rate.
Typical config would be global.low_res and local.hi_res. User can
check hz and max time value (wrap around time) with odp_time_info() and
adapt (fall back to use global time) if e.g. local time wraps too
quickly (e.g. in 4 sec).
If this time wrap every 4s, it shouldn't be used at all...(any 32-bits)
Agree. Frequent wraps is the main problem, but can we rule today that any HW
interesting to ODP must have HW time counter large enough to wrap once per no
less than X years. We cannot rule the implementation (e.g. 32, 48, 64 bit
counter) only the API spec. API can require that there's at least one time
source that won't wrap often (global). It's then up to the implementation how
that's guaranteed (natively due to large enough counter, co-operating with OS,
using per core interrupts, ...).
See the proposal under.
-Petri
//
// Use cases
//
// high resolution low resolution
// short interval long interval
// low overhead high overhead
//
//
// global timestamp packets or | timestamp log entries
or
// other global resources | other global resources
// at high rate | at low rate
// |
// ---------------------------+--------------------------
----
// |
// local timestamp and sort items | measure execution time
over
// in thread local work queue,| many iterations or
over
// measure execution time | a "long" function
// of a "short" function, |
// spin and wait for a short |
// while
//
//
No see reason to overload user with this stuff.
In fact we always need one hi-resolution time with best quality, no
matter what we measure.
No matter how resolution it has, it should be the max that platform can
provide for that.
At this moment all counters are 64-bit and can not wrap for years.
On my opinion,32-bit counter we shouldn't take into account.
These are *application use cases*. One use case could be for example: stamp every
log entry (average 1 entry per minute) with millisecond resolution in global
time... => does not need low overhead or high resolution, but globally
synchronized linear time.
// time in nsec
// renamed to leave room for sec or other units in the future
#define ODP_TIME_NS_USEC 1000ULL /**< Microsecond in nsec */
#define ODP_TIME_NS_MSEC 1000000ULL /**< Millisecond in nsec */
#define ODP_TIME_NS_SEC 1000000000ULL /**< Second in nsec */
#define ODP_TIME_NS_DAY ((24*60*60)*ODP_TIME_NS_SEC) /**< Day in
nsec */
// Abstract time type
// Implementation specific type, includes e.g.
// - counter value
// - potentially other information: global vs. local time, ...
typedef odp_time_t
This type can be added only with one aim - ask user to use appropriate
API that
can handle wraps correctly. In another case, like with global time you
are
proposing (no wrap), uint64_t can be used, no need to overload API with
odp_time_t
and APIs like diff, cmp, etc.
Main benefit from abstract time is that implementation can work in native
counter values. If API would specify that time is always nsec (or sec+nsec like
in POSIX struct timespec), every timestamp operation would need to convert
between counter cycles and nsec (which may add e.g. division operations in the
calls).
// Get global time
// Global time is common over all threads. Global timestamps can be
compared
// between threads.
odp_time_t odp_time(void);
// Get thread local time
// Thread local time is only meaningful for the calling thread. It
cannot be
// compare with other timestamps (global or local from other
threads).
// May run from different clock source and different rate than global
time.
// User must take care not to mix local and global time values in API
calls.
odp_time_t odp_time_local(void);
I dislike the idea of local time. Theoretically it can be added, but I
no see reason for that.
Even if it's required, it should be handled with separate functions, as
according to RFC it can
overlap, global cannot. In every function the time type has to be
checked and different approach chosen.
It's time consuming redundancy for short periods and this reduces the
actual resolution.
The reason is implementation efficiency. Implementation can be optimized for
local time (e.g. CPU local counters), when user doesn't need globally sharable
time value.
The spec says: local can wrap, NOT that it must wrap.
Implementation decides and knows:
- if it's possible to wrap (in practice)
- if it's identical to global time (== no redundancy, no checks needed, same
code serves both)
Yes in most cases it be so, but no guarantee.
// Compare time values
//
// Check if t1 is before t2 in absolute time, or if interval t1 is
shorter
// than interval t2
//
// -1: t2 < t1
// 0: t2 == t1
// 1: t2 > t1
int odp_time_cmp(odp_time_t t1, odp_time_t t2);
This function, according to RFC, must behave differently for local and
global time.
And use cases also different, for time than can wrap, it can be used
only for ranges.
But, again, I dislike to guarantee any timer linearity.
I added this function with only one intention - simply compare time
ranges, not more.
Time is linear - the API needs to support that. Application can check if local
time stays linear long enough for its use case.
It doesn't sound like simplification. In current variant user don't need to
worry about this.
If it does not, the global time should be the fall back (wrap only after
several years).
Range is a relative term - ranges longer than the wrap around time (in real
time) would again cause problems.
I thinks no need to compare ranges more then years. It's not for this use-case.
One option would be to force all time sources to have very long wrap around
times, which may cause low resolution on all of them (not only global). Maybe
it's better to just specify that cmp() must not be used if (nsec) time can wrap
between t1 and t2.
This also doesn't sounds as simplification. How user can know he is comparing
wrapped time or not?
He cannot - that's the problem. No one cannot. You cannot predict what points
user compare.
You cannot emulate it in implementation also (suppose worst case -
implementation cannot grantee counter is united to 0 at board start),
as first wrap can happen any time, it's second takes years. Relying on this
makes all applications very configuration dependent.
That is the one of the main and bright examples that allow to see why we don't
need to hide wraps.
So this function cannot be used with timestamps at all, only ranges. To get
range, you must use diff function,
diff function can handle wraps inside. That is. If you must use diff and cmp
then why bother with wall time?
Why user still should think about wraps, if you want to equalize it to wall
time?
Or even, use this function (it was one of your ideas), to check time order with
function that requires order......
What about to not bother with chicken/egg issue and always assume that wrap can
happen or cannot at all.
Only describe in API file, it must be > 10 years, for instance, before first
wrap.
And if your application can run more than 10 years it can suddenly fail.
Uh..or add in description. ..never change your dtb file to another init value
or freq
if you don't know what are you doing...in another way you application can
suddenly fail...
It be threshold for orientation and both implementation and application can lie
on it. And hardly control.
// Sum of t1 and t2
//
// User can sum timestamps or accumulate multiple intervals before
// comparing or converting to nsec
odp_time_t odp_time_sum(odp_time_t t1, odp_time_t t2);
// Time difference between t1 and t2
//
// Calculate interval from timestamp t1 to t2, or difference of two
intervals.
// T2 must be the latter timestamp, or the longer interval (t2 >=
t1).
// Use cmp() first, if don't know which timestamp is the
latter/longer.
Event if suppose that it's split on local/global
you cannot use it to compare timestamps that can wrap (local)
Compare can be used only to compare RANGES.
odp_time_t odp_time_diff(odp_time_t t1, odp_time_t t2);
// Convert ODP time to wall clock time in nsec
//
// Wall clock time advances linearly in realtime and starts from 0 in
ODP init.
//
// Global time must not wrap in several years (max time value is
defined by
// info.global_nsec_max). Local time may have shorter wrap around
time
// (info.local_nsec_max) than global, but it's also recommended to be
years.
//
// Global and local time may run from different time base and thus
result
// different nsec values.
uint64_t odp_time_to_ns(odp_time_t time);
As I see it can be used for "local" time also.
You cannot get wall clock time from time counter that can wrap with
this function.
It can be done only in this way:
start_time = odp_time(); // at init.
....
odp_time_ns(odp_time_diff(start_time, odp_time()))
Yes, this is what implementation needs to do when converting odp_time_t to nsec
time. User can see from the info struct when (and how often) a time source will
wrap. Both global and local nsec time may wrap the first time after e.g. >100
years when implemented with 64 bit counters.
Only if it can wrap, if it cannot odp_time_ns(odp_time() - start_time)) is
enough.
// convert nsec value to global time
odp_time_t odp_time_from_ns(uint64_t ns);
// convert nsec value to local time
odp_time_t odp_time_local_from_ns(uint64_t ns);
// Time info structure
typedef struct {
// Global timestamp resolution in hz
uint64_t global_hz;
// Max global time value in nsec. Global time values (timestamps
or
// intervals) larger than this are not handled correctly.
// Global wall clock time wraps back to zero after this value.
uint64_t global_nsec_max;
User don't need to worry about this parameter.
Why do we need this? I no see any usecase. Only if user wants catch
wraps.
But why then add wall global time if he needs to worry about this.
Strange.
We can easily spec that this should be in minimum "several years". It gets
trickier to spec that it must be at least X years. What would be good number that
everybody can support efficiently in HW? If we find a number let's put it here.
// Local timestamp resolution in hz
uint64_t local_hz;
// Max local time value in nsec. Local time values (timestamps or
// intervals) larger than this are not handled correctly.
// Local wall clock time wraps back to zero after this value.
uint64_t local_nsec_max;
} odp_time_info_t;
// Time info request
//
// Fill in time info struct. User can check resolutions and max time
values¨
// in nsec.
//
// 0 on success
// <0 on failure
int odp_time_info(odp_time_info_t *info);
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp
In summary:
* I like wall time, but:
- it requires to extract init value in places where it's not needed.
- requires to guarantee that timer is set counter to 0 at init.
- can be replaced with:
odp_time_diff(start_init_time, odp_time()) // give you wall
time,
// then convert to ns
if you need.
which doesn't require guarantee to be 0 at init.
Better to do this once and inside implementation. Only read access to HW
counter is needed.
* According local time
- it increases complexity.
- it requires to hold different types of time under opaque type, thus
- it requires each time check the type of time under odp_time_diff(),
which
can be used in places sensible for that.
- if local counters have better characteristics they can emulate
global timer,
in turn global timer can be used everywhere. And doesn't matter if
they are
the same on some platforms, you should worry about it in
application anyway.
Application gives information (I need to share this timestamp, I don't need to
share this one), implementation uses that as it wishes.
I propose to use always global time and use API, that is enough for all
cases:
Mostly it includes and follows existent time API:
odp_time_t odp_time(void);
odp_time_t odp_time_diff(odp_time_t t1, odp_time_t t2); // ranges and
timestamps
odp_time_t odp_time_sum(odp_time_t t1, odp_time_t t2);
uint64_t odp_time_to_ns(odp_time_t time);
odp_time_t odp_time_from_ns(uint64_t ns);
int odp_time_cmp(odp_time_t t1, odp_time_t t2); // only ranges
uint64_t odp_time_to_u64(odp_time_t time); // debugg purposes
ODP_TIME_NULL // for init and comparison
To_u64 can be added.
In general, odp_time_t could be a struct and thus pointer could be used for
reference. Output would be through param and return value could indicate error
(e.g. too large time value is input).
Also #defines should be minimized for possible future binary compatibility. So,
odp_time_zero(odp_time_t *t) could be a better option
-Petri
I like more separate API for global timer, and no any hidden wraps. All should
be correctly written w/o pink glasses.
--
Regards,
Ivan Khoronzhuk
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp