Re: jmx-dev RFR 6523160: RuntimeMXBean.getUptime() returns negative values

David Holmes Wed, 16 Oct 2013 19:28:20 -0700

Hi Jaroslav,

Minor nit: os::elapsed_time should really be defined in terms of theother functions ie:


return ((double) os::elapsed_counter()) / os::elapsed_frequency();

I also prefer the cast above as it is very clear that we will be doing afloating-point division.


Aside: AFAICS os::elapsed_time() is never actually used ??

I agree that it appears that changing the frequency should be okay.

Thanks,
David

On 17/10/2013 2:16 AM, Jaroslav Bachorik wrote:

On 15.10.2013 08:49, David Holmes wrote:

Hi Jaroslav,

os_bsd.cpp / os_linux.cpp:

If you don't have a monotonic clock you leave timer_frequency set to 0!
(So you need to test on a system without a monotonic clock, or else
force it to act as-if not present.)

That aside I don't trust clock_getres to give values that actually allow
the timer frequency to be determined. As per the comments in
os_linux.cpp:

// It's fixed in newer kernels, however clock_getres() still returns
// 1/HZ. We check if clock_getres() works, but will ignore its reported
// resolution for now. Hopefully as people move to new kernels, this
// won't be a problem.

we don't know what kernels provide real values here and which provide
dummy ones.

On BSD you haven't modified os::elapsed_counter.

Looking at the linux changes I don't think the logic is correct even if
clock_getres is accurate. In the existing code we have:

elapsed_counter -> elapsed time in microseconds
elapsed_frequency -> 1000 * 1000 (ie micros per second)
elapsed_time -> elapsed_counter*0.000001 -> time in seconds

Now we have:

elapsed_counter -> elapsed time in nanoseconds
elapsed_frequency -> 1x10^9 / whatever clock_getres says
elapsed_time -> counter/frequency -> ???

So elapsed_time not, in general, going to give the elapsed time in
seconds. And elapsed_time is not dependent on the "frequency" at all
because elapsed_counter is not reporting ticks but an actual elapsed
"time" in nanoseconds.


Also note that we constants for:

NANOSECS_PER_SEC
NANOSECS_PER_MILLISEC

to aid with time conversions.

The linux webrev contains unrelated UseLargePages changes!


Sorry for the mess with UseLargePages changes :/

I've fixed the problems with the frequency (using a fixed number as
before) and I kept the changes to minimum.

I was hesitating about changing the elapsed_counter precision from
microseconds to nanoseconds but since solaris and windows versions
already use nanosecond ticks for elapsed_counter I think the change is
safe.

The update webrev: http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03



David
-----


On 15/10/2013 12:13 AM, Jaroslav Bachorik wrote:

On 10.10.2013 13:15, Staffan Larsen wrote:


On 10 okt 2013, at 13:02, Jaroslav Bachorik
<[email protected]> wrote:

On 10.10.2013 05:44, David Holmes wrote:

On 10/10/2013 4:12 AM, Staffan Larsen wrote:


On 9 okt 2013, at 16:19, Jaroslav Bachorik
<[email protected]> wrote:

On 9.10.2013 16:10, Staffan Larsen wrote:

There is now an awful amount of different timestamps in the
Management class. Can they be consolidated somehow? At least
_begin_vm_creation_time and the new _begin_vm_creation_ns.

This discussion also implies that the "elapsed time" we print in
the
hs_err file is also wrong. It uses os::elapsedTime() which uses
os::elapsed_counter().

And I guess the same thing for the VM.uptime Diagnostic Command
(class VMUptimeDCmd) which also relies on os::elapsed_counter().


Also the reported GC pauses duration might be wrong since it uses
Management::timestamp().

On the first sight the change looks rather trivial. But, honestly,
I'm not sure which other parts could for whatever reason break once
the time-of-day timestamp is replaced with a monotonic equivalent.
One would think that it shouldn't matter but one never knows ...

Staffan, do you think this kind of change is suitable for the
current
phase of JDK release cycle? I think I could improve the patch in
few
days and then it should probably be able to pass the review before
ZBB. But, it's only P3  ...


I think it is a bit late in the release cycle to clean this up in
the
way it should be cleaned up. I think we should wait until the
first 8
update release and do a more thorough job than we have time for
right
now.


I second that. The elapsed_counter/elpased_timer APIs and
implementations are a tangled mess. But part of the problem has been
that people want/expect monotonic time-of-day based timestamps (yes a
contradiction - though some people make sure TOD does not get
modified
on their production systems). The use of timestamps in logging has
to be
examined carefully - mainly GC logging. I recall a "simple" attempted
change in the past that resulted in trying to compare a nanoTime
based
timestamp with the TOD. :(


Actually, if I'm reading the sources right for Solaris and Win the
monotonic clock source is used to provide elapsed_counter() value. It
falls back to TOD when the monotonic clock source is not available.
For Linux/BSD the TOD is used directly.

This makes me wonder if changing the linux/bsd implementation to
follow the same logic would be really that disruptive.


Good point. I would like a world where elapsed_counter is monotonic
(where possible). Still a bit scary this late in the release, but an
interesting experiment.


The change is rather simple and tests ok. All the means to get a
monotonic timestamp are already there and proved to work. The core tests
in JPRT went fine.

The updated webrev is at
http://cr.openjdk.java.net/~jbachorik/6523160/webrev.02

-JB-


/Staffan


-JB-


David
-----

/Staffan


-JB-


/Staffan


On 9 okt 2013, at 13:26, Jaroslav Bachorik
<[email protected]> wrote:

On 8.10.2013 23:46, David Holmes wrote:

On 8/10/2013 10:36 PM, Jaroslav Bachorik wrote:

On 8.10.2013 09:34, David Holmes wrote:

Jaroslav,

On 2/10/2013 6:47 PM, Jaroslav Bachorik wrote:

Hello,

currently the JVM uptime reported by the RuntimeMXBean is
based on
System.currentTimeMillis() which makes it susceptible to
changes of the
OS time (eg. changing timezone, NTP synchronization etc.).
The
uptime
should not depend on the system time and should be calculated
using a
monotonic clock source.

There is already the way to get the actual JVM uptime in
ticks.
It is
accessible as Management::timestamp() and the ticks are
convertible to
milliseconds using Management::ticks_to_ms(ts_ticks) thus
making it
very
easy to switch to the monotonic clock based uptime.


Maybe I'm missing something but TiumeStamp updates using
os::elapsed_counter() which on Linux uses gettimeofday so is
not a
monotonic clock source.


Hm, yes. I wasn't aware of this linux/bsd specific.

Is there any reason why a non monotonic clock source is used
for
timestamping except of the historical one? os::javaTimeNanos()
uses
montonic clock when available - why can't be the same used for
os::elapsed_counter() especially when a counter based on
"gettimeofday"
is not really a counter?


It is all historical. These elapsed_counters and elapsed_timers
make me
cringe. But changing it has a lot of potential consequences
because of
the way these are used in logging etc. Certainly not something
to be
contemplated at this stage of JDK 8.

Perhaps a simpler fix here is to expose a startUpTimeNanos that
can then
be used for the uptime.


My attempt at this is at
http://cr.openjdk.java.net/~jbachorik/6523160/webrev.01/hotspot
I am using os::javaTimeNanos() to get the monotonic ticks where
possible.

The JDK part stays the same as for webrev.00

-JB-


David

-JB-


David
-----

The patch consists of the hotspot and jdk parts.

For the hotspot a new constant needs to be introduced in
src/share/vm/services/jmm.h and the actual logic to obtain
the
uptime in
milliseconds is added in
src/share/vm/services/management.cpp.

For the jdk the changes comprise of adding the necessary JNI
bridging
methods in order to get the new uptime, introducing the same
constant
that is used in hotspot and changes to mapfile-vers files in
order to
properly build the native library.

Issue:   https://bugs.openjdk.java.net/browse/JDK-6523160
Webrev:
http://cr.openjdk.java.net/~jbachorik/6523160/webrev.00

Thanks,

-JB-

Re: jmx-dev RFR 6523160: RuntimeMXBean.getUptime() returns negative values

Reply via email to