Hi all,
thanks a lot for all the input! I have prepared a new version of the webrev
incorporating the suggestions you made (at least I tried):
http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v3/
This version outputs the thread times unconditionally while the other
information is guarded by a flags. I think, most participants found the thread
times the most valuable information and had no (strong) objections to adding
them unconditionally.
@David
Implementation is much simpler with conditional output only for JavaThreads. I
could get rid of the terrible hack (changing the flag) without having to change
too many source files.
The information on allocated bytes is present in the Thread class already
before this proposed change, I just print it. It might be sensible to move
_allocated_bytes and the respective methods to the ThreadStatisticInfo class as
Götz suggested. I haven’t done that in the current version, though.
@David and Thomas
I've removed the pthread-id output. I'm unsure myself what it could be good
for. Anyway, we could add it with a separate change (with a better
implementation) if there is a need to.
@Chris
As you have written, the user has no direct contact to the attach listener of
the VM and jstack won't misinterpret e.g. -help. I agree that the parsing in
attachListener.cpp could be more robust. However, it hasn't been so far either.
All that is done in the current implementation is a strcmp to -l everything
else will be silently ignored. I can try to make this more robust or we could
say that the new output is only available via jcmd. OTH nothing bad can happen
with the current version of the proposed change, so we could also leave it as
it is. What would you prefer?
@Kirk and Thomas
Implementing a new diagnostic command is certainly well worth a thought.
However, it appears to me that it's not necessary in this case. There are
already flags to jstack and jcmd Thread.print, so it's not that uncommon. The
amount of information that would be available by a new diagnostic command and
that is now added to the thread dump is quite small and I think that is
tolerable. Moreover, and most importantly, our support team is use to find the
information in the thread dump and it would make things easier to us if I could
leave it there.
Thanks again and have a nice weekend,
Gunter
On 06.06.18, 04:56, "David Holmes" <david.hol...@oracle.com> wrote:
Hi Goetz,
On 5/06/2018 11:07 PM, Lindenmaier, Goetz wrote:
> Hi
>
> this discussion touched a lot of points so far, which seem
> to lead to different conclusions.
>
> I think we should look at the values printed:
>
> 1. cpu=6300.65ms elapsed=123.28s
> Overhead
> cpu time:
> * system calls at thread dump time
> elapsed time:
> * 1 system call at thread creation time
> * 1 64-bit field per thread for the thread start time
> * 1 system call at thread dump time
>
> As I understand, JDK-8143176 would have had to get the
> time at each locking, which is much more time critical
> than doing this during thread creation. While
Correct.
> the time a lock was held would be much more useful,
> the ratio here, combined with knowledge about the application,
> could lead to the conclusion that the thread is wrongly
> blocked at much lower cost.
Agreed. I see no issue with unconditionally adding this information as
it should address some of the concerns of 8143176 as well.
>
> 2. pthread-id=0x109708000
> Overhead:
> * a field access at thread dump time. Negligible I would say.
I'd overlooked this addition, but it is of course the chunk of ifdef
code in osThread.cpp that I objected to. The problem I have here is the
convoluted mess of "thread identifiers" that we already have. We
currently print:
st->print("tid=" INTPTR_FORMAT " ", p2i(this));
which is just the address of the Thread/JavaThread object. Then:
st->print("nid=0x%x ", thread_id());
where 'n' is supposed to represent "native thread id", which is:
- linux: kernel thread id from syscall gettid
- solaris: thread library identity from thr_create() or thr_self()
- windows: thread id from _beginthreadex
- OS X: kernel thread id from pthread_mach_thread_np(pthread_self());
- Other BSD: kernel (?) thread id from syscall thr_self or getthrid
- AIX: thread library identity from pthread_create() or pthread_self()
It's telling that the code Gunter added gets the thread library id on
linux, but a "kernel thread id" on other platforms! Do we want to see a
thread library id and a kernel thread id? IIRC nid is supposed be the id
you need to see the thread in a debugger. In which case I'm unclear what
role the thread id Gunter wants to add is playing? I don't think we need
to see kernel ids in general, and pthread-id isn't useful for debugging
is it?
>
> 3. allocated=242236760B defined_classes=1725
> Overhead:
> * 1 64-bit field per thread.
> * counter increment on class creation
>
> Especially defined_classes seems to be controversial.
> As this only makes sense for Java threads, this could
> be printed in a line of it's own in Threads::print_on_error()
> (which already gets a flag to customize for jstack:
> print_concurrent_locks) by calling a dedicated function in Thread.
> No need for flag PrintExtendedThreadInfo thus.
>
> But I would propose to drop this information as it is
> too controversial.
Agreed.
> This leaves the times and the pthread-id to be decided whether
> they are worth the cost. If so, I think they should be printed
> unconditional.
Agreed. With some refinement of the "thread id" issue.
> If a flag is required to avoid bloating the info in the default case,
> I would include more infos, as os_prio and nid under that flag.
Of course those things exist for good reason. The native id is supposed
to be what aids in you matching what you see in our thread dumps with
what you see in a debugger. Priority is less of an issue these days, but
there was a time ... ;-)
> As jstack is deprecated, changes to jstack could be skipped always
> disabling the new printouts.
>
> Finally, I would propose to move _allocated_bytes into
> threadStatisticInfo.hpp.
Didn't that part get dropped? (You don't need it for the time info or
thread id). That's a GC logging issue to me.
Thanks,
David
> Best regards,
> Goetz.
>
>
>
>
>> -----Original Message-----
>> From: serviceability-dev [mailto:serviceability-dev-
>> boun...@openjdk.java.net] On Behalf Of David Holmes
>> Sent: Dienstag, 5. Juni 2018 04:53
>> To: Haug, Gunter <gunter.h...@sap.com>; Chris Plummer
>> <chris.plum...@oracle.com>; serviceability-dev <serviceability-
>> d...@openjdk.java.net>; Langer, Christoph <christoph.lan...@sap.com>;
>> hotspot-runtime-...@openjdk.java.net
>> Subject: Re: RFR(S): 8200720: Print additional information in thread
dump
>> (times, allocated bytes etc.)
>>
>> Hi Gunter,
>>
>> On 5/06/2018 1:27 AM, Haug, Gunter wrote:
>>> Hi David, Chris,
>>>
>>> would it be an option to unconditionally print the additional
information?
>> Regardless which way this is implemented it will be rather complicated
and it
>> only switches on/off a few field in the thread dump.
>>
>> I'm not convinced this is all suitable information for a thread stack
>> dump. I can see the time information being useful if using the dump to
>> try and diagnoze a hang or responsiveness issue. But the allocated-bytes
>> and classes-defined just doesn't seem useful in the context of a thread
>> dump to me. Anyone interested in allocation stats is going to be looking
>> at GC logs etc which is where this belongs. Ditto the class-stats belong
>> in some kind of classloading logging IMHO.
>>
>> I'm very reluctant to burden the implementation with capturing these
>> kinds of stats, just to use for diagnostic purposes that may not even be
>> asked for. I'm very much in the "you shouldn't pay for what you don't
>> use" camp in that regard. (See my comments in JDK-8143176 referenced by
>> Sean.)
>>
>> The ThreadStatisticInfo adds overhead to every thread object - and I'd
>> have to suspect there could be overlap with whatever flight recorder
>> sticks in there too. (Thread has become even more bloated in recent
time!).
>>
>> Cheers,
>> David
>>
>>
>>> Thanks,
>>> Gunter
>>>
>>> On 04.06.18, 01:13, "David Holmes" <david.hol...@oracle.com> wrote:
>>>
>>> Hi Gunter, Chris,
>>>
>>> Sorry just trying to catch up and this is only a partial look so
far ...
>>>
>>> BTW these changes are not limited to serviceability code so
adding in
>>> runtime team as well.
>>>
>>> On 2/06/2018 9:12 AM, Chris Plummer wrote:
>>> > Hi Gunter,
>>> >
>>> > On 6/1/18 3:17 AM, Haug, Gunter wrote:
>>> >> Hi Chris,
>>> >>
>>> >> thanks for looking into this!
>>> >>
>>> >> Re the synchronization: The value is stored only in a VM
operation at
>>> >> a safepoint by the VM thread. I was of the opinion, that this
could
>>> >> not be interrupted by a second VM operation (of the same
type). Or
>> am
>>> >> I missing something else?
>>> > I was really thinking more about the temporary changing of
>>> > PrintExtendedThreadInfo, not the value stored in the VMOp. You
may
>> be be
>>> > correct that no more than one VMOp is executing, but while it
is
>>> > executing it is has changed the value of
PrintExtendedThreadInfo,
>> which
>>> > might have an impact on anything else that triggers printing
thread info
>>> > while the VMOp is executing.
>>>
>>> Even if nothing else can possibly be running during the
safepoint I find
>>> it extremely bad form to change a command-line flag in this way,
>>> particularly from a safepoint!
>>>
>>> If this flag is intended to be dynamically enabled as part of a
dcmd
>>> then it should also be a Manageable flag IMHO.
>>>
>>> That said ...
>>>
>>> >> I did think about passing an argument to the various
print_on
>> member
>>> >> functions of the thread classes, but this would require rather
>>> >> extensive code changes for a rather tiny extension. Therefore
I feel
>>> >> doing it like this is the lesser evil.
>>>
>>> ... it's obviously not truly a global setting, but one that is
needed on
>>> a per-print-request basis. The flag is just the default, but if
the
>>> default is off you still want to enable extended printing on a
>>> per-request basis.
>>>
>>> The current print_on mechanics is not set up for this kind of
>>> flexibility. I think this needs more thought.
>>>
>>> ---
>>>
>>> Re osThread.cpp shared code changes ... I really hate to see all
the
>>> ifdefs in there. Please add a pd_print_on function to support
this if
>>> you truly want platform specific information.
>>>
>>> ---
>>>
>>> threadStatisticInfo.hpp
>>>
>>> typo: getElepsedTime() -> getElapsedTime()
>>>
>>> Thanks,
>>> David
>>>
>>>
>>>
>>> >>
>>> >> Re thread_dump(): I think it's correct or, well, at least it
works ;-)
>>> >> In fact jstack will transfer the "-e" and "-l" in only one
string,
>>> >> i.e. op->arg(0).
>>> > So you are saying that op->arg(0) is "-e -l" when using
jstack? I think
>>> > you really need to clean up the parsing. As it stands right
now passing,
>>> > I get the feeling that if a user erroneously asks for help by
using
>>> > "jcmd <pid> Thread.Print -help", it will end up executing with
-e an -l
>>> > enabled, and no failure for the invalid "-help" option.
>>> >> Christoph has already explained my reasoning. But I agree,
it's not
>>> >> nice and I would be happy to do it like Christoph suggested.
>>> > I'll respond separately to his suggestion.
>>> >
>>> > thanks,
>>> >
>>> > Chris
>>> >>
>>> >> And typo fixed, sorry.
>>> >>
>>> >> Thanks again,
>>> >> Gunter
>>> >>
>>> >> On 01.06.18, 00:03, "Chris Plummer" <chris.plum...@oracle.com>
>> wrote:
>>> >>
>>> >> Hi Gunter,
>>> >> globals.hpp: fix typo "informatiuon"
>>> >> I worry a little bit about the synchronizing (if that's
the right
>>> >> word)
>>> >> of PrintExtendedThreadInfo and the dcmd's -e flag. When
using -e,
>>> >> you
>>> >> are temporarily enabling PrintExtendedThreadInfo if it
was false.
>>> >> This
>>> >> temporarily changes the behavior of thread dumps, and
could
>>> >> impact other
>>> >> uses that happen in parallel. Also, could two
simultaneous uses
>>> >> of -e
>>> >> result in PrintExtendedThreadInfo not getting restored
properly?
>>> >> thread_dump() doesn't look right. It looks like you are
iterating
>>> >> char
>>> >> by char over the argument, and expect something like
"-el" to be
>>> >> specified rather then "-e -l". The loop should be
iterating over
>>> >> op->arg(i), not op->arg(0)[i].
>>> >> The rest of the changes look fine.
>>> >> thanks,
>>> >> Chris
>>> >> On 5/30/18 8:12 AM, Haug, Gunter wrote:
>>> >> > Hi all,
>>> >> >
>>> >> > As Chris proposed, I have made an the extended output
>>> >> switchable. There is an VM flag (PrintExtendedThreadInfo),
which is
>>> >> false by default. Moreover, there is an Option (-e) which can
be used
>>> >> with jcmd Thread.print as well as with jstack. The -e option
>>> >> essentially sets PrintExtendedThreadInfo true just for the
respective
>>> >> thread dump.
>>> >> >
>>> >> > Here is the updated webrev:
>>> >> >
>>> >> > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v2
>>> >> >
>>> >> > (https://bugs.openjdk.java.net/browse/JDK-8200720)
>>> >> >
>>> >> > Thanks,
>>> >> > Gunter
>>> >> >
>>> >> >
>>> >> > On 02.05.18, 17:07, "serviceability-dev on behalf of
Haug,
>>> >> Gunter" <serviceability-dev-boun...@openjdk.java.net on
behalf of
>>> >> gunter.h...@sap.com> wrote:
>>> >> >
>>> >> > Hi Chris,
>>> >> >
>>> >> > Thanks for looking into this.
>>> >> > You're right, there is a little more we have. We
have
>>> >> implemented an IO tracing mechanism which - rather as a
byproduct -
>>> >> keeps track of bytes read and written per thread. However,
this of
>>> >> course requires changes not only in hotspot. We would be
happy to
>>> >> contribute this as well, but this is a far bigger change and
will
>>> >> probably lead to a far bigger discussion. Anyway, with the
number of
>>> >> bytes read, the number of classes defined doesn't look that
arbitrary
>>> >> anymore, as one can correlate IO to class loading.
>>> >> >
>>> >> > Regarding the verbose option I think that's a
good idea!
>>> >> >
>>> >> > Thanks again,
>>> >> > Gunter
>>> >> >
>>> >> > On 01.05.18, 22:55, "Chris Plummer"
>>> >> <chris.plum...@oracle.com> wrote:
>>> >> >
>>> >> > Hi Gunter,
>>> >> >
>>> >> > The output you are adding is all useful. I
think the
>>> >> question is (and
>>> >> > I'd like to see a few people chime in on this
for this
>>> >> review) is
>>> >> > whether or not all of it is the appropriate
for a
>>> >> thread's stack dump.
>>> >> > For example, defined_classes is on the fringe
of what
>>> >> I would call
>>> >> > generally useful info in a stack dump. Sure,
there
>>> >> might be that rare
>>> >> > case when you need it, but how often compared
to other
>>> >> useful info
>>> >> > maintained on a per thread basis. How many
other bits
>>> >> of useful info are
>>> >> > not being printed in favor of
defined_classes? It
>>> >> seems you have more in
>>> >> > the queue. How many? I'm worried about how
cluttered
>>> >> the stack dumps
>>> >> > will get. Maybe we should add some sort of
verbose
>>> >> thread dumping
>>> >> > option. Just a thought.
>>> >> >
>>> >> > As for the implementation, overall it looks
good, but
>>> >> I can't speak to
>>> >> > whether or not you are doing proper
accounting of
>>> >> defined_classes and
>>> >> > bytes allocated. You'll need input from
someone with
>>> >> more knowledge of
>>> >> > those areas. We'll also need to do some
testing to
>>> >> make sure tool tests
>>> >> > are not impacted.
>>> >> >
>>> >> > thanks,
>>> >> >
>>> >> > Chris
>>> >> >
>>> >> > On 4/30/18 2:51 AM, Haug, Gunter wrote:
>>> >> > > Hi,
>>> >> > >
>>> >> > > this is an update to an RFR I posted on
hotspot-dev,
>>> >> but it is probably more suitable to post it here. Can I
please have a
>>> >> review and a sponsor for the following enhancement:
>>> >> > >
>>> >> > >
http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v1
>>> >> > >
https://bugs.openjdk.java.net/browse/JDK-8200720
>>> >> > >
>>> >> > > We at SAP have extended the thread dumps
(obtained
>>> >> by jstack or jcmd) by several fields providing thread specific
>>> >> information. These extensions are quite popular with our
support
>> team.
>>> >> With some knowledge of the architecture of the application,
they
>> often
>>> >> allow for quick and simple diagnosis of a running system.
Therefore
>> we
>>> >> would like to contribute these enhancements.
>>> >> > >
>>> >> > > I took a few simple examples here, namely
cpu time,
>>> >> elapsed time since thread creation, bytes allocated and
classes
>>> >> defined by the thread and the pthread-id or equivalent on
platforms
>>> >> where it makes sense. Provided it is known how the application
>> should
>>> >> behave, a misbehaving thread can identified easily.
>>> >> > >
>>> >> > > There is no measurable overhead for this
>>> >> enhancement. However, it may be a problem that the format of
the
>>> >> output is changed. Tools parsing the output may have to be
changed.
>>> >> > >
>>> >> > > Here is an example of the output generated:
>>> >> > >
>>> >> > >
------------------------------------------------------
>>> >> > >
>>> >> > > "main" #1 prio=5 os_prio=31 cpu=6300.65ms
>>> >> elapsed=123.28s allocated=242236760B defined_classes=1725
>>> >> tid=0x00007fa13a806000 nid=0x1c03 pthread-id=0x109708000
waiting
>> on
>>> >> condition [0x0000000109707000]
>>> >> > > java.lang.Thread.State: TIMED_WAITING
(sleeping)
>>> >> > > JavaThread state: _thread_blocked
>>> >> > > Thread: 0x00007fa13a806000 [0x1c03] State:
>>> >> _at_safepoint _has_called_back 0 _at_poll_safepoint 0
>>> >> > > JavaThread state: _thread_blocked
>>> >> > > at java.lang.Thread.sleep(java.base/Native
Method)
>>> >> > > ...
>>> >> > >
------------------------------------------------------
>>> >> > >
>>> >> > > As mentioned above, we have a whole bunch
of other
>>> >> enhancements to the thread dump similar to this one and would
be
>>> >> willing to contribute them if there is any interest.
>>> >> > >
>>> >> > > Thanks and best regards,
>>> >> > > Gunter
>>> >> > >
>>> >> > >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >>
>>> >
>>> >
>>>
>>>