Hi -

In testing I wasn't seeing any of the Dwarf code triggered.

With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries...

src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c

   if (fill_instr_info(newlib)) {
     if (!read_eh_frame(ph, newlib)) {

fill_instr_info is failing, and we never get to read_eh_frame().

output like:

libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so

(similar for all libraries).

fill_instr fails if:

 if ((lib->exec_start == 0L) || (lib->exec_end == 0L))

...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero.

I added some booleans and did:

185       if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) {
186         lib->exec_start = ph->p_vaddr;
187         found_start =true;
188       }

(similarly for end) and only failed if:

201   if (!found_start || !found_end) {
202     return false;

...and now it's better.   I go from:

----------------- 3306 -----------------
0x00007f75824acd2d      __GI___pthread_timedjoin_ex + 0x17d

to:

----------------- 31127 -----------------
0x00007fa284d78d2d      __GI___pthread_timedjoin_ex + 0x17d
0x00007fa2857aaf2d      CallJavaMainInNewThread + 0xad
0x00007fa2857a74ed      ContinueInNewThread + 0x4d
0x00007fa2857a8c49      JLI_Launch + 0x1529
0x000055af1b78db1c      main + 0x11c


Thanks
Kevin




On 10/03/2020 12:36, Yasumasa Suenaga wrote:

Hi Kevin,

Thanks for your comment!

On 2020/03/10 18:58, Kevin Walls wrote:
Hi Yasumasa ,

The changes build OK for me in the latest jdk, and things still work.
I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames.

I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes.

You can see the problem with JShell.
Some Java frames would not be seen in mixed jstack.


I have mostly minor other comments which don't need a new webrev, some just comments for the future:

src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:

DW_CFA_nop - shouldn't this continue instead of return?
(It may "never" happen, but a nop could appear within some other instructions?)

DW_CFA_nop is used for padding, so we can ignore (return immediately) it.


DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".

I will fix it.


We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.  I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-) So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness.

I will add DW_CFA_advance_loc4.


General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser.

DwarfParser::process_dwarf() moves _buf.
It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is. I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout.

I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary.

I saw GDB and binutils source for creating this patch.
They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC.


Similarly in future, if this DWARF support code became used more widely, it might want to move to an
OS-neutral directory?  It's odd to label it as Linux-specific.

Windows does not use DWARF at least, it uses another feature.

https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$
I'm not sure other platforms (Solaris, macOS) uses DWARF.
If DWARF is used in them, I can move DWARF related code to posix directory.


src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
Thanks for changing "can_parsable" which was in the earlier version. 8-)


These are just comments to mainly say it looks good, and somebody else out there has read it. I will look for a system that shows the problem, and get back to you again!


Thanks,

Yasumasa


Many thanks
Kevin

On 27/02/2020 05:13, Yasumasa Suenaga wrote:
Hi all,

webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year). So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk.
Could you review it?

  http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/

I need one more reviewer to push.


Thanks,

Yasumasa


On 2020/02/17 13:07, Yasumasa Suenaga wrote:
PING: Could you review it?

   JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/

This change has been already reviewed by Serguei.
I need one more reviewer to push.


Thanks,

Yasumasa


On 2020/02/03 1:37, Yasumasa Suenaga wrote:
PING: Could you reveiw this change?

   JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/

I believe this change helps troubleshooter to fight to postmortem analysis.


Thanks,

Yasumasa


On 2020/01/19 3:16, Yasumasa Suenaga wrote:
PING: Could you review it?

   JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/

I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).


Thanks,

Yasumasa


On 2019/12/15 10:51, Yasumasa Suenaga wrote:
Hi Serguei,

Thanks for your comment!
I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.

http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/

This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).


Thanks,

Yasumasa


On 2019/12/14 10:02, serguei.spit...@oracle.com wrote:
Hi Yasumasa,

This is nice move in general.
Thank you for working on this!

http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html

96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }


I'd suggest to simplify the logic by refactoring to something like below:

           long libptr = dbg.findLibPtrByAddress(pc);
           Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
           DwarfParser dwarf = null;

           if (libptr != 0L) { // Native frame
             try {
               dwarf = new DwarfParser(libptr);
               dwarf.processDwarf(pc);
               Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
!dwarf.isBPOffsetAvailable())
                                 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)                                  : context.getRegisterAsAddress(dwarf.getCFARegister())
.addOffsetTo(dwarf.getCFAOffset());

            } catch (DebuggerException e) { // bail out to Java frame case
            }
          }
          if (cfa == null) {
            return null;
          }
          return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);

http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html

58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()

   Better to rename 'ofs' => 'offs'.

77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());

   Extra space after '-' sign.

71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {

   It feels like the logic has to be somehow refactored/simplified as    several typical fragments appears in slightly different contexts.
   But it is not easy to understand what it is.
   Could you, please, add some comments to key places explaining this logic.    Then I'll check if it is possible to make it a little bit simpler.

109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }

  The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
      private CFrame javaSender(ThreadContext context) {
        Address nextPC = getNextPC(false);
        if (nextPC == null) {
          return null;
        }
        long libptr = dbg.findLibPtrByAddress(nextPC);
        DwarfParser nextDwarf = null;

        if (libptr != 0L) { // Native frame
          try {
            nextDwarf = new DwarfParser(libptr);
            nextDwarf.processDwarf(nextPC);
          } catch (DebuggerException e) { // Bail out to Java frame
          }
        }
        Address nextCFA = getNextCFA(nextDwarf, context);
        return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
      }

135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }

  This one can be also simplified a little:

      public CFrame sender(ThreadProxy thread) {
        ThreadContext context = thread.getContext();

        if (dwarf == null) { // Java frame
          return javaSender(context);
        }
        Address nextPC = getNextPC(true);
        if (nextPC == null) {
          return null;
        }
        DwarfParser nextDwarf = null;
        if (!dwarf.isIn(nextPC)) {
          long libptr = dbg.findLibPtrByAddress(nextPC);
          if (libptr != 0L) {
            try {
              nextDwarf = new DwarfParser(libptr);
              nextDwarf.processDwarf(nextPC);
            } catch (DebuggerException e) { // Bail out to Java frame
            }
          }
        }
        Address nextCFA = getNextCFA(nextDwarf, context);
        return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
      }

Finally, it looks like just one method could replace both
sender(ThreadProxy thread) and javaSender(ThreadContext context):

      private CFrame commonSender(ThreadProxy thread) {
        ThreadContext context = thread.getContext();
        Address nextPC = getNextPC(false);
        if (nextPC == null) {
          return null;
        }
        DwarfParser nextDwarf = null;

        long libptr = dbg.findLibPtrByAddress(nextPC);
        if (dwarf == null || !dwarf.isIn(nextPC)) {
          long libptr = dbg.findLibPtrByAddress(nextPC);
          if (libptr != 0L) {
            try {
              nextDwarf = new DwarfParser(libptr);
              nextDwarf.processDwarf(nextPC);
            } catch (DebuggerException e) { // Bail out to Java frame
            }
          }
        }
        Address nextCFA = getNextCFA(nextDwarf, context);
        return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
      }

I'm still reviewing the dwarf parser files.

Thanks,
Serguei


On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
Hi,

I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
Could you review new webrev?

http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/

The diff from previous webrev is here:
http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b


Thanks,

Yasumasa


On 2019/11/25 14:08, Yasumasa Suenaga wrote:
Hi all,

Please review this change:

   JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/


According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
for stack unwinding.

As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
library (e.g. libc) might be compiled with this feature.

However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
So it might be lack of stack frames.

I guess JDK-8219201 is caused by same issue.


Thanks,

Yasumasa


[1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$


Reply via email to