Re: RFR (trivial): 8250826: jhsdb does not work with coredump which comes from Substrate VM

Yasumasa Suenaga Sat, 01 Aug 2020 17:23:27 -0700

Hi Chris,

Thanks for your comment!
I pushed new change to submit repo, but the build failed on macOS. Could you 
share details?
(I do not have Mac)


  commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297
  job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989

On 2020/08/01 13:06, Chris Plummer wrote:

On 7/30/20 6:18 PM, Yasumasa Suenaga wrote:

Hi Chris,

On 2020/07/31 7:29, Chris Plummer wrote:

Hi Yasumasa,

If I understand correctly we first call add_map_info() for all the PT_LOAD 
segments in the core file. We then process all the library segments, calling 
add_map_info() for them if the target_vaddr has not already been addded. If has 
already been added, which I assume is the case for any library segment that is 
already in the core file, then the core file version is replaced the the 
library version.  I'm a little unclear of the purpose of this replacing of the 
core PT_LOAD segments with those found in the libraries. If you could explain 
this that would help me understand your change.


Read only segments in ELF should not be any different from PT_LOAD segments in 
the core.
And head of ELF header might be included in coredump (See JDK-7133122). Thus we 
need to replace PT_LOAD segments the library version.

Ok. The code in the area really should have been commented better when first 
written. The purpose is not understandable simply by reading the code.


I added some comments to existing code. Please tell me if it is insufficient.

I'm also unsure why existing_map->fd would ever be something other than the 
core file. Why would another library map the same target_vaddr.


When mmap() is called to read-only ELF segments / sections, Linux kernel seems 
to allocate other memory segments which has same top virtual memory address. 
I've not yet found out from the code of Linux kernel, but I confirmed this 
behavior on GDB.

Ok. Same comment as above. This should have been explained with comments in the 
code.


Added some comments.

As for your fix, if I understand correctly the issue is that a single segment 
in the library is being split into two segments in the process (and therefore 
in the core file) due to an mprotect being done on part of the segment. Because 
of this the segment size in the library does match the segment size in the core 
file. So with your fix the library segment is used, but what about the other 
half of the segment that is in the core file? Don't we now have overlapping 
segments; the full original segment from the library, and then a second segment 
that overlaps the tail end of the library segment? Will that cause any 
confusion later on?


As long as vaddr is valid, it doesn't matter even if it overlaps because SA 
would sort the map with vaddr, and would lookup with it.
In Substrate VM, there are RO and RW sections in that order, so it is ok with 
webrev.00 . However it might not be appropriate because RW section might be top 
of PT_LOAD.

To make it more generalized, I changed it to the commit on submit repo.
It would check access flags between in coredump and in binary. If they are 
different, we respect current (loaded from coredump) map because it might be 
changed at runtime.

The change for LabsJDK 11 is more simple because JDK 11 does not have 
ps_core_common.c .
So I share you it. It may help you:

  http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch


Thanks,

Yasumasa

thanks,

Chris



Thanks,

Yasumasa

thanks,

Chris

On 7/30/20 1:18 PM, Chris Plummer wrote:

Hi Yasumasa,

I'm reviewing this RFR, and I'd like to ask that it not be pushed as trivial. 
Although it is just a one line change, it takes an extensive knowledge to 
understand the impact. I'll read up on the filed graal issue and try to 
understand the ELF code a bit better.

thanks,

Chris

On 7/30/20 6:45 AM, Yasumasa Suenaga wrote:

Hi all,

Please review this trivial change:

  JBS: https://bugs.openjdk.java.net/browse/JDK-8250826
  webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/

I played Truffle NFI on GraalVM, but I cannot get Java stacks from coredump via 
jhsdb.

I've reported this issue to GraalVM community [1], and I 've found out the 
cause of this issue is .svm_heap would be separated to RO and RW areas by 
mprotect() calls in run time in spite of .svm_heap is RO section in ELF (please 
see [1] for details).

It is corner case, but we will see same problem on jhsdb when we attempt to 
analyze coredump which comes from some applications / libraries which would 
separate RO sections in ELF like Substrate VM.

I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], then community 
members suggested me to discuss in serviceability-dev.


Thanks,

Yasumasa


[1] https://github.com/oracle/graal/issues/2579
[2] https://github.com/graalvm/labs-openjdk-11/pull/9

Re: RFR (trivial): 8250826: jhsdb does not work with coredump which comes from Substrate VM

Reply via email to