On Tue, 30 Aug 2022 23:57:55 GMT, Chris Plummer <[email protected]> wrote:
> While dumping all registers (and doing a findpc on each), the following
> exception occurred for r8:
>
>
> r8: 0x000000750e4fdffc
> Error: java.lang.ArrayIndexOutOfBoundsException: Index 4099 out of bounds for
> length 4096
> java.lang.ArrayIndexOutOfBoundsException: Index 4099 out of bounds for length
> 4096
> at jdk.hotspot.agent/sun.jvm.hotspot.debugger.Page.getLong(Page.java:182)
> at
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getLong(PageCache.java:100)
> at
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:364)
> at
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462)
> at
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readAddress(WindbgDebuggerLocal.java:312)
> at
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgAddress.getAddressAt(WindbgAddress.java:71)
> at
> jdk.hotspot.agent/sun.jvm.hotspot.utilities.PointerFinder.find(PointerFinder.java:58)
> at
> jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.printThreadContextOn(JavaThread.java:493)
>
>
> This exception is the result of using PointerFinder ("findpc") on invalid
> address that starts on the last 32-bit word of a page. "page" in this case is
> referring to a page in SA's PageCache, which are always 4k in size. Note
> findpc is frequently done using an invalid address. In this test case it is
> being called on each x64 register, which could contain anything. findpc
> relies on getting some sort of AddressException when this happens, and will
> then say the address points to something that is unknown. However, in the
> case of the address pointing to the last 32-bot word of a page, it results in
> an ArrayIndexOutOfBoundsException when the page cache tries to read past the
> end of the page. This is happening in Page.getLong(), which you can see in
> the stack trace.
>
> The main culprit here is some weakening of the alignment checks being done.
> The alignment check should have resulted in an UnalignedAddressException long
> before we ever got to Page.getLong(). However, we have the following
> override, which is allowing the unaligned address to pass the alignment check.
>
>
> public void checkAlignment(long address, long alignment) {
> // Need to override default checkAlignment because we need to
> // relax alignment constraints on Bsd/x86
> if ( (address % alignment != 0)
> &&(alignment != 8 || address % 4 != 0)) {
> throw new UnalignedAddressException(
> "Trying to read at address: "
> + addressValueToString(address)
> + " with alignment: " + alignment,
> address);
> }
> }
> };
>
>
> This allows a pointer to a 64-bit value to only be 32-bit aligned. But
> there's more to it than that. I modified ClhsdbFindPC.java to fetch a tid (a
> thread address), and did a findpc on the address OR'd with 0xffc. `findpc`
> uses PointerFinder. This forced a read of a 64-bit value that starts in the
> last 32-bits of a page. It passed on linux-x64 but failed on windowx-x64 in
> the same manner described in the description of this CR. The difference
> between the two implementations is that windows relies on the default
> implementation of DebuggerBase.readCInteger() whereas linux has an override.
> DebuggerBase.readCInteger() does the following:
>
>
> public long readCInteger(long address, long numBytes, boolean isUnsigned) {
> utils.checkAlignment(address, numBytes);
> if (useFastAccessors) {
> if (isUnsigned) {
> switch((int) numBytes) {
> case 1: return cache.getByte(address) & 0xFF;
> case 2: return cache.getShort(address, bigEndian) & 0xFFFF;
> case 4: return cache.getInt(address, bigEndian) & 0xFFFFFFFFL;
> case 8: return cache.getLong(address, bigEndian);
> ...
>
>
> There is an alignment check here, but it is the "relaxed" override shown
> above, which allows 64-bit addresses on 32-bit boundaries. The override in
> LinuxDebuggerLocal is:
>
>
> /** Need to override this to relax alignment checks on x86. */
> public long readCInteger(long address, long numBytes, boolean isUnsigned)
> throws UnmappedAddressException, UnalignedAddressException {
> // Only slightly relaxed semantics -- this is a hack, but is
> // necessary on x86 where it seems the compiler is
> // putting some global 64-bit data on 32-bit boundaries
> if (numBytes == 8) {
> utils.checkAlignment(address, 4);
> } else {
> utils.checkAlignment(address, numBytes);
> }
> byte[] data = readBytes(address, numBytes);
> return utils.dataToCInteger(data, isUnsigned);
>
>
> Although there is a relaxed alignment check here also, the code that reads
> from the address does not assume all the bytes are on the same page. It uses
> readBytes() intead of cache.getLong(), so it won't run into the PageCache
> issue with reading a 64-bit value that starts in the last 32-bit word of a
> page.
>
> I think the introduction of these relaxed alignment checks has a muddled
> history, probably made more muddled by ports cloning code that maybe wasn't
> necessary. Probably also initial fixes (the relaxed alignment check) seemed
> to work at first, but later the PageCache issue was discovered, leading to
> readBytes() workaround in the LinuxDebuggerLocal.readCInteger() override, but
> was not also done on other ports (so we got this CR for windows).
>
> For 64-bit support it's clear this easing of the 64-bit alignment is not
> needed, and getting rid of it would result in the proper
> UnalignedAddressException being thrown. The question is whether it is still
> needed for 32-bit x86, and if so is it needed on all ports.
>
> I can't test linux-x86, so I can't tell if it still allows 64-bit values on
> 32-bit aligned addresses, so for now I'll assume it does. So the approach
> being taken is whenever an address of a 64-bit type points to the last 32-bit
> word of a page, use readBytes() to get the 64-bit value one byte at a time.
> It still uses the page cache in the end, but it doesn't try to get all 8
> bytes from the same page. Note for 64-bit systems, the conditoin will never
> arise because of the removal of the relaxed alignment check. Instead there
> will be an UnalignedAddressException at an early point when the alignment
> check is made.
>
> One windfall of this change is now we always read 64-bit values from the page
> cache in a way that is much more efficient than reading a byte at a time.
> This has resulted in about a 25% performance improvement in a test I have
> that does a heap dump.
This pull request has now been integrated.
Changeset: 767262e6
Author: Chris Plummer <[email protected]>
URL:
https://git.openjdk.org/jdk/commit/767262e67cec8e7a5e5eba2c6ebea7f60186d2cb
Stats: 227 lines in 11 files changed: 52 ins; 154 del; 21 mod
8292201: serviceability/sa/ClhsdbThreadContext.java fails with "'Thread
"Common-Cleaner"' missing from stdout/stderr"
Reviewed-by: amenkov, sspitsyn
-------------
PR: https://git.openjdk.org/jdk/pull/10090