On Thu, 1 Sep 2022 17:20:33 GMT, Chris Plummer <[email protected]> wrote:
>> While dumping all registers (and doing a findpc on each), the following
>> exception occurred for r8:
>>
>>
>> r8: 0x000000750e4fdffc
>> Error: java.lang.ArrayIndexOutOfBoundsException: Index 4099 out of bounds
>> for length 4096
>> java.lang.ArrayIndexOutOfBoundsException: Index 4099 out of bounds for
>> length 4096
>> at jdk.hotspot.agent/sun.jvm.hotspot.debugger.Page.getLong(Page.java:182)
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getLong(PageCache.java:100)
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:364)
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462)
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readAddress(WindbgDebuggerLocal.java:312)
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgAddress.getAddressAt(WindbgAddress.java:71)
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.utilities.PointerFinder.find(PointerFinder.java:58)
>> at
>> jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.printThreadContextOn(JavaThread.java:493)
>>
>>
>> This exception is the result of using PointerFinder ("findpc") on invalid
>> address that starts on the last 32-bit word of a page. "page" in this case
>> is referring to a page in SA's PageCache, which are always 4k in size. Note
>> findpc is frequently done using an invalid address. In this test case it is
>> being called on each x64 register, which could contain anything. findpc
>> relies on getting some sort of AddressException when this happens, and will
>> then say the address points to something that is unknown. However, in the
>> case of the address pointing to the last 32-bot word of a page, it results
>> in an ArrayIndexOutOfBoundsException when the page cache tries to read past
>> the end of the page. This is happening in Page.getLong(), which you can see
>> in the stack trace.
>>
>> The main culprit here is some weakening of the alignment checks being done.
>> The alignment check should have resulted in an UnalignedAddressException
>> long before we ever got to Page.getLong(). However, we have the following
>> override, which is allowing the unaligned address to pass the alignment
>> check.
>>
>>
>> public void checkAlignment(long address, long alignment) {
>> // Need to override default checkAlignment because we need to
>> // relax alignment constraints on Bsd/x86
>> if ( (address % alignment != 0)
>> &&(alignment != 8 || address % 4 != 0)) {
>> throw new UnalignedAddressException(
>> "Trying to read at address: "
>> + addressValueToString(address)
>> + " with alignment: " + alignment,
>> address);
>> }
>> }
>> };
>>
>>
>> This allows a pointer to a 64-bit value to only be 32-bit aligned. But
>> there's more to it than that. I modified ClhsdbFindPC.java to fetch a tid (a
>> thread address), and did a findpc on the address OR'd with 0xffc. `findpc`
>> uses PointerFinder. This forced a read of a 64-bit value that starts in the
>> last 32-bits of a page. It passed on linux-x64 but failed on windowx-x64 in
>> the same manner described in the description of this CR. The difference
>> between the two implementations is that windows relies on the default
>> implementation of DebuggerBase.readCInteger() whereas linux has an override.
>> DebuggerBase.readCInteger() does the following:
>>
>>
>> public long readCInteger(long address, long numBytes, boolean isUnsigned) {
>> utils.checkAlignment(address, numBytes);
>> if (useFastAccessors) {
>> if (isUnsigned) {
>> switch((int) numBytes) {
>> case 1: return cache.getByte(address) & 0xFF;
>> case 2: return cache.getShort(address, bigEndian) & 0xFFFF;
>> case 4: return cache.getInt(address, bigEndian) & 0xFFFFFFFFL;
>> case 8: return cache.getLong(address, bigEndian);
>> ...
>>
>>
>> There is an alignment check here, but it is the "relaxed" override shown
>> above, which allows 64-bit addresses on 32-bit boundaries. The override in
>> LinuxDebuggerLocal is:
>>
>>
>> /** Need to override this to relax alignment checks on x86. */
>> public long readCInteger(long address, long numBytes, boolean isUnsigned)
>> throws UnmappedAddressException, UnalignedAddressException {
>> // Only slightly relaxed semantics -- this is a hack, but is
>> // necessary on x86 where it seems the compiler is
>> // putting some global 64-bit data on 32-bit boundaries
>> if (numBytes == 8) {
>> utils.checkAlignment(address, 4);
>> } else {
>> utils.checkAlignment(address, numBytes);
>> }
>> byte[] data = readBytes(address, numBytes);
>> return utils.dataToCInteger(data, isUnsigned);
>>
>>
>> Although there is a relaxed alignment check here also, the code that reads
>> from the address does not assume all the bytes are on the same page. It uses
>> readBytes() intead of cache.getLong(), so it won't run into the PageCache
>> issue with reading a 64-bit value that starts in the last 32-bit word of a
>> page.
>>
>> I think the introduction of these relaxed alignment checks has a muddled
>> history, probably made more muddled by ports cloning code that maybe wasn't
>> necessary. Probably also initial fixes (the relaxed alignment check) seemed
>> to work at first, but later the PageCache issue was discovered, leading to
>> readBytes() workaround in the LinuxDebuggerLocal.readCInteger() override,
>> but was not also done on other ports (so we got this CR for windows).
>>
>> For 64-bit support it's clear this easing of the 64-bit alignment is not
>> needed, and getting rid of it would result in the proper
>> UnalignedAddressException being thrown. The question is whether it is still
>> needed for 32-bit x86, and if so is it needed on all ports.
>>
>> I can't test linux-x86, so I can't tell if it still allows 64-bit values on
>> 32-bit aligned addresses, so for now I'll assume it does. So the approach
>> being taken is whenever an address of a 64-bit type points to the last
>> 32-bit word of a page, use readBytes() to get the 64-bit value one byte at a
>> time. It still uses the page cache in the end, but it doesn't try to get all
>> 8 bytes from the same page. Note for 64-bit systems, the conditoin will
>> never arise because of the removal of the relaxed alignment check. Instead
>> there will be an UnalignedAddressException at an early point when the
>> alignment check is made.
>>
>> One windfall of this change is now we always read 64-bit values from the
>> page cache in a way that is much more efficient than reading a byte at a
>> time. This has resulted in about a 25% performance improvement in a test I
>> have that does a heap dump.
>
> Chris Plummer has updated the pull request with a new target base due to a
> merge or a rebase. The pull request now contains four commits:
>
> - Merge
> - Fix jcheck error
> - Undo some temp test changes.
> - Fix 64-bit alignment requirements
src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/DebuggerBase.java
line 242:
> 240: private boolean canUsePageCacheFor64bitRead(long address) {
> 241: long pageMask = ~(pageSize - 1);
> 242: if ((address & pageMask) != ((address + 4) & pageMask)) {
This looks a bit over-complicated.
Maybe something like
` long pageMask = pageSize - 1;
if ((address & pageMask) > (pageSize - 8)) {
return false;
}`
-------------
PR: https://git.openjdk.org/jdk/pull/10090