Hi Vladimir, I submitted a pull request an hour or so ago as this was a P1 bug, feel free to use that or ignore. https://git.openjdk.java.net/jdk/pull/772
Best Regards, Sandhya -----Original Message----- From: Vladimir Kozlov <[email protected]> Sent: Tuesday, October 20, 2020 2:40 PM To: Viswanathan, Sandhya <[email protected]>; Tatton, Jason <[email protected]>; David Holmes <[email protected]>; [email protected]; [email protected]; Hohensee, Paul <[email protected]> Subject: Re: Howto replicate failure of 8254790? Thank you, Sandhya Very nice analysis. I just finished running dsig/GenerationTests.java test multiply runs (to besure) on our systems and confirmed your proposed fix: bsfl(ch, tmp); + if (UseNewCode) { + addptr(result, ch); + } else { addl(result, ch); + } It always fails with addl() and always passed with addptr(). I will assign bug to me and file PR now. I will also fix Unicode string index instrinsic code. Thanks, Vladimir On 10/20/20 10:27 AM, Viswanathan, Sandhya wrote: > Hi Vladimir, > > I analyzed the instruction dump yesterday to find out where the issue is. I > have attached it to the bug report as 8254790.asm: > https://bugs.openjdk.java.net/browse/JDK-8254790 > > The crash is reported at: > 100: 450FB64C1810 movzx r9d, byte ptr [r8+rbx*1+0x10] > > Which is just after the intrinsics and uses the rbx register (containing the > index of char from the intrinsic). > > RBX has the large value 0xfffffff900000008 instead of 8. The length of the > string is 34 bytes. The match is found in first 32 bytes at index 8. > After doing the 32 bytes with the following instructions: > 6b: C5FE6F13 vmovdqu ymm2, ymmword ptr [rbx] > 6f: C5ED74D1 vpcmpeqb ymm2, ymm2, ymm1 > 73: C4E27D17C2 vptest ymm0, ymm2 > 78: 0F8369000000 jnb 0xe7 > The control goes to 0xe7. > > The code snippet at 0xe7 is: > e7: C5FDD7CA vpmovmskb ecx, ymm2 > eb: 0FBCC1 bsf eax, ecx > ee: 03D8 add ebx, eax > f0: 482BDF sub rbx, rdi > f3: 0F1F4000 nop dword ptr [rax], eax > f7: 413BDB cmp ebx, r11d > fa: 0F83DF290000 jnb 0x2adf > 100: 450FB64C1810 movzx r9d, byte ptr [r8+rbx*1+0x10] > > After vpmovmskb, the bit mask in ecx is 0x1100, showing the match at 8th and > 9th byte. > The register rbx at this point must be holding address to the base of array: > 0x00000007e41d2700 same as rdi. > Bsf puts 8 in eax. > Then 8 is added to ebx instead of rbx using 32-bit add, making upper 32 bits > as 0, resulting in rbx = 0xe41d2708. > If the add was 64-bit add, everything would have worked well. > Then sub rbx, rdi results in 0xe41d2708 - 0x00000007e41d2700 = > 0xFFFFFFF900000008 being loaded in rbx. > This is the value we see at crash. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Vladimir Kozlov <[email protected]> > Sent: Tuesday, October 20, 2020 10:01 AM > To: Viswanathan, Sandhya <[email protected]>; Tatton, > Jason <[email protected]>; David Holmes <[email protected]>; > [email protected]; [email protected]; > Hohensee, Paul <[email protected]> > Subject: Re: Howto replicate failure of 8254790? > > Yes, I saw it too but I was not sure because we never hit the issue with > Unicode string index intrinsic. > An other thing is we see the failure only on MacOS. > > I also want someone to decode asm dump I provided in bug to see actual > instructions where it happened. > > Vladimir K > > On 10/19/20 5:38 PM, Viswanathan, Sandhya wrote: >> Hi Jason, >> >> I think I found the problem looking at the error log from Vladimir Kozlov. >> In stringL_indexof_char() function, the following snippet is the cause of >> problem: >> >> 2807 bind(FOUND_CHAR); >> 2808 if (UseAVX >= 2) { >> 2809 vpmovmskb(tmp, vec3); >> 2810 } else { >> 2811 pmovmskb(tmp, vec3); >> 2812 } >> 2813 bsfl(ch, tmp); >> 2814 addl(result, ch); <==== The problem is here >> 2815 >> 2816 bind(FOUND_SEQ_CHAR); >> 2817 subptr(result, str1); >> >> The line addl(result, ch) should have been addptr(result, ch). >> >> The same problem exists in the Unicode string index of char intrinsic as >> well and need to be fixed. >> >> Hope this helps. >> >> Best Regards, >> Sandhya >> >> -----Original Message----- >> From: hotspot-compiler-dev >> <[email protected]> On Behalf Of Vladimir >> Kozlov >> Sent: Thursday, October 15, 2020 3:59 PM >> To: Tatton, Jason <[email protected]>; David Holmes >> <[email protected]>; [email protected]; >> [email protected] >> Subject: Re: Howto replicate failure of 8254790? >> >> Hi Jason, >> >> I added surrounding instructions dump from hs_err file we have so you can >> reconstruct x86 assembler from it. >> >> If you look on si_addr: 0x00000000e41d2718 which case memory map >> failure, it looks like R8 =0x00000007e41d2700 is an >> oop: [B with upper 32-bits zeroed. It seems uppers 32-bits of address were >> cut. >> >> But I don't see it can happens in stringL_indexof_char() sub. You correctly >> used movptr() and addptr() instructions. >> >> Vladimir K >> >> On 10/15/20 2:10 PM, Tatton, Jason wrote: >>> Thanks Vladimir and David, I have access to a new macbook with an Intel >>> i7-9750H (supports AVX2) so I will try on that. >>> >>> -----Original Message----- >>> From: Vladimir Kozlov <[email protected]> >>> Sent: 15 October 2020 20:25 >>> To: David Holmes <[email protected]>; Tatton, Jason >>> <[email protected]>; [email protected]; >>> [email protected] >>> Subject: RE: [EXTERNAL] Howto replicate failure of 8254790? >>> >>> CAUTION: This email originated from outside of the organization. Do not >>> click links or open attachments unless you can confirm the sender and know >>> the content is safe. >>> >>> >>> >>> Note, we have old Mac machines in our testing env: >>> cx8, cmov, fxsr, ht, mmx, 3dnowpref, sse, sse2, sse3, ssse3, sse4.1, >>> sse4.2, popcnt, lzcnt, tsc, tscinvbit, avx, avx2, aes, erms, clmul, >>> bmi1, bmi2, rtm, adx, fma, vzeroupper, clflush, clflushopt >>> >>> Use -XX:UseAVX=2 >>> >>> But I was not able reproduce failure on my Skylake Linux machine even with >>> -XX:UseAVX=2. Maybe there are other factors on MacOS. >>> >>> Regards, >>> Vladimir K >>> >>> On 10/14/20 5:48 PM, David Holmes wrote: >>>> Hi Jason, >>>> >>>> On 15/10/2020 10:42 am, Tatton, Jason wrote: >>>>> Hi all, >>>>> >>>>> >>>>> >>>>> I am trying to replicate the failure of the tier2 test mentioned >>>>> in 8254790<https://bugs.openjdk.java.net/browse/JDK-8254790> but I >>>>> am only seeing it pass under an x86 linux machine. Are there any specific >>>>> architectural constraints under which this test should be run in order to >>>>> make it fail? >>>> >>>> It failed on a Mac, not Linux. >>>> >>>> Cheers, >>>> David >>>> >>>>> >>>>> >>>>> I am running the test via: make test >>>>> TEST="test/jdk/javax/xml/crypto/dsig/GenerationTests.java" >>>>> >>>>> >>>>> >>>>> Note that I am running the test against master without the commit: >>>>> "8254792: Disable intrinsic StringLatin1.indexOf until 8254790 is fixed" >>>>> which disables the intrinsic that is causing the test to fail. >>>>> >>>>> >>>>> >>>>> Thanks >>>>> -- >>>>> Jason >>>>>
