Hi Timo, Can you please open an issue of this, it's easier to track in Github?
Thanks, Pekka On 14.3.2019 1.49, Timo Betcke wrote: > Hi, > > I have pinned down the next failed test. It still seems related to the > multi-indexing even with your bugfixed version. The corresponding gist > is here: > > https://gist.github.com/tbetcke/0bf7e12a2f3ab8032339cc38b8441b6e > > At the end of the kernel all entries in shapeIntegral should have the > value 1.0. However, while shapeIntegral[0][0] is correct, > shapeIntegral[1][0] is not. > If I move the second print statement for shapeIntegral[1][0] into the > for loop the variables are correctly updated. > > Just something for context. The actual kernel from which this example is > derived, is doing a finite element integral on a triangle. The test > values are from the test space and the trial values from the domain > space. Via C Macros I am adapting the dimensions of the arrays to the > actual number of test and trial functions. The crash happens for trial > dimension 1 and test dimension 3. > > Thanks again for your help. I am excited about getting Pocl to work with > our software. > > Best wishes > > Timo > > > On Wed, 13 Mar 2019 at 23:23, Timo Betcke <[email protected] > <mailto:[email protected]>> wrote: > > Hi Michal, > > thanks for the bugfix. The crashes have now disappeared and more > tests are passing with your bugfix version. However, several unit > tests still fail that work with AMD and Intel. Briefly looking at > the results I see lots of nan entries in the pocl output. I will try > to pin this down more and then report back to you. > > Best wishes > > Timo > > On Mon, 11 Mar 2019 at 10:50, Michal Babej (TAU) > <[email protected] <mailto:[email protected]>> wrote: > > Hello, > > > I remember trying to fix this bug last year, but then i got > sidetracked by other things. (BTW it would be preferable if you > reported bugs as github issues in the future) > > > Anyway, i've hopefully fixed it. Can you test your program with > master branch from https://github.com/franz/pocl > > > Regards, > > -- mb > > > ------------------------------------------------------------------------ > *From:* Timo Betcke <[email protected] > <mailto:[email protected]>> > *Sent:* Friday, March 8, 2019 3:48:34 AM > *To:* Portable Computing Language development discussion > *Subject:* Re: [pocl-devel] POCL Crash in vmovaps operation > Dear Pekka, > > I have now cooked up a small example that crashes in vmovaps. > The gist is available here (uses PyOpenCL to run): > > https://gist.github.com/tbetcke/b4da01465b587e85cc88801aafdced0a > > The example is fairly nonsensical and was derived by reducing a > crashing kernel as far as possible while retaining the crash. > It runs fine under Intel CPU OpenCL on a Xeon and Rocm OpenCL on > an AMD GPU. My platform is Ubuntu 18.04 with llvm 6. If necessary > I can create an environment with updated llvm, but would like to > avoid it (unless it is llvm 6 related). Pocl is the most recent > git master. > > The code crashes at the following assembler instructions: > > 0x00007fffe02575e3 <+195>: xor r9d,r9d > 0x00007fffe02575e6 <+198>: xor r10d,r10d > 0x00007fffe02575e9 <+201>: nop DWORD PTR [rax+0x0] > 0x00007fffe02575f0 <+208>: mov QWORD PTR [rdx+r9*1],0x0 > => 0x00007fffe02575f8 <+216>: vmovaps XMMWORD PTR > [rdi+r9*1-0x10],xmm0 > 0x00007fffe02575ff <+223>: mov QWORD PTR [rdi+r9*1],0x0 > 0x00007fffe0257607 <+231>: vmovaps XMMWORD PTR > [rdx+r9*1-0x10],xmm0 > 0x00007fffe025760e <+238>: vmovupd xmm1,XMMWORD PTR > [rdi+r9*1-0x8] > 0x00007fffe0257615 <+245>: vaddpd xmm1,xmm1,XMMWORD PTR > [rdx+r9*1-0x8] > 0x00007fffe025761c <+252>: vmovupd XMMWORD PTR > [rdx+r9*1-0x8],xmm1 > 0x00007fffe0257623 <+259>: mov r8,r11 > 0x00007fffe0257626 <+262>: sar r8,0x20 > 0x00007fffe025762a <+266>: lea rsi,[r8+r8*2] > > Removing any of the for loops or the localResult variable (or > removing its __local attribute) leads to the kernel working on Pocl. > It would be great to get to the source of this. Please let me > know if you need more information from me. > > Best wishes > > Timo > > > On Wed, 6 Mar 2019 at 21:21, Timo Betcke <[email protected] > <mailto:[email protected]>> wrote: > > Hi Pekka, > > thanks for your hints and the link. I had one buffer in the > kernel call that had a cast from a float type to a vector > type. I have fixed this. But the segfault remains. In the > next few days I will try to cook up a simple example that > produces the segfault. Fortunately, the kernel itself is not > too complicated, so should be able to reduce it. > > Best wishes > > Timo > > On Wed, 6 Mar 2019 at 10:20, Pekka Jääskeläinen (TAU) > <[email protected] > <mailto:[email protected]>> wrote: > > Yes, now that I look at it more closely, > your stack trace looks _very_ much to the common data > alignment > issues people have. I think this might be worth a FAQ > item somewhere. > > > https://stackoverflow.com/questions/5983389/how-to-align-stack-at-32-byte-boundary-in-gcc > > On 6.3.2019 8.45, Pekka Jääskeläinen (TAU) wrote: > > Hi Timo, > > > > Shooting in the dark here, but since just yesterday I > debugged a similar > > looking issue > > which was caused by an illegal cast in the source > code from float* to > > float4*. It trusted > > the alignment is still fine, which it wasn't after > vectorization. A very > > target specific programming > > error which many ocl targets can easily hide. > > > > If this is something else, we need a test case, > smaller the better, to > > help you here. > > Before opening an issue though, please with the > latest master and LLVM 8. > > > > Pekka > > > > > > ------------------------------------------------------------------------ > > *From:* Timo Betcke <[email protected] > <mailto:[email protected]>> > > *Sent:* Tuesday, March 5, 2019 11:27:12 PM > > *To:* Portable Computing Language development discussion > > *Subject:* [pocl-devel] POCL Crash in vmovaps operation > > Dear Pocl community, > > > > I was just testing the newest Pocl Version (github > master branch) with > > our software. During execution of one of our kernels > Pocl crashed. > > Disassembling the crash shows the following > operations during the crash: > > > > ------------------ > > 0x00007fffb81efdd8 <+664>: vmulpd xmm2,xmm2,xmm6 > > 0x00007fffb81efddc <+668>: vsubpd xmm2,xmm5,xmm2 > > 0x00007fffb81efde0 <+672>: vpermilpd xmm5,xmm4,0x1 > > 0x00007fffb81efde6 <+678>: vmulsd xmm3,xmm3,xmm5 > > 0x00007fffb81efdea <+682>: vmulsd xmm4,xmm15,xmm4 > > 0x00007fffb81efdee <+686>: vsubsd xmm3,xmm3,xmm4 > > 0x00007fffb81efdf2 <+690>: vpermilpd xmm1,xmm1,0x1 > > 0x00007fffb81efdf8 <+696>: vmulpd xmm0,xmm0,xmm1 > > 0x00007fffb81efdfc <+700>: vpermilpd xmm1,xmm0,0x1 > > 0x00007fffb81efe02 <+706>: vsubsd xmm0,xmm0,xmm1 > > 0x00007fffb81efe06 <+710>: lea rsi,[rdx+rdx*2] > > 0x00007fffb81efe0a <+714>: mov rdx,QWORD PTR > [rbx+0x38] > > => 0x00007fffb81efe0e <+718>: vmovaps XMMWORD PTR > [rdx+rsi*8],xmm12 > > ---Type <return> to continue, or q <return> to quit--- > > 0x00007fffb81efe13 <+723>: mov QWORD PTR > [rbx+0x40],rsi > > 0x00007fffb81efe17 <+727>: mov QWORD PTR > [rdx+rsi*8+0x10],0x0 > > 0x00007fffb81efe20 <+736>: vinsertf32x4 > ymm1,ymm16,xmm0,0x1 > > ----------------------------- > > This seems to be a similar bug that I discussed a > year ago on the > > mailing list. See the thread here: > > > > https://www.mail-archive.com/[email protected]/msg01087.html. > > > In summary, the issue was related to us using arrays > of arrays within > > our kernels and pocl creating wrong code for it. > > > > During that time a gist was suggested for Pocl, which > I tested but did > > not improve things. Afterwards I let it drop for a > while as we were in > > early development and had loads of building sites. > But our software is > > now close to release ready and it would be great to > get it working with > > pocl. > > > > Any help would be greatly appreciated. > > Best wishes > > > > Timo > > > > -- > > Timo Betcke > > Professor of Computational Mathematics > > University College London > > Department of Mathematics > > E-Mail: [email protected] > <mailto:[email protected]> <mailto:[email protected] > <mailto:[email protected]>> > > Tel.: +44 (0) 20-3108-4068 > > > > > > _______________________________________________ > > pocl-devel mailing list > > [email protected] > <mailto:[email protected]> > > https://lists.sourceforge.net/lists/listinfo/pocl-devel > > > > -- > Pekka > > > _______________________________________________ > pocl-devel mailing list > [email protected] > <mailto:[email protected]> > https://lists.sourceforge.net/lists/listinfo/pocl-devel > > > > -- > Timo Betcke > Professor of Computational Mathematics > University College London > Department of Mathematics > E-Mail: [email protected] <mailto:[email protected]> > Tel.: +44 (0) 20-3108-4068 > > > > -- > Timo Betcke > Professor of Computational Mathematics > University College London > Department of Mathematics > E-Mail: [email protected] <mailto:[email protected]> > Tel.: +44 (0) 20-3108-4068 > _______________________________________________ > pocl-devel mailing list > [email protected] > <mailto:[email protected]> > https://lists.sourceforge.net/lists/listinfo/pocl-devel > > > > -- > Timo Betcke > Professor of Computational Mathematics > University College London > Department of Mathematics > E-Mail: [email protected] <mailto:[email protected]> > Tel.: +44 (0) 20-3108-4068 > > > > -- > Timo Betcke > Professor of Computational Mathematics > University College London > Department of Mathematics > E-Mail: [email protected] <mailto:[email protected]> > Tel.: +44 (0) 20-3108-4068 > > > _______________________________________________ > pocl-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/pocl-devel > -- Pekka _______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
