Hi Timo,

Can you please open an issue of this, it's easier to track
in Github?

Thanks,
Pekka

On 14.3.2019 1.49, Timo Betcke wrote:
> Hi,
> 
> I have pinned down the next failed test. It still seems related to the 
> multi-indexing even with your bugfixed version. The corresponding gist 
> is here:
> 
> https://gist.github.com/tbetcke/0bf7e12a2f3ab8032339cc38b8441b6e
> 
> At the end of the kernel all entries in shapeIntegral should have the 
> value 1.0. However, while shapeIntegral[0][0] is correct, 
> shapeIntegral[1][0] is not.
> If I move the second print statement for shapeIntegral[1][0] into the 
> for loop the variables are correctly updated.
> 
> Just something for context. The actual kernel from which this example is 
> derived, is doing a finite element integral on a triangle. The test 
> values are from the test space and the trial values from the domain 
> space. Via C Macros I am adapting the dimensions of the arrays to the 
> actual number of test and trial functions. The crash happens for trial 
> dimension 1 and test dimension 3.
> 
> Thanks again for your help. I am excited about getting Pocl to work with 
> our software.
> 
> Best wishes
> 
> Timo
> 
> 
> On Wed, 13 Mar 2019 at 23:23, Timo Betcke <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>     Hi Michal,
> 
>     thanks for the bugfix. The crashes have now disappeared and more
>     tests are passing with your bugfix version. However, several unit
>     tests still fail that work with AMD and Intel. Briefly looking at
>     the results I see lots of nan entries in the pocl output. I will try
>     to pin this down more and then report back to you.
> 
>     Best wishes
> 
>     Timo
> 
>     On Mon, 11 Mar 2019 at 10:50, Michal Babej (TAU)
>     <[email protected] <mailto:[email protected]>> wrote:
> 
>         Hello,
> 
> 
>         I remember trying to fix this bug last year, but then i got
>         sidetracked by other things. (BTW it would be preferable if you
>         reported bugs as github issues in the future)
> 
> 
>         Anyway, i've hopefully fixed it. Can you test your program with
>         master branch from https://github.com/franz/pocl
> 
> 
>         Regards,
> 
>         -- mb
> 
>         
> ------------------------------------------------------------------------
>         *From:* Timo Betcke <[email protected]
>         <mailto:[email protected]>>
>         *Sent:* Friday, March 8, 2019 3:48:34 AM
>         *To:* Portable Computing Language development discussion
>         *Subject:* Re: [pocl-devel] POCL Crash in vmovaps operation
>         Dear Pekka,
> 
>         I have now cooked up a small example that crashes in vmovaps.
>         The gist is available here (uses PyOpenCL to run):
> 
>         https://gist.github.com/tbetcke/b4da01465b587e85cc88801aafdced0a
> 
>         The example is fairly nonsensical and was derived by reducing a
>         crashing kernel as far as possible while retaining the crash.
>         It runs fine under Intel CPU OpenCL on a Xeon and Rocm OpenCL on
>         an AMD GPU. My platform is Ubuntu 18.04 with llvm 6. If necessary
>         I can create an environment with updated llvm, but would like to
>         avoid it (unless it is llvm 6 related). Pocl is the most recent
>         git master.
> 
>         The code crashes at the following assembler instructions:
> 
>             0x00007fffe02575e3 <+195>:   xor    r9d,r9d
>             0x00007fffe02575e6 <+198>:   xor    r10d,r10d
>             0x00007fffe02575e9 <+201>:   nop    DWORD PTR [rax+0x0]
>             0x00007fffe02575f0 <+208>:   mov    QWORD PTR [rdx+r9*1],0x0
>         => 0x00007fffe02575f8 <+216>:   vmovaps XMMWORD PTR
>         [rdi+r9*1-0x10],xmm0
>             0x00007fffe02575ff <+223>:   mov    QWORD PTR [rdi+r9*1],0x0
>             0x00007fffe0257607 <+231>:   vmovaps XMMWORD PTR
>         [rdx+r9*1-0x10],xmm0
>             0x00007fffe025760e <+238>:   vmovupd xmm1,XMMWORD PTR
>         [rdi+r9*1-0x8]
>             0x00007fffe0257615 <+245>:   vaddpd xmm1,xmm1,XMMWORD PTR
>         [rdx+r9*1-0x8]
>             0x00007fffe025761c <+252>:   vmovupd XMMWORD PTR
>         [rdx+r9*1-0x8],xmm1
>             0x00007fffe0257623 <+259>:   mov    r8,r11
>             0x00007fffe0257626 <+262>:   sar    r8,0x20
>             0x00007fffe025762a <+266>:   lea    rsi,[r8+r8*2]
> 
>         Removing any of the for loops or the localResult variable (or
>         removing its __local attribute) leads to the kernel working on Pocl.
>         It would be great to get to the source of this. Please let me
>         know if you need more information from me.
> 
>         Best wishes
> 
>         Timo
> 
> 
>         On Wed, 6 Mar 2019 at 21:21, Timo Betcke <[email protected]
>         <mailto:[email protected]>> wrote:
> 
>             Hi Pekka,
> 
>             thanks for your hints and the link. I had one buffer in the
>             kernel call that had a cast from a float type to a vector
>             type. I have fixed this. But the segfault remains. In the
>             next few days I will try to cook up a simple example that
>             produces the segfault. Fortunately, the kernel itself is not
>             too complicated, so should be able to reduce it.
> 
>             Best wishes
> 
>             Timo
> 
>             On Wed, 6 Mar 2019 at 10:20, Pekka Jääskeläinen (TAU)
>             <[email protected]
>             <mailto:[email protected]>> wrote:
> 
>                 Yes, now that I look at it more closely,
>                 your stack trace looks _very_ much to the common data
>                 alignment
>                 issues people have. I think this might be worth a FAQ
>                 item somewhere.
> 
>                 
> https://stackoverflow.com/questions/5983389/how-to-align-stack-at-32-byte-boundary-in-gcc
> 
>                 On 6.3.2019 8.45, Pekka Jääskeläinen (TAU) wrote:
>                  > Hi Timo,
>                  >
>                  > Shooting in the dark here, but since just yesterday I
>                 debugged a similar
>                  > looking issue
>                  > which was caused by an illegal cast in the source
>                 code from float* to
>                  > float4*. It trusted
>                  > the alignment is still fine, which it wasn't after
>                 vectorization. A very
>                  > target specific programming
>                  > error which many ocl targets can easily hide.
>                  >
>                  > If this is something else, we need a test case,
>                 smaller the better, to
>                  > help you here.
>                  > Before opening an issue though, please with the
>                 latest master and LLVM 8.
>                  >
>                  > Pekka
>                  >
>                  >
>                 
> ------------------------------------------------------------------------
>                  > *From:* Timo Betcke <[email protected]
>                 <mailto:[email protected]>>
>                  > *Sent:* Tuesday, March 5, 2019 11:27:12 PM
>                  > *To:* Portable Computing Language development discussion
>                  > *Subject:* [pocl-devel] POCL Crash in vmovaps operation
>                  > Dear Pocl community,
>                  >
>                  > I was just testing the newest Pocl Version (github
>                 master branch) with
>                  > our software. During execution of one of our kernels
>                 Pocl crashed.
>                  > Disassembling the crash shows the following
>                 operations during the crash:
>                  >
>                  > ------------------
>                  >     0x00007fffb81efdd8 <+664>:   vmulpd xmm2,xmm2,xmm6
>                  >     0x00007fffb81efddc <+668>:   vsubpd xmm2,xmm5,xmm2
>                  >     0x00007fffb81efde0 <+672>:   vpermilpd xmm5,xmm4,0x1
>                  >     0x00007fffb81efde6 <+678>:   vmulsd xmm3,xmm3,xmm5
>                  >     0x00007fffb81efdea <+682>:   vmulsd xmm4,xmm15,xmm4
>                  >     0x00007fffb81efdee <+686>:   vsubsd xmm3,xmm3,xmm4
>                  >     0x00007fffb81efdf2 <+690>:   vpermilpd xmm1,xmm1,0x1
>                  >     0x00007fffb81efdf8 <+696>:   vmulpd xmm0,xmm0,xmm1
>                  >     0x00007fffb81efdfc <+700>:   vpermilpd xmm1,xmm0,0x1
>                  >     0x00007fffb81efe02 <+706>:   vsubsd xmm0,xmm0,xmm1
>                  >     0x00007fffb81efe06 <+710>:   lea    rsi,[rdx+rdx*2]
>                  >     0x00007fffb81efe0a <+714>:   mov    rdx,QWORD PTR
>                 [rbx+0x38]
>                  > => 0x00007fffb81efe0e <+718>:   vmovaps XMMWORD PTR
>                 [rdx+rsi*8],xmm12
>                  > ---Type <return> to continue, or q <return> to quit---
>                  >     0x00007fffb81efe13 <+723>:   mov    QWORD PTR
>                 [rbx+0x40],rsi
>                  >     0x00007fffb81efe17 <+727>:   mov    QWORD PTR
>                 [rdx+rsi*8+0x10],0x0
>                  >     0x00007fffb81efe20 <+736>:   vinsertf32x4
>                 ymm1,ymm16,xmm0,0x1
>                  > -----------------------------
>                  > This seems to be a similar bug that I discussed a
>                 year ago on the
>                  > mailing list. See the thread here:
>                  >
>                 
> https://www.mail-archive.com/[email protected]/msg01087.html.
> 
>                  > In summary, the issue was related to us using arrays
>                 of arrays within
>                  > our kernels and pocl creating wrong code for it.
>                  >
>                  > During that time a gist was suggested for Pocl, which
>                 I tested but did
>                  > not improve things. Afterwards I let it drop for a
>                 while as we were in
>                  > early development and had loads of building sites.
>                 But our software is
>                  > now close to release ready and it would be great to
>                 get it working with
>                  > pocl.
>                  >
>                  > Any help would be greatly appreciated.
>                  > Best wishes
>                  >
>                  > Timo
>                  >
>                  > --
>                  > Timo Betcke
>                  > Professor of Computational Mathematics
>                  > University College London
>                  > Department of Mathematics
>                  > E-Mail: [email protected]
>                 <mailto:[email protected]> <mailto:[email protected]
>                 <mailto:[email protected]>>
>                  > Tel.: +44 (0) 20-3108-4068
>                  >
>                  >
>                  > _______________________________________________
>                  > pocl-devel mailing list
>                  > [email protected]
>                 <mailto:[email protected]>
>                  > https://lists.sourceforge.net/lists/listinfo/pocl-devel
>                  >
> 
>                 -- 
>                 Pekka
> 
> 
>                 _______________________________________________
>                 pocl-devel mailing list
>                 [email protected]
>                 <mailto:[email protected]>
>                 https://lists.sourceforge.net/lists/listinfo/pocl-devel
> 
> 
> 
>             -- 
>             Timo Betcke
>             Professor of Computational Mathematics
>             University College London
>             Department of Mathematics
>             E-Mail: [email protected] <mailto:[email protected]>
>             Tel.: +44 (0) 20-3108-4068
> 
> 
> 
>         -- 
>         Timo Betcke
>         Professor of Computational Mathematics
>         University College London
>         Department of Mathematics
>         E-Mail: [email protected] <mailto:[email protected]>
>         Tel.: +44 (0) 20-3108-4068
>         _______________________________________________
>         pocl-devel mailing list
>         [email protected]
>         <mailto:[email protected]>
>         https://lists.sourceforge.net/lists/listinfo/pocl-devel
> 
> 
> 
>     -- 
>     Timo Betcke
>     Professor of Computational Mathematics
>     University College London
>     Department of Mathematics
>     E-Mail: [email protected] <mailto:[email protected]>
>     Tel.: +44 (0) 20-3108-4068
> 
> 
> 
> -- 
> Timo Betcke
> Professor of Computational Mathematics
> University College London
> Department of Mathematics
> E-Mail: [email protected] <mailto:[email protected]>
> Tel.: +44 (0) 20-3108-4068
> 
> 
> _______________________________________________
> pocl-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/pocl-devel
> 

-- 
Pekka

_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to