Hi,

I have pinned down the next failed test. It still seems related to the
multi-indexing even with your bugfixed version. The corresponding gist is
here:

https://gist.github.com/tbetcke/0bf7e12a2f3ab8032339cc38b8441b6e

At the end of the kernel all entries in shapeIntegral should have the value
1.0. However, while shapeIntegral[0][0] is correct, shapeIntegral[1][0] is
not.
If I move the second print statement for shapeIntegral[1][0] into the for
loop the variables are correctly updated.

Just something for context. The actual kernel from which this example is
derived, is doing a finite element integral on a triangle. The test values
are from the test space and the trial values from the domain space. Via C
Macros I am adapting the dimensions of the arrays to the actual number of
test and trial functions. The crash happens for trial dimension 1 and test
dimension 3.

Thanks again for your help. I am excited about getting Pocl to work with
our software.

Best wishes

Timo


On Wed, 13 Mar 2019 at 23:23, Timo Betcke <[email protected]> wrote:

> Hi Michal,
>
> thanks for the bugfix. The crashes have now disappeared and more tests are
> passing with your bugfix version. However, several unit tests still fail
> that work with AMD and Intel. Briefly looking at the results I see lots of
> nan entries in the pocl output. I will try to pin this down more and then
> report back to you.
>
> Best wishes
>
> Timo
>
> On Mon, 11 Mar 2019 at 10:50, Michal Babej (TAU) <[email protected]>
> wrote:
>
>> Hello,
>>
>>
>> I remember trying to fix this bug last year, but then i got sidetracked
>> by other things. (BTW it would be preferable if you reported bugs as github
>> issues in the future)
>>
>>
>> Anyway, i've hopefully fixed it. Can you test your program with master
>> branch from https://github.com/franz/pocl
>>
>>
>> Regards,
>>
>> -- mb
>> ------------------------------
>> *From:* Timo Betcke <[email protected]>
>> *Sent:* Friday, March 8, 2019 3:48:34 AM
>> *To:* Portable Computing Language development discussion
>> *Subject:* Re: [pocl-devel] POCL Crash in vmovaps operation
>>
>> Dear Pekka,
>>
>> I have now cooked up a small example that crashes in vmovaps. The gist is
>> available here (uses PyOpenCL to run):
>>
>> https://gist.github.com/tbetcke/b4da01465b587e85cc88801aafdced0a
>>
>> The example is fairly nonsensical and was derived by reducing a crashing
>> kernel as far as possible while retaining the crash.
>> It runs fine under Intel CPU OpenCL on a Xeon and Rocm OpenCL on an AMD
>> GPU. My platform is Ubuntu 18.04 with llvm 6. If necessary
>> I can create an environment with updated llvm, but would like to avoid it
>> (unless it is llvm 6 related). Pocl is the most recent git master.
>>
>> The code crashes at the following assembler instructions:
>>
>>    0x00007fffe02575e3 <+195>:   xor    r9d,r9d
>>    0x00007fffe02575e6 <+198>:   xor    r10d,r10d
>>    0x00007fffe02575e9 <+201>:   nop    DWORD PTR [rax+0x0]
>>    0x00007fffe02575f0 <+208>:   mov    QWORD PTR [rdx+r9*1],0x0
>> => 0x00007fffe02575f8 <+216>:   vmovaps XMMWORD PTR [rdi+r9*1-0x10],xmm0
>>    0x00007fffe02575ff <+223>:   mov    QWORD PTR [rdi+r9*1],0x0
>>    0x00007fffe0257607 <+231>:   vmovaps XMMWORD PTR [rdx+r9*1-0x10],xmm0
>>    0x00007fffe025760e <+238>:   vmovupd xmm1,XMMWORD PTR [rdi+r9*1-0x8]
>>    0x00007fffe0257615 <+245>:   vaddpd xmm1,xmm1,XMMWORD PTR
>> [rdx+r9*1-0x8]
>>    0x00007fffe025761c <+252>:   vmovupd XMMWORD PTR [rdx+r9*1-0x8],xmm1
>>    0x00007fffe0257623 <+259>:   mov    r8,r11
>>    0x00007fffe0257626 <+262>:   sar    r8,0x20
>>    0x00007fffe025762a <+266>:   lea    rsi,[r8+r8*2]
>>
>> Removing any of the for loops or the localResult variable (or removing
>> its __local attribute) leads to the kernel working on Pocl.
>> It would be great to get to the source of this. Please let me know if you
>> need more information from me.
>>
>> Best wishes
>>
>> Timo
>>
>>
>> On Wed, 6 Mar 2019 at 21:21, Timo Betcke <[email protected]> wrote:
>>
>> Hi Pekka,
>>
>> thanks for your hints and the link. I had one buffer in the kernel call
>> that had a cast from a float type to a vector type. I have fixed this. But
>> the segfault remains. In the next few days I will try to cook up a simple
>> example that produces the segfault. Fortunately, the kernel itself is not
>> too complicated, so should be able to reduce it.
>>
>> Best wishes
>>
>> Timo
>>
>> On Wed, 6 Mar 2019 at 10:20, Pekka Jääskeläinen (TAU) <
>> [email protected]> wrote:
>>
>> Yes, now that I look at it more closely,
>> your stack trace looks _very_ much to the common data alignment
>> issues people have. I think this might be worth a FAQ item somewhere.
>>
>>
>> https://stackoverflow.com/questions/5983389/how-to-align-stack-at-32-byte-boundary-in-gcc
>>
>> On 6.3.2019 8.45, Pekka Jääskeläinen (TAU) wrote:
>> > Hi Timo,
>> >
>> > Shooting in the dark here, but since just yesterday I debugged a
>> similar
>> > looking issue
>> > which was caused by an illegal cast in the source code from float* to
>> > float4*. It trusted
>> > the alignment is still fine, which it wasn't after vectorization. A
>> very
>> > target specific programming
>> > error which many ocl targets can easily hide.
>> >
>> > If this is something else, we need a test case, smaller the better, to
>> > help you here.
>> > Before opening an issue though, please with the latest master and LLVM
>> 8.
>> >
>> > Pekka
>> >
>> > ------------------------------------------------------------------------
>> > *From:* Timo Betcke <[email protected]>
>> > *Sent:* Tuesday, March 5, 2019 11:27:12 PM
>> > *To:* Portable Computing Language development discussion
>> > *Subject:* [pocl-devel] POCL Crash in vmovaps operation
>> > Dear Pocl community,
>> >
>> > I was just testing the newest Pocl Version (github master branch) with
>> > our software. During execution of one of our kernels Pocl crashed.
>> > Disassembling the crash shows the following operations during the crash:
>> >
>> > ------------------
>> >     0x00007fffb81efdd8 <+664>:   vmulpd xmm2,xmm2,xmm6
>> >     0x00007fffb81efddc <+668>:   vsubpd xmm2,xmm5,xmm2
>> >     0x00007fffb81efde0 <+672>:   vpermilpd xmm5,xmm4,0x1
>> >     0x00007fffb81efde6 <+678>:   vmulsd xmm3,xmm3,xmm5
>> >     0x00007fffb81efdea <+682>:   vmulsd xmm4,xmm15,xmm4
>> >     0x00007fffb81efdee <+686>:   vsubsd xmm3,xmm3,xmm4
>> >     0x00007fffb81efdf2 <+690>:   vpermilpd xmm1,xmm1,0x1
>> >     0x00007fffb81efdf8 <+696>:   vmulpd xmm0,xmm0,xmm1
>> >     0x00007fffb81efdfc <+700>:   vpermilpd xmm1,xmm0,0x1
>> >     0x00007fffb81efe02 <+706>:   vsubsd xmm0,xmm0,xmm1
>> >     0x00007fffb81efe06 <+710>:   lea    rsi,[rdx+rdx*2]
>> >     0x00007fffb81efe0a <+714>:   mov    rdx,QWORD PTR [rbx+0x38]
>> > => 0x00007fffb81efe0e <+718>:   vmovaps XMMWORD PTR [rdx+rsi*8],xmm12
>> > ---Type <return> to continue, or q <return> to quit---
>> >     0x00007fffb81efe13 <+723>:   mov    QWORD PTR [rbx+0x40],rsi
>> >     0x00007fffb81efe17 <+727>:   mov    QWORD PTR [rdx+rsi*8+0x10],0x0
>> >     0x00007fffb81efe20 <+736>:   vinsertf32x4 ymm1,ymm16,xmm0,0x1
>> > -----------------------------
>> > This seems to be a similar bug that I discussed a year ago on the
>> > mailing list. See the thread here:
>> >
>> https://www.mail-archive.com/[email protected]/msg01087.html.
>>
>> > In summary, the issue was related to us using arrays of arrays within
>> > our kernels and pocl creating wrong code for it.
>> >
>> > During that time a gist was suggested for Pocl, which I tested but did
>> > not improve things. Afterwards I let it drop for a while as we were in
>> > early development and had loads of building sites. But our software is
>> > now close to release ready and it would be great to get it working with
>> > pocl.
>> >
>> > Any help would be greatly appreciated.
>> > Best wishes
>> >
>> > Timo
>> >
>> > --
>> > Timo Betcke
>> > Professor of Computational Mathematics
>> > University College London
>> > Department of Mathematics
>> > E-Mail: [email protected] <mailto:[email protected]>
>> > Tel.: +44 (0) 20-3108-4068
>> >
>> >
>> > _______________________________________________
>> > pocl-devel mailing list
>> > [email protected]
>> > https://lists.sourceforge.net/lists/listinfo/pocl-devel
>> >
>>
>> --
>> Pekka
>>
>>
>> _______________________________________________
>> pocl-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/pocl-devel
>>
>>
>>
>> --
>> Timo Betcke
>> Professor of Computational Mathematics
>> University College London
>> Department of Mathematics
>> E-Mail: [email protected]
>> Tel.: +44 (0) 20-3108-4068
>>
>>
>>
>> --
>> Timo Betcke
>> Professor of Computational Mathematics
>> University College London
>> Department of Mathematics
>> E-Mail: [email protected]
>> Tel.: +44 (0) 20-3108-4068
>> _______________________________________________
>> pocl-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/pocl-devel
>>
>
>
> --
> Timo Betcke
> Professor of Computational Mathematics
> University College London
> Department of Mathematics
> E-Mail: [email protected]
> Tel.: +44 (0) 20-3108-4068
>


-- 
Timo Betcke
Professor of Computational Mathematics
University College London
Department of Mathematics
E-Mail: [email protected]
Tel.: +44 (0) 20-3108-4068
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to