[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #12 from Jan Vesely  ---
The first patch is already in the mainline (https://reviews.llvm.org/D29792)
The second one is under review (https://reviews.llvm.org/D30230)

Feel free to leave this bug open until jpeg conversion works.
The kernel uses calls to sinpi/cospi functions which are rather register hungry
atm (they fail even on their own).

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #11 from nixscrip...@gmail.com ---
Also, a link to the patch in the build system would be nice, so I know when
that happens.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #10 from nixscrip...@gmail.com ---
I have finally gotten a build done, and the patch does indeed fix the hang. I
get the same assertion.

Once that patch lands, I will verify that SVN build number, and close this bug.

(And then, I'll be opening another bug for the assertion, once I've gathering
more information about it. There is an even simpler case in the ImageMagick
self-test suite that I'm trying to figure out how to run in a debuggable
manner.)

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #9 from nixscrip...@gmail.com ---
Thanks for your continued work on this.

I've been fighting with Arch Linux packaging for a week of quiet frustration.
It's also really slow to try and fix it, because the debug build seems to be
10x the size of the release version (7 GB total instead of 600 MB).

I will spend some more time on it in the coming days, and let you know how your
current patch goes when I am able.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #8 from Jan Vesely  ---
Created attachment 129815
  --> https://bugs.freedesktop.org/attachment.cgi?id=129815&action=edit
Fix-ALU-clause-markers-use-detection

This patch fixes gromacs build for me. I tested blur and it now results in
"Register number out of range!" failure.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #7 from Jan Vesely  ---
Thanks for the info. I think it's the same bug that hangs GROMACS kernel (my
patch was originally written to debug GROMACS).
you can change the assert to if and use "I->dump()" to print the triggering
instruction.
If it's "MOVA_INT_eg" then it's the same bug that hangs GROMACS.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #6 from nixscrip...@gmail.com ---
Replicating the issue again, here is the backtrace you requested:

(gdb) bt
#0  0x03b11dea95d4 in
llvm::MachineInstr::findRegisterUseOperandIdx(unsigned int, bool,
llvm::TargetRegisterInfo const*) const ()
   from /usr/lib/libLLVM-5.0svn.so
#1  0x03b11ed8c96e in (anonymous
namespace)::R600EmitClauseMarkers::MakeALUClause(llvm::MachineBasicBlock&,
llvm::MachineInstrBundleIterator) () from
/usr/lib/libLLVM-5.0svn.so
#2  0x03b11ed8d7fe in (anonymous
namespace)::R600EmitClauseMarkers::runOnMachineFunction(llvm::MachineFunction&)
() from /usr/lib/libLLVM-5.0svn.so
#3  0x03b11dea3bf1 in
llvm::MachineFunctionPass::runOnFunction(llvm::Function&) () from
/usr/lib/libLLVM-5.0svn.so
#4  0x03b11dd19342 in llvm::FPPassManager::runOnFunction(llvm::Function&)
() from /usr/lib/libLLVM-5.0svn.so
#5  0x03b11dd193e3 in llvm::FPPassManager::runOnModule(llvm::Module&) ()
   from /usr/lib/libLLVM-5.0svn.so
#6  0x03b11dd19d94 in llvm::legacy::PassManagerImpl::run(llvm::Module&) ()
   from /usr/lib/libLLVM-5.0svn.so
#7  0x03b120ede12c in ?? () from /usr/lib/libMesaOpenCL.so.1
#8  0x03b120ede790 in ?? () from /usr/lib/libMesaOpenCL.so.1
#9  0x03b120eda9ad in ?? () from /usr/lib/libMesaOpenCL.so.1
#10 0x03b120ecb9f9 in ?? () from /usr/lib/libMesaOpenCL.so.1
#11 0x03b120ea98dc in ?? () from /usr/lib/libMesaOpenCL.so.1
#12 0x03b1220f450b in clBuildProgram () from /usr/lib/libOpenCL.so
#13 0x03b127156afe in CompileOpenCLKernels ()
   from /usr/lib/libMagickCore-6.Q16HDRI.so.4
#14 0x03b12715728d in InitOpenCLEnvInternal ()
   from /usr/lib/libMagickCore-6.Q16HDRI.so.4
#15 0x03b1271573f1 in AcceleratePerfEvaluator ()
   from /usr/lib/libMagickCore-6.Q16HDRI.so.4
#16 0x03b127157e5a in autoSelectDevice ()
   from /usr/lib/libMagickCore-6.Q16HDRI.so.4
#17 0x03b127158944 in InitOpenCLEnv ()
   from /usr/lib/libMagickCore-6.Q16HDRI.so.4
#18 0x03b127051400 in checkOpenCLEnvironment ()
   from /usr/lib/libMagickCore-6.Q16HDRI.so.4
#19 0x03b127054a2d in AccelerateBlurImage ()
   from /usr/lib/libMagickCore-6.Q16HDRI.so.4
#20 0x03b1270f0c13 in BlurImageChannel ()
   from /usr/lib/libMagickCore-6.Q16HDRI.so.4
#21 0x03b126d9efd0 in MogrifyImage ()
   from /usr/lib/libMagickWand-6.Q16HDRI.so.4
#22 0x03b126da6200 in MogrifyImages ()
   from /usr/lib/libMagickWand-6.Q16HDRI.so.4
#23 0x03b126d2bb86 in ConvertImageCommand ()
   from /usr/lib/libMagickWand-6.Q16HDRI.so.4
#24 0x03b126d9b3ee in MagickCommandGenesis ()
   from /usr/lib/libMagickWand-6.Q16HDRI.so.4
#25 0x004007c7 in main ()

As you can see, it's a different function at the top of the stack this time.
That's beacuse, as you suspected, it is not the culprit. I was just lucky
finding findRegisterDefOperandIdx at the top of the stack when I tested this
before several times.

Per your suggestion, I did "finish" repeatedly until I could find the danging
function:

gdb) finish
Run till exit from #0  0x03b11dea95d4 in
llvm::MachineInstr::findRegisterUseOperandIdx(unsigned int, bool,
llvm::TargetRegisterInfo const*) const ()
   from /usr/lib/libLLVM-5.0svn.so
0x03b11ed8c96e in (anonymous
namespace)::R600EmitClauseMarkers::MakeALUClause(llvm::MachineBasicBlock&,
llvm::MachineInstrBundleIterator) () from
/usr/lib/libLLVM-5.0svn.so
(gdb) finish
Run till exit from #0  0x03b11ed8c96e in (anonymous
namespace)::R600EmitClauseMarkers::MakeALUClause(llvm::MachineBasicBlock&,
llvm::MachineInstrBundleIterator) () from
/usr/lib/libLLVM-5.0svn.so
0x03b11ed8d7fe in (anonymous
namespace)::R600EmitClauseMarkers::runOnMachineFunction(llvm::MachineFunction&)
() from /usr/lib/libLLVM-5.0svn.so
(gdb) finish
Run till exit from #0  0x03b11ed8d7fe in (anonymous
namespace)::R600EmitClauseMarkers::runOnMachineFunction(llvm::MachineFunction&)
()
   from /usr/lib/libLLVM-5.0svn.so
[ ... hangs... ]

So it seems your patch is on the right track.

In addition, I was making a release build. because that was the default in the
recipe file. I am building a debug build as I write this, including your
attached patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #5 from Jan Vesely  ---
Created attachment 129340
  --> https://bugs.freedesktop.org/attachment.cgi?id=129340&action=edit
assert on infinite loop

this patch adds an assert for possible infinite loop in emit clause markers.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #4 from Michel Dänzer  ---
(In reply to Jan Vesely from comment #3)
> > (gdb) info threads 
> >   Id   Target Id Frame 
> > * 1Thread 0x39ac9cdf7c0 (LWP 3806) "display" 0x039abefef921 in
> > llvm::MachineInstr::findRegisterDefOperandIdx(unsigned int, bool, bool,
> > llvm::TargetRegisterInfo const*) const () from /usr/lib/libLLVM-5.0svn.so
> 
> can you get backtrace of this thread?
> does it ever leave this function? you can check by adding breakpoint on that
> function and checking if it gets hit.
> this can be repeated going up the stack to find the function that won't exit.

FWIW, for this purpose I usually just use "finish" repeatedly until it doesn't
terminate.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

--- Comment #3 from Jan Vesely  ---
(In reply to nixscripter from comment #1)
> I'm still trying some versions in order to help you guys pin this down (it's
> not always easy to tell what reinstall is having what effect, since Arch
> Linux has three packages involved). In the mean time, I did the basics on
> the process in its hung state.
> 
> It's currently running three threads, two blocked, one continuing to run:
> 
> (gdb) info threads 
>   Id   Target Id Frame 
> * 1Thread 0x39ac9cdf7c0 (LWP 3806) "display" 0x039abefef921 in
> llvm::MachineInstr::findRegisterDefOperandIdx(unsigned int, bool, bool,
> llvm::TargetRegisterInfo const*) const () from /usr/lib/libLLVM-5.0svn.so

can you get backtrace of this thread?
does it ever leave this function? you can check by adding breakpoint on that
function and checking if it gets hit.
this can be repeated going up the stack to find the function that won't exit.

>   2Thread 0x39abd04f700 (LWP 3809) "radeon_cs:0" 0x039ac6b0310f in
> pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
>   3Thread 0x39abadd4700 (LWP 3814) "display" futex_wait (val=8, 
> addr=0x25349d4)
> at /build/gcc-multilib/src/gcc/libgomp/config/linux/x86/futex.h:44
> (gdb)
> 
> 
> What is that call to findRegisterDefOperandIdx doing?

there's a loop, it can't be infinite, but if the num of operands is corrupted,
it can take a very long time to finish. can you check "p e" in gdb?

> It's not entirely
> clear, but it's sucking up a lot of memory. Running strace confirms that: 
> 
> strace: Process 3806 attached with 3 threads
> strace: [ Process PID=3806 runs in x32 mode. ]
> [pid  3809] futex(0x2599e64, FUTEX_WAIT_PRIVATE, 1, NULL 
> [pid  3814] futex(0x25349d4, FUTEX_WAIT_PRIVATE, 8, NULL 
> [pid  3806] mmap(NULL, 8392704, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x640f4000
> strace: [ Process PID=3806 runs in 64 bit mode. ]
> [pid  3806] mmap(NULL, 8392704, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x39a638f3000
> [pid  3806] mmap(NULL, 8392704, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x39a630f2000
> [pid  3806] mmap(NULL, 8392704, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x39a628f1000
> [...]
> 
> And down the address space it goes, 0x1000 bytes (4k) a time or two per
> second.

the above mmaps show 8M (+4K, probably for bookkeeping) allocations. is there
any other, not shown? I haven't found anything in the mentioned function that
would need such big amount of memory, the hand if probably higher in the call
stack.

> 
> Looking at the function name, I'm thinking about what Jan said on another
> bug:
> 
> > the hang is probably a separate bug. ImageMagick test suite results on my 
> > Turks GPU are:
> > # TOTAL: 86
> > # PASS:  78
> > # SKIP:  0
> > # XFAIL: 0
> > # FAIL:  3
> > # XPASS: 0
> > # ERROR: 5
> >
> > the errors and failures are accompanied by:
> > Assertion `i < getNumRegs() && "Register number out of range!"' failed.
> 
> Could this be perhaps the same registers that were out of range on a
> different card?

all cards of one class have the same number of architecturally available
registers.
I see you have debug symbols, is that a debug build? if not, it can be that the
assert is not hit, and the hang is just fallout.

> 
> Either way, I will continue to investigate, and hope to narrow down the
> issue soon.

thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 99488] [r600g]OpenCL driver causes ImageMagick to hang on JPEG input in Gaussian Blur kernel

2017-02-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99488

nixscrip...@gmail.com changed:

   What|Removed |Added

Summary|[r600g]OpenCL driver causes |[r600g]OpenCL driver causes
   |process to hang in  |ImageMagick to hang on JPEG
   |ImageMagick's Gaussian Blur |input in Gaussian Blur
   |kernel  |kernel

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel