There are actually three separate issues here, but as (a) is already
known and (b) is not a bug, I define this bug to be (c).

To understand them, it is necessary to know that OpenCL computations are
asynchronous: a clmath expression like "aCL=bCL+cCL" places this
operation in a CommandQueue and returns without waiting for it to
finish.  (This is to allow the CPU to do other work during the GPU
computation.)

(a) Running out of memory can hang the entire system, rather than ending
just the OpenCL application with CL_OUT_OF_RESOURCES.

This is probably the same long-standing issue (e.g. bug 620074, bug
1504914, bug 1592813) that makes Linux out-of-memory conditions in
general do this.  (The integrated GPUs supported by beignet share the
host's memory.)

(b) In both beignet and pocl (probably all ICDs), a long sequence of
allocate/deallocate operations (e.g. clmath creating a new array each
operation) *without* waiting for results uses up memory, but regularly
waiting for results avoids this.

This is because allocating memory (clCreateBuffer) happens immediately,
but the actual computations are queued, and memory can't be freed until
the computations using it have finished.  Hence, if many operations are
queued without waiting for a result, memory allocation can run far ahead
of computation, filling up the memory.

This is not a bug: don't do that.  Either wait for results often enough
that this doesn't build up to the point of running out of memory, or
(better for performance) re-use existing memory objects instead of
allocating/deallocating.  (To do the latter with clmath, use
pyopencl.tools.MemoryPool.)

While investigating this I discovered that all beignet queues are out-
of-order execution even if the user requested in-order, which is a bug,
but is not the cause of this issue.

(c) In beignet but not pocl, a long sequence of clmath operations leaks
memory, even with regular waits.

To ensure that intermediate results are calculated before they are used,
clmath arrays use Event objects to track dependencies.  A beignet event
includes references to the event(s) it depends on
(https://sources.debian.org/src/beignet/1.3.2-2/src/cl_event.h/?hl=47#L40),
and continues to hold these as long as the event object exists, even if
it has completed and been waited for.  As OpenCL objects are freed by
reference counting, this means that as long as the last event in a
dependency tree exists, the whole tree of (recursive) dependencies also
exists, taking up memory (~20kB per event).

pocl avoids this by dropping these references after completion (
https://sources.debian.org/src/pocl/1.1-5/lib/CL/devices/common.c/?hl=722#L714
); the attached patch makes beignet do so.  Checking the source suggests
mesa is also affected (
https://sources.debian.org/src/mesa/18.1.3-1/src/gallium/state_trackers/clover/core/event.hpp/?hl=84#L34
), but I don't have the hardware to try it.  (The OpenCL part of mesa is
AMD/Radeon only.)


** Also affects: mesa (Ubuntu)
   Importance: Undecided
       Status: New

** Changed in: pyopencl (Ubuntu)
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1354086

Title:
  [i5-3230] Tight pyopencl.clmath loops cause out-of-memory system hang

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/beignet/+bug/1354086/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to