Hello everyone.

Some time ago (2010-09-11) I volunteered for fixing PyCUDA
to use non-deprecated OpenGL interoperability functions.
Here is THE patch.

The biggest problem is that IMO (I might be wrong - do
not know PyCUDA to such intimate details) it introduces
non-backward-compatible change.

I have split BufferObject into two:
BufferObject responsible for buffers (VBO, etc.)
ImageObject responsible for textures, etc.

Previously BufferObject was responsible for both those
cases - now anyone who wants to use CUDA on images
must use ImageObject (which is API-breaking change).


All tests and examples run without any problem.
Test undistributed/reduction-perf.py throws
out of memory on cuMemAllod, but it is not
error related to changes in attached patch.

I also attach problem sample code (Qt+OpenGL+PyCUDA)
that uses new BufferObject.
I have checked and it works after applying my patch;
I only had to change initialisation to get program
using VBO running after applying my patch.

I have not checked ImageObject.

I have also started writing documentation for new pycuda.gl
functions.

I have removed gl_init as documentation says that cuGLInit()
is deprecated and is not doing anything (Reference 4.41.4.1).

Instead I have created gl_select_device; this function however
must be called as the first (before all other CUDA functions),
is cuda*, not cu* function, does not cooperate with
CUDAPP_CALL_GUARDED - so I do not know what to do with it.
I was not able to test it - I do not have setup with two devices.


Other problems:

Function pycuda.tools.get_default_device is depreciated,
but pycuda.gl.autoinit uses it. I have changed autoinit
to use Device(0), but it is not elegant and should be changed.

Function pycuda.tools.make_default_context contains
repeated code checking for presence of CUDA device:

def make_default_context():
    ndevices = cuda.Device.count()
    if ndevices == 0:
        errmsg = "No CUDA enabled device found. Please check your
installation."
        raise RuntimeError, errmsg

    ndevices = cuda.Device.count()
    if ndevices == 0:
        raise RuntimeError("No CUDA enabled device found. "
                "Please check your installation.")

Why? Is that omission, or does this double checking serve
some purpose?


Please look at this patch and send remarks.
The biggest problem is with necessity of moving
OpenGL image operations to use ImageObject, not
BufferObject.

During this weekend we have holiday in Poland and I will
be at my parent's home, so I am not sure if I will be able
to work on example of ImageObject.

Regards.

-- 
Tomasz Rybak <bogom...@post.pl> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak
diff --git a/doc/source/gl.rst b/doc/source/gl.rst
index e5d196f..b367c24 100644
--- a/doc/source/gl.rst
+++ b/doc/source/gl.rst
@@ -8,7 +8,7 @@ GL Interoperability
 
 .. module:: pycuda.gl
 
-.. function :: init()
+.. function :: select_device(device)
     
     Enable GL interoperability for the already-created (so far non-GL)
     and currently active :class:`pycuda.driver.Context`.
@@ -33,7 +33,36 @@ GL Interoperability
         This will fail with a rather unhelpful error message if you don't already 
         have a GL context created and active.
 
-.. class :: BufferObject(bufobj)
+.. class :: map_flags
+
+    .. attribute :: cudaGraphicsMapFlagsNone
+
+      Read and write access to mapped OpenGL object from CUDA code.
+
+    .. attribute :: cudaGraphicsMapFlagsReadOnly
+
+      Read only access to mapped OpenGL object from CUDA code.
+
+    .. attribute :: cudaGraphicsMapFlagsWriteDiscard
+
+      Write only access to mapped OpenGL object from CUDA code. Reading
+      is prohibited.
+
+.. class :: map_targets
+
+  Type of OpenGL Image object that is mapped to CUDA.
+
+    .. attribute :: GL_TEXTURE_2D
+    .. attribute :: GL_TEXTURE_RECTANGLE
+    .. attribute :: GL_TEXTURE_CUBE_MAP
+    .. attribute :: GL_TEXTURE_3D
+    .. attribute :: GL_TEXTURE_2D_ARRAY
+    .. attribute :: GL_RENDERBUFFER
+
+.. class :: BufferObject(bufobj, flags = cudaGraphicsMapFlagsNone)
+
+  Object managing mapping of OpenGL buffers to CUDA. Cannot be used to
+  map images.
 
     .. method :: unregister()
     .. method :: handle()
@@ -45,6 +74,20 @@ GL Interoperability
     .. method :: device_ptr()
     .. method :: size()
 
+.. class :: ImageObject(bufobj, target, flags = cudaGraphicsMapFlagsNone)
+
+  Object managing mapping of OpenGL textures and render buffers to CUDA.
+
+    .. method :: unregister()
+    .. method :: handle()
+    .. method :: map()
+    
+.. class :: ImageObjectMapping
+
+    .. method :: unmap()
+    .. method :: device_ptr()
+    .. method :: size()
+
 .. note ::
 
     See this `post <http://forums.nvidia.com/index.php?showtopic=88152>`_ on the
diff --git a/pycuda/gl/__init__.py b/pycuda/gl/__init__.py
index a18b3d2..379b42e 100644
--- a/pycuda/gl/__init__.py
+++ b/pycuda/gl/__init__.py
@@ -3,7 +3,11 @@ import pycuda._driver as _drv
 if not _drv.have_gl_ext(): 
     raise ImportError("PyCUDA was compiled without GL extension support")
 
-init = _drv.gl_init
+select_device = _drv.gl_select_device
 make_context = _drv.make_gl_context
+map_flags = _drv.map_flags
+target_flags = _drv.target_flags
 BufferObject = _drv.BufferObject
 BufferObjectMapping = _drv.BufferObjectMapping
+ImageObject = _drv.ImageObject
+ImageObjectMapping = _drv.ImageObjectMapping
diff --git a/pycuda/gl/autoinit.py b/pycuda/gl/autoinit.py
index 9fb8e88..fc0d2b0 100644
--- a/pycuda/gl/autoinit.py
+++ b/pycuda/gl/autoinit.py
@@ -5,8 +5,9 @@ import pycuda.tools
 cuda.init()
 assert cuda.Device.count() >= 1
 
-device = pycuda.tools.get_default_device()
+device = cuda.Device(0)
 context = cudagl.make_context(device)
 
+
 import atexit
 atexit.register(context.pop)
diff --git a/src/cpp/cuda_gl.hpp b/src/cpp/cuda_gl.hpp
index df8d289..c6d41c3 100644
--- a/src/cpp/cuda_gl.hpp
+++ b/src/cpp/cuda_gl.hpp
@@ -11,13 +11,22 @@
 #endif 
 
 #include <cudaGL.h>
+#include <cuda_gl_interop.h>
+#include <driver_types.h>
 
 
 namespace cuda { namespace gl {
   inline
-  void gl_init()
+  void gl_select_device(int device)
   {
-    CUDAPP_CALL_GUARDED(cuGLInit, ());
+    CUDAPP_PRINT_CALL_TRACE("cudaGLSetGLDevice");
+    cudaError_t cu_status_code = cudaSuccess;
+//    cu_status_code = cudaGLSetGLDevice(device);
+    CUDAPP_PRINT_ERROR_TRACE("cudaGLSetGLDevice", cu_status_code);
+    if (cu_status_code != cudaSuccess) {
+      throw cuda::error("cudaGLSetGLDevice", CUDA_SUCCESS);
+    }
+//    CUDAPP_CALL_GUARDED(cudaGLSetGLDevice, (device));
   }
 
 
@@ -40,12 +49,13 @@ namespace cuda { namespace gl {
   {
     private:
       GLuint m_handle;
+      CUgraphicsResource m_resource;
       bool m_valid;
 
     public:
-      buffer_object(GLuint handle)
+      buffer_object(GLuint handle, cudaGraphicsMapFlags flags = cudaGraphicsMapFlagsNone)
         : m_handle(handle), m_valid(true)
-      { CUDAPP_CALL_GUARDED(cuGLRegisterBufferObject, (handle)); }
+      { CUDAPP_CALL_GUARDED(cuGraphicsGLRegisterBuffer, (&m_resource, handle, flags)); }
 
       ~buffer_object()
       {
@@ -56,6 +66,9 @@ namespace cuda { namespace gl {
       GLuint handle()
       { return m_handle; }
 
+      CUgraphicsResource * resource()
+      { return &m_resource; }
+
       void unregister()
       {
         if (m_valid)
@@ -63,7 +76,7 @@ namespace cuda { namespace gl {
           try
           {
             scoped_context_activation ca(get_context());
-            CUDAPP_CALL_GUARDED_CLEANUP(cuGLUnregisterBufferObject, (m_handle));
+            CUDAPP_CALL_GUARDED_CLEANUP(cuGraphicsUnregisterResource, (m_resource));
             m_valid = false;
           }
           CUDAPP_CATCH_CLEANUP_ON_DEAD_CONTEXT(buffer_object);
@@ -104,7 +117,9 @@ namespace cuda { namespace gl {
           try
           {
             scoped_context_activation ca(get_context());
-            CUDAPP_CALL_GUARDED_CLEANUP(cuGLUnmapBufferObject, (m_buffer_object->handle()));
+// Stream as third parameter
+// Is that a problem, or all main tasks are done in stream 0 in PyCUDA?
+            CUDAPP_CALL_GUARDED_CLEANUP(cuGraphicsUnmapResources, (1, m_buffer_object->resource(), NULL));
             m_valid = false;
           }
           CUDAPP_CATCH_CLEANUP_ON_DEAD_CONTEXT(buffer_object_mapping)
@@ -128,10 +143,123 @@ namespace cuda { namespace gl {
   {
     CUdeviceptr devptr;
     pycuda_size_t size;
-    CUDAPP_CALL_GUARDED(cuGLMapBufferObject, (&devptr, &size, bobj->handle()));
+// Stream as third parameter
+// Is that a problem, or all main tasks are done in stream 0 in PyCUDA?
+    CUDAPP_CALL_GUARDED_CLEANUP(cuGraphicsMapResources, (1, bobj->resource(), NULL));
+    CUDAPP_CALL_GUARDED(cuGraphicsResourceGetMappedPointer, (&devptr, &size, *(bobj->resource())));
 
     return new buffer_object_mapping(bobj, devptr, size);
   }
+
+
+
+
+  class image_object : public context_dependent
+  {
+    private:
+      GLuint m_handle;
+      GLenum m_target;
+      CUgraphicsResource m_resource;
+      bool m_valid;
+
+    public:
+      image_object(GLuint handle, GLenum target, cudaGraphicsMapFlags flags = cudaGraphicsMapFlagsNone)
+        : m_handle(handle), m_target(target), m_resource(NULL), m_valid(true)
+      { CUDAPP_CALL_GUARDED(cuGraphicsGLRegisterBuffer, (&m_resource, handle, flags)); }
+
+      ~image_object()
+      {
+        if (m_valid)
+          unregister();
+      }
+
+      GLuint handle()
+      { return m_handle; }
+
+      CUgraphicsResource * resource()
+      { return &m_resource; }
+
+      void unregister()
+      {
+        if (m_valid)
+        {
+          try
+          {
+            scoped_context_activation ca(get_context());
+            CUDAPP_CALL_GUARDED_CLEANUP(cuGraphicsUnregisterResource, (m_resource));
+            m_valid = false;
+          }
+          CUDAPP_CATCH_CLEANUP_ON_DEAD_CONTEXT(image_object);
+        }
+        else
+          throw cuda::error("image_object::unregister", CUDA_ERROR_INVALID_HANDLE);
+      }
+  };
+
+
+
+  class image_object_mapping : public context_dependent
+  {
+    private:
+      boost::shared_ptr<image_object> m_image_object;
+      CUdeviceptr m_devptr;
+      unsigned int m_size;
+      bool m_valid;
+
+    public:
+      image_object_mapping(
+          boost::shared_ptr<image_object> iobj,
+          CUdeviceptr devptr,
+          unsigned int size)
+        : m_image_object(iobj), m_devptr(devptr), m_size(size), m_valid(true)
+      { }
+
+      ~image_object_mapping()
+      {
+        if (m_valid)
+          unmap();
+      }
+
+      void unmap()
+      {
+        if (m_valid)
+        {
+          try
+          {
+            scoped_context_activation ca(get_context());
+// Stream as third parameter
+// Is that a problem, or all main tasks are done in stream 0 in PyCUDA?
+            CUDAPP_CALL_GUARDED_CLEANUP(cuGraphicsUnmapResources, (1, m_image_object->resource(), NULL));
+            m_valid = false;
+          }
+          CUDAPP_CATCH_CLEANUP_ON_DEAD_CONTEXT(image_object_mapping)
+        }
+        else
+          throw cuda::error("image_object_mapping::unmap", CUDA_ERROR_INVALID_HANDLE);
+      }
+
+      CUdeviceptr device_ptr() const
+      { return m_devptr; }
+
+      unsigned int size() const
+      { return m_size; }
+  };
+
+
+
+
+  inline image_object_mapping *map_image_object(
+      boost::shared_ptr<image_object> iobj)
+  {
+    CUdeviceptr devptr;
+    pycuda_size_t size;
+// Stream as third parameter
+// Is that a problem, or all main tasks are done in stream 0 in PyCUDA?
+    CUDAPP_CALL_GUARDED_CLEANUP(cuGraphicsMapResources, (1, iobj->resource(), NULL));
+    CUDAPP_CALL_GUARDED(cuGraphicsResourceGetMappedPointer, (&devptr, &size, *(iobj->resource())));
+
+    return new image_object_mapping(iobj, devptr, size);
+  }
 } }
 
 
diff --git a/src/wrapper/wrap_cudagl.cpp b/src/wrapper/wrap_cudagl.cpp
index 8e9ba02..2532262 100644
--- a/src/wrapper/wrap_cudagl.cpp
+++ b/src/wrapper/wrap_cudagl.cpp
@@ -1,5 +1,6 @@
 #include <cuda.hpp>
 #include <cuda_gl.hpp>
+#include <driver_types.h>
 
 #include "tools.hpp"
 #include "wrap_helpers.hpp"
@@ -19,13 +20,29 @@ void pycuda_expose_gl()
   using py::arg;
   using py::args;
 
-  DEF_SIMPLE_FUNCTION(gl_init);
+  py::def("gl_select_device", gl_select_device, (arg("device")));
 
   py::def("make_gl_context", make_gl_context, (arg("dev"), arg("flags")=0));
 
+// Reference Manual 4.9 and 4.23.3.6
+  py::enum_<cudaGraphicsMapFlags>("map_flags")
+    .value("cudaGraphicsMapFlagsNone", cudaGraphicsMapFlagsNone)
+    .value("cudaGraphicsMapFlagsReadOnly", cudaGraphicsMapFlagsReadOnly)
+    .value("cudaGraphicsMapFlagsWriteDiscard", cudaGraphicsMapFlagsWriteDiscard)
+  ;
+
+  py::enum_<GLenum>("target_flags")
+    .value("GL_TEXTURE_2D", GL_TEXTURE_2D)
+    .value("GL_TEXTURE_RECTANGLE", GL_TEXTURE_RECTANGLE)
+    .value("GL_TEXTURE_CUBE_MAP", GL_TEXTURE_CUBE_MAP)
+    .value("GL_TEXTURE_3D", GL_TEXTURE_3D)
+    .value("GL_TEXTURE_2D_ARRAY", GL_TEXTURE_2D_ARRAY)
+    .value("GL_RENDERBUFFER", GL_RENDERBUFFER)
+  ;
+
   {
     typedef buffer_object cl;
-    py::class_<cl, shared_ptr<cl> >("BufferObject", py::init<GLuint>())
+    py::class_<cl, shared_ptr<cl> >("BufferObject", py::init<GLuint, py::optional<cudaGraphicsMapFlags> >())
       .DEF_SIMPLE_METHOD(handle)
       .DEF_SIMPLE_METHOD(unregister)
       .def("map", map_buffer_object,
@@ -41,4 +58,24 @@ void pycuda_expose_gl()
       .DEF_SIMPLE_METHOD(size)
       ;
   }
+
+
+  {
+    typedef image_object cl;
+    py::class_<cl, shared_ptr<cl> >("ImageObject", py::init<GLuint, GLenum, py::optional<cudaGraphicsMapFlags> >())
+      .DEF_SIMPLE_METHOD(handle)
+      .DEF_SIMPLE_METHOD(unregister)
+      .def("map", map_image_object,
+          py::return_value_policy<py::manage_new_object>())
+      ;
+  }
+
+  {
+    typedef image_object_mapping cl;
+    py::class_<cl>("ImageObjectMapping", py::no_init)
+      .DEF_SIMPLE_METHOD(unmap)
+      .DEF_SIMPLE_METHOD(device_ptr)
+      .DEF_SIMPLE_METHOD(size)
+      ;
+  }
 }
#! /usr/bin/python

import sys
import array
import math
import numpy

from OpenGL.GL import *

import pycuda
import pycuda.compiler
import pycuda.driver
import pycuda.gl
import pycuda.gpuarray
#pycuda.init()
#import pycuda.autoinit

from PyQt4 import QtCore, QtGui, QtOpenGL


vertexSource = """#version 150 core
uniform mat4 projectionMatrix;
in vec3 vertexCoordinate;

void main() {
	vec4 a = vec4(vertexCoordinate, 1.0);
	a.z = a.z - 2.0;

	gl_Position = projectionMatrix * a;
}
"""

fragmentSource = """#version 150 core
void main() {
	gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);
}
"""

kernel = """
__global__ void compute(float3 *pos) {
	unsigned int x = threadIdx.x;
//	if (x < N) {
		if (x % 2 == 0) {
			pos[x].z = pos[x].z+1;
		} else {
			pos[x].z = pos[x].z-1;
		}
//	}
}
"""


class Widget(QtOpenGL.QGLWidget):
	def __init__(self, parent = None):
		super(Widget, self).__init__(parent)
	def minimumSizeHint(self):
		return QtCore.QSize(400, 300)
	def sizeHint(self):
		return QtCore.QSize(800, 600)
	def initializeGL(self):
		self.qglClearColor(QtCore.Qt.black)
		glEnable(GL_DEPTH_TEST)

		program = glCreateProgram()
		vertexShader = glCreateShader(GL_VERTEX_SHADER)
		glShaderSource(vertexShader, [vertexSource])
		glCompileShader(vertexShader)
		glAttachShader(program, vertexShader)
		fragmentShader = glCreateShader(GL_FRAGMENT_SHADER)
		glShaderSource(fragmentShader, [fragmentSource])
		glCompileShader(fragmentShader)
		glAttachShader(program, fragmentShader)
		glBindAttribLocation(program, 0, 'vertexCoordinate')
		glLinkProgram(program)
		glValidateProgram(program)
		glUseProgram(program)
		self.projectionMatrixLocation = glGetUniformLocation(program, 'projectionMatrix')

		self.data = numpy.zeros((99, 3), numpy.float32)
		for i in range(33):
			self.data[i*3+0, 0] = 0
			self.data[i*3+0, 1] = 0
			self.data[i*3+0, 2] = 0

			self.data[i*3+1, 0] = 1*math.cos(math.pi*(i+0)/16)
			self.data[i*3+1, 1] = 1*math.sin(math.pi*(i+0)/16)
			self.data[i*3+1, 2] = 0

			self.data[i*3+2, 0] = 1*math.cos(math.pi*(i+1)/16)
			self.data[i*3+2, 1] = 1*math.sin(math.pi*(i+1)/16)
			self.data[i*3+2, 2] = 0

		self.gl_buffer = glGenBuffers(1)
		glBindBuffer(GL_ARRAY_BUFFER, self.gl_buffer)
		glBufferData(GL_ARRAY_BUFFER, self.data, GL_DYNAMIC_DRAW)
		glVertexAttribPointer(0, 3, GL_FLOAT, False, 0, None)
		glEnableVertexAttribArray(0)
		import pycuda.gl.autoinit
#		pycuda.gl.init(0)
		k = pycuda.compiler.SourceModule(kernel)
		self.cuda_function = k.get_function("compute")
		self.cuda_function.prepare("P", (99, 1, 1))
		self.cuda_buffer = pycuda.gl.BufferObject(long(self.gl_buffer))
	def paintGL(self):
		cuda_object = self.cuda_buffer.map()
		self.cuda_function.prepared_call((1, 1), cuda_object.device_ptr())
		pycuda.driver.Context.synchronize()
		cuda_object.unmap()
		glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)
		glDrawArrays(GL_TRIANGLES, 0, 99)
	def resizeGL(self, width, height):
		side = min(width, height)
		glViewport((width-side)/2, (height-side)/2, side, side)
		projectionMatrix = numpy.zeros((4, 4), 'f')
		top = +1.5
		bottom = -1.5
		left = -1.5
		right = +1.5
		near = 1.0
		far = 3.0
		projectionMatrix[0, 0] = (2.0)/(right-left)
		projectionMatrix[0, 3] = -(right+left)/(right-left)
		projectionMatrix[1, 1] = (2.0)/(top-bottom)
		projectionMatrix[1, 3] = -(top+bottom)/(top-bottom)
		projectionMatrix[2, 2] = -(2.0)/(far-near)
		projectionMatrix[2, 3] = -(far+near)/(far-near)
		projectionMatrix[3, 3] = 1.0
		glUniformMatrix4fv(self.projectionMatrixLocation, 1, True, projectionMatrix)
	def closeEvent(self, event):
		glDeleteBuffers(1, long(self.gl_buffer))
		self.gl_buffer = None
		self.cuda_buffer = None
		super(Widget, self).closeEvent(event)

class Window(QtGui.QWidget):
	def __init__(self):
		super(Window, self).__init__()
		self.widget = Widget()
		self.setWindowTitle("CUDA OpenGL")
		layout = QtGui.QHBoxLayout()
		layout.addWidget(self.widget)
		self.setLayout(layout)
	def closeEvent(self, event):
		self.widget.close()
		super(Window, self).closeEvent(event)


app = QtGui.QApplication(sys.argv)
window = Window()
window.show()
app.exec_()

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to