[Bug 52054] New: gallium/opencl doesnt support includes for opencl kernels

2012-07-13 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=52054

 Bug #: 52054
   Summary: gallium/opencl doesnt support includes for opencl
kernels
Classification: Unclassified
   Product: Mesa
   Version: git
  Platform: x86-64 (AMD64)
OS/Version: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/r600
AssignedTo: dri-devel at lists.freedesktop.org
ReportedBy: alexxy at gentoo.org


when running tests for opencl enabled jtr (http://www.openwall.com/john/)
i get following error

OpenCL platform 0: Default, 1 device(s).
Using device 0: AMD JUNIPER
1 error generated.
Compilation log: cl_input:17:10: fatal error: 'opencl_rar.h' file not found
#include "opencl_rar.h"
 ^

OpenCL error (CL_INVALID_PROGRAM_EXECUTABLE) in file (rar_fmt.c) at line (588)
- (Error creating kernel. Double-check kernel name?)

xeon ~ # ./clInfo 
Found 1 platform(s).
platform[(nil)]: profile: FULL_PROFILE
platform[(nil)]: version: OpenCL 1.1 MESA 
platform[(nil)]: name: Default
platform[(nil)]: vendor: Mesa
platform[(nil)]: extensions: 
platform[(nil)]: Found 1 device(s).
device[0xc82360]: NAME: AMD JUNIPER
device[0xc82360]: VENDOR: X.Org
device[0xc82360]: PROFILE: FULL_PROFILE
device[0xc82360]: VERSION: OpenCL 1.1 MESA 
device[0xc82360]: EXTENSIONS: 
device[0xc82360]: DRIVER_VERSION: 

device[0xc82360]: Type: GPU 
device[0xc82360]: EXECUTION_CAPABILITIES: Kernel 
device[0xc82360]: GLOBAL_MEM_CACHE_TYPE: None (0)
device[0xc82360]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1)
device[0xc82360]: SINGLE_FP_CONFIG: 0x7
device[0xc82360]: QUEUE_PROPERTIES: 0x2

device[0xc82360]: VENDOR_ID: 4098
device[0xc82360]: MAX_COMPUTE_UNITS: 1
device[0xc82360]: MAX_WORK_ITEM_DIMENSIONS: 3
device[0xc82360]: MAX_WORK_GROUP_SIZE: 256
device[0xc82360]: PREFERRED_VECTOR_WIDTH_CHAR: 16
device[0xc82360]: PREFERRED_VECTOR_WIDTH_SHORT: 8
device[0xc82360]: PREFERRED_VECTOR_WIDTH_INT: 4
device[0xc82360]: PREFERRED_VECTOR_WIDTH_LONG: 2
device[0xc82360]: PREFERRED_VECTOR_WIDTH_FLOAT: 4
device[0xc82360]: PREFERRED_VECTOR_WIDTH_DOUBLE: 2
device[0xc82360]: MAX_CLOCK_FREQUENCY: 0
device[0xc82360]: ADDRESS_BITS: 32
device[0xc82360]: MAX_MEM_ALLOC_SIZE: 0
device[0xc82360]: IMAGE_SUPPORT: 1
device[0xc82360]: MAX_READ_IMAGE_ARGS: 32
device[0xc82360]: MAX_WRITE_IMAGE_ARGS: 32
device[0xc82360]: IMAGE2D_MAX_WIDTH: 32768
device[0xc82360]: IMAGE2D_MAX_HEIGHT: 32768
device[0xc82360]: IMAGE3D_MAX_WIDTH: 32768
device[0xc82360]: IMAGE3D_MAX_HEIGHT: 32768
device[0xc82360]: IMAGE3D_MAX_DEPTH: 32768
device[0xc82360]: MAX_SAMPLERS: 16
device[0xc82360]: MAX_PARAMETER_SIZE: 1024
device[0xc82360]: MEM_BASE_ADDR_ALIGN: 128
device[0xc82360]: MIN_DATA_TYPE_ALIGN_SIZE: 128
device[0xc82360]: GLOBAL_MEM_CACHELINE_SIZE: 0
device[0xc82360]: GLOBAL_MEM_CACHE_SIZE: 0
device[0xc82360]: GLOBAL_MEM_SIZE: 201326592
device[0xc82360]: MAX_CONSTANT_BUFFER_SIZE: 65536
device[0xc82360]: MAX_CONSTANT_ARGS: 1
device[0xc82360]: LOCAL_MEM_SIZE: 32768
device[0xc82360]: ERROR_CORRECTION_SUPPORT: 0
device[0xc82360]: PROFILING_TIMER_RESOLUTION: 0
device[0xc82360]: ENDIAN_LITTLE: 1
device[0xc82360]: AVAILABLE: 1
device[0xc82360]: COMPILER_AVAILABLE: 1

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[pull] drm-intel-next

2012-07-13 Thread Daniel Vetter
Hi Dave,

New pull for -next. Highlights:
- rc6/turbo support for hsw (Eugeni)
- improve corner-case of the reset handling code - gpu reset handling
  should be rock-solid now
- support for fb offset > 4096 pixels on gen4+ (yeah, you need some fairly
  big screens to hit that)
- the "Flush Me Harder" patch to fix the gen6+ fallout from disabling the
  flushing_list
- no more /dev/agpgart on gen6+!
- HAS_PCH_xxx improvements from Paulo
- a few minor bits all over, most of it in thew hsw code

QA reported 2 regression, one due a bad cable (fixed by a walk to the next
radioshack) and one due to the HPD v2 patch - I owe you one for refusing
to take v2 for -fixes after v1 blew up on Linus' machine I guess ;-) The
later has a confirmed fix already queued up in my tree.

Regressions from the last pull are all fixed and some really good news:
We've finally fixed the last DP regression from 3.2. Although I'm vary of
that blowing up elseplaces, hence I prefer that we soak it in 3.6 a bit
before submitting it to stable.

Otherwise Chris is hunting down an obscure bug that got recently
introduced due to a funny interaction between two seemingly unrelated
patches, one improving our gpu death handling, the other preparing the
removal of the flushing_list. But he has patches already, although I'm
still complaining a bit about the commit messages ...

Wrt further pulls for 3.6 I'll merge feature-y stuff only at the end of
the current drm-intel-next cycle so that if this will miss 3.6 I can just
send you a pull for the bugfixes that are currently merged (or in the case
of Chris' patches, hopefully merged soon).

Yours, Daniel

PS: This pull will make the already existing conflict with Linus' tree a
bit more fun, but I think it should be still doable (the important thing
is to keep the revert from -fixes, but don't kill any other changes from
-next).

The following changes since commit 7b0cfee1a24efdfe0235bac62e53f686fe8a8e24:

  Merge tag 'v3.5-rc4' into drm-intel-next-queued (2012-06-25 19:10:36 +0200)

are available in the git repository at:


  git://people.freedesktop.org/~danvet/drm-intel tags/drm-intel-next-2012-07-06

for you to fetch changes up to 4acf518626cdad5bbf7aac9869bd4accbbfb4ad3:

  drm/i915: program FDI_RX TP and FDI delays (2012-07-05 15:09:03 +0200)


Ben Widawsky (1):
  drm/i915: linuxify create_hw_context()

Chris Wilson (2):
  drm/i915: Group the GT routines together in both code and vtable
  drm/i915: Implement w/a for sporadic read failures on waking from rc6

Daniel Vetter (15):
  drm/i915: wrap up gt powersave enabling functions
  drm/i915: make enable/disable_gt_powersave locking consistent
  drm/i915: don't use dev->agp
  drm/i915: disable drm agp support for !gen3 with kms enabled
  agp/intel-agp: remove snb+ host bridge pciids
  drm/i915: "Flush Me Harder" required on gen6+
  drm/i915: fix up ilk rc6 disabling confusion
  drm/i915: don't trylock in the gpu reset code
  drm/i915: non-interruptible sleeps can't handle -EAGAIN
  drm/i915: don't hang userspace when the gpu reset is stuck
  drm/i915: properly SIGBUS on I/O errors
  drm/i915: don't return a spurious -EIO from intel_ring_begin
  drm/i915: introduce crtc->dspaddr_offset
  drm/i915: adjust framebuffer base address on gen4+
  drm/i915: introduce for_each_encoder_on_crtc

Eugeni Dodonov (11):
  drm/i915: support Haswell force waking
  drm/i915: add RPS configuration for Haswell
  drm/i915: slightly improve gt enable/disable routines
  drm/i915: enable RC6 by default on Haswell
  drm/i915: disable RC6 when disabling rps
  drm/i915: introduce haswell_init_clock_gating
  drm/i915: enable RC6 workaround on Haswell
  drm/i915: move force wake support into intel_pm
  drm/i915: re-initialize DDI buffer translations after resume
  drm/i915: prevent bogus intel_update_fbc notifications
  drm/i915: program FDI_RX TP and FDI delays

Jesper Juhl (1):
  drm/i915/sprite: Fix mem leak in intel_plane_init()

Jesse Barnes (3):
  drm/i915: mask tiled bit when updating IVB sprites
  drm/i915: correct IVB default sprite format
  drm/i915: prefer wide & slow to fast & narrow in DP configs

Paulo Zanoni (5):
  drm/i915: fix PIPE_WM_LINETIME definition
  drm/i915: add PCH_NONE to enum intel_pch
  drm/i915: get rid of dev_priv->info->has_pch_split
  drm/i915: don't ironlake_init_pch_refclk() on LPT
  drm/i915: fix PIPE_DDI_PORT_MASK

Ville Syrj?l? (2):
  drm/i915: Zero initialize mode_cmd
  drm/i915: Reject page flips with changed format/offset/pitch

 drivers/char/agp/intel-agp.c|   11 -
 drivers/gpu/drm/i915/i915_dma.c |9 +-
 drivers/gpu/drm/i915/i915_drv.c |  172 ++
 drivers/gpu/drm/i915/i915_drv.h |   28 ++-
 drivers/gpu/drm/i915/i915_gem.c |   44 +++-
 

[PATCH 6/6] modetest.c: Add return 0 in bit_name_fn(res) macro.

2012-07-13 Thread Johannes Obermayr
---
 tests/modetest/modetest.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c
index ec3121e..00129fa 100644
--- a/tests/modetest/modetest.c
+++ b/tests/modetest/modetest.c
@@ -128,6 +128,7 @@ char * res##_str(int type) {
\
sep = ", "; \
}   \
}   \
+   return 0;   \
 }

 static const char *mode_type_names[] = {
-- 
1.7.7



[PATCH 5/6] xf86drm.c: Fix two memory leaks.

2012-07-13 Thread Johannes Obermayr
---
 xf86drm.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/xf86drm.c b/xf86drm.c
index e652731..c1cc170 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -1399,8 +1399,11 @@ drm_context_t *drmGetReservedContextList(int fd, int 
*count)
 }

 res.contexts = list;
-if (drmIoctl(fd, DRM_IOCTL_RES_CTX, ))
+if (drmIoctl(fd, DRM_IOCTL_RES_CTX, )) {
+   drmFree(list);
+   drmFree(retval);
return NULL;
+}

 for (i = 0; i < res.count; i++)
retval[i] = list[i].handle;
-- 
1.7.7



[PATCH 4/6] xf86drm.c: Make more code UDEV unrelevant.

2012-07-13 Thread Johannes Obermayr
---
 xf86drm.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/xf86drm.c b/xf86drm.c
index 6ea068f..e652731 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -255,6 +255,7 @@ static int drmMatchBusID(const char *id1, const char *id2, 
int pci_domain_ok)
 return 0;
 }

+#if !defined(UDEV)
 /**
  * Handles error checking for chown call.
  *
@@ -284,6 +285,7 @@ static int chown_check_return(const char *path, uid_t 
owner, gid_t group)
path, errno, strerror(errno));
return -1;
 }
+#endif

 /**
  * Open the DRM device, creating it if necessary.
@@ -303,14 +305,17 @@ static int drmOpenDevice(long dev, int minor, int type)
 stat_t  st;
 charbuf[64];
 int fd;
+#if !defined(UDEV)
 mode_t  devmode = DRM_DEV_MODE, serv_mode;
 int isroot  = !geteuid();
 uid_t   user= DRM_DEV_UID;
 gid_t   group   = DRM_DEV_GID, serv_group;
-
+#endif
+
 sprintf(buf, type ? DRM_DEV_NAME : DRM_CONTROL_DEV_NAME, DRM_DIR_NAME, 
minor);
 drmMsg("drmOpenDevice: node name is %s\n", buf);

+#if !defined(UDEV)
 if (drm_server_info) {
drm_server_info->get_perms(_group, _mode);
devmode  = serv_mode ? serv_mode : DRM_DEV_MODE;
@@ -318,7 +323,6 @@ static int drmOpenDevice(long dev, int minor, int type)
group = (serv_group >= 0) ? serv_group : DRM_DEV_GID;
 }

-#if !defined(UDEV)
 if (stat(DRM_DIR_NAME, )) {
if (!isroot)
return DRM_ERR_NOT_ROOT;
-- 
1.7.7



[PATCH 3/6] nouveau/nouveau.c: Fix two memory leaks.

2012-07-13 Thread Johannes Obermayr
---
 nouveau/nouveau.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c
index 5aa4107..e91287f 100644
--- a/nouveau/nouveau.c
+++ b/nouveau/nouveau.c
@@ -95,6 +95,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device 
**pdev)
(dev->drm_version <  0x0100 ||
 dev->drm_version >= 0x0200)) {
nouveau_device_del();
+   free(nvdev);
return -EINVAL;
}

@@ -105,6 +106,7 @@ nouveau_device_wrap(int fd, int close, struct 
nouveau_device **pdev)
ret = nouveau_getparam(dev, NOUVEAU_GETPARAM_AGP_SIZE, );
if (ret) {
nouveau_device_del();
+   free(nvdev);
return ret;
}

-- 
1.7.7



[PATCH 2/6] libkms/nouveau.c: Fix a memory leak and make some code easier to read.

2012-07-13 Thread Johannes Obermayr
---
 libkms/nouveau.c |   27 ++-
 1 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/libkms/nouveau.c b/libkms/nouveau.c
index 0e24a15..fbca6fe 100644
--- a/libkms/nouveau.c
+++ b/libkms/nouveau.c
@@ -90,21 +90,24 @@ nouveau_bo_create(struct kms_driver *kms,
}
}

-   bo = calloc(1, sizeof(*bo));
-   if (!bo)
-   return -ENOMEM;
-
-   if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) {
+   switch (type) {
+   case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8:
pitch = 64 * 4;
size = 64 * 64 * 4;
-   } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) {
+   break;
+   case KMS_BO_TYPE_SCANOUT_X8R8G8B8:
pitch = width * 4;
pitch = (pitch + 512 - 1) & ~(512 - 1);
size = pitch * height;
-   } else {
+   break;
+   default:
return -EINVAL;
}

+   bo = calloc(1, sizeof(*bo));
+   if (!bo)
+   return -ENOMEM;
+
memset(, 0, sizeof(arg));
arg.info.size = size;
arg.info.domain = NOUVEAU_GEM_DOMAIN_MAPPABLE | NOUVEAU_GEM_DOMAIN_VRAM;
@@ -114,8 +117,10 @@ nouveau_bo_create(struct kms_driver *kms,
arg.channel_hint = 0;

ret = drmCommandWriteRead(kms->fd, DRM_NOUVEAU_GEM_NEW, , 
sizeof(arg));
-   if (ret)
-   goto err_free;
+   if (ret) {
+   free(bo);
+   return ret;
+   }

bo->base.kms = kms;
bo->base.handle = arg.info.handle;
@@ -126,10 +131,6 @@ nouveau_bo_create(struct kms_driver *kms,
*out = >base;

return 0;
-
-err_free:
-   free(bo);
-   return ret;
 }

 static int
-- 
1.7.7



[PATCH 1/6] libkms/intel.c: Fix a memory leak and a dead assignment as well as make some code easier to read.

2012-07-13 Thread Johannes Obermayr
---
 libkms/intel.c |   32 +---
 1 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/libkms/intel.c b/libkms/intel.c
index 8b8249b..12175b0 100644
--- a/libkms/intel.c
+++ b/libkms/intel.c
@@ -89,27 +89,32 @@ intel_bo_create(struct kms_driver *kms,
}
}

-   bo = calloc(1, sizeof(*bo));
-   if (!bo)
-   return -ENOMEM;
-
-   if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) {
+   switch (type) {
+   case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8:
pitch = 64 * 4;
size = 64 * 64 * 4;
-   } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) {
+   break;
+   case KMS_BO_TYPE_SCANOUT_X8R8G8B8:
pitch = width * 4;
pitch = (pitch + 512 - 1) & ~(512 - 1);
size = pitch * ((height + 4 - 1) & ~(4 - 1));
-   } else {
+   break;
+   default:
return -EINVAL;
}

+   bo = calloc(1, sizeof(*bo));
+   if (!bo)
+   return -ENOMEM;
+
memset(, 0, sizeof(arg));
arg.size = size;

ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_CREATE, , 
sizeof(arg));
-   if (ret)
-   goto err_free;
+   if (ret) {
+   free(bo);
+   return ret;
+   }

bo->base.kms = kms;
bo->base.handle = arg.handle;
@@ -124,21 +129,18 @@ intel_bo_create(struct kms_driver *kms,
tile.handle = bo->base.handle;
tile.tiling_mode = I915_TILING_X;
tile.stride = bo->base.pitch;
-
-   ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, 
, sizeof(tile));
 #if 0
+   ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, 
, sizeof(tile));
if (ret) {
kms_bo_destroy(out);
return ret;
}
+#else
+   drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, , 
sizeof(tile));
 #endif
}

return 0;
-
-err_free:
-   free(bo);
-   return ret;
 }

 static int
-- 
1.7.7



libdrm: Fix some warnings reported by clang's scan-build tool [try 2]

2012-07-13 Thread Johannes Obermayr

Am Freitag, 13. Juli 2012, 18:47:50 schrieb Marcin Slusarz:
> On Fri, Jul 13, 2012 at 05:49:12PM +0200, Johannes Obermayr wrote:
> > 
> > Patches 1 to 4 were sent to mesa-dev.
> 
> And you chose to ignore most of my comments.
> Fine. Don't expect further reviews from me.
> 
> Marcin

Patch 1 and 2:
- Adapted
- I want to keep proposed easier to read "switch" case

Patch 3:
- Resend
- Waiting on your response: 
http://lists.freedesktop.org/archives/mesa-dev/2012-June/023456.html

Patch 4 and 5:
- Splitted
- http://llvm.org/bugs/show_bug.cgi?id=13358 (forgot to split and to add 
'drmFree(list);')
- The 'more if's case' seems better to me

Patch 6:
- Resend

Marcin, not that I ignore comments. But sometimes I want to hear also opinions 
from (some more) other people.
I hope I can calm the waves ...

Johannes


libdrm: Fix some warnings reported by clang's scan-build tool

2012-07-13 Thread Marcin Slusarz
On Fri, Jul 13, 2012 at 05:49:12PM +0200, Johannes Obermayr wrote:
> 
> Patches 1 to 4 were sent to mesa-dev.

And you chose to ignore most of my comments.
Fine. Don't expect further reviews from me.

Marcin


[RFC] dma-fence: dma-buf synchronization (v2)

2012-07-13 Thread Tom Cooksey
Hi Rob,

Yes, sorry we've been a bit slack progressing KDS publicly. Your
approach looks interesting and seems like it could enable both implicit
and explicit synchronization. A good compromise.


> From: Rob Clark 
> 
> A dma-fence can be attached to a buffer which is being filled or
> consumed by hw, to allow userspace to pass the buffer without waiting
> to another device.  For example, userspace can call page_flip ioctl to
> display the next frame of graphics after kicking the GPU but while the
> GPU is still rendering.  The display device sharing the buffer with the
> GPU would attach a callback to get notified when the GPU's rendering-
> complete IRQ fires, to update the scan-out address of the display,
> without having to wake up userspace.
> 
> A dma-fence is transient, one-shot deal.  It is allocated and attached
> to dma-buf's list of fences.  When the one that attached it is done,
> with the pending operation, it can signal the fence removing it from
> the dma-buf's list of fences:
> 
>   + dma_buf_attach_fence()
>   + dma_fence_signal()

It would be useful to have two lists of fences, those around writes to
the buffer and those around reads. The idea being that if you only want
to read from a buffer, you don't need to wait for fences around other
read operations, you only need to wait for the "last" writer fence. If
you do want to write to the buffer however, you need to wait for all
the read fences and the last writer fence. The use-case is when EGL
swap behaviour is EGL_BUFFER_PRESERVED. You have the display controller
reading the buffer with its fence defined to be signalled when it is
no-longer scanning out that buffer. It can only stop scanning out that
buffer when it is given another buffer to scan-out. If that next buffer
must be rendered by copying the currently scanned-out buffer into it
(one possible option for implementing EGL_BUFFER_PRESERVED) then you
essentially deadlock if the scan-out job blocks the "render the next
frame" job. 

There's probably variations of this idea, perhaps you only need a flag
to indicate if a fence is around a read-only or rw access?


> The intention is to provide a userspace interface (presumably via
> eventfd) later, to be used in conjunction with dma-buf's mmap support
> for sw access to buffers (or for userspace apps that would prefer to
> do their own synchronization).

>From our experience of our own KDS, we've come up with an interesting
approach to synchronizing userspace applications which have a buffer
mmap'd. We wanted to avoid userspace being able to block jobs running
on hardware while still allowing userspace to participate. Our original
idea was to have a lock/unlock ioctl interface on a dma_buf but have
a timeout whereby the application's lock would be broken if held for
too long. That at least bounded how long userspace could potentially
block hardware making progress, though was pretty "harsh".

The approach we have now settled on is to instead only allow an
application to wait for all jobs currently pending for a buffer. So
there's no way userspace can prevent anything else from using a
buffer, other than not issuing jobs which will use that buffer.
Also, the interface we settled on was to add a poll handler to
dma_buf, that way userspace can select() on multiple dma_buff
buffers in one syscall. It can also chose if it wants to wait for
only the last writer fence, I.e. wait until it can read (POLLIN)
or wait for all fences as it wants to write to the buffer (POLLOUT).
We kinda like this, but does restrict the utility a little. An idea
worth considering anyway.


My other thought is around atomicity. Could this be extended to
(safely) allow for hardware devices which might want to access
multiple buffers simultaneously? I think it probably can with
some tweaks to the interface? An atomic function which does 
something like "give me all the fences for all these buffers 
and add this fence to each instead/as-well-as"?


Cheers,

Tom






[PATCH 5/5] modetest.c: Add return 0 in bit_name_fn(res) macro.

2012-07-13 Thread Johannes Obermayr
---
 tests/modetest/modetest.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c
index ec3121e..00129fa 100644
--- a/tests/modetest/modetest.c
+++ b/tests/modetest/modetest.c
@@ -128,6 +128,7 @@ char * res##_str(int type) {
\
sep = ", "; \
}   \
}   \
+   return 0;   \
 }

 static const char *mode_type_names[] = {
-- 
1.7.7



[PATCH 4/5] xf86drm.c: Make more code UDEV unrelevant and fix a memory leak.

2012-07-13 Thread Johannes Obermayr
---
 xf86drm.c |   12 +---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/xf86drm.c b/xf86drm.c
index 6ea068f..e3789c8 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -255,6 +255,7 @@ static int drmMatchBusID(const char *id1, const char *id2, 
int pci_domain_ok)
 return 0;
 }

+#if !defined(UDEV)
 /**
  * Handles error checking for chown call.
  *
@@ -284,6 +285,7 @@ static int chown_check_return(const char *path, uid_t 
owner, gid_t group)
path, errno, strerror(errno));
return -1;
 }
+#endif

 /**
  * Open the DRM device, creating it if necessary.
@@ -303,14 +305,17 @@ static int drmOpenDevice(long dev, int minor, int type)
 stat_t  st;
 charbuf[64];
 int fd;
+#if !defined(UDEV)
 mode_t  devmode = DRM_DEV_MODE, serv_mode;
 int isroot  = !geteuid();
 uid_t   user= DRM_DEV_UID;
 gid_t   group   = DRM_DEV_GID, serv_group;
-
+#endif
+
 sprintf(buf, type ? DRM_DEV_NAME : DRM_CONTROL_DEV_NAME, DRM_DIR_NAME, 
minor);
 drmMsg("drmOpenDevice: node name is %s\n", buf);

+#if !defined(UDEV)
 if (drm_server_info) {
drm_server_info->get_perms(_group, _mode);
devmode  = serv_mode ? serv_mode : DRM_DEV_MODE;
@@ -318,7 +323,6 @@ static int drmOpenDevice(long dev, int minor, int type)
group = (serv_group >= 0) ? serv_group : DRM_DEV_GID;
 }

-#if !defined(UDEV)
 if (stat(DRM_DIR_NAME, )) {
if (!isroot)
return DRM_ERR_NOT_ROOT;
@@ -1395,8 +1399,10 @@ drm_context_t *drmGetReservedContextList(int fd, int 
*count)
 }

 res.contexts = list;
-if (drmIoctl(fd, DRM_IOCTL_RES_CTX, ))
+if (drmIoctl(fd, DRM_IOCTL_RES_CTX, )) {
+   drmFree(retval);
return NULL;
+}

 for (i = 0; i < res.count; i++)
retval[i] = list[i].handle;
-- 
1.7.7



[PATCH 3/5] nouveau/nouveau.c: Fix two memory leaks.

2012-07-13 Thread Johannes Obermayr
---
 nouveau/nouveau.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c
index 5aa4107..e91287f 100644
--- a/nouveau/nouveau.c
+++ b/nouveau/nouveau.c
@@ -95,6 +95,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device 
**pdev)
(dev->drm_version <  0x0100 ||
 dev->drm_version >= 0x0200)) {
nouveau_device_del();
+   free(nvdev);
return -EINVAL;
}

@@ -105,6 +106,7 @@ nouveau_device_wrap(int fd, int close, struct 
nouveau_device **pdev)
ret = nouveau_getparam(dev, NOUVEAU_GETPARAM_AGP_SIZE, );
if (ret) {
nouveau_device_del();
+   free(nvdev);
return ret;
}

-- 
1.7.7



[PATCH 2/5] libkms/nouveau.c: Fix a memory leak and cleanup code a bit.

2012-07-13 Thread Johannes Obermayr
---
 libkms/nouveau.c |   20 +++-
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/libkms/nouveau.c b/libkms/nouveau.c
index 0e24a15..4cbca96 100644
--- a/libkms/nouveau.c
+++ b/libkms/nouveau.c
@@ -94,14 +94,18 @@ nouveau_bo_create(struct kms_driver *kms,
if (!bo)
return -ENOMEM;

-   if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) {
+   switch (type) {
+   case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8:
pitch = 64 * 4;
size = 64 * 64 * 4;
-   } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) {
+   break;
+   case KMS_BO_TYPE_SCANOUT_X8R8G8B8:
pitch = width * 4;
pitch = (pitch + 512 - 1) & ~(512 - 1);
size = pitch * height;
-   } else {
+   break;
+   default:
+   free(bo);
return -EINVAL;
}

@@ -114,8 +118,10 @@ nouveau_bo_create(struct kms_driver *kms,
arg.channel_hint = 0;

ret = drmCommandWriteRead(kms->fd, DRM_NOUVEAU_GEM_NEW, , 
sizeof(arg));
-   if (ret)
-   goto err_free;
+   if (ret) {
+   free(bo);
+   return ret;
+   }

bo->base.kms = kms;
bo->base.handle = arg.info.handle;
@@ -126,10 +132,6 @@ nouveau_bo_create(struct kms_driver *kms,
*out = >base;

return 0;
-
-err_free:
-   free(bo);
-   return ret;
 }

 static int
-- 
1.7.7



[PATCH 1/5] libkms/intel.c: Fix a memory leak and a dead assignment as well as cleanup code a bit.

2012-07-13 Thread Johannes Obermayr
---
 libkms/intel.c |   25 ++---
 1 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/libkms/intel.c b/libkms/intel.c
index 8b8249b..b8ac343 100644
--- a/libkms/intel.c
+++ b/libkms/intel.c
@@ -93,14 +93,18 @@ intel_bo_create(struct kms_driver *kms,
if (!bo)
return -ENOMEM;

-   if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) {
+   switch (type) {
+   case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8:
pitch = 64 * 4;
size = 64 * 64 * 4;
-   } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) {
+   break;
+   case KMS_BO_TYPE_SCANOUT_X8R8G8B8:
pitch = width * 4;
pitch = (pitch + 512 - 1) & ~(512 - 1);
size = pitch * ((height + 4 - 1) & ~(4 - 1));
-   } else {
+   break;
+   default:
+   free(bo);
return -EINVAL;
}

@@ -108,8 +112,10 @@ intel_bo_create(struct kms_driver *kms,
arg.size = size;

ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_CREATE, , 
sizeof(arg));
-   if (ret)
-   goto err_free;
+   if (ret) {
+   free(bo);
+   return ret;
+   }

bo->base.kms = kms;
bo->base.handle = arg.handle;
@@ -124,21 +130,18 @@ intel_bo_create(struct kms_driver *kms,
tile.handle = bo->base.handle;
tile.tiling_mode = I915_TILING_X;
tile.stride = bo->base.pitch;
-
-   ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, 
, sizeof(tile));
 #if 0
+   ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, 
, sizeof(tile));
if (ret) {
kms_bo_destroy(out);
return ret;
}
+#else
+   drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, , 
sizeof(tile));
 #endif
}

return 0;
-
-err_free:
-   free(bo);
-   return ret;
 }

 static int
-- 
1.7.7



libdrm: Fix some warnings reported by clang's scan-build tool

2012-07-13 Thread Johannes Obermayr

Patches 1 to 4 were sent to mesa-dev.


[RFC] dma-fence: dma-buf synchronization (v2)

2012-07-13 Thread Rob Clark
On Fri, Jul 13, 2012 at 4:44 PM, Maarten Lankhorst
 wrote:
> Hey,
>
> Op 13-07-12 20:52, Rob Clark schreef:
>> On Fri, Jul 13, 2012 at 12:35 PM, Tom Cooksey  wrote:
>>> My other thought is around atomicity. Could this be extended to
>>> (safely) allow for hardware devices which might want to access
>>> multiple buffers simultaneously? I think it probably can with
>>> some tweaks to the interface? An atomic function which does
>>> something like "give me all the fences for all these buffers
>>> and add this fence to each instead/as-well-as"?
>> fwiw, what I'm leaning towards right now is combining dma-fence w/
>> Maarten's idea of dma-buf-mgr (not sure if you saw his patches?).  And
>> let dmabufmgr handle the multi-buffer reservation stuff.  And possibly
>> the read vs write access, although this I'm not 100% sure on... the
>> other option being the concept of read vs write (or
>> exclusive/non-exclusive) fences.
> Agreed, dmabufmgr is meant for reserving multiple buffers without deadlocks.
> The underlying mechanism for synchronization can be dma-fences, it wouldn't
> really change dmabufmgr much.
>> In the current state, the fence is quite simple, and doesn't care
>> *what* it is fencing, which seems advantageous when you get into
>> trying to deal with combinations of devices sharing buffers, some of
>> whom can do hw sync, and some who can't.  So having a bit of
>> partitioning from the code dealing w/ sequencing who can access the
>> buffers when and for what purpose seems like it might not be a bad
>> idea.  Although I'm still working through the different alternatives.
>>
> Yeah, I managed to get nouveau hooked up with generating irqs on
> completion today using an invalid command. It's also no longer a
> performance regression, so software syncing is no longer a problem
> for nouveau. i915 already generates irqs and r600 presumably too.
>
> Monday I'll take a better look at your patch, end of day now. :)

let me send you a slightly updated version.. I fixed locally some
locking fail in attach_fence() and get_fence() that I managed to
introduce when converting from global spinlock to using the
waitqueue's spinlock.

BR,
-R

> ~Maarten
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] drm/exynos: Add IOMMU support for mapping gem object

2012-07-13 Thread Inki Dae


> -Original Message-
> From: Prathyush K [mailto:prathyush.k at samsung.com]
> Sent: Wednesday, July 11, 2012 6:40 PM
> To: dri-devel at lists.freedesktop.org
> Cc: prathyush at chromium.org; m.szyprowski at samsung.com;
inki.dae at samsung.com;
> subash.ramaswamy at linaro.org
> Subject: [PATCH 7/7] drm/exynos: Add IOMMU support for mapping gem object
> 
> A gem object is created using dma_alloc_writecombine. Currently, this
> buffer is assumed to be contiguous. If a IOMMU mapping is created for
> DRM, this buffer would be non-contig so the map functions are modified
> to call dma_mmap_writecombine. This works for both contig and non-contig
> buffers.
> 
> Signed-off-by: Prathyush K 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_gem.c |   35
++-
> ---
>  1 files changed, 16 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> index 5c8b683..59240f7 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> @@ -162,17 +162,22 @@ static int exynos_drm_gem_map_pages(struct
> drm_gem_object *obj,
>  {
>   struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj);
>   struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer;
> - unsigned long pfn;
> 
>   if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG) {
> + unsigned long pfn;
>   if (!buf->pages)
>   return -EINTR;
> 
>   pfn = page_to_pfn(buf->pages[page_offset++]);
> - } else
> - pfn = (buf->dma_addr >> PAGE_SHIFT) + page_offset;
> -
> - return vm_insert_mixed(vma, f_vaddr, pfn);
> + return vm_insert_mixed(vma, f_vaddr, pfn);
> + } else {

It's not good. EXYNOS_BO_NONCONTIG means physically non-contiguous otherwise
physically contiguous memory but with your patch, in case of using iommu,
memory type of the gem object may have no any meaning. in this case, the
memory type is EXYNOS_BO_CONTIG and has physically non-contiguous memory.

> + int ret;
> + ret = dma_mmap_writecombine(obj->dev->dev, vma, buf->kvaddr,
> + buf->dma_addr, buf->size);
> + if (ret)
> + DRM_ERROR("dma_mmap_writecombine failed\n");
> + return ret;
> + }
>  }
> 
>  static int exynos_drm_gem_get_pages(struct drm_gem_object *obj)
> @@ -503,7 +508,7 @@ static int exynos_drm_gem_mmap_buffer(struct file
> *filp,
>   struct drm_gem_object *obj = filp->private_data;
>   struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj);
>   struct exynos_drm_gem_buf *buffer;
> - unsigned long pfn, vm_size, usize, uaddr = vma->vm_start;
> + unsigned long vm_size, usize, uaddr = vma->vm_start;
>   int ret;
> 
>   DRM_DEBUG_KMS("%s\n", __FILE__);
> @@ -543,19 +548,11 @@ static int exynos_drm_gem_mmap_buffer(struct file
> *filp,
>   usize -= PAGE_SIZE;
>   } while (usize > 0);
>   } else {
> - /*
> -  * get page frame number to physical memory to be mapped
> -  * to user space.
> -  */
> - pfn = ((unsigned long)exynos_gem_obj->buffer->dma_addr) >>
> - PAGE_SHIFT;
> -
> - DRM_DEBUG_KMS("pfn = 0x%lx\n", pfn);
> -
> - if (remap_pfn_range(vma, vma->vm_start, pfn, vm_size,
> - vma->vm_page_prot)) {
> - DRM_ERROR("failed to remap pfn range.\n");
> - return -EAGAIN;
> + ret = dma_mmap_writecombine(obj->dev->dev, vma, buffer-
> >kvaddr,

What if we don't use iommu and memory type of this buffer is non-contiguous?

> + buffer->dma_addr, buffer->size);
> + if (ret) {
> + DRM_ERROR("dma_mmap_writecombine failed\n");
> + return ret;
>   }
>   }
> 
> --
> 1.7.0.4



[PATCH 3/3] drm/radeon: fix const IB handling

2012-07-13 Thread Christian König
Const IBs are executed on the CE not the CP, so we can't
fence them in the normal way.

So submit them directly before the IB instead, just as
the documentation says.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/r100.c|2 +-
 drivers/gpu/drm/radeon/r600.c|2 +-
 drivers/gpu/drm/radeon/radeon.h  |3 ++-
 drivers/gpu/drm/radeon/radeon_cs.c   |   25 +++--
 drivers/gpu/drm/radeon/radeon_ring.c |   10 +-
 5 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index e0f5ae8..4ee5a74 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct 
radeon_ring *ring)
ib.ptr[6] = PACKET2(0);
ib.ptr[7] = PACKET2(0);
ib.length_dw = 8;
-   r = radeon_ib_schedule(rdev, );
+   r = radeon_ib_schedule(rdev, , NULL);
if (r) {
radeon_scratch_free(rdev, scratch);
radeon_ib_free(rdev, );
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 3156d25..c2e5069 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct 
radeon_ring *ring)
ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2);
ib.ptr[2] = 0xDEADBEEF;
ib.length_dw = 3;
-   r = radeon_ib_schedule(rdev, );
+   r = radeon_ib_schedule(rdev, , NULL);
if (r) {
radeon_scratch_free(rdev, scratch);
radeon_ib_free(rdev, );
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 2cb355b..2d7f06c 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -751,7 +751,8 @@ struct si_rlc {
 int radeon_ib_get(struct radeon_device *rdev, int ring,
  struct radeon_ib *ib, unsigned size);
 void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib);
-int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
+int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
+  struct radeon_ib *const_ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 void radeon_ib_pool_fini(struct radeon_device *rdev);
 int radeon_ib_ring_tests(struct radeon_device *rdev);
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 553da67..d0be5d5 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
}
radeon_cs_sync_rings(parser);
parser->ib.vm_id = 0;
-   r = radeon_ib_schedule(rdev, >ib);
+   r = radeon_ib_schedule(rdev, >ib, NULL);
if (r) {
DRM_ERROR("Failed to schedule IB !\n");
}
@@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device 
*rdev,
}
radeon_cs_sync_rings(parser);

+   parser->ib.vm_id = vm->id;
+   /* ib pool is bind at 0 in virtual address space,
+* so gpu_addr is the offset inside the pool bo
+*/
+   parser->ib.gpu_addr = parser->ib.sa_bo->soffset;
+
if ((rdev->family >= CHIP_TAHITI) &&
(parser->chunk_const_ib_idx != -1)) {
parser->const_ib.vm_id = vm->id;
-   /* ib pool is bind at 0 in virtual address space to gpu_addr is 
the
-* offset inside the pool bo
-*/
+   /* same reason as above */
parser->const_ib.gpu_addr = parser->const_ib.sa_bo->soffset;
-   r = radeon_ib_schedule(rdev, >const_ib);
-   if (r)
-   goto out;
+   r = radeon_ib_schedule(rdev, >ib, >const_ib);
+   } else {
+   r = radeon_ib_schedule(rdev, >ib, NULL);
}

-   parser->ib.vm_id = vm->id;
-   /* ib pool is bind at 0 in virtual address space to gpu_addr is the
-* offset inside the pool bo
-*/
-   parser->ib.gpu_addr = parser->ib.sa_bo->soffset;
-   parser->ib.is_const_ib = false;
-   r = radeon_ib_schedule(rdev, >ib);
 out:
if (!r) {
if (vm->fence) {
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 75cbe46..c48c354 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -74,7 +74,8 @@ void radeon_ib_free(struct radeon_device *rdev, struct 
radeon_ib *ib)
radeon_fence_unref(>fence);
 }

-int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib)
+int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
+  struct radeon_ib *const_ib)
 {
struct radeon_ring *ring = >ring[ib->ring];
bool need_sync = false;
@@ -105,6 

[PATCH 2/3] drm/radeon: let sa manager block for fences to wait for

2012-07-13 Thread Christian König
Otherwise we can encounter out of memory situations under extreme load.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h|2 +-
 drivers/gpu/drm/radeon/radeon_sa.c |   72 +---
 2 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 6715e4c..2cb355b 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -362,7 +362,7 @@ struct radeon_bo_list {
  * alignment).
  */
 struct radeon_sa_manager {
-   spinlock_t  lock;
+   wait_queue_head_t   wq;
struct radeon_bo*bo;
struct list_head*hole;
struct list_headflist[RADEON_NUM_RINGS];
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
b/drivers/gpu/drm/radeon/radeon_sa.c
index 81dbb5b..b535fc4 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -54,7 +54,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev,
 {
int i, r;

-   spin_lock_init(_manager->lock);
+   init_waitqueue_head(_manager->wq);
sa_manager->bo = NULL;
sa_manager->size = size;
sa_manager->domain = domain;
@@ -211,6 +211,29 @@ static bool radeon_sa_bo_try_alloc(struct 
radeon_sa_manager *sa_manager,
return false;
 }

+static bool radeon_sa_event(struct radeon_sa_manager *sa_manager,
+   unsigned size, unsigned align)
+{
+   unsigned soffset, eoffset, wasted;
+   int i;
+
+   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+   if (!list_empty(_manager->flist[i])) {
+   return true;
+   }
+   }
+
+   soffset = radeon_sa_bo_hole_soffset(sa_manager);
+   eoffset = radeon_sa_bo_hole_eoffset(sa_manager);
+   wasted = (align - (soffset % align)) % align;
+
+   if ((eoffset - soffset) >= (size + wasted)) {
+   return true;
+   }
+
+   return false;
+}
+
 static bool radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager,
   struct radeon_fence **fences,
   unsigned *tries)
@@ -297,8 +320,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
INIT_LIST_HEAD(&(*sa_bo)->olist);
INIT_LIST_HEAD(&(*sa_bo)->flist);

-   spin_lock(_manager->lock);
-   do {
+   spin_lock(_manager->wq.lock);
+   while(1) {
for (i = 0; i < RADEON_NUM_RINGS; ++i) {
fences[i] = NULL;
tries[i] = 0;
@@ -309,30 +332,34 @@ int radeon_sa_bo_new(struct radeon_device *rdev,

if (radeon_sa_bo_try_alloc(sa_manager, *sa_bo,
   size, align)) {
-   spin_unlock(_manager->lock);
+   spin_unlock(_manager->wq.lock);
return 0;
}

/* see if we can skip over some allocations */
} while (radeon_sa_bo_next_hole(sa_manager, fences, tries));

-   if (block) {
-   spin_unlock(_manager->lock);
-   r = radeon_fence_wait_any(rdev, fences, false);
-   spin_lock(_manager->lock);
-   if (r) {
-   /* if we have nothing to wait for we
-  are practically out of memory */
-   if (r == -ENOENT) {
-   r = -ENOMEM;
-   }
-   goto out_err;
-   }
+   if (!block) {
+   break;
+   }
+
+   spin_unlock(_manager->wq.lock);
+   r = radeon_fence_wait_any(rdev, fences, false);
+   spin_lock(_manager->wq.lock);
+   /* if we have nothing to wait for block */
+   if (r == -ENOENT) {
+   r = wait_event_interruptible_locked(
+   sa_manager->wq, 
+   radeon_sa_event(sa_manager, size, align)
+   );
+   }
+   if (r) {
+   goto out_err;
}
-   } while (block);
+   };

 out_err:
-   spin_unlock(_manager->lock);
+   spin_unlock(_manager->wq.lock);
kfree(*sa_bo);
*sa_bo = NULL;
return r;
@@ -348,7 +375,7 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct 
radeon_sa_bo **sa_bo,
}

sa_manager = (*sa_bo)->manager;
-   spin_lock(_manager->lock);
+   spin_lock(_manager->wq.lock);
if (fence && !radeon_fence_signaled(fence)) {
(*sa_bo)->fence = radeon_fence_ref(fence);
list_add_tail(&(*sa_bo)->flist,
@@ -356,7 +383,8 @@ void 

[PATCH 1/3] drm/radeon: return an error if there is nothing to wait for

2012-07-13 Thread Christian König
Otherwise the sa managers out of memory
handling doesn't work.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_fence.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 76c5b22..7a181c3 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -331,7 +331,7 @@ static int radeon_fence_wait_any_seq(struct radeon_device 
*rdev,

/* nothing to wait for ? */
if (ring == RADEON_NUM_RINGS) {
-   return 0;
+   return -ENOENT;
}

while (!radeon_fence_any_seq_signaled(rdev, target_seq)) {
-- 
1.7.9.5



[PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function

2012-07-13 Thread Inki Dae


> -Original Message-
> From: Prathyush K [mailto:prathyush.k at samsung.com]
> Sent: Wednesday, July 11, 2012 6:40 PM
> To: dri-devel at lists.freedesktop.org
> Cc: prathyush at chromium.org; m.szyprowski at samsung.com;
inki.dae at samsung.com;
> subash.ramaswamy at linaro.org
> Subject: [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function
> 
> This patch adds a exynos drm specific implementation of fb_mmap
> which supports mapping a non-contiguous buffer to user space.
> This new function does not assume that the frame buffer is contiguous
> and calls dma_mmap_writecombine for mapping the buffer to user space.
> dma_mmap_writecombine will be able to map a contiguous buffer as well
> as non-contig buffer depending on whether an IOMMU mapping is created
> for drm or not.
> 
> Signed-off-by: Prathyush K 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_fbdev.c |   16 
>  1 files changed, 16 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> index d5586cc..b53e638 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> @@ -46,8 +46,24 @@ struct exynos_drm_fbdev {
>   struct exynos_drm_gem_obj   *exynos_gem_obj;
>  };
> 
> +static int exynos_drm_fb_mmap(struct fb_info *info,
> +   struct vm_area_struct *vma)
> +{
> + if ((vma->vm_end - vma->vm_start) > info->fix.smem_len)
> + return -EINVAL;
> +
> + vma->vm_pgoff = 0;
> + vma->vm_flags |= VM_IO | VM_RESERVED;
> + if (dma_mmap_writecombine(info->device, vma, info->screen_base,
> + info->fix.smem_start, vma->vm_end - vma->vm_start))
> + return -EAGAIN;
> +
> + return 0;
> +}
> +

Ok, it's good feature. actually the physically non-contiguous gem buffer
allocated for console framebuffer has to be mapped with user space.

Thanks.

>  static struct fb_ops exynos_drm_fb_ops = {
>   .owner  = THIS_MODULE,
> + .fb_mmap= exynos_drm_fb_mmap,
>   .fb_fillrect= cfb_fillrect,
>   .fb_copyarea= cfb_copyarea,
>   .fb_imageblit   = cfb_imageblit,
> --
> 1.7.0.4



[PATCH 5/7] drm/exynos: attach drm device with common drm mapping

2012-07-13 Thread Inki Dae


> -Original Message-
> From: Prathyush K [mailto:prathyush.k at samsung.com]
> Sent: Wednesday, July 11, 2012 6:40 PM
> To: dri-devel at lists.freedesktop.org
> Cc: prathyush at chromium.org; m.szyprowski at samsung.com;
inki.dae at samsung.com;
> subash.ramaswamy at linaro.org
> Subject: [PATCH 5/7] drm/exynos: attach drm device with common drm mapping
> 
> This patch sets the common mapping created during drm init, to the
> drm device's archdata. The dma_ops of drm device is set as arm_iommu_ops.
> The common mapping is shared across all the drm devices which ensures
> that any buffer allocated with drm is accessible by drm-fimd or drm-hdmi
> or both.
> 
> Signed-off-by: Prathyush K 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_drv.c |9 +
>  1 files changed, 9 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> index c3ad87e..2e40ca8 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> @@ -276,6 +276,15 @@ static struct drm_driver exynos_drm_driver = {
> 
>  static int exynos_drm_platform_probe(struct platform_device *pdev)
>  {
> +#ifdef CONFIG_EXYNOS_IOMMU
> + struct device *dev = >dev;
> +
> + kref_get(_drm_common_mapping->kref);
> + dev->archdata.mapping = exynos_drm_common_mapping;

Ok, exynos_drm_common_mapping is shared with drivers using
dev->archdata.mapping

> + set_dma_ops(dev, _iommu_ops);
> +
> + DRM_INFO("drm common mapping set to drm device.\n");
> +#endif
>   DRM_DEBUG_DRIVER("%s\n", __FILE__);
> 
>   exynos_drm_driver.num_ioctls = DRM_ARRAY_SIZE(exynos_ioctls);
> --
> 1.7.0.4



[PATCH 3/7] drm/exynos: add IOMMU support to drm fimd

2012-07-13 Thread Inki Dae


> -Original Message-
> From: Prathyush K [mailto:prathyush.k at samsung.com]
> Sent: Wednesday, July 11, 2012 6:40 PM
> To: dri-devel at lists.freedesktop.org
> Cc: prathyush at chromium.org; m.szyprowski at samsung.com;
inki.dae at samsung.com;
> subash.ramaswamy at linaro.org
> Subject: [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd
> 
> This patch adds device tree based IOMMU support to DRM FIMD. During
> probe, the driver searches for a 'sysmmu' field in the device node. The
> sysmmu field points to the corresponding sysmmu device of fimd.
> This sysmmu device is retrieved and set as fimd's sysmmu. The common
> IOMMU mapping created during DRM init is then attached to drm fimd.
> 
> Signed-off-by: Prathyush K 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_fimd.c |   54
> +-
>  1 files changed, 53 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> index 15b5286..6d4048a 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> @@ -19,7 +19,7 @@
>  #include 
>  #include 
>  #include 
> -
> +#include 
>  #include 
>  #include 
> 
> @@ -790,12 +790,56 @@ static int fimd_power_on(struct fimd_context *ctx,
> bool enable)
>  }
> 
>  #ifdef CONFIG_OF
> +
> +#ifdef CONFIG_EXYNOS_IOMMU
> +static int iommu_init(struct device *dev)
> +{
> + struct platform_device *pds;
> + struct device_node *dn, *dns;
> + const __be32 *parp;
> + int ret;
> +
> + dn = dev->of_node;
> + parp = of_get_property(dn, "sysmmu", NULL);
> + if (parp == NULL) {
> + dev_err(dev, "failed to find sysmmu property\n");
> + return -EINVAL;
> + }
> + dns = of_find_node_by_phandle(be32_to_cpup(parp));
> + if (dns == NULL) {
> + dev_err(dev, "failed to find sysmmu node\n");
> + return -EINVAL;
> + }
> + pds = of_find_device_by_node(dns);
> + if (pds == NULL) {
> + dev_err(dev, "failed to find sysmmu platform device\n");
> + return -EINVAL;
> + }
> +
> + platform_set_sysmmu(>dev, dev);
> + dev->dma_parms = kzalloc(sizeof(*dev->dma_parms), GFP_KERNEL);
> + if (!dev->dma_parms) {
> + dev_err(dev, "failed to allocate dma parms\n");
> + return -ENOMEM;
> + }
> + dma_set_max_seg_size(dev, 0xu);
> +
> + ret = arm_iommu_attach_device(dev, exynos_drm_common_mapping);

where is exynos_drm_common_mapping declared? you can get this point using
exynos_drm_private structure.


> + if (ret) {
> + dev_err(dev, "failed to attach device\n");
> + return ret;
> + }
> + return 0;
> +}
> +#endif
> +

with your patch, we can use iommu feature only with device tree. I think
iommu feature should be used commonly.


>  static struct exynos_drm_fimd_pdata *drm_fimd_dt_parse_pdata(struct
> device *dev)
>  {
>   struct device_node *np = dev->of_node;
>   struct device_node *disp_np;
>   struct exynos_drm_fimd_pdata *pd;
>   u32 data[4];
> + int ret;
> 
>   pd = kzalloc(sizeof(*pd), GFP_KERNEL);
>   if (!pd) {
> @@ -803,6 +847,14 @@ static struct exynos_drm_fimd_pdata
> *drm_fimd_dt_parse_pdata(struct device *dev)
>   return ERR_PTR(-ENOMEM);
>   }
> 
> +#ifdef CONFIG_EXYNOS_IOMMU

and please avoid such #ifdef in device driver.

> + ret = iommu_init(dev);
> + if (ret) {
> + dev_err(dev, "failed to initialize iommu\n");
> + return ERR_PTR(ret);
> + }
> +#endif
> +
>   if (of_get_property(np, "samsung,fimd-vidout-rgb", NULL))
>   pd->vidcon0 |= VIDCON0_VIDOUT_RGB | VIDCON0_PNRMODE_RGB;
>   if (of_get_property(np, "samsung,fimd-vidout-tv", NULL))
> --
> 1.7.0.4



[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3

2012-07-13 Thread Christian König
On 13.07.2012 14:27, Alex Deucher wrote:
> On Fri, Jul 13, 2012 at 5:09 AM, Christian K?nig
>  wrote:
>> On 12.07.2012 18:36, Alex Deucher wrote:
>>> On Thu, Jul 12, 2012 at 12:12 PM, Christian K?nig
>>>  wrote:
 Before emitting any indirect buffer, emit the offset of the next
 valid ring content if any. This allow code that want to resume
 ring to resume ring right after ib that caused GPU lockup.

 v2: use scratch registers instead of storing it into memory
 v3: skip over the surface sync for ni and si as well

 Signed-off-by: Jerome Glisse 
 Signed-off-by: Christian K?nig 
 ---
drivers/gpu/drm/radeon/evergreen.c   |8 +++-
drivers/gpu/drm/radeon/ni.c  |   11 ++-
drivers/gpu/drm/radeon/r600.c|   18 --
drivers/gpu/drm/radeon/radeon.h  |1 +
drivers/gpu/drm/radeon/radeon_ring.c |4 
drivers/gpu/drm/radeon/rv770.c   |4 +++-
drivers/gpu/drm/radeon/si.c  |   22 +++---
7 files changed, 60 insertions(+), 8 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/evergreen.c
 b/drivers/gpu/drm/radeon/evergreen.c
 index f39b900..40de347 100644
 --- a/drivers/gpu/drm/radeon/evergreen.c
 +++ b/drivers/gpu/drm/radeon/evergreen.c
 @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct
 radeon_device *rdev, struct radeon_ib *ib)
   /* set to DX10/11 mode */
   radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
   radeon_ring_write(ring, 1);
 -   /* FIXME: implement */
 +
 +   if (ring->rptr_save_reg) {
 +   uint32_t next_rptr = ring->wptr + 2 + 4;
 +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
 +   radeon_ring_write(ring, next_rptr);
 +   }
>>> On r600 and newer please use SET_CONFIG_REG rather than Packet0.
>> Why? Please note that it's on purpose that this doesn't interfere with the
>> top/bottom of pipe handling and the draw commands, e.g. the register write
>> isn't associated with drawing but instead just marks the beginning of
>> parsing the IB.
> Packet0's are have been semi-deprecated since r600.  They still work,
> but the CP guys recommend using the appropriate packet3 whenever
> possible.
Ok, that makes sense.

Any further comments on the patchset, or can I send that to Dave for 
merging now?

Cheers,
Christian.



[PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM

2012-07-13 Thread Inki Dae


> -Original Message-
> From: Prathyush K [mailto:prathyush.k at samsung.com]
> Sent: Wednesday, July 11, 2012 6:40 PM
> To: dri-devel at lists.freedesktop.org
> Cc: prathyush at chromium.org; m.szyprowski at samsung.com;
inki.dae at samsung.com;
> subash.ramaswamy at linaro.org
> Subject: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM
> 
> The dma-mapping framework needs a IOMMU mapping to be created for the
> device which allocates/maps/frees the non-contig buffer. In the DRM
> framework, a gem buffer is created by the DRM virtual device and not
> directly by any of the physical devices (FIMD, HDMI etc). Each gem object
> can be set as a framebuffer to one or many of the drm devices. So a gem
> object cannot be allocated for any one device. All the DRM devices should
> be able to access this buffer.
>

It's good to use unified iommu table so I agree to your opinion but we don't
decide whether we use dma mapping api or not. now dma mapping api has one
issue.
in case of using iommu with dma mapping api, we couldn't use physically
contiguous memory region with iommu. for this, there is a case that we
should use physically contiguous memory region with iommu. it is because we
sometime may use mfc(hw video codec) with secure zone such as ARM TrustZone.
Then, it needs physically contiguous memory region.

Thanks,
Inki Dae

> The proposed method is to create a common IOMMU mapping during drm init.
> This
> mapping is then attached to all of the drm devices including the drm
> device.
> [PATCH 1/7] drm/exynos: create common IOMMU mapping for DRM
> 
> During the probe of drm fimd, the driver retrieves a 'sysmmu' field
> in the device node for fimd. If such a field exists, the driver retrieves
> the
> platform device of the sysmmu device. This sysmmu is set as the sysmmu
> for fimd. The common mapping created is then attached to fimd.
> This needs to be done for all the other devices (hdmi, vidi etc).
> [PATCH 2/7] ARM: EXYNOS5: add sysmmu field to fimd device node
> [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd
> 
> During DRM's probe which happens last, the common mapping is set to its
> archdata
> and iommu ops are set as its dma ops. This requires a modification in the
> dma-mapping framework so that the iommu ops can be visible to all drivers.
> [PATCH 4/7] ARM: dma-mapping: rename and export iommu_ops
> [PATCH 5/7] drm/exynos: attach drm device with common drm mapping
> 
> Currently allocation and free use the iommu framework by calling
> dma_alloc_writecombine and dma_free_writecombine respectively.
> For mapping the buffers to user space, the mmap functions assume that
> the buffer is contiguous. This is modified by calling
> dma_mmap_writecombine.
> [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function
> [PATCH 7/7] Add IOMMU support for mapping gem object
> 
> The device tree based patches are based on Leela's patch which was posted
> last week for adding DT support to DRM FIMD. The patch to add sysmmu
> field is for reference only and will be posted to the device tree
> mailing list. Same with the rename and export iommu_ops patch.
> 
> These patches are tested on Exynos5250 SMDK board and tested with modetest
> from libdrm tests.
> 
> Prathyush K (7):
>   drm/exynos: create common IOMMU mapping for DRM
>   ARM: EXYNOS5: add sysmmu field to fimd device node
>   drm/exynos: add IOMMU support to drm fimd
>   ARM: dma-mapping: rename and export iommu_ops
>   drm/exynos: attach drm device with common drm mapping
>   drm/exynos: Add exynos drm specific fb_mmap function
>   drm/exynos: Add IOMMU support for mapping gem object
> 
>  arch/arm/boot/dts/exynos5250.dtsi |1 +
>  arch/arm/include/asm/dma-mapping.h|1 +
>  arch/arm/mm/dma-mapping.c |5 ++-
>  drivers/gpu/drm/exynos/exynos_drm_core.c  |3 ++
>  drivers/gpu/drm/exynos/exynos_drm_drv.c   |   30 
>  drivers/gpu/drm/exynos/exynos_drm_drv.h   |   10 +
>  drivers/gpu/drm/exynos/exynos_drm_fbdev.c |   16 
>  drivers/gpu/drm/exynos/exynos_drm_fimd.c  |   54
> -
>  drivers/gpu/drm/exynos/exynos_drm_gem.c   |   35 --
>  9 files changed, 133 insertions(+), 22 deletions(-)



[PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM

2012-07-13 Thread Subash Patel
On 07/13/2012 12:09 PM, Inki Dae wrote:
>
>> -Original Message-
>> From: Prathyush K [mailto:prathyush.k at samsung.com]
>> Sent: Wednesday, July 11, 2012 6:40 PM
>> To: dri-devel at lists.freedesktop.org
>> Cc: prathyush at chromium.org; m.szyprowski at samsung.com;
> inki.dae at samsung.com;
>> subash.ramaswamy at linaro.org
>> Subject: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM
>>
>> The dma-mapping framework needs a IOMMU mapping to be created for the
>> device which allocates/maps/frees the non-contig buffer. In the DRM
>> framework, a gem buffer is created by the DRM virtual device and not
>> directly by any of the physical devices (FIMD, HDMI etc). Each gem object
>> can be set as a framebuffer to one or many of the drm devices. So a gem
>> object cannot be allocated for any one device. All the DRM devices should
>> be able to access this buffer.
>>
> It's good to use unified iommu table so I agree to your opinion but we don't
> decide whether we use dma mapping api or not. now dma mapping api has one
> issue.
> in case of using iommu with dma mapping api, we couldn't use physically
> contiguous memory region with iommu. for this, there is a case that we
> should use physically contiguous memory region with iommu. it is because we
> sometime may use mfc(hw video codec) with secure zone such as ARM TrustZone.
> Then, it needs physically contiguous memory region.
>
> Thanks,
> Inki Dae
I agree. In the mainline code, as of now only the arm_dma_ops has the 
support allocating
from the CMA. But in the function arm_iommu_alloc_attrs(), there is no 
way to know if the
device had declared a contiguous memory range. The reason, we don't 
store that cookie
into the device during the dma_declare_contiguous(). So is it advisable 
to store such information
like mapping(in the iommu operations) in the device.archdata?

Regards,
Subash
>
>> The proposed method is to create a common IOMMU mapping during drm init.
>> This
>> mapping is then attached to all of the drm devices including the drm
>> device.
>> [PATCH 1/7] drm/exynos: create common IOMMU mapping for DRM
>>
>> During the probe of drm fimd, the driver retrieves a 'sysmmu' field
>> in the device node for fimd. If such a field exists, the driver retrieves
>> the
>> platform device of the sysmmu device. This sysmmu is set as the sysmmu
>> for fimd. The common mapping created is then attached to fimd.
>> This needs to be done for all the other devices (hdmi, vidi etc).
>> [PATCH 2/7] ARM: EXYNOS5: add sysmmu field to fimd device node
>> [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd
>>
>> During DRM's probe which happens last, the common mapping is set to its
>> archdata
>> and iommu ops are set as its dma ops. This requires a modification in the
>> dma-mapping framework so that the iommu ops can be visible to all drivers.
>> [PATCH 4/7] ARM: dma-mapping: rename and export iommu_ops
>> [PATCH 5/7] drm/exynos: attach drm device with common drm mapping
>>
>> Currently allocation and free use the iommu framework by calling
>> dma_alloc_writecombine and dma_free_writecombine respectively.
>> For mapping the buffers to user space, the mmap functions assume that
>> the buffer is contiguous. This is modified by calling
>> dma_mmap_writecombine.
>> [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function
>> [PATCH 7/7] Add IOMMU support for mapping gem object
>>
>> The device tree based patches are based on Leela's patch which was posted
>> last week for adding DT support to DRM FIMD. The patch to add sysmmu
>> field is for reference only and will be posted to the device tree
>> mailing list. Same with the rename and export iommu_ops patch.
>>
>> These patches are tested on Exynos5250 SMDK board and tested with modetest
>> from libdrm tests.
>>
>> Prathyush K (7):
>>drm/exynos: create common IOMMU mapping for DRM
>>ARM: EXYNOS5: add sysmmu field to fimd device node
>>drm/exynos: add IOMMU support to drm fimd
>>ARM: dma-mapping: rename and export iommu_ops
>>drm/exynos: attach drm device with common drm mapping
>>drm/exynos: Add exynos drm specific fb_mmap function
>>drm/exynos: Add IOMMU support for mapping gem object
>>
>>   arch/arm/boot/dts/exynos5250.dtsi |1 +
>>   arch/arm/include/asm/dma-mapping.h|1 +
>>   arch/arm/mm/dma-mapping.c |5 ++-
>>   drivers/gpu/drm/exynos/exynos_drm_core.c  |3 ++
>>   drivers/gpu/drm/exynos/exynos_drm_drv.c   |   30 
>>   drivers/gpu/drm/exynos/exynos_drm_drv.h   |   10 +
>>   drivers/gpu/drm/exynos/exynos_drm_fbdev.c |   16 
>>   drivers/gpu/drm/exynos/exynos_drm_fimd.c  |   54
>> -
>>   drivers/gpu/drm/exynos/exynos_drm_gem.c   |   35 --
>>   9 files changed, 133 insertions(+), 22 deletions(-)



[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3

2012-07-13 Thread Christian König
On 12.07.2012 18:36, Alex Deucher wrote:
> On Thu, Jul 12, 2012 at 12:12 PM, Christian K?nig
>  wrote:
>> Before emitting any indirect buffer, emit the offset of the next
>> valid ring content if any. This allow code that want to resume
>> ring to resume ring right after ib that caused GPU lockup.
>>
>> v2: use scratch registers instead of storing it into memory
>> v3: skip over the surface sync for ni and si as well
>>
>> Signed-off-by: Jerome Glisse 
>> Signed-off-by: Christian K?nig 
>> ---
>>   drivers/gpu/drm/radeon/evergreen.c   |8 +++-
>>   drivers/gpu/drm/radeon/ni.c  |   11 ++-
>>   drivers/gpu/drm/radeon/r600.c|   18 --
>>   drivers/gpu/drm/radeon/radeon.h  |1 +
>>   drivers/gpu/drm/radeon/radeon_ring.c |4 
>>   drivers/gpu/drm/radeon/rv770.c   |4 +++-
>>   drivers/gpu/drm/radeon/si.c  |   22 +++---
>>   7 files changed, 60 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/evergreen.c 
>> b/drivers/gpu/drm/radeon/evergreen.c
>> index f39b900..40de347 100644
>> --- a/drivers/gpu/drm/radeon/evergreen.c
>> +++ b/drivers/gpu/drm/radeon/evergreen.c
>> @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device 
>> *rdev, struct radeon_ib *ib)
>>  /* set to DX10/11 mode */
>>  radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
>>  radeon_ring_write(ring, 1);
>> -   /* FIXME: implement */
>> +
>> +   if (ring->rptr_save_reg) {
>> +   uint32_t next_rptr = ring->wptr + 2 + 4;
>> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
>> +   radeon_ring_write(ring, next_rptr);
>> +   }
> On r600 and newer please use SET_CONFIG_REG rather than Packet0.
Why? Please note that it's on purpose that this doesn't interfere with 
the top/bottom of pipe handling and the draw commands, e.g. the register 
write isn't associated with drawing but instead just marks the beginning 
of parsing the IB.

Christian.
>
> Alex
>
>> +
>>  radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
>>  radeon_ring_write(ring,
>>   #ifdef __BIG_ENDIAN
>> diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
>> index f2afefb..5b7ce2c 100644
>> --- a/drivers/gpu/drm/radeon/ni.c
>> +++ b/drivers/gpu/drm/radeon/ni.c
>> @@ -855,6 +855,13 @@ void cayman_ring_ib_execute(struct radeon_device *rdev, 
>> struct radeon_ib *ib)
>>  /* set to DX10/11 mode */
>>  radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
>>  radeon_ring_write(ring, 1);
>> +
>> +   if (ring->rptr_save_reg) {
>> +   uint32_t next_rptr = ring->wptr + 2 + 4 + 8;
>> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
>> +   radeon_ring_write(ring, next_rptr);
>> +   }
>> +
>>  radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
>>  radeon_ring_write(ring,
>>   #ifdef __BIG_ENDIAN
>> @@ -981,8 +988,10 @@ static int cayman_cp_start(struct radeon_device *rdev)
>>
>>   static void cayman_cp_fini(struct radeon_device *rdev)
>>   {
>> +   struct radeon_ring *ring = >ring[RADEON_RING_TYPE_GFX_INDEX];
>>  cayman_cp_enable(rdev, false);
>> -   radeon_ring_fini(rdev, >ring[RADEON_RING_TYPE_GFX_INDEX]);
>> +   radeon_ring_fini(rdev, ring);
>> +   radeon_scratch_free(rdev, ring->rptr_save_reg);
>>   }
>>
>>   int cayman_cp_resume(struct radeon_device *rdev)
>> diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
>> index c808fa9..74fca15 100644
>> --- a/drivers/gpu/drm/radeon/r600.c
>> +++ b/drivers/gpu/drm/radeon/r600.c
>> @@ -2155,18 +2155,27 @@ int r600_cp_resume(struct radeon_device *rdev)
>>   void r600_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, 
>> unsigned ring_size)
>>   {
>>  u32 rb_bufsz;
>> +   int r;
>>
>>  /* Align ring size */
>>  rb_bufsz = drm_order(ring_size / 8);
>>  ring_size = (1 << (rb_bufsz + 1)) * 4;
>>  ring->ring_size = ring_size;
>>  ring->align_mask = 16 - 1;
>> +
>> +   r = radeon_scratch_get(rdev, >rptr_save_reg);
>> +   if (r) {
>> +   DRM_ERROR("failed to get scratch reg for rptr save (%d).\n", 
>> r);
>> +   ring->rptr_save_reg = 0;
>> +   }
>>   }
>>
>>   void r600_cp_fini(struct radeon_device *rdev)
>>   {
>> +   struct radeon_ring *ring = >ring[RADEON_RING_TYPE_GFX_INDEX];
>>  r600_cp_stop(rdev);
>> -   radeon_ring_fini(rdev, >ring[RADEON_RING_TYPE_GFX_INDEX]);
>> +   radeon_ring_fini(rdev, ring);
>> +   radeon_scratch_free(rdev, ring->rptr_save_reg);
>>   }
>>
>>
>> @@ -2568,7 +2577,12 @@ void r600_ring_ib_execute(struct radeon_device *rdev, 
>> struct radeon_ib *ib)
>>   {
>>  struct radeon_ring *ring = >ring[ib->ring];
>>
>> -   /* FIXME: implement */
>> +   if (ring->rptr_save_reg) {
>> +

[PATCH 3/3] drm/radeon: fix const IB handling

2012-07-13 Thread Jerome Glisse
On Fri, Jul 13, 2012 at 10:08 AM, Christian K?nig
 wrote:
> Const IBs are executed on the CE not the CP, so we can't
> fence them in the normal way.
>
> So submit them directly before the IB instead, just as
> the documentation says.
>
> Signed-off-by: Christian K?nig 
> ---
>  drivers/gpu/drm/radeon/r100.c|2 +-
>  drivers/gpu/drm/radeon/r600.c|2 +-
>  drivers/gpu/drm/radeon/radeon.h  |3 ++-
>  drivers/gpu/drm/radeon/radeon_cs.c   |   25 +++--
>  drivers/gpu/drm/radeon/radeon_ring.c |   10 +-
>  5 files changed, 24 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
> index e0f5ae8..4ee5a74 100644
> --- a/drivers/gpu/drm/radeon/r100.c
> +++ b/drivers/gpu/drm/radeon/r100.c
> @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct 
> radeon_ring *ring)
> ib.ptr[6] = PACKET2(0);
> ib.ptr[7] = PACKET2(0);
> ib.length_dw = 8;
> -   r = radeon_ib_schedule(rdev, );
> +   r = radeon_ib_schedule(rdev, , NULL);
> if (r) {
> radeon_scratch_free(rdev, scratch);
> radeon_ib_free(rdev, );
> diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
> index 3156d25..c2e5069 100644
> --- a/drivers/gpu/drm/radeon/r600.c
> +++ b/drivers/gpu/drm/radeon/r600.c
> @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct 
> radeon_ring *ring)
> ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2);
> ib.ptr[2] = 0xDEADBEEF;
> ib.length_dw = 3;
> -   r = radeon_ib_schedule(rdev, );
> +   r = radeon_ib_schedule(rdev, , NULL);
> if (r) {
> radeon_scratch_free(rdev, scratch);
> radeon_ib_free(rdev, );
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 2cb355b..2d7f06c 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -751,7 +751,8 @@ struct si_rlc {
>  int radeon_ib_get(struct radeon_device *rdev, int ring,
>   struct radeon_ib *ib, unsigned size);
>  void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib);
> -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
> +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
> +  struct radeon_ib *const_ib);
>  int radeon_ib_pool_init(struct radeon_device *rdev);
>  void radeon_ib_pool_fini(struct radeon_device *rdev);
>  int radeon_ib_ring_tests(struct radeon_device *rdev);
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
> b/drivers/gpu/drm/radeon/radeon_cs.c
> index 553da67..d0be5d5 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
> }
> radeon_cs_sync_rings(parser);
> parser->ib.vm_id = 0;
> -   r = radeon_ib_schedule(rdev, >ib);
> +   r = radeon_ib_schedule(rdev, >ib, NULL);
> if (r) {
> DRM_ERROR("Failed to schedule IB !\n");
> }
> @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device 
> *rdev,
> }
> radeon_cs_sync_rings(parser);
>
> +   parser->ib.vm_id = vm->id;
> +   /* ib pool is bind at 0 in virtual address space,
> +* so gpu_addr is the offset inside the pool bo
> +*/
> +   parser->ib.gpu_addr = parser->ib.sa_bo->soffset;
> +
> if ((rdev->family >= CHIP_TAHITI) &&
> (parser->chunk_const_ib_idx != -1)) {
> parser->const_ib.vm_id = vm->id;
> -   /* ib pool is bind at 0 in virtual address space to gpu_addr 
> is the
> -* offset inside the pool bo
> -*/
> +   /* same reason as above */

Don't remove comment, code might move and the above comment might not
be the same better to duplicate comment then trying to cross reference
comment across file.

> parser->const_ib.gpu_addr = parser->const_ib.sa_bo->soffset;
> -   r = radeon_ib_schedule(rdev, >const_ib);
> -   if (r)
> -   goto out;
> +   r = radeon_ib_schedule(rdev, >ib, >const_ib);
> +   } else {
> +   r = radeon_ib_schedule(rdev, >ib, NULL);
> }
>
> -   parser->ib.vm_id = vm->id;
> -   /* ib pool is bind at 0 in virtual address space to gpu_addr is the
> -* offset inside the pool bo
> -*/
> -   parser->ib.gpu_addr = parser->ib.sa_bo->soffset;
> -   parser->ib.is_const_ib = false;
> -   r = radeon_ib_schedule(rdev, >ib);
>  out:
> if (!r) {
> if (vm->fence) {
> diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
> b/drivers/gpu/drm/radeon/radeon_ring.c
> index 75cbe46..c48c354 100644
> --- a/drivers/gpu/drm/radeon/radeon_ring.c
> +++ 

[RFC] dma-fence: dma-buf synchronization (v2)

2012-07-13 Thread Rob Clark
From: Rob Clark 

A dma-fence can be attached to a buffer which is being filled or consumed
by hw, to allow userspace to pass the buffer without waiting to another
device.  For example, userspace can call page_flip ioctl to display the
next frame of graphics after kicking the GPU but while the GPU is still
rendering.  The display device sharing the buffer with the GPU would
attach a callback to get notified when the GPU's rendering-complete IRQ
fires, to update the scan-out address of the display, without having to
wake up userspace.

A dma-fence is transient, one-shot deal.  It is allocated and attached
to dma-buf's list of fences.  When the one that attached it is done,
with the pending operation, it can signal the fence removing it from the
dma-buf's list of fences:

  + dma_buf_attach_fence()
  + dma_fence_signal()

Other drivers can access the current fence on the dma-buf (if any),
which increment's the fences refcnt:

  + dma_buf_get_fence()
  + dma_fence_put()

The one pending on the fence can add an async callback (and optionally
cancel it.. for example, to recover from GPU hangs):

  + dma_fence_add_callback()
  + dma_fence_cancel_callback()

Or wait synchronously (optionally with timeout or from atomic context):

  + dma_fence_wait()

A default software-only implementation is provided, which can be used
by drivers attaching a fence to a buffer when they have no other means
for hw sync.  But a memory backed fence is also envisioned, because it
is common that GPU's can write to, or poll on some memory location for
synchronization.  For example:

  fence = dma_buf_get_fence(dmabuf);
  if (fence->ops == _dma_fence_ops) {
dma_buf *fence_buf;
mem_dma_fence_get_buf(fence, _buf, );
... tell the hw the memory location to wait on ...
  } else {
/* fall-back to sw sync * /
dma_fence_add_callback(fence, my_cb);
  }

The memory location is itself backed by dma-buf, to simplify mapping
to the device's address space, an idea borrowed from Maarten Lankhorst.

NOTE: the memory location fence is not implemented yet, the above is
just for explaining how it would work.

On SoC platforms, if some other hw mechanism is provided for synchronizing
between IP blocks, it could be supported as an alternate implementation
with it's own fence ops in a similar way.

The other non-sw implementations would wrap the add/cancel_callback and
wait fence ops, so that they can keep track if a device not supporting
hw sync is waiting on the fence, and in this case should arrange to
call dma_fence_signal() at some point after the condition has changed,
to notify other devices waiting on the fence.  If there are no sw
waiters, this can be skipped to avoid waking the CPU unnecessarily.

The intention is to provide a userspace interface (presumably via eventfd)
later, to be used in conjunction with dma-buf's mmap support for sw access
to buffers (or for userspace apps that would prefer to do their own
synchronization).

v1: original
v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided
that dma-fence didn't need to care about the sw->hw signaling path
(it can be handled same as sw->sw case), and therefore the fence->ops
can be simplified and more handled in the core.  So remove the signal,
add_callback, cancel_callback, and wait ops, and replace with a simple
enable_signaling() op which can be used to inform a fence supporting
hw->hw signaling that one or more devices which do not support hw
signaling are waiting (and therefore it should enable an irq or do
whatever is necessary in order that the CPU is notified when the
fence is passed).
---
 drivers/base/Makefile |2 +-
 drivers/base/dma-buf.c|3 +
 drivers/base/dma-fence.c  |  364 +
 include/linux/dma-buf.h   |2 +
 include/linux/dma-fence.h |  128 
 5 files changed, 498 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/dma-fence.c
 create mode 100644 include/linux/dma-fence.h

diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 5aa2d70..6e9f217 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o
 obj-y  += power/
 obj-$(CONFIG_HAS_DMA)  += dma-mapping.o
 obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o
-obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o
+obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o
 obj-$(CONFIG_ISA)  += isa.o
 obj-$(CONFIG_FW_LOADER)+= firmware_class.o
 obj-$(CONFIG_NUMA) += node.o
diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index 24e88fe..b053236 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -39,6 +39,8 @@ static int dma_buf_release(struct inode *inode, struct file 
*file)

dmabuf = file->private_data;

+   WARN_ON(!list_empty(>fence_list));
+
dmabuf->ops->release(dmabuf);
kfree(dmabuf);
return 

[PATCH 3/3] drm/radeon: fix const IB handling

2012-07-13 Thread Tom Stellard
On Fri, Jul 13, 2012 at 04:08:15PM +0200, Christian K?nig wrote:
> Const IBs are executed on the CE not the CP, so we can't
> fence them in the normal way.
> 
> So submit them directly before the IB instead, just as
> the documentation says.
> 
> Signed-off-by: Christian K?nig 
> ---
>  drivers/gpu/drm/radeon/r100.c|2 +-
>  drivers/gpu/drm/radeon/r600.c|2 +-
>  drivers/gpu/drm/radeon/radeon.h  |3 ++-
>  drivers/gpu/drm/radeon/radeon_cs.c   |   25 +++--
>  drivers/gpu/drm/radeon/radeon_ring.c |   10 +-
>  5 files changed, 24 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
> index e0f5ae8..4ee5a74 100644
> --- a/drivers/gpu/drm/radeon/r100.c
> +++ b/drivers/gpu/drm/radeon/r100.c
> @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct 
> radeon_ring *ring)
>   ib.ptr[6] = PACKET2(0);
>   ib.ptr[7] = PACKET2(0);
>   ib.length_dw = 8;
> - r = radeon_ib_schedule(rdev, );
> + r = radeon_ib_schedule(rdev, , NULL);
>   if (r) {
>   radeon_scratch_free(rdev, scratch);
>   radeon_ib_free(rdev, );
> diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
> index 3156d25..c2e5069 100644
> --- a/drivers/gpu/drm/radeon/r600.c
> +++ b/drivers/gpu/drm/radeon/r600.c
> @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct 
> radeon_ring *ring)
>   ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2);
>   ib.ptr[2] = 0xDEADBEEF;
>   ib.length_dw = 3;
> - r = radeon_ib_schedule(rdev, );
> + r = radeon_ib_schedule(rdev, , NULL);
>   if (r) {
>   radeon_scratch_free(rdev, scratch);
>   radeon_ib_free(rdev, );
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 2cb355b..2d7f06c 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -751,7 +751,8 @@ struct si_rlc {
>  int radeon_ib_get(struct radeon_device *rdev, int ring,
> struct radeon_ib *ib, unsigned size);
>  void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib);
> -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
> +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
> +struct radeon_ib *const_ib);
>  int radeon_ib_pool_init(struct radeon_device *rdev);
>  void radeon_ib_pool_fini(struct radeon_device *rdev);
>  int radeon_ib_ring_tests(struct radeon_device *rdev);
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
> b/drivers/gpu/drm/radeon/radeon_cs.c
> index 553da67..d0be5d5 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
>   }
>   radeon_cs_sync_rings(parser);
>   parser->ib.vm_id = 0;
> - r = radeon_ib_schedule(rdev, >ib);
> + r = radeon_ib_schedule(rdev, >ib, NULL);
>   if (r) {
>   DRM_ERROR("Failed to schedule IB !\n");
>   }
> @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device 
> *rdev,
>   }
>   radeon_cs_sync_rings(parser);
>  
> + parser->ib.vm_id = vm->id;
> + /* ib pool is bind at 0 in virtual address space,
> +  * so gpu_addr is the offset inside the pool bo
> +  */
> + parser->ib.gpu_addr = parser->ib.sa_bo->soffset;
> +
>   if ((rdev->family >= CHIP_TAHITI) &&
>   (parser->chunk_const_ib_idx != -1)) {
>   parser->const_ib.vm_id = vm->id;
> - /* ib pool is bind at 0 in virtual address space to gpu_addr is 
> the
> -  * offset inside the pool bo
> -  */
> + /* same reason as above */
>   parser->const_ib.gpu_addr = parser->const_ib.sa_bo->soffset;
> - r = radeon_ib_schedule(rdev, >const_ib);
> - if (r)
> - goto out;
> + r = radeon_ib_schedule(rdev, >ib, >const_ib);
> + } else {
> + r = radeon_ib_schedule(rdev, >ib, NULL);
>   }
>  
> - parser->ib.vm_id = vm->id;
> - /* ib pool is bind at 0 in virtual address space to gpu_addr is the
> -  * offset inside the pool bo
> -  */
> - parser->ib.gpu_addr = parser->ib.sa_bo->soffset;
> - parser->ib.is_const_ib = false;
> - r = radeon_ib_schedule(rdev, >ib);
>  out:
>   if (!r) {
>   if (vm->fence) {
> diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
> b/drivers/gpu/drm/radeon/radeon_ring.c
> index 75cbe46..c48c354 100644
> --- a/drivers/gpu/drm/radeon/radeon_ring.c
> +++ b/drivers/gpu/drm/radeon/radeon_ring.c
> @@ -74,7 +74,8 @@ void radeon_ib_free(struct radeon_device *rdev, struct 
> radeon_ib *ib)
>   radeon_fence_unref(>fence);
>  }
>

> -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib)
> +int radeon_ib_schedule(struct 

[PATCH 2/3] drm/radeon: let sa manager block for fences to wait for

2012-07-13 Thread Tom Stellard
On Fri, Jul 13, 2012 at 04:08:14PM +0200, Christian K?nig wrote:
> Otherwise we can encounter out of memory situations under extreme load.
> 
> Signed-off-by: Christian K?nig 
> ---
>  drivers/gpu/drm/radeon/radeon.h|2 +-
>  drivers/gpu/drm/radeon/radeon_sa.c |   72 
> +---
>  2 files changed, 51 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 6715e4c..2cb355b 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -362,7 +362,7 @@ struct radeon_bo_list {
>   * alignment).
>   */
>  struct radeon_sa_manager {
> - spinlock_t  lock;
> + wait_queue_head_t   wq;
>   struct radeon_bo*bo;
>   struct list_head*hole;
>   struct list_headflist[RADEON_NUM_RINGS];
> diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
> b/drivers/gpu/drm/radeon/radeon_sa.c
> index 81dbb5b..b535fc4 100644
> --- a/drivers/gpu/drm/radeon/radeon_sa.c
> +++ b/drivers/gpu/drm/radeon/radeon_sa.c
> @@ -54,7 +54,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev,
>  {
>   int i, r;
>  
> - spin_lock_init(_manager->lock);
> + init_waitqueue_head(_manager->wq);
>   sa_manager->bo = NULL;
>   sa_manager->size = size;
>   sa_manager->domain = domain;
> @@ -211,6 +211,29 @@ static bool radeon_sa_bo_try_alloc(struct 
> radeon_sa_manager *sa_manager,
>   return false;
>  }
>
> +static bool radeon_sa_event(struct radeon_sa_manager *sa_manager,
> + unsigned size, unsigned align)
> +{
> + unsigned soffset, eoffset, wasted;
> + int i;
> +
> + for (i = 0; i < RADEON_NUM_RINGS; ++i) {
> + if (!list_empty(_manager->flist[i])) {
> + return true;
> + }
> + }
> +
> + soffset = radeon_sa_bo_hole_soffset(sa_manager);
> + eoffset = radeon_sa_bo_hole_eoffset(sa_manager);
> + wasted = (align - (soffset % align)) % align;
> +
> + if ((eoffset - soffset) >= (size + wasted)) {
> + return true;
> + }
> +
> + return false;
> +}
> +

This new function should come with a comment, per the new documentation
rules.

>  static bool radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager,
>  struct radeon_fence **fences,
>  unsigned *tries)
> @@ -297,8 +320,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
>   INIT_LIST_HEAD(&(*sa_bo)->olist);
>   INIT_LIST_HEAD(&(*sa_bo)->flist);
>  
> - spin_lock(_manager->lock);
> - do {
> + spin_lock(_manager->wq.lock);
> + while(1) {
>   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
>   fences[i] = NULL;
>   tries[i] = 0;
> @@ -309,30 +332,34 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
>  
>   if (radeon_sa_bo_try_alloc(sa_manager, *sa_bo,
>  size, align)) {
> - spin_unlock(_manager->lock);
> + spin_unlock(_manager->wq.lock);
>   return 0;
>   }
>  
>   /* see if we can skip over some allocations */
>   } while (radeon_sa_bo_next_hole(sa_manager, fences, tries));
>  
> - if (block) {
> - spin_unlock(_manager->lock);
> - r = radeon_fence_wait_any(rdev, fences, false);
> - spin_lock(_manager->lock);
> - if (r) {
> - /* if we have nothing to wait for we
> -are practically out of memory */
> - if (r == -ENOENT) {
> - r = -ENOMEM;
> - }
> - goto out_err;
> - }
> + if (!block) {
> + break;
> + }
> +
> + spin_unlock(_manager->wq.lock);
> + r = radeon_fence_wait_any(rdev, fences, false);
> + spin_lock(_manager->wq.lock);
> + /* if we have nothing to wait for block */
> + if (r == -ENOENT) {
> + r = wait_event_interruptible_locked(
> + sa_manager->wq, 
> + radeon_sa_event(sa_manager, size, align)
> + );
> + }
> + if (r) {
> + goto out_err;
>   }
> - } while (block);
> + };
>  
>  out_err:
> - spin_unlock(_manager->lock);
> + spin_unlock(_manager->wq.lock);
>   kfree(*sa_bo);
>   *sa_bo = NULL;
>   return r;
> @@ -348,7 +375,7 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct 
> radeon_sa_bo **sa_bo,
>   }
>  
>   sa_manager = (*sa_bo)->manager;
> - 

[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3

2012-07-13 Thread Alex Deucher
On Fri, Jul 13, 2012 at 9:46 AM, Christian K?nig
 wrote:
> On 13.07.2012 14:27, Alex Deucher wrote:
>>
>> On Fri, Jul 13, 2012 at 5:09 AM, Christian K?nig
>>  wrote:
>>>
>>> On 12.07.2012 18:36, Alex Deucher wrote:

 On Thu, Jul 12, 2012 at 12:12 PM, Christian K?nig
  wrote:
>
> Before emitting any indirect buffer, emit the offset of the next
> valid ring content if any. This allow code that want to resume
> ring to resume ring right after ib that caused GPU lockup.
>
> v2: use scratch registers instead of storing it into memory
> v3: skip over the surface sync for ni and si as well
>
> Signed-off-by: Jerome Glisse 
> Signed-off-by: Christian K?nig 
> ---
>drivers/gpu/drm/radeon/evergreen.c   |8 +++-
>drivers/gpu/drm/radeon/ni.c  |   11 ++-
>drivers/gpu/drm/radeon/r600.c|   18 --
>drivers/gpu/drm/radeon/radeon.h  |1 +
>drivers/gpu/drm/radeon/radeon_ring.c |4 
>drivers/gpu/drm/radeon/rv770.c   |4 +++-
>drivers/gpu/drm/radeon/si.c  |   22 +++---
>7 files changed, 60 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/evergreen.c
> b/drivers/gpu/drm/radeon/evergreen.c
> index f39b900..40de347 100644
> --- a/drivers/gpu/drm/radeon/evergreen.c
> +++ b/drivers/gpu/drm/radeon/evergreen.c
> @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct
> radeon_device *rdev, struct radeon_ib *ib)
>   /* set to DX10/11 mode */
>   radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
>   radeon_ring_write(ring, 1);
> -   /* FIXME: implement */
> +
> +   if (ring->rptr_save_reg) {
> +   uint32_t next_rptr = ring->wptr + 2 + 4;
> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg,
> 0));
> +   radeon_ring_write(ring, next_rptr);
> +   }

 On r600 and newer please use SET_CONFIG_REG rather than Packet0.
>>>
>>> Why? Please note that it's on purpose that this doesn't interfere with
>>> the
>>> top/bottom of pipe handling and the draw commands, e.g. the register
>>> write
>>> isn't associated with drawing but instead just marks the beginning of
>>> parsing the IB.
>>
>> Packet0's are have been semi-deprecated since r600.  They still work,
>> but the CP guys recommend using the appropriate packet3 whenever
>> possible.
>
> Ok, that makes sense.
>
> Any further comments on the patchset, or can I send that to Dave for merging
> now?

Other than that, it looks good to me.  For the series:

Reviewed-by: Alex Deucher 


[PATCH] drm/radeon: fix bo creation retry path

2012-07-13 Thread Christian König
On 13.07.2012 00:23, j.glisse at gmail.com wrote:
> From: Jerome Glisse 
>
> Retry label was at wrong place in function leading to memory
> leak.
>
> Cc: 
> Signed-off-by: Jerome Glisse 
Reviewed-by: Christian K?nig 
> ---
>   drivers/gpu/drm/radeon/radeon_object.c |3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
> b/drivers/gpu/drm/radeon/radeon_object.c
> index 6ecb200..f71e472 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -138,7 +138,6 @@ int radeon_bo_create(struct radeon_device *rdev,
>   acc_size = ttm_bo_dma_acc_size(>mman.bdev, size,
>  sizeof(struct radeon_bo));
>   
> -retry:
>   bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL);
>   if (bo == NULL)
>   return -ENOMEM;
> @@ -152,6 +151,8 @@ retry:
>   bo->surface_reg = -1;
>   INIT_LIST_HEAD(>list);
>   INIT_LIST_HEAD(>va);
> +
> +retry:
>   radeon_ttm_placement_from_domain(bo, domain);
>   /* Kernel allocation are uninterruptible */
>   down_read(>pm.mclk_lock);




[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3

2012-07-13 Thread Alex Deucher
On Fri, Jul 13, 2012 at 5:09 AM, Christian K?nig
 wrote:
> On 12.07.2012 18:36, Alex Deucher wrote:
>>
>> On Thu, Jul 12, 2012 at 12:12 PM, Christian K?nig
>>  wrote:
>>>
>>> Before emitting any indirect buffer, emit the offset of the next
>>> valid ring content if any. This allow code that want to resume
>>> ring to resume ring right after ib that caused GPU lockup.
>>>
>>> v2: use scratch registers instead of storing it into memory
>>> v3: skip over the surface sync for ni and si as well
>>>
>>> Signed-off-by: Jerome Glisse 
>>> Signed-off-by: Christian K?nig 
>>> ---
>>>   drivers/gpu/drm/radeon/evergreen.c   |8 +++-
>>>   drivers/gpu/drm/radeon/ni.c  |   11 ++-
>>>   drivers/gpu/drm/radeon/r600.c|   18 --
>>>   drivers/gpu/drm/radeon/radeon.h  |1 +
>>>   drivers/gpu/drm/radeon/radeon_ring.c |4 
>>>   drivers/gpu/drm/radeon/rv770.c   |4 +++-
>>>   drivers/gpu/drm/radeon/si.c  |   22 +++---
>>>   7 files changed, 60 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/evergreen.c
>>> b/drivers/gpu/drm/radeon/evergreen.c
>>> index f39b900..40de347 100644
>>> --- a/drivers/gpu/drm/radeon/evergreen.c
>>> +++ b/drivers/gpu/drm/radeon/evergreen.c
>>> @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct
>>> radeon_device *rdev, struct radeon_ib *ib)
>>>  /* set to DX10/11 mode */
>>>  radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
>>>  radeon_ring_write(ring, 1);
>>> -   /* FIXME: implement */
>>> +
>>> +   if (ring->rptr_save_reg) {
>>> +   uint32_t next_rptr = ring->wptr + 2 + 4;
>>> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
>>> +   radeon_ring_write(ring, next_rptr);
>>> +   }
>>
>> On r600 and newer please use SET_CONFIG_REG rather than Packet0.
>
> Why? Please note that it's on purpose that this doesn't interfere with the
> top/bottom of pipe handling and the draw commands, e.g. the register write
> isn't associated with drawing but instead just marks the beginning of
> parsing the IB.

Packet0's are have been semi-deprecated since r600.  They still work,
but the CP guys recommend using the appropriate packet3 whenever
possible.

Alex


[PATCH] drm/radeon: fix bo creation retry path

2012-07-13 Thread Michel Dänzer
On Don, 2012-07-12 at 18:23 -0400, j.glisse at gmail.com wrote: 
> From: Jerome Glisse 
> 
> Retry label was at wrong place in function leading to memory
> leak.
> 
> Cc: 
> Signed-off-by: Jerome Glisse 
> ---
>  drivers/gpu/drm/radeon/radeon_object.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
> b/drivers/gpu/drm/radeon/radeon_object.c
> index 6ecb200..f71e472 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -138,7 +138,6 @@ int radeon_bo_create(struct radeon_device *rdev,
>   acc_size = ttm_bo_dma_acc_size(>mman.bdev, size,
>  sizeof(struct radeon_bo));
>  
> -retry:
>   bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL);
>   if (bo == NULL)
>   return -ENOMEM;
> @@ -152,6 +151,8 @@ retry:
>   bo->surface_reg = -1;
>   INIT_LIST_HEAD(>list);
>   INIT_LIST_HEAD(>va);
> +
> +retry:
>   radeon_ttm_placement_from_domain(bo, domain);
>   /* Kernel allocation are uninterruptible */
>   down_read(>pm.mclk_lock);

Reviewed-by: Michel D?nzer 


-- 
Earthling Michel D?nzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer


[PATCH] Documentation: DocBook DRM framework documentation

2012-07-13 Thread Laurent Pinchart
Signed-off-by: Laurent Pinchart 
---
 Documentation/DocBook/drm.tmpl | 2835 +++-
 1 files changed, 2226 insertions(+), 609 deletions(-)

Hi everybody,

Here's the DRM kernel framework documentation previously posted to the
dri-devel mailing list. The documentation has been reworked, converted to
DocBook and merged with the existing DocBook DRM documentation stub. The
result doesn't cover the whole DRM API but should hopefully be good enough
for a start.

I've done my best to follow a natural flow starting at initialization and
covering the major DRM internal topics. As I'm not a native English speaker
I'm not totally happy with the result, so if anyone wants to edit the text
please feel free to do so. Review will as usual be appreciated, and acks will
be even more welcome (I've been working on this document for longer than I
feel comfortable with).

diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl
index 196b8b9..44a2c66 100644
--- a/Documentation/DocBook/drm.tmpl
+++ b/Documentation/DocBook/drm.tmpl
@@ -6,11 +6,36 @@
   
 Linux DRM Developer's Guide

+
+  
+   Jesse
+   Barnes
+   Initial version
+   
+ Intel Corporation
+ 
+   jesse.barnes at intel.com
+ 
+   
+  
+  
+   Laurent
+   Pinchart
+   Driver internals
+   
+ Ideas on board SPRL
+ 
+   laurent.pinchart at ideasonboard.com
+ 
+   
+  
+
+
 
   2008-2009
-  
-   Intel Corporation (Jesse Barnes jesse.barnes at intel.com)
-  
+  2012
+  Intel Corporation
+  Laurent Pinchart
 

 
@@ -20,6 +45,17 @@
the kernel source COPYING file.
   
 
+
+
+  
+  
+   1.0
+   2012-07-13
+   LP
+   Added extensive documentation about driver internals.
+   
+  
+
   

 
@@ -72,342 +108,361 @@
   submission  fencing, suspend/resume support, and DMA
   services.
 
-
-  The core of every DRM driver is struct drm_driver.  Drivers
-  typically statically initialize a drm_driver structure,
-  then pass it to drm_init() at load time.
-

   

   
-Driver initialization
-
-  Before calling the DRM initialization routines, the driver must
-  first create and fill out a struct drm_driver structure.
-
-
-  static struct drm_driver driver = {
-   /* Don't use MTRRs here; the Xserver or userspace app should
-* deal with them for Intel hardware.
-*/
-   .driver_features =
-   DRIVER_USE_AGP | DRIVER_REQUIRE_AGP |
-   DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_MODESET,
-   .load = i915_driver_load,
-   .unload = i915_driver_unload,
-   .firstopen = i915_driver_firstopen,
-   .lastclose = i915_driver_lastclose,
-   .preclose = i915_driver_preclose,
-   .save = i915_save,
-   .restore = i915_restore,
-   .device_is_agp = i915_driver_device_is_agp,
-   .get_vblank_counter = i915_get_vblank_counter,
-   .enable_vblank = i915_enable_vblank,
-   .disable_vblank = i915_disable_vblank,
-   .irq_preinstall = i915_driver_irq_preinstall,
-   .irq_postinstall = i915_driver_irq_postinstall,
-   .irq_uninstall = i915_driver_irq_uninstall,
-   .irq_handler = i915_driver_irq_handler,
-   .reclaim_buffers = drm_core_reclaim_buffers,
-   .get_map_ofs = drm_core_get_map_ofs,
-   .get_reg_ofs = drm_core_get_reg_ofs,
-   .fb_probe = intelfb_probe,
-   .fb_remove = intelfb_remove,
-   .fb_resize = intelfb_resize,
-   .master_create = i915_master_create,
-   .master_destroy = i915_master_destroy,
-#if defined(CONFIG_DEBUG_FS)
-   .debugfs_init = i915_debugfs_init,
-   .debugfs_cleanup = i915_debugfs_cleanup,
-#endif
-   .gem_init_object = i915_gem_init_object,
-   .gem_free_object = i915_gem_free_object,
-   .gem_vm_ops = i915_gem_vm_ops,
-   .ioctls = i915_ioctls,
-   .fops = {
-   .owner = THIS_MODULE,
-   .open = drm_open,
-   .release = drm_release,
-   .ioctl = drm_ioctl,
-   .mmap = drm_mmap,
-   .poll = drm_poll,
-   .fasync = drm_fasync,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl = i915_compat_ioctl,
-#endif
-   .llseek = noop_llseek,
-   },
-   .pci_driver = {
-   .name = DRIVER_NAME,
-   .id_table = pciidlist,
-   .probe = probe,
-   .remove = __devexit_p(drm_cleanup_pci),
-   },
-   .name = DRIVER_NAME,
-   .desc = DRIVER_DESC,
-   .date = DRIVER_DATE,
-   .major = DRIVER_MAJOR,
-   .minor = DRIVER_MINOR,
-   .patchlevel = DRIVER_PATCHLEVEL,
-  };
-
-
-  In the example above, taken from the i915 DRM driver, the driver
-  sets several flags indicating what core features 

Re: [PATCH] drm/radeon: fix bo creation retry path

2012-07-13 Thread Michel Dänzer
On Don, 2012-07-12 at 18:23 -0400, j.gli...@gmail.com wrote: 
 From: Jerome Glisse jgli...@redhat.com
 
 Retry label was at wrong place in function leading to memory
 leak.
 
 Cc: sta...@vger.kernel.org
 Signed-off-by: Jerome Glisse jgli...@redhat.com
 ---
  drivers/gpu/drm/radeon/radeon_object.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
 b/drivers/gpu/drm/radeon/radeon_object.c
 index 6ecb200..f71e472 100644
 --- a/drivers/gpu/drm/radeon/radeon_object.c
 +++ b/drivers/gpu/drm/radeon/radeon_object.c
 @@ -138,7 +138,6 @@ int radeon_bo_create(struct radeon_device *rdev,
   acc_size = ttm_bo_dma_acc_size(rdev-mman.bdev, size,
  sizeof(struct radeon_bo));
  
 -retry:
   bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL);
   if (bo == NULL)
   return -ENOMEM;
 @@ -152,6 +151,8 @@ retry:
   bo-surface_reg = -1;
   INIT_LIST_HEAD(bo-list);
   INIT_LIST_HEAD(bo-va);
 +
 +retry:
   radeon_ttm_placement_from_domain(bo, domain);
   /* Kernel allocation are uninterruptible */
   down_read(rdev-pm.mclk_lock);

Reviewed-by: Michel Dänzer michel.daen...@amd.com


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM

2012-07-13 Thread Inki Dae


 -Original Message-
 From: Prathyush K [mailto:prathyus...@samsung.com]
 Sent: Wednesday, July 11, 2012 6:40 PM
 To: dri-devel@lists.freedesktop.org
 Cc: prathy...@chromium.org; m.szyprow...@samsung.com;
inki@samsung.com;
 subash.ramasw...@linaro.org
 Subject: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM
 
 The dma-mapping framework needs a IOMMU mapping to be created for the
 device which allocates/maps/frees the non-contig buffer. In the DRM
 framework, a gem buffer is created by the DRM virtual device and not
 directly by any of the physical devices (FIMD, HDMI etc). Each gem object
 can be set as a framebuffer to one or many of the drm devices. So a gem
 object cannot be allocated for any one device. All the DRM devices should
 be able to access this buffer.


It's good to use unified iommu table so I agree to your opinion but we don't
decide whether we use dma mapping api or not. now dma mapping api has one
issue.
in case of using iommu with dma mapping api, we couldn't use physically
contiguous memory region with iommu. for this, there is a case that we
should use physically contiguous memory region with iommu. it is because we
sometime may use mfc(hw video codec) with secure zone such as ARM TrustZone.
Then, it needs physically contiguous memory region.

Thanks,
Inki Dae

 The proposed method is to create a common IOMMU mapping during drm init.
 This
 mapping is then attached to all of the drm devices including the drm
 device.
 [PATCH 1/7] drm/exynos: create common IOMMU mapping for DRM
 
 During the probe of drm fimd, the driver retrieves a 'sysmmu' field
 in the device node for fimd. If such a field exists, the driver retrieves
 the
 platform device of the sysmmu device. This sysmmu is set as the sysmmu
 for fimd. The common mapping created is then attached to fimd.
 This needs to be done for all the other devices (hdmi, vidi etc).
 [PATCH 2/7] ARM: EXYNOS5: add sysmmu field to fimd device node
 [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd
 
 During DRM's probe which happens last, the common mapping is set to its
 archdata
 and iommu ops are set as its dma ops. This requires a modification in the
 dma-mapping framework so that the iommu ops can be visible to all drivers.
 [PATCH 4/7] ARM: dma-mapping: rename and export iommu_ops
 [PATCH 5/7] drm/exynos: attach drm device with common drm mapping
 
 Currently allocation and free use the iommu framework by calling
 dma_alloc_writecombine and dma_free_writecombine respectively.
 For mapping the buffers to user space, the mmap functions assume that
 the buffer is contiguous. This is modified by calling
 dma_mmap_writecombine.
 [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function
 [PATCH 7/7] Add IOMMU support for mapping gem object
 
 The device tree based patches are based on Leela's patch which was posted
 last week for adding DT support to DRM FIMD. The patch to add sysmmu
 field is for reference only and will be posted to the device tree
 mailing list. Same with the rename and export iommu_ops patch.
 
 These patches are tested on Exynos5250 SMDK board and tested with modetest
 from libdrm tests.
 
 Prathyush K (7):
   drm/exynos: create common IOMMU mapping for DRM
   ARM: EXYNOS5: add sysmmu field to fimd device node
   drm/exynos: add IOMMU support to drm fimd
   ARM: dma-mapping: rename and export iommu_ops
   drm/exynos: attach drm device with common drm mapping
   drm/exynos: Add exynos drm specific fb_mmap function
   drm/exynos: Add IOMMU support for mapping gem object
 
  arch/arm/boot/dts/exynos5250.dtsi |1 +
  arch/arm/include/asm/dma-mapping.h|1 +
  arch/arm/mm/dma-mapping.c |5 ++-
  drivers/gpu/drm/exynos/exynos_drm_core.c  |3 ++
  drivers/gpu/drm/exynos/exynos_drm_drv.c   |   30 
  drivers/gpu/drm/exynos/exynos_drm_drv.h   |   10 +
  drivers/gpu/drm/exynos/exynos_drm_fbdev.c |   16 
  drivers/gpu/drm/exynos/exynos_drm_fimd.c  |   54
 -
  drivers/gpu/drm/exynos/exynos_drm_gem.c   |   35 --
  9 files changed, 133 insertions(+), 22 deletions(-)

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd

2012-07-13 Thread Inki Dae


 -Original Message-
 From: Prathyush K [mailto:prathyus...@samsung.com]
 Sent: Wednesday, July 11, 2012 6:40 PM
 To: dri-devel@lists.freedesktop.org
 Cc: prathy...@chromium.org; m.szyprow...@samsung.com;
inki@samsung.com;
 subash.ramasw...@linaro.org
 Subject: [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd
 
 This patch adds device tree based IOMMU support to DRM FIMD. During
 probe, the driver searches for a 'sysmmu' field in the device node. The
 sysmmu field points to the corresponding sysmmu device of fimd.
 This sysmmu device is retrieved and set as fimd's sysmmu. The common
 IOMMU mapping created during DRM init is then attached to drm fimd.
 
 Signed-off-by: Prathyush K prathyus...@samsung.com
 ---
  drivers/gpu/drm/exynos/exynos_drm_fimd.c |   54
 +-
  1 files changed, 53 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
 b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
 index 15b5286..6d4048a 100644
 --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
 +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
 @@ -19,7 +19,7 @@
  #include linux/clk.h
  #include linux/pm_runtime.h
  #include linux/of.h
 -
 +#include linux/of_platform.h
  #include drm/exynos_drm.h
  #include plat/regs-fb-v4.h
 
 @@ -790,12 +790,56 @@ static int fimd_power_on(struct fimd_context *ctx,
 bool enable)
  }
 
  #ifdef CONFIG_OF
 +
 +#ifdef CONFIG_EXYNOS_IOMMU
 +static int iommu_init(struct device *dev)
 +{
 + struct platform_device *pds;
 + struct device_node *dn, *dns;
 + const __be32 *parp;
 + int ret;
 +
 + dn = dev-of_node;
 + parp = of_get_property(dn, sysmmu, NULL);
 + if (parp == NULL) {
 + dev_err(dev, failed to find sysmmu property\n);
 + return -EINVAL;
 + }
 + dns = of_find_node_by_phandle(be32_to_cpup(parp));
 + if (dns == NULL) {
 + dev_err(dev, failed to find sysmmu node\n);
 + return -EINVAL;
 + }
 + pds = of_find_device_by_node(dns);
 + if (pds == NULL) {
 + dev_err(dev, failed to find sysmmu platform device\n);
 + return -EINVAL;
 + }
 +
 + platform_set_sysmmu(pds-dev, dev);
 + dev-dma_parms = kzalloc(sizeof(*dev-dma_parms), GFP_KERNEL);
 + if (!dev-dma_parms) {
 + dev_err(dev, failed to allocate dma parms\n);
 + return -ENOMEM;
 + }
 + dma_set_max_seg_size(dev, 0xu);
 +
 + ret = arm_iommu_attach_device(dev, exynos_drm_common_mapping);

where is exynos_drm_common_mapping declared? you can get this point using
exynos_drm_private structure.


 + if (ret) {
 + dev_err(dev, failed to attach device\n);
 + return ret;
 + }
 + return 0;
 +}
 +#endif
 +

with your patch, we can use iommu feature only with device tree. I think
iommu feature should be used commonly.


  static struct exynos_drm_fimd_pdata *drm_fimd_dt_parse_pdata(struct
 device *dev)
  {
   struct device_node *np = dev-of_node;
   struct device_node *disp_np;
   struct exynos_drm_fimd_pdata *pd;
   u32 data[4];
 + int ret;
 
   pd = kzalloc(sizeof(*pd), GFP_KERNEL);
   if (!pd) {
 @@ -803,6 +847,14 @@ static struct exynos_drm_fimd_pdata
 *drm_fimd_dt_parse_pdata(struct device *dev)
   return ERR_PTR(-ENOMEM);
   }
 
 +#ifdef CONFIG_EXYNOS_IOMMU

and please avoid such #ifdef in device driver.

 + ret = iommu_init(dev);
 + if (ret) {
 + dev_err(dev, failed to initialize iommu\n);
 + return ERR_PTR(ret);
 + }
 +#endif
 +
   if (of_get_property(np, samsung,fimd-vidout-rgb, NULL))
   pd-vidcon0 |= VIDCON0_VIDOUT_RGB | VIDCON0_PNRMODE_RGB;
   if (of_get_property(np, samsung,fimd-vidout-tv, NULL))
 --
 1.7.0.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH 5/7] drm/exynos: attach drm device with common drm mapping

2012-07-13 Thread Inki Dae


 -Original Message-
 From: Prathyush K [mailto:prathyus...@samsung.com]
 Sent: Wednesday, July 11, 2012 6:40 PM
 To: dri-devel@lists.freedesktop.org
 Cc: prathy...@chromium.org; m.szyprow...@samsung.com;
inki@samsung.com;
 subash.ramasw...@linaro.org
 Subject: [PATCH 5/7] drm/exynos: attach drm device with common drm mapping
 
 This patch sets the common mapping created during drm init, to the
 drm device's archdata. The dma_ops of drm device is set as arm_iommu_ops.
 The common mapping is shared across all the drm devices which ensures
 that any buffer allocated with drm is accessible by drm-fimd or drm-hdmi
 or both.
 
 Signed-off-by: Prathyush K prathyus...@samsung.com
 ---
  drivers/gpu/drm/exynos/exynos_drm_drv.c |9 +
  1 files changed, 9 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
 b/drivers/gpu/drm/exynos/exynos_drm_drv.c
 index c3ad87e..2e40ca8 100644
 --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
 +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
 @@ -276,6 +276,15 @@ static struct drm_driver exynos_drm_driver = {
 
  static int exynos_drm_platform_probe(struct platform_device *pdev)
  {
 +#ifdef CONFIG_EXYNOS_IOMMU
 + struct device *dev = pdev-dev;
 +
 + kref_get(exynos_drm_common_mapping-kref);
 + dev-archdata.mapping = exynos_drm_common_mapping;

Ok, exynos_drm_common_mapping is shared with drivers using
dev-archdata.mapping

 + set_dma_ops(dev, arm_iommu_ops);
 +
 + DRM_INFO(drm common mapping set to drm device.\n);
 +#endif
   DRM_DEBUG_DRIVER(%s\n, __FILE__);
 
   exynos_drm_driver.num_ioctls = DRM_ARRAY_SIZE(exynos_ioctls);
 --
 1.7.0.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function

2012-07-13 Thread Inki Dae


 -Original Message-
 From: Prathyush K [mailto:prathyus...@samsung.com]
 Sent: Wednesday, July 11, 2012 6:40 PM
 To: dri-devel@lists.freedesktop.org
 Cc: prathy...@chromium.org; m.szyprow...@samsung.com;
inki@samsung.com;
 subash.ramasw...@linaro.org
 Subject: [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function
 
 This patch adds a exynos drm specific implementation of fb_mmap
 which supports mapping a non-contiguous buffer to user space.
 This new function does not assume that the frame buffer is contiguous
 and calls dma_mmap_writecombine for mapping the buffer to user space.
 dma_mmap_writecombine will be able to map a contiguous buffer as well
 as non-contig buffer depending on whether an IOMMU mapping is created
 for drm or not.
 
 Signed-off-by: Prathyush K prathyus...@samsung.com
 ---
  drivers/gpu/drm/exynos/exynos_drm_fbdev.c |   16 
  1 files changed, 16 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
 b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
 index d5586cc..b53e638 100644
 --- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
 +++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
 @@ -46,8 +46,24 @@ struct exynos_drm_fbdev {
   struct exynos_drm_gem_obj   *exynos_gem_obj;
  };
 
 +static int exynos_drm_fb_mmap(struct fb_info *info,
 +   struct vm_area_struct *vma)
 +{
 + if ((vma-vm_end - vma-vm_start)  info-fix.smem_len)
 + return -EINVAL;
 +
 + vma-vm_pgoff = 0;
 + vma-vm_flags |= VM_IO | VM_RESERVED;
 + if (dma_mmap_writecombine(info-device, vma, info-screen_base,
 + info-fix.smem_start, vma-vm_end - vma-vm_start))
 + return -EAGAIN;
 +
 + return 0;
 +}
 +

Ok, it's good feature. actually the physically non-contiguous gem buffer
allocated for console framebuffer has to be mapped with user space.

Thanks.

  static struct fb_ops exynos_drm_fb_ops = {
   .owner  = THIS_MODULE,
 + .fb_mmap= exynos_drm_fb_mmap,
   .fb_fillrect= cfb_fillrect,
   .fb_copyarea= cfb_copyarea,
   .fb_imageblit   = cfb_imageblit,
 --
 1.7.0.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH 7/7] drm/exynos: Add IOMMU support for mapping gem object

2012-07-13 Thread Inki Dae


 -Original Message-
 From: Prathyush K [mailto:prathyus...@samsung.com]
 Sent: Wednesday, July 11, 2012 6:40 PM
 To: dri-devel@lists.freedesktop.org
 Cc: prathy...@chromium.org; m.szyprow...@samsung.com;
inki@samsung.com;
 subash.ramasw...@linaro.org
 Subject: [PATCH 7/7] drm/exynos: Add IOMMU support for mapping gem object
 
 A gem object is created using dma_alloc_writecombine. Currently, this
 buffer is assumed to be contiguous. If a IOMMU mapping is created for
 DRM, this buffer would be non-contig so the map functions are modified
 to call dma_mmap_writecombine. This works for both contig and non-contig
 buffers.
 
 Signed-off-by: Prathyush K prathyus...@samsung.com
 ---
  drivers/gpu/drm/exynos/exynos_drm_gem.c |   35
++-
 ---
  1 files changed, 16 insertions(+), 19 deletions(-)
 
 diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
 b/drivers/gpu/drm/exynos/exynos_drm_gem.c
 index 5c8b683..59240f7 100644
 --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
 +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
 @@ -162,17 +162,22 @@ static int exynos_drm_gem_map_pages(struct
 drm_gem_object *obj,
  {
   struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj);
   struct exynos_drm_gem_buf *buf = exynos_gem_obj-buffer;
 - unsigned long pfn;
 
   if (exynos_gem_obj-flags  EXYNOS_BO_NONCONTIG) {
 + unsigned long pfn;
   if (!buf-pages)
   return -EINTR;
 
   pfn = page_to_pfn(buf-pages[page_offset++]);
 - } else
 - pfn = (buf-dma_addr  PAGE_SHIFT) + page_offset;
 -
 - return vm_insert_mixed(vma, f_vaddr, pfn);
 + return vm_insert_mixed(vma, f_vaddr, pfn);
 + } else {

It's not good. EXYNOS_BO_NONCONTIG means physically non-contiguous otherwise
physically contiguous memory but with your patch, in case of using iommu,
memory type of the gem object may have no any meaning. in this case, the
memory type is EXYNOS_BO_CONTIG and has physically non-contiguous memory.

 + int ret;
 + ret = dma_mmap_writecombine(obj-dev-dev, vma, buf-kvaddr,
 + buf-dma_addr, buf-size);
 + if (ret)
 + DRM_ERROR(dma_mmap_writecombine failed\n);
 + return ret;
 + }
  }
 
  static int exynos_drm_gem_get_pages(struct drm_gem_object *obj)
 @@ -503,7 +508,7 @@ static int exynos_drm_gem_mmap_buffer(struct file
 *filp,
   struct drm_gem_object *obj = filp-private_data;
   struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj);
   struct exynos_drm_gem_buf *buffer;
 - unsigned long pfn, vm_size, usize, uaddr = vma-vm_start;
 + unsigned long vm_size, usize, uaddr = vma-vm_start;
   int ret;
 
   DRM_DEBUG_KMS(%s\n, __FILE__);
 @@ -543,19 +548,11 @@ static int exynos_drm_gem_mmap_buffer(struct file
 *filp,
   usize -= PAGE_SIZE;
   } while (usize  0);
   } else {
 - /*
 -  * get page frame number to physical memory to be mapped
 -  * to user space.
 -  */
 - pfn = ((unsigned long)exynos_gem_obj-buffer-dma_addr) 
 - PAGE_SHIFT;
 -
 - DRM_DEBUG_KMS(pfn = 0x%lx\n, pfn);
 -
 - if (remap_pfn_range(vma, vma-vm_start, pfn, vm_size,
 - vma-vm_page_prot)) {
 - DRM_ERROR(failed to remap pfn range.\n);
 - return -EAGAIN;
 + ret = dma_mmap_writecombine(obj-dev-dev, vma, buffer-
 kvaddr,

What if we don't use iommu and memory type of this buffer is non-contiguous?

 + buffer-dma_addr, buffer-size);
 + if (ret) {
 + DRM_ERROR(dma_mmap_writecombine failed\n);
 + return ret;
   }
   }
 
 --
 1.7.0.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/radeon: fix bo creation retry path

2012-07-13 Thread Christian König

On 13.07.2012 00:23, j.gli...@gmail.com wrote:

From: Jerome Glisse jgli...@redhat.com

Retry label was at wrong place in function leading to memory
leak.

Cc: sta...@vger.kernel.org
Signed-off-by: Jerome Glisse jgli...@redhat.com

Reviewed-by: Christian König christian.koe...@amd.com

---
  drivers/gpu/drm/radeon/radeon_object.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
b/drivers/gpu/drm/radeon/radeon_object.c
index 6ecb200..f71e472 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -138,7 +138,6 @@ int radeon_bo_create(struct radeon_device *rdev,
acc_size = ttm_bo_dma_acc_size(rdev-mman.bdev, size,
   sizeof(struct radeon_bo));
  
-retry:

bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL);
if (bo == NULL)
return -ENOMEM;
@@ -152,6 +151,8 @@ retry:
bo-surface_reg = -1;
INIT_LIST_HEAD(bo-list);
INIT_LIST_HEAD(bo-va);
+
+retry:
radeon_ttm_placement_from_domain(bo, domain);
/* Kernel allocation are uninterruptible */
down_read(rdev-pm.mclk_lock);



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3

2012-07-13 Thread Christian König

On 12.07.2012 18:36, Alex Deucher wrote:

On Thu, Jul 12, 2012 at 12:12 PM, Christian König
deathsim...@vodafone.de wrote:

Before emitting any indirect buffer, emit the offset of the next
valid ring content if any. This allow code that want to resume
ring to resume ring right after ib that caused GPU lockup.

v2: use scratch registers instead of storing it into memory
v3: skip over the surface sync for ni and si as well

Signed-off-by: Jerome Glisse jgli...@redhat.com
Signed-off-by: Christian König deathsim...@vodafone.de
---
  drivers/gpu/drm/radeon/evergreen.c   |8 +++-
  drivers/gpu/drm/radeon/ni.c  |   11 ++-
  drivers/gpu/drm/radeon/r600.c|   18 --
  drivers/gpu/drm/radeon/radeon.h  |1 +
  drivers/gpu/drm/radeon/radeon_ring.c |4 
  drivers/gpu/drm/radeon/rv770.c   |4 +++-
  drivers/gpu/drm/radeon/si.c  |   22 +++---
  7 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index f39b900..40de347 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device 
*rdev, struct radeon_ib *ib)
 /* set to DX10/11 mode */
 radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
 radeon_ring_write(ring, 1);
-   /* FIXME: implement */
+
+   if (ring-rptr_save_reg) {
+   uint32_t next_rptr = ring-wptr + 2 + 4;
+   radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }

On r600 and newer please use SET_CONFIG_REG rather than Packet0.
Why? Please note that it's on purpose that this doesn't interfere with 
the top/bottom of pipe handling and the draw commands, e.g. the register 
write isn't associated with drawing but instead just marks the beginning 
of parsing the IB.


Christian.


Alex


+
 radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
 radeon_ring_write(ring,
  #ifdef __BIG_ENDIAN
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index f2afefb..5b7ce2c 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -855,6 +855,13 @@ void cayman_ring_ib_execute(struct radeon_device *rdev, 
struct radeon_ib *ib)
 /* set to DX10/11 mode */
 radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
 radeon_ring_write(ring, 1);
+
+   if (ring-rptr_save_reg) {
+   uint32_t next_rptr = ring-wptr + 2 + 4 + 8;
+   radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }
+
 radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
 radeon_ring_write(ring,
  #ifdef __BIG_ENDIAN
@@ -981,8 +988,10 @@ static int cayman_cp_start(struct radeon_device *rdev)

  static void cayman_cp_fini(struct radeon_device *rdev)
  {
+   struct radeon_ring *ring = rdev-ring[RADEON_RING_TYPE_GFX_INDEX];
 cayman_cp_enable(rdev, false);
-   radeon_ring_fini(rdev, rdev-ring[RADEON_RING_TYPE_GFX_INDEX]);
+   radeon_ring_fini(rdev, ring);
+   radeon_scratch_free(rdev, ring-rptr_save_reg);
  }

  int cayman_cp_resume(struct radeon_device *rdev)
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index c808fa9..74fca15 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2155,18 +2155,27 @@ int r600_cp_resume(struct radeon_device *rdev)
  void r600_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, 
unsigned ring_size)
  {
 u32 rb_bufsz;
+   int r;

 /* Align ring size */
 rb_bufsz = drm_order(ring_size / 8);
 ring_size = (1  (rb_bufsz + 1)) * 4;
 ring-ring_size = ring_size;
 ring-align_mask = 16 - 1;
+
+   r = radeon_scratch_get(rdev, ring-rptr_save_reg);
+   if (r) {
+   DRM_ERROR(failed to get scratch reg for rptr save (%d).\n, r);
+   ring-rptr_save_reg = 0;
+   }
  }

  void r600_cp_fini(struct radeon_device *rdev)
  {
+   struct radeon_ring *ring = rdev-ring[RADEON_RING_TYPE_GFX_INDEX];
 r600_cp_stop(rdev);
-   radeon_ring_fini(rdev, rdev-ring[RADEON_RING_TYPE_GFX_INDEX]);
+   radeon_ring_fini(rdev, ring);
+   radeon_scratch_free(rdev, ring-rptr_save_reg);
  }


@@ -2568,7 +2577,12 @@ void r600_ring_ib_execute(struct radeon_device *rdev, 
struct radeon_ib *ib)
  {
 struct radeon_ring *ring = rdev-ring[ib-ring];

-   /* FIXME: implement */
+   if (ring-rptr_save_reg) {
+   uint32_t next_rptr = ring-wptr + 2 + 4;
+   radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }
+
 radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
 

Re: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM

2012-07-13 Thread Subash Patel

On 07/13/2012 12:09 PM, Inki Dae wrote:



-Original Message-
From: Prathyush K [mailto:prathyus...@samsung.com]
Sent: Wednesday, July 11, 2012 6:40 PM
To: dri-devel@lists.freedesktop.org
Cc: prathy...@chromium.org; m.szyprow...@samsung.com;

inki@samsung.com;

subash.ramasw...@linaro.org
Subject: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM

The dma-mapping framework needs a IOMMU mapping to be created for the
device which allocates/maps/frees the non-contig buffer. In the DRM
framework, a gem buffer is created by the DRM virtual device and not
directly by any of the physical devices (FIMD, HDMI etc). Each gem object
can be set as a framebuffer to one or many of the drm devices. So a gem
object cannot be allocated for any one device. All the DRM devices should
be able to access this buffer.


It's good to use unified iommu table so I agree to your opinion but we don't
decide whether we use dma mapping api or not. now dma mapping api has one
issue.
in case of using iommu with dma mapping api, we couldn't use physically
contiguous memory region with iommu. for this, there is a case that we
should use physically contiguous memory region with iommu. it is because we
sometime may use mfc(hw video codec) with secure zone such as ARM TrustZone.
Then, it needs physically contiguous memory region.

Thanks,
Inki Dae
I agree. In the mainline code, as of now only the arm_dma_ops has the 
support allocating
from the CMA. But in the function arm_iommu_alloc_attrs(), there is no 
way to know if the
device had declared a contiguous memory range. The reason, we don't 
store that cookie
into the device during the dma_declare_contiguous(). So is it advisable 
to store such information

like mapping(in the iommu operations) in the device.archdata?

Regards,
Subash



The proposed method is to create a common IOMMU mapping during drm init.
This
mapping is then attached to all of the drm devices including the drm
device.
[PATCH 1/7] drm/exynos: create common IOMMU mapping for DRM

During the probe of drm fimd, the driver retrieves a 'sysmmu' field
in the device node for fimd. If such a field exists, the driver retrieves
the
platform device of the sysmmu device. This sysmmu is set as the sysmmu
for fimd. The common mapping created is then attached to fimd.
This needs to be done for all the other devices (hdmi, vidi etc).
[PATCH 2/7] ARM: EXYNOS5: add sysmmu field to fimd device node
[PATCH 3/7] drm/exynos: add IOMMU support to drm fimd

During DRM's probe which happens last, the common mapping is set to its
archdata
and iommu ops are set as its dma ops. This requires a modification in the
dma-mapping framework so that the iommu ops can be visible to all drivers.
[PATCH 4/7] ARM: dma-mapping: rename and export iommu_ops
[PATCH 5/7] drm/exynos: attach drm device with common drm mapping

Currently allocation and free use the iommu framework by calling
dma_alloc_writecombine and dma_free_writecombine respectively.
For mapping the buffers to user space, the mmap functions assume that
the buffer is contiguous. This is modified by calling
dma_mmap_writecombine.
[PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function
[PATCH 7/7] Add IOMMU support for mapping gem object

The device tree based patches are based on Leela's patch which was posted
last week for adding DT support to DRM FIMD. The patch to add sysmmu
field is for reference only and will be posted to the device tree
mailing list. Same with the rename and export iommu_ops patch.

These patches are tested on Exynos5250 SMDK board and tested with modetest
from libdrm tests.

Prathyush K (7):
   drm/exynos: create common IOMMU mapping for DRM
   ARM: EXYNOS5: add sysmmu field to fimd device node
   drm/exynos: add IOMMU support to drm fimd
   ARM: dma-mapping: rename and export iommu_ops
   drm/exynos: attach drm device with common drm mapping
   drm/exynos: Add exynos drm specific fb_mmap function
   drm/exynos: Add IOMMU support for mapping gem object

  arch/arm/boot/dts/exynos5250.dtsi |1 +
  arch/arm/include/asm/dma-mapping.h|1 +
  arch/arm/mm/dma-mapping.c |5 ++-
  drivers/gpu/drm/exynos/exynos_drm_core.c  |3 ++
  drivers/gpu/drm/exynos/exynos_drm_drv.c   |   30 
  drivers/gpu/drm/exynos/exynos_drm_drv.h   |   10 +
  drivers/gpu/drm/exynos/exynos_drm_fbdev.c |   16 
  drivers/gpu/drm/exynos/exynos_drm_fimd.c  |   54
-
  drivers/gpu/drm/exynos/exynos_drm_gem.c   |   35 --
  9 files changed, 133 insertions(+), 22 deletions(-)


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3

2012-07-13 Thread Alex Deucher
On Fri, Jul 13, 2012 at 5:09 AM, Christian König
deathsim...@vodafone.de wrote:
 On 12.07.2012 18:36, Alex Deucher wrote:

 On Thu, Jul 12, 2012 at 12:12 PM, Christian König
 deathsim...@vodafone.de wrote:

 Before emitting any indirect buffer, emit the offset of the next
 valid ring content if any. This allow code that want to resume
 ring to resume ring right after ib that caused GPU lockup.

 v2: use scratch registers instead of storing it into memory
 v3: skip over the surface sync for ni and si as well

 Signed-off-by: Jerome Glisse jgli...@redhat.com
 Signed-off-by: Christian König deathsim...@vodafone.de
 ---
   drivers/gpu/drm/radeon/evergreen.c   |8 +++-
   drivers/gpu/drm/radeon/ni.c  |   11 ++-
   drivers/gpu/drm/radeon/r600.c|   18 --
   drivers/gpu/drm/radeon/radeon.h  |1 +
   drivers/gpu/drm/radeon/radeon_ring.c |4 
   drivers/gpu/drm/radeon/rv770.c   |4 +++-
   drivers/gpu/drm/radeon/si.c  |   22 +++---
   7 files changed, 60 insertions(+), 8 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/evergreen.c
 b/drivers/gpu/drm/radeon/evergreen.c
 index f39b900..40de347 100644
 --- a/drivers/gpu/drm/radeon/evergreen.c
 +++ b/drivers/gpu/drm/radeon/evergreen.c
 @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct
 radeon_device *rdev, struct radeon_ib *ib)
  /* set to DX10/11 mode */
  radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
  radeon_ring_write(ring, 1);
 -   /* FIXME: implement */
 +
 +   if (ring-rptr_save_reg) {
 +   uint32_t next_rptr = ring-wptr + 2 + 4;
 +   radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0));
 +   radeon_ring_write(ring, next_rptr);
 +   }

 On r600 and newer please use SET_CONFIG_REG rather than Packet0.

 Why? Please note that it's on purpose that this doesn't interfere with the
 top/bottom of pipe handling and the draw commands, e.g. the register write
 isn't associated with drawing but instead just marks the beginning of
 parsing the IB.

Packet0's are have been semi-deprecated since r600.  They still work,
but the CP guys recommend using the appropriate packet3 whenever
possible.

Alex
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3

2012-07-13 Thread Christian König

On 13.07.2012 14:27, Alex Deucher wrote:

On Fri, Jul 13, 2012 at 5:09 AM, Christian König
deathsim...@vodafone.de wrote:

On 12.07.2012 18:36, Alex Deucher wrote:

On Thu, Jul 12, 2012 at 12:12 PM, Christian König
deathsim...@vodafone.de wrote:

Before emitting any indirect buffer, emit the offset of the next
valid ring content if any. This allow code that want to resume
ring to resume ring right after ib that caused GPU lockup.

v2: use scratch registers instead of storing it into memory
v3: skip over the surface sync for ni and si as well

Signed-off-by: Jerome Glisse jgli...@redhat.com
Signed-off-by: Christian König deathsim...@vodafone.de
---
   drivers/gpu/drm/radeon/evergreen.c   |8 +++-
   drivers/gpu/drm/radeon/ni.c  |   11 ++-
   drivers/gpu/drm/radeon/r600.c|   18 --
   drivers/gpu/drm/radeon/radeon.h  |1 +
   drivers/gpu/drm/radeon/radeon_ring.c |4 
   drivers/gpu/drm/radeon/rv770.c   |4 +++-
   drivers/gpu/drm/radeon/si.c  |   22 +++---
   7 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c
b/drivers/gpu/drm/radeon/evergreen.c
index f39b900..40de347 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct
radeon_device *rdev, struct radeon_ib *ib)
  /* set to DX10/11 mode */
  radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
  radeon_ring_write(ring, 1);
-   /* FIXME: implement */
+
+   if (ring-rptr_save_reg) {
+   uint32_t next_rptr = ring-wptr + 2 + 4;
+   radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }

On r600 and newer please use SET_CONFIG_REG rather than Packet0.

Why? Please note that it's on purpose that this doesn't interfere with the
top/bottom of pipe handling and the draw commands, e.g. the register write
isn't associated with drawing but instead just marks the beginning of
parsing the IB.

Packet0's are have been semi-deprecated since r600.  They still work,
but the CP guys recommend using the appropriate packet3 whenever
possible.

Ok, that makes sense.

Any further comments on the patchset, or can I send that to Dave for 
merging now?


Cheers,
Christian.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3

2012-07-13 Thread Alex Deucher
On Fri, Jul 13, 2012 at 9:46 AM, Christian König
deathsim...@vodafone.de wrote:
 On 13.07.2012 14:27, Alex Deucher wrote:

 On Fri, Jul 13, 2012 at 5:09 AM, Christian König
 deathsim...@vodafone.de wrote:

 On 12.07.2012 18:36, Alex Deucher wrote:

 On Thu, Jul 12, 2012 at 12:12 PM, Christian König
 deathsim...@vodafone.de wrote:

 Before emitting any indirect buffer, emit the offset of the next
 valid ring content if any. This allow code that want to resume
 ring to resume ring right after ib that caused GPU lockup.

 v2: use scratch registers instead of storing it into memory
 v3: skip over the surface sync for ni and si as well

 Signed-off-by: Jerome Glisse jgli...@redhat.com
 Signed-off-by: Christian König deathsim...@vodafone.de
 ---
drivers/gpu/drm/radeon/evergreen.c   |8 +++-
drivers/gpu/drm/radeon/ni.c  |   11 ++-
drivers/gpu/drm/radeon/r600.c|   18 --
drivers/gpu/drm/radeon/radeon.h  |1 +
drivers/gpu/drm/radeon/radeon_ring.c |4 
drivers/gpu/drm/radeon/rv770.c   |4 +++-
drivers/gpu/drm/radeon/si.c  |   22 +++---
7 files changed, 60 insertions(+), 8 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/evergreen.c
 b/drivers/gpu/drm/radeon/evergreen.c
 index f39b900..40de347 100644
 --- a/drivers/gpu/drm/radeon/evergreen.c
 +++ b/drivers/gpu/drm/radeon/evergreen.c
 @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct
 radeon_device *rdev, struct radeon_ib *ib)
   /* set to DX10/11 mode */
   radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
   radeon_ring_write(ring, 1);
 -   /* FIXME: implement */
 +
 +   if (ring-rptr_save_reg) {
 +   uint32_t next_rptr = ring-wptr + 2 + 4;
 +   radeon_ring_write(ring, PACKET0(ring-rptr_save_reg,
 0));
 +   radeon_ring_write(ring, next_rptr);
 +   }

 On r600 and newer please use SET_CONFIG_REG rather than Packet0.

 Why? Please note that it's on purpose that this doesn't interfere with
 the
 top/bottom of pipe handling and the draw commands, e.g. the register
 write
 isn't associated with drawing but instead just marks the beginning of
 parsing the IB.

 Packet0's are have been semi-deprecated since r600.  They still work,
 but the CP guys recommend using the appropriate packet3 whenever
 possible.

 Ok, that makes sense.

 Any further comments on the patchset, or can I send that to Dave for merging
 now?

Other than that, it looks good to me.  For the series:

Reviewed-by: Alex Deucher alexander.deuc...@amd.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/3] drm/radeon: return an error if there is nothing to wait for

2012-07-13 Thread Christian König
Otherwise the sa managers out of memory
handling doesn't work.

Signed-off-by: Christian König deathsim...@vodafone.de
---
 drivers/gpu/drm/radeon/radeon_fence.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 76c5b22..7a181c3 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -331,7 +331,7 @@ static int radeon_fence_wait_any_seq(struct radeon_device 
*rdev,
 
/* nothing to wait for ? */
if (ring == RADEON_NUM_RINGS) {
-   return 0;
+   return -ENOENT;
}
 
while (!radeon_fence_any_seq_signaled(rdev, target_seq)) {
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/3] drm/radeon: let sa manager block for fences to wait for

2012-07-13 Thread Christian König
Otherwise we can encounter out of memory situations under extreme load.

Signed-off-by: Christian König deathsim...@vodafone.de
---
 drivers/gpu/drm/radeon/radeon.h|2 +-
 drivers/gpu/drm/radeon/radeon_sa.c |   72 +---
 2 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 6715e4c..2cb355b 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -362,7 +362,7 @@ struct radeon_bo_list {
  * alignment).
  */
 struct radeon_sa_manager {
-   spinlock_t  lock;
+   wait_queue_head_t   wq;
struct radeon_bo*bo;
struct list_head*hole;
struct list_headflist[RADEON_NUM_RINGS];
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
b/drivers/gpu/drm/radeon/radeon_sa.c
index 81dbb5b..b535fc4 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -54,7 +54,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev,
 {
int i, r;
 
-   spin_lock_init(sa_manager-lock);
+   init_waitqueue_head(sa_manager-wq);
sa_manager-bo = NULL;
sa_manager-size = size;
sa_manager-domain = domain;
@@ -211,6 +211,29 @@ static bool radeon_sa_bo_try_alloc(struct 
radeon_sa_manager *sa_manager,
return false;
 }
 
+static bool radeon_sa_event(struct radeon_sa_manager *sa_manager,
+   unsigned size, unsigned align)
+{
+   unsigned soffset, eoffset, wasted;
+   int i;
+
+   for (i = 0; i  RADEON_NUM_RINGS; ++i) {
+   if (!list_empty(sa_manager-flist[i])) {
+   return true;
+   }
+   }
+
+   soffset = radeon_sa_bo_hole_soffset(sa_manager);
+   eoffset = radeon_sa_bo_hole_eoffset(sa_manager);
+   wasted = (align - (soffset % align)) % align;
+
+   if ((eoffset - soffset) = (size + wasted)) {
+   return true;
+   }
+
+   return false;
+}
+
 static bool radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager,
   struct radeon_fence **fences,
   unsigned *tries)
@@ -297,8 +320,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
INIT_LIST_HEAD((*sa_bo)-olist);
INIT_LIST_HEAD((*sa_bo)-flist);
 
-   spin_lock(sa_manager-lock);
-   do {
+   spin_lock(sa_manager-wq.lock);
+   while(1) {
for (i = 0; i  RADEON_NUM_RINGS; ++i) {
fences[i] = NULL;
tries[i] = 0;
@@ -309,30 +332,34 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 
if (radeon_sa_bo_try_alloc(sa_manager, *sa_bo,
   size, align)) {
-   spin_unlock(sa_manager-lock);
+   spin_unlock(sa_manager-wq.lock);
return 0;
}
 
/* see if we can skip over some allocations */
} while (radeon_sa_bo_next_hole(sa_manager, fences, tries));
 
-   if (block) {
-   spin_unlock(sa_manager-lock);
-   r = radeon_fence_wait_any(rdev, fences, false);
-   spin_lock(sa_manager-lock);
-   if (r) {
-   /* if we have nothing to wait for we
-  are practically out of memory */
-   if (r == -ENOENT) {
-   r = -ENOMEM;
-   }
-   goto out_err;
-   }
+   if (!block) {
+   break;
+   }
+
+   spin_unlock(sa_manager-wq.lock);
+   r = radeon_fence_wait_any(rdev, fences, false);
+   spin_lock(sa_manager-wq.lock);
+   /* if we have nothing to wait for block */
+   if (r == -ENOENT) {
+   r = wait_event_interruptible_locked(
+   sa_manager-wq, 
+   radeon_sa_event(sa_manager, size, align)
+   );
+   }
+   if (r) {
+   goto out_err;
}
-   } while (block);
+   };
 
 out_err:
-   spin_unlock(sa_manager-lock);
+   spin_unlock(sa_manager-wq.lock);
kfree(*sa_bo);
*sa_bo = NULL;
return r;
@@ -348,7 +375,7 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct 
radeon_sa_bo **sa_bo,
}
 
sa_manager = (*sa_bo)-manager;
-   spin_lock(sa_manager-lock);
+   spin_lock(sa_manager-wq.lock);
if (fence  !radeon_fence_signaled(fence)) {
(*sa_bo)-fence = radeon_fence_ref(fence);
list_add_tail((*sa_bo)-flist,

[PATCH 3/3] drm/radeon: fix const IB handling

2012-07-13 Thread Christian König
Const IBs are executed on the CE not the CP, so we can't
fence them in the normal way.

So submit them directly before the IB instead, just as
the documentation says.

Signed-off-by: Christian König deathsim...@vodafone.de
---
 drivers/gpu/drm/radeon/r100.c|2 +-
 drivers/gpu/drm/radeon/r600.c|2 +-
 drivers/gpu/drm/radeon/radeon.h  |3 ++-
 drivers/gpu/drm/radeon/radeon_cs.c   |   25 +++--
 drivers/gpu/drm/radeon/radeon_ring.c |   10 +-
 5 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index e0f5ae8..4ee5a74 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct 
radeon_ring *ring)
ib.ptr[6] = PACKET2(0);
ib.ptr[7] = PACKET2(0);
ib.length_dw = 8;
-   r = radeon_ib_schedule(rdev, ib);
+   r = radeon_ib_schedule(rdev, ib, NULL);
if (r) {
radeon_scratch_free(rdev, scratch);
radeon_ib_free(rdev, ib);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 3156d25..c2e5069 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct 
radeon_ring *ring)
ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET)  2);
ib.ptr[2] = 0xDEADBEEF;
ib.length_dw = 3;
-   r = radeon_ib_schedule(rdev, ib);
+   r = radeon_ib_schedule(rdev, ib, NULL);
if (r) {
radeon_scratch_free(rdev, scratch);
radeon_ib_free(rdev, ib);
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 2cb355b..2d7f06c 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -751,7 +751,8 @@ struct si_rlc {
 int radeon_ib_get(struct radeon_device *rdev, int ring,
  struct radeon_ib *ib, unsigned size);
 void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib);
-int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
+int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
+  struct radeon_ib *const_ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 void radeon_ib_pool_fini(struct radeon_device *rdev);
 int radeon_ib_ring_tests(struct radeon_device *rdev);
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 553da67..d0be5d5 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
}
radeon_cs_sync_rings(parser);
parser-ib.vm_id = 0;
-   r = radeon_ib_schedule(rdev, parser-ib);
+   r = radeon_ib_schedule(rdev, parser-ib, NULL);
if (r) {
DRM_ERROR(Failed to schedule IB !\n);
}
@@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device 
*rdev,
}
radeon_cs_sync_rings(parser);
 
+   parser-ib.vm_id = vm-id;
+   /* ib pool is bind at 0 in virtual address space,
+* so gpu_addr is the offset inside the pool bo
+*/
+   parser-ib.gpu_addr = parser-ib.sa_bo-soffset;
+
if ((rdev-family = CHIP_TAHITI) 
(parser-chunk_const_ib_idx != -1)) {
parser-const_ib.vm_id = vm-id;
-   /* ib pool is bind at 0 in virtual address space to gpu_addr is 
the
-* offset inside the pool bo
-*/
+   /* same reason as above */
parser-const_ib.gpu_addr = parser-const_ib.sa_bo-soffset;
-   r = radeon_ib_schedule(rdev, parser-const_ib);
-   if (r)
-   goto out;
+   r = radeon_ib_schedule(rdev, parser-ib, parser-const_ib);
+   } else {
+   r = radeon_ib_schedule(rdev, parser-ib, NULL);
}
 
-   parser-ib.vm_id = vm-id;
-   /* ib pool is bind at 0 in virtual address space to gpu_addr is the
-* offset inside the pool bo
-*/
-   parser-ib.gpu_addr = parser-ib.sa_bo-soffset;
-   parser-ib.is_const_ib = false;
-   r = radeon_ib_schedule(rdev, parser-ib);
 out:
if (!r) {
if (vm-fence) {
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 75cbe46..c48c354 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -74,7 +74,8 @@ void radeon_ib_free(struct radeon_device *rdev, struct 
radeon_ib *ib)
radeon_fence_unref(ib-fence);
 }
 
-int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib)
+int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
+  struct radeon_ib *const_ib)
 {
struct radeon_ring *ring = 

Re: [PATCH 2/3] drm/radeon: let sa manager block for fences to wait for

2012-07-13 Thread Tom Stellard
On Fri, Jul 13, 2012 at 04:08:14PM +0200, Christian König wrote:
 Otherwise we can encounter out of memory situations under extreme load.
 
 Signed-off-by: Christian König deathsim...@vodafone.de
 ---
  drivers/gpu/drm/radeon/radeon.h|2 +-
  drivers/gpu/drm/radeon/radeon_sa.c |   72 
 +---
  2 files changed, 51 insertions(+), 23 deletions(-)
 
 diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
 index 6715e4c..2cb355b 100644
 --- a/drivers/gpu/drm/radeon/radeon.h
 +++ b/drivers/gpu/drm/radeon/radeon.h
 @@ -362,7 +362,7 @@ struct radeon_bo_list {
   * alignment).
   */
  struct radeon_sa_manager {
 - spinlock_t  lock;
 + wait_queue_head_t   wq;
   struct radeon_bo*bo;
   struct list_head*hole;
   struct list_headflist[RADEON_NUM_RINGS];
 diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
 b/drivers/gpu/drm/radeon/radeon_sa.c
 index 81dbb5b..b535fc4 100644
 --- a/drivers/gpu/drm/radeon/radeon_sa.c
 +++ b/drivers/gpu/drm/radeon/radeon_sa.c
 @@ -54,7 +54,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev,
  {
   int i, r;
  
 - spin_lock_init(sa_manager-lock);
 + init_waitqueue_head(sa_manager-wq);
   sa_manager-bo = NULL;
   sa_manager-size = size;
   sa_manager-domain = domain;
 @@ -211,6 +211,29 @@ static bool radeon_sa_bo_try_alloc(struct 
 radeon_sa_manager *sa_manager,
   return false;
  }

 +static bool radeon_sa_event(struct radeon_sa_manager *sa_manager,
 + unsigned size, unsigned align)
 +{
 + unsigned soffset, eoffset, wasted;
 + int i;
 +
 + for (i = 0; i  RADEON_NUM_RINGS; ++i) {
 + if (!list_empty(sa_manager-flist[i])) {
 + return true;
 + }
 + }
 +
 + soffset = radeon_sa_bo_hole_soffset(sa_manager);
 + eoffset = radeon_sa_bo_hole_eoffset(sa_manager);
 + wasted = (align - (soffset % align)) % align;
 +
 + if ((eoffset - soffset) = (size + wasted)) {
 + return true;
 + }
 +
 + return false;
 +}
 +

This new function should come with a comment, per the new documentation
rules.

  static bool radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager,
  struct radeon_fence **fences,
  unsigned *tries)
 @@ -297,8 +320,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
   INIT_LIST_HEAD((*sa_bo)-olist);
   INIT_LIST_HEAD((*sa_bo)-flist);
  
 - spin_lock(sa_manager-lock);
 - do {
 + spin_lock(sa_manager-wq.lock);
 + while(1) {
   for (i = 0; i  RADEON_NUM_RINGS; ++i) {
   fences[i] = NULL;
   tries[i] = 0;
 @@ -309,30 +332,34 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
  
   if (radeon_sa_bo_try_alloc(sa_manager, *sa_bo,
  size, align)) {
 - spin_unlock(sa_manager-lock);
 + spin_unlock(sa_manager-wq.lock);
   return 0;
   }
  
   /* see if we can skip over some allocations */
   } while (radeon_sa_bo_next_hole(sa_manager, fences, tries));
  
 - if (block) {
 - spin_unlock(sa_manager-lock);
 - r = radeon_fence_wait_any(rdev, fences, false);
 - spin_lock(sa_manager-lock);
 - if (r) {
 - /* if we have nothing to wait for we
 -are practically out of memory */
 - if (r == -ENOENT) {
 - r = -ENOMEM;
 - }
 - goto out_err;
 - }
 + if (!block) {
 + break;
 + }
 +
 + spin_unlock(sa_manager-wq.lock);
 + r = radeon_fence_wait_any(rdev, fences, false);
 + spin_lock(sa_manager-wq.lock);
 + /* if we have nothing to wait for block */
 + if (r == -ENOENT) {
 + r = wait_event_interruptible_locked(
 + sa_manager-wq, 
 + radeon_sa_event(sa_manager, size, align)
 + );
 + }
 + if (r) {
 + goto out_err;
   }
 - } while (block);
 + };
  
  out_err:
 - spin_unlock(sa_manager-lock);
 + spin_unlock(sa_manager-wq.lock);
   kfree(*sa_bo);
   *sa_bo = NULL;
   return r;
 @@ -348,7 +375,7 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct 
 radeon_sa_bo **sa_bo,
   }
  
   sa_manager = (*sa_bo)-manager;
 - spin_lock(sa_manager-lock);
 + spin_lock(sa_manager-wq.lock);
   if (fence  

Re: [PATCH 3/3] drm/radeon: fix const IB handling

2012-07-13 Thread Tom Stellard
On Fri, Jul 13, 2012 at 04:08:15PM +0200, Christian König wrote:
 Const IBs are executed on the CE not the CP, so we can't
 fence them in the normal way.
 
 So submit them directly before the IB instead, just as
 the documentation says.
 
 Signed-off-by: Christian König deathsim...@vodafone.de
 ---
  drivers/gpu/drm/radeon/r100.c|2 +-
  drivers/gpu/drm/radeon/r600.c|2 +-
  drivers/gpu/drm/radeon/radeon.h  |3 ++-
  drivers/gpu/drm/radeon/radeon_cs.c   |   25 +++--
  drivers/gpu/drm/radeon/radeon_ring.c |   10 +-
  5 files changed, 24 insertions(+), 18 deletions(-)
 
 diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
 index e0f5ae8..4ee5a74 100644
 --- a/drivers/gpu/drm/radeon/r100.c
 +++ b/drivers/gpu/drm/radeon/r100.c
 @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct 
 radeon_ring *ring)
   ib.ptr[6] = PACKET2(0);
   ib.ptr[7] = PACKET2(0);
   ib.length_dw = 8;
 - r = radeon_ib_schedule(rdev, ib);
 + r = radeon_ib_schedule(rdev, ib, NULL);
   if (r) {
   radeon_scratch_free(rdev, scratch);
   radeon_ib_free(rdev, ib);
 diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
 index 3156d25..c2e5069 100644
 --- a/drivers/gpu/drm/radeon/r600.c
 +++ b/drivers/gpu/drm/radeon/r600.c
 @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct 
 radeon_ring *ring)
   ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET)  2);
   ib.ptr[2] = 0xDEADBEEF;
   ib.length_dw = 3;
 - r = radeon_ib_schedule(rdev, ib);
 + r = radeon_ib_schedule(rdev, ib, NULL);
   if (r) {
   radeon_scratch_free(rdev, scratch);
   radeon_ib_free(rdev, ib);
 diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
 index 2cb355b..2d7f06c 100644
 --- a/drivers/gpu/drm/radeon/radeon.h
 +++ b/drivers/gpu/drm/radeon/radeon.h
 @@ -751,7 +751,8 @@ struct si_rlc {
  int radeon_ib_get(struct radeon_device *rdev, int ring,
 struct radeon_ib *ib, unsigned size);
  void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib);
 -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
 +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
 +struct radeon_ib *const_ib);
  int radeon_ib_pool_init(struct radeon_device *rdev);
  void radeon_ib_pool_fini(struct radeon_device *rdev);
  int radeon_ib_ring_tests(struct radeon_device *rdev);
 diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
 b/drivers/gpu/drm/radeon/radeon_cs.c
 index 553da67..d0be5d5 100644
 --- a/drivers/gpu/drm/radeon/radeon_cs.c
 +++ b/drivers/gpu/drm/radeon/radeon_cs.c
 @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
   }
   radeon_cs_sync_rings(parser);
   parser-ib.vm_id = 0;
 - r = radeon_ib_schedule(rdev, parser-ib);
 + r = radeon_ib_schedule(rdev, parser-ib, NULL);
   if (r) {
   DRM_ERROR(Failed to schedule IB !\n);
   }
 @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device 
 *rdev,
   }
   radeon_cs_sync_rings(parser);
  
 + parser-ib.vm_id = vm-id;
 + /* ib pool is bind at 0 in virtual address space,
 +  * so gpu_addr is the offset inside the pool bo
 +  */
 + parser-ib.gpu_addr = parser-ib.sa_bo-soffset;
 +
   if ((rdev-family = CHIP_TAHITI) 
   (parser-chunk_const_ib_idx != -1)) {
   parser-const_ib.vm_id = vm-id;
 - /* ib pool is bind at 0 in virtual address space to gpu_addr is 
 the
 -  * offset inside the pool bo
 -  */
 + /* same reason as above */
   parser-const_ib.gpu_addr = parser-const_ib.sa_bo-soffset;
 - r = radeon_ib_schedule(rdev, parser-const_ib);
 - if (r)
 - goto out;
 + r = radeon_ib_schedule(rdev, parser-ib, parser-const_ib);
 + } else {
 + r = radeon_ib_schedule(rdev, parser-ib, NULL);
   }
  
 - parser-ib.vm_id = vm-id;
 - /* ib pool is bind at 0 in virtual address space to gpu_addr is the
 -  * offset inside the pool bo
 -  */
 - parser-ib.gpu_addr = parser-ib.sa_bo-soffset;
 - parser-ib.is_const_ib = false;
 - r = radeon_ib_schedule(rdev, parser-ib);
  out:
   if (!r) {
   if (vm-fence) {
 diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
 b/drivers/gpu/drm/radeon/radeon_ring.c
 index 75cbe46..c48c354 100644
 --- a/drivers/gpu/drm/radeon/radeon_ring.c
 +++ b/drivers/gpu/drm/radeon/radeon_ring.c
 @@ -74,7 +74,8 @@ void radeon_ib_free(struct radeon_device *rdev, struct 
 radeon_ib *ib)
   radeon_fence_unref(ib-fence);
  }


 -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib)
 +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
 +

Re: [PATCH 3/3] drm/radeon: fix const IB handling

2012-07-13 Thread Jerome Glisse
On Fri, Jul 13, 2012 at 10:08 AM, Christian König
deathsim...@vodafone.de wrote:
 Const IBs are executed on the CE not the CP, so we can't
 fence them in the normal way.

 So submit them directly before the IB instead, just as
 the documentation says.

 Signed-off-by: Christian König deathsim...@vodafone.de
 ---
  drivers/gpu/drm/radeon/r100.c|2 +-
  drivers/gpu/drm/radeon/r600.c|2 +-
  drivers/gpu/drm/radeon/radeon.h  |3 ++-
  drivers/gpu/drm/radeon/radeon_cs.c   |   25 +++--
  drivers/gpu/drm/radeon/radeon_ring.c |   10 +-
  5 files changed, 24 insertions(+), 18 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
 index e0f5ae8..4ee5a74 100644
 --- a/drivers/gpu/drm/radeon/r100.c
 +++ b/drivers/gpu/drm/radeon/r100.c
 @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct 
 radeon_ring *ring)
 ib.ptr[6] = PACKET2(0);
 ib.ptr[7] = PACKET2(0);
 ib.length_dw = 8;
 -   r = radeon_ib_schedule(rdev, ib);
 +   r = radeon_ib_schedule(rdev, ib, NULL);
 if (r) {
 radeon_scratch_free(rdev, scratch);
 radeon_ib_free(rdev, ib);
 diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
 index 3156d25..c2e5069 100644
 --- a/drivers/gpu/drm/radeon/r600.c
 +++ b/drivers/gpu/drm/radeon/r600.c
 @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct 
 radeon_ring *ring)
 ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET)  2);
 ib.ptr[2] = 0xDEADBEEF;
 ib.length_dw = 3;
 -   r = radeon_ib_schedule(rdev, ib);
 +   r = radeon_ib_schedule(rdev, ib, NULL);
 if (r) {
 radeon_scratch_free(rdev, scratch);
 radeon_ib_free(rdev, ib);
 diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
 index 2cb355b..2d7f06c 100644
 --- a/drivers/gpu/drm/radeon/radeon.h
 +++ b/drivers/gpu/drm/radeon/radeon.h
 @@ -751,7 +751,8 @@ struct si_rlc {
  int radeon_ib_get(struct radeon_device *rdev, int ring,
   struct radeon_ib *ib, unsigned size);
  void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib);
 -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
 +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
 +  struct radeon_ib *const_ib);
  int radeon_ib_pool_init(struct radeon_device *rdev);
  void radeon_ib_pool_fini(struct radeon_device *rdev);
  int radeon_ib_ring_tests(struct radeon_device *rdev);
 diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
 b/drivers/gpu/drm/radeon/radeon_cs.c
 index 553da67..d0be5d5 100644
 --- a/drivers/gpu/drm/radeon/radeon_cs.c
 +++ b/drivers/gpu/drm/radeon/radeon_cs.c
 @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
 }
 radeon_cs_sync_rings(parser);
 parser-ib.vm_id = 0;
 -   r = radeon_ib_schedule(rdev, parser-ib);
 +   r = radeon_ib_schedule(rdev, parser-ib, NULL);
 if (r) {
 DRM_ERROR(Failed to schedule IB !\n);
 }
 @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device 
 *rdev,
 }
 radeon_cs_sync_rings(parser);

 +   parser-ib.vm_id = vm-id;
 +   /* ib pool is bind at 0 in virtual address space,
 +* so gpu_addr is the offset inside the pool bo
 +*/
 +   parser-ib.gpu_addr = parser-ib.sa_bo-soffset;
 +
 if ((rdev-family = CHIP_TAHITI) 
 (parser-chunk_const_ib_idx != -1)) {
 parser-const_ib.vm_id = vm-id;
 -   /* ib pool is bind at 0 in virtual address space to gpu_addr 
 is the
 -* offset inside the pool bo
 -*/
 +   /* same reason as above */

Don't remove comment, code might move and the above comment might not
be the same better to duplicate comment then trying to cross reference
comment across file.

 parser-const_ib.gpu_addr = parser-const_ib.sa_bo-soffset;
 -   r = radeon_ib_schedule(rdev, parser-const_ib);
 -   if (r)
 -   goto out;
 +   r = radeon_ib_schedule(rdev, parser-ib, parser-const_ib);
 +   } else {
 +   r = radeon_ib_schedule(rdev, parser-ib, NULL);
 }

 -   parser-ib.vm_id = vm-id;
 -   /* ib pool is bind at 0 in virtual address space to gpu_addr is the
 -* offset inside the pool bo
 -*/
 -   parser-ib.gpu_addr = parser-ib.sa_bo-soffset;
 -   parser-ib.is_const_ib = false;
 -   r = radeon_ib_schedule(rdev, parser-ib);
  out:
 if (!r) {
 if (vm-fence) {
 diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
 b/drivers/gpu/drm/radeon/radeon_ring.c
 index 75cbe46..c48c354 100644
 --- a/drivers/gpu/drm/radeon/radeon_ring.c
 +++ b/drivers/gpu/drm/radeon/radeon_ring.c
 @@ -74,7 +74,8 @@ void 

[Bug 52054] New: gallium/opencl doesnt support includes for opencl kernels

2012-07-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=52054

 Bug #: 52054
   Summary: gallium/opencl doesnt support includes for opencl
kernels
Classification: Unclassified
   Product: Mesa
   Version: git
  Platform: x86-64 (AMD64)
OS/Version: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/r600
AssignedTo: dri-devel@lists.freedesktop.org
ReportedBy: ale...@gentoo.org


when running tests for opencl enabled jtr (http://www.openwall.com/john/)
i get following error

OpenCL platform 0: Default, 1 device(s).
Using device 0: AMD JUNIPER
1 error generated.
Compilation log: cl_input:17:10: fatal error: 'opencl_rar.h' file not found
#include opencl_rar.h
 ^

OpenCL error (CL_INVALID_PROGRAM_EXECUTABLE) in file (rar_fmt.c) at line (588)
- (Error creating kernel. Double-check kernel name?)

xeon ~ # ./clInfo 
Found 1 platform(s).
platform[(nil)]: profile: FULL_PROFILE
platform[(nil)]: version: OpenCL 1.1 MESA 
platform[(nil)]: name: Default
platform[(nil)]: vendor: Mesa
platform[(nil)]: extensions: 
platform[(nil)]: Found 1 device(s).
device[0xc82360]: NAME: AMD JUNIPER
device[0xc82360]: VENDOR: X.Org
device[0xc82360]: PROFILE: FULL_PROFILE
device[0xc82360]: VERSION: OpenCL 1.1 MESA 
device[0xc82360]: EXTENSIONS: 
device[0xc82360]: DRIVER_VERSION: 

device[0xc82360]: Type: GPU 
device[0xc82360]: EXECUTION_CAPABILITIES: Kernel 
device[0xc82360]: GLOBAL_MEM_CACHE_TYPE: None (0)
device[0xc82360]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1)
device[0xc82360]: SINGLE_FP_CONFIG: 0x7
device[0xc82360]: QUEUE_PROPERTIES: 0x2

device[0xc82360]: VENDOR_ID: 4098
device[0xc82360]: MAX_COMPUTE_UNITS: 1
device[0xc82360]: MAX_WORK_ITEM_DIMENSIONS: 3
device[0xc82360]: MAX_WORK_GROUP_SIZE: 256
device[0xc82360]: PREFERRED_VECTOR_WIDTH_CHAR: 16
device[0xc82360]: PREFERRED_VECTOR_WIDTH_SHORT: 8
device[0xc82360]: PREFERRED_VECTOR_WIDTH_INT: 4
device[0xc82360]: PREFERRED_VECTOR_WIDTH_LONG: 2
device[0xc82360]: PREFERRED_VECTOR_WIDTH_FLOAT: 4
device[0xc82360]: PREFERRED_VECTOR_WIDTH_DOUBLE: 2
device[0xc82360]: MAX_CLOCK_FREQUENCY: 0
device[0xc82360]: ADDRESS_BITS: 32
device[0xc82360]: MAX_MEM_ALLOC_SIZE: 0
device[0xc82360]: IMAGE_SUPPORT: 1
device[0xc82360]: MAX_READ_IMAGE_ARGS: 32
device[0xc82360]: MAX_WRITE_IMAGE_ARGS: 32
device[0xc82360]: IMAGE2D_MAX_WIDTH: 32768
device[0xc82360]: IMAGE2D_MAX_HEIGHT: 32768
device[0xc82360]: IMAGE3D_MAX_WIDTH: 32768
device[0xc82360]: IMAGE3D_MAX_HEIGHT: 32768
device[0xc82360]: IMAGE3D_MAX_DEPTH: 32768
device[0xc82360]: MAX_SAMPLERS: 16
device[0xc82360]: MAX_PARAMETER_SIZE: 1024
device[0xc82360]: MEM_BASE_ADDR_ALIGN: 128
device[0xc82360]: MIN_DATA_TYPE_ALIGN_SIZE: 128
device[0xc82360]: GLOBAL_MEM_CACHELINE_SIZE: 0
device[0xc82360]: GLOBAL_MEM_CACHE_SIZE: 0
device[0xc82360]: GLOBAL_MEM_SIZE: 201326592
device[0xc82360]: MAX_CONSTANT_BUFFER_SIZE: 65536
device[0xc82360]: MAX_CONSTANT_ARGS: 1
device[0xc82360]: LOCAL_MEM_SIZE: 32768
device[0xc82360]: ERROR_CORRECTION_SUPPORT: 0
device[0xc82360]: PROFILING_TIMER_RESOLUTION: 0
device[0xc82360]: ENDIAN_LITTLE: 1
device[0xc82360]: AVAILABLE: 1
device[0xc82360]: COMPILER_AVAILABLE: 1

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[RFC] dma-fence: dma-buf synchronization (v2)

2012-07-13 Thread Rob Clark
From: Rob Clark r...@ti.com

A dma-fence can be attached to a buffer which is being filled or consumed
by hw, to allow userspace to pass the buffer without waiting to another
device.  For example, userspace can call page_flip ioctl to display the
next frame of graphics after kicking the GPU but while the GPU is still
rendering.  The display device sharing the buffer with the GPU would
attach a callback to get notified when the GPU's rendering-complete IRQ
fires, to update the scan-out address of the display, without having to
wake up userspace.

A dma-fence is transient, one-shot deal.  It is allocated and attached
to dma-buf's list of fences.  When the one that attached it is done,
with the pending operation, it can signal the fence removing it from the
dma-buf's list of fences:

  + dma_buf_attach_fence()
  + dma_fence_signal()

Other drivers can access the current fence on the dma-buf (if any),
which increment's the fences refcnt:

  + dma_buf_get_fence()
  + dma_fence_put()

The one pending on the fence can add an async callback (and optionally
cancel it.. for example, to recover from GPU hangs):

  + dma_fence_add_callback()
  + dma_fence_cancel_callback()

Or wait synchronously (optionally with timeout or from atomic context):

  + dma_fence_wait()

A default software-only implementation is provided, which can be used
by drivers attaching a fence to a buffer when they have no other means
for hw sync.  But a memory backed fence is also envisioned, because it
is common that GPU's can write to, or poll on some memory location for
synchronization.  For example:

  fence = dma_buf_get_fence(dmabuf);
  if (fence-ops == mem_dma_fence_ops) {
dma_buf *fence_buf;
mem_dma_fence_get_buf(fence, fence_buf, offset);
... tell the hw the memory location to wait on ...
  } else {
/* fall-back to sw sync * /
dma_fence_add_callback(fence, my_cb);
  }

The memory location is itself backed by dma-buf, to simplify mapping
to the device's address space, an idea borrowed from Maarten Lankhorst.

NOTE: the memory location fence is not implemented yet, the above is
just for explaining how it would work.

On SoC platforms, if some other hw mechanism is provided for synchronizing
between IP blocks, it could be supported as an alternate implementation
with it's own fence ops in a similar way.

The other non-sw implementations would wrap the add/cancel_callback and
wait fence ops, so that they can keep track if a device not supporting
hw sync is waiting on the fence, and in this case should arrange to
call dma_fence_signal() at some point after the condition has changed,
to notify other devices waiting on the fence.  If there are no sw
waiters, this can be skipped to avoid waking the CPU unnecessarily.

The intention is to provide a userspace interface (presumably via eventfd)
later, to be used in conjunction with dma-buf's mmap support for sw access
to buffers (or for userspace apps that would prefer to do their own
synchronization).

v1: original
v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided
that dma-fence didn't need to care about the sw-hw signaling path
(it can be handled same as sw-sw case), and therefore the fence-ops
can be simplified and more handled in the core.  So remove the signal,
add_callback, cancel_callback, and wait ops, and replace with a simple
enable_signaling() op which can be used to inform a fence supporting
hw-hw signaling that one or more devices which do not support hw
signaling are waiting (and therefore it should enable an irq or do
whatever is necessary in order that the CPU is notified when the
fence is passed).
---
 drivers/base/Makefile |2 +-
 drivers/base/dma-buf.c|3 +
 drivers/base/dma-fence.c  |  364 +
 include/linux/dma-buf.h   |2 +
 include/linux/dma-fence.h |  128 
 5 files changed, 498 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/dma-fence.c
 create mode 100644 include/linux/dma-fence.h

diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 5aa2d70..6e9f217 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o
 obj-y  += power/
 obj-$(CONFIG_HAS_DMA)  += dma-mapping.o
 obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o
-obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o
+obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o
 obj-$(CONFIG_ISA)  += isa.o
 obj-$(CONFIG_FW_LOADER)+= firmware_class.o
 obj-$(CONFIG_NUMA) += node.o
diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index 24e88fe..b053236 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -39,6 +39,8 @@ static int dma_buf_release(struct inode *inode, struct file 
*file)
 
dmabuf = file-private_data;
 
+   WARN_ON(!list_empty(dmabuf-fence_list));
+
dmabuf-ops-release(dmabuf);
kfree(dmabuf);
   

libdrm: Fix some warnings reported by clang's scan-build tool

2012-07-13 Thread Johannes Obermayr

Patches 1 to 4 were sent to mesa-dev.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/5] libkms/nouveau.c: Fix a memory leak and cleanup code a bit.

2012-07-13 Thread Johannes Obermayr
---
 libkms/nouveau.c |   20 +++-
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/libkms/nouveau.c b/libkms/nouveau.c
index 0e24a15..4cbca96 100644
--- a/libkms/nouveau.c
+++ b/libkms/nouveau.c
@@ -94,14 +94,18 @@ nouveau_bo_create(struct kms_driver *kms,
if (!bo)
return -ENOMEM;
 
-   if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) {
+   switch (type) {
+   case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8:
pitch = 64 * 4;
size = 64 * 64 * 4;
-   } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) {
+   break;
+   case KMS_BO_TYPE_SCANOUT_X8R8G8B8:
pitch = width * 4;
pitch = (pitch + 512 - 1)  ~(512 - 1);
size = pitch * height;
-   } else {
+   break;
+   default:
+   free(bo);
return -EINVAL;
}
 
@@ -114,8 +118,10 @@ nouveau_bo_create(struct kms_driver *kms,
arg.channel_hint = 0;
 
ret = drmCommandWriteRead(kms-fd, DRM_NOUVEAU_GEM_NEW, arg, 
sizeof(arg));
-   if (ret)
-   goto err_free;
+   if (ret) {
+   free(bo);
+   return ret;
+   }
 
bo-base.kms = kms;
bo-base.handle = arg.info.handle;
@@ -126,10 +132,6 @@ nouveau_bo_create(struct kms_driver *kms,
*out = bo-base;
 
return 0;
-
-err_free:
-   free(bo);
-   return ret;
 }
 
 static int
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/5] libkms/intel.c: Fix a memory leak and a dead assignment as well as cleanup code a bit.

2012-07-13 Thread Johannes Obermayr
---
 libkms/intel.c |   25 ++---
 1 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/libkms/intel.c b/libkms/intel.c
index 8b8249b..b8ac343 100644
--- a/libkms/intel.c
+++ b/libkms/intel.c
@@ -93,14 +93,18 @@ intel_bo_create(struct kms_driver *kms,
if (!bo)
return -ENOMEM;
 
-   if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) {
+   switch (type) {
+   case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8:
pitch = 64 * 4;
size = 64 * 64 * 4;
-   } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) {
+   break;
+   case KMS_BO_TYPE_SCANOUT_X8R8G8B8:
pitch = width * 4;
pitch = (pitch + 512 - 1)  ~(512 - 1);
size = pitch * ((height + 4 - 1)  ~(4 - 1));
-   } else {
+   break;
+   default:
+   free(bo);
return -EINVAL;
}
 
@@ -108,8 +112,10 @@ intel_bo_create(struct kms_driver *kms,
arg.size = size;
 
ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_CREATE, arg, 
sizeof(arg));
-   if (ret)
-   goto err_free;
+   if (ret) {
+   free(bo);
+   return ret;
+   }
 
bo-base.kms = kms;
bo-base.handle = arg.handle;
@@ -124,21 +130,18 @@ intel_bo_create(struct kms_driver *kms,
tile.handle = bo-base.handle;
tile.tiling_mode = I915_TILING_X;
tile.stride = bo-base.pitch;
-
-   ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, 
tile, sizeof(tile));
 #if 0
+   ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, 
tile, sizeof(tile));
if (ret) {
kms_bo_destroy(out);
return ret;
}
+#else
+   drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, tile, 
sizeof(tile));
 #endif
}
 
return 0;
-
-err_free:
-   free(bo);
-   return ret;
 }
 
 static int
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/5] nouveau/nouveau.c: Fix two memory leaks.

2012-07-13 Thread Johannes Obermayr
---
 nouveau/nouveau.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c
index 5aa4107..e91287f 100644
--- a/nouveau/nouveau.c
+++ b/nouveau/nouveau.c
@@ -95,6 +95,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device 
**pdev)
(dev-drm_version   0x0100 ||
 dev-drm_version = 0x0200)) {
nouveau_device_del(dev);
+   free(nvdev);
return -EINVAL;
}
 
@@ -105,6 +106,7 @@ nouveau_device_wrap(int fd, int close, struct 
nouveau_device **pdev)
ret = nouveau_getparam(dev, NOUVEAU_GETPARAM_AGP_SIZE, gart);
if (ret) {
nouveau_device_del(dev);
+   free(nvdev);
return ret;
}
 
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 4/5] xf86drm.c: Make more code UDEV unrelevant and fix a memory leak.

2012-07-13 Thread Johannes Obermayr
---
 xf86drm.c |   12 +---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/xf86drm.c b/xf86drm.c
index 6ea068f..e3789c8 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -255,6 +255,7 @@ static int drmMatchBusID(const char *id1, const char *id2, 
int pci_domain_ok)
 return 0;
 }
 
+#if !defined(UDEV)
 /**
  * Handles error checking for chown call.
  *
@@ -284,6 +285,7 @@ static int chown_check_return(const char *path, uid_t 
owner, gid_t group)
path, errno, strerror(errno));
return -1;
 }
+#endif
 
 /**
  * Open the DRM device, creating it if necessary.
@@ -303,14 +305,17 @@ static int drmOpenDevice(long dev, int minor, int type)
 stat_t  st;
 charbuf[64];
 int fd;
+#if !defined(UDEV)
 mode_t  devmode = DRM_DEV_MODE, serv_mode;
 int isroot  = !geteuid();
 uid_t   user= DRM_DEV_UID;
 gid_t   group   = DRM_DEV_GID, serv_group;
-
+#endif
+
 sprintf(buf, type ? DRM_DEV_NAME : DRM_CONTROL_DEV_NAME, DRM_DIR_NAME, 
minor);
 drmMsg(drmOpenDevice: node name is %s\n, buf);
 
+#if !defined(UDEV)
 if (drm_server_info) {
drm_server_info-get_perms(serv_group, serv_mode);
devmode  = serv_mode ? serv_mode : DRM_DEV_MODE;
@@ -318,7 +323,6 @@ static int drmOpenDevice(long dev, int minor, int type)
group = (serv_group = 0) ? serv_group : DRM_DEV_GID;
 }
 
-#if !defined(UDEV)
 if (stat(DRM_DIR_NAME, st)) {
if (!isroot)
return DRM_ERR_NOT_ROOT;
@@ -1395,8 +1399,10 @@ drm_context_t *drmGetReservedContextList(int fd, int 
*count)
 }
 
 res.contexts = list;
-if (drmIoctl(fd, DRM_IOCTL_RES_CTX, res))
+if (drmIoctl(fd, DRM_IOCTL_RES_CTX, res)) {
+   drmFree(retval);
return NULL;
+}
 
 for (i = 0; i  res.count; i++)
retval[i] = list[i].handle;
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 5/5] modetest.c: Add return 0 in bit_name_fn(res) macro.

2012-07-13 Thread Johannes Obermayr
---
 tests/modetest/modetest.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c
index ec3121e..00129fa 100644
--- a/tests/modetest/modetest.c
+++ b/tests/modetest/modetest.c
@@ -128,6 +128,7 @@ char * res##_str(int type) {
\
sep = , ; \
}   \
}   \
+   return 0;   \
 }
 
 static const char *mode_type_names[] = {
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: libdrm: Fix some warnings reported by clang's scan-build tool

2012-07-13 Thread Marcin Slusarz
On Fri, Jul 13, 2012 at 05:49:12PM +0200, Johannes Obermayr wrote:
 
 Patches 1 to 4 were sent to mesa-dev.

And you chose to ignore most of my comments.
Fine. Don't expect further reviews from me.

Marcin
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [RFC] dma-fence: dma-buf synchronization (v2)

2012-07-13 Thread Tom Cooksey
Hi Rob,

Yes, sorry we've been a bit slack progressing KDS publicly. Your
approach looks interesting and seems like it could enable both implicit
and explicit synchronization. A good compromise.


 From: Rob Clark r...@ti.com
 
 A dma-fence can be attached to a buffer which is being filled or
 consumed by hw, to allow userspace to pass the buffer without waiting
 to another device.  For example, userspace can call page_flip ioctl to
 display the next frame of graphics after kicking the GPU but while the
 GPU is still rendering.  The display device sharing the buffer with the
 GPU would attach a callback to get notified when the GPU's rendering-
 complete IRQ fires, to update the scan-out address of the display,
 without having to wake up userspace.
 
 A dma-fence is transient, one-shot deal.  It is allocated and attached
 to dma-buf's list of fences.  When the one that attached it is done,
 with the pending operation, it can signal the fence removing it from
 the dma-buf's list of fences:
 
   + dma_buf_attach_fence()
   + dma_fence_signal()

It would be useful to have two lists of fences, those around writes to
the buffer and those around reads. The idea being that if you only want
to read from a buffer, you don't need to wait for fences around other
read operations, you only need to wait for the last writer fence. If
you do want to write to the buffer however, you need to wait for all
the read fences and the last writer fence. The use-case is when EGL
swap behaviour is EGL_BUFFER_PRESERVED. You have the display controller
reading the buffer with its fence defined to be signalled when it is
no-longer scanning out that buffer. It can only stop scanning out that
buffer when it is given another buffer to scan-out. If that next buffer
must be rendered by copying the currently scanned-out buffer into it
(one possible option for implementing EGL_BUFFER_PRESERVED) then you
essentially deadlock if the scan-out job blocks the render the next
frame job. 

There's probably variations of this idea, perhaps you only need a flag
to indicate if a fence is around a read-only or rw access?


 The intention is to provide a userspace interface (presumably via
 eventfd) later, to be used in conjunction with dma-buf's mmap support
 for sw access to buffers (or for userspace apps that would prefer to
 do their own synchronization).

From our experience of our own KDS, we've come up with an interesting
approach to synchronizing userspace applications which have a buffer
mmap'd. We wanted to avoid userspace being able to block jobs running
on hardware while still allowing userspace to participate. Our original
idea was to have a lock/unlock ioctl interface on a dma_buf but have
a timeout whereby the application's lock would be broken if held for
too long. That at least bounded how long userspace could potentially
block hardware making progress, though was pretty harsh.

The approach we have now settled on is to instead only allow an
application to wait for all jobs currently pending for a buffer. So
there's no way userspace can prevent anything else from using a
buffer, other than not issuing jobs which will use that buffer.
Also, the interface we settled on was to add a poll handler to
dma_buf, that way userspace can select() on multiple dma_buff
buffers in one syscall. It can also chose if it wants to wait for
only the last writer fence, I.e. wait until it can read (POLLIN)
or wait for all fences as it wants to write to the buffer (POLLOUT).
We kinda like this, but does restrict the utility a little. An idea
worth considering anyway.


My other thought is around atomicity. Could this be extended to
(safely) allow for hardware devices which might want to access
multiple buffers simultaneously? I think it probably can with
some tweaks to the interface? An atomic function which does 
something like give me all the fences for all these buffers 
and add this fence to each instead/as-well-as?


Cheers,

Tom




___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


libdrm: Fix some warnings reported by clang's scan-build tool [try 2]

2012-07-13 Thread Johannes Obermayr

Am Freitag, 13. Juli 2012, 18:47:50 schrieb Marcin Slusarz:
 On Fri, Jul 13, 2012 at 05:49:12PM +0200, Johannes Obermayr wrote:
  
  Patches 1 to 4 were sent to mesa-dev.
 
 And you chose to ignore most of my comments.
 Fine. Don't expect further reviews from me.
 
 Marcin

Patch 1 and 2:
- Adapted
- I want to keep proposed easier to read switch case

Patch 3:
- Resend
- Waiting on your response: 
http://lists.freedesktop.org/archives/mesa-dev/2012-June/023456.html

Patch 4 and 5:
- Splitted
- http://llvm.org/bugs/show_bug.cgi?id=13358 (forgot to split and to add 
'drmFree(list);')
- The 'more if's case' seems better to me

Patch 6:
- Resend

Marcin, not that I ignore comments. But sometimes I want to hear also opinions 
from (some more) other people.
I hope I can calm the waves ...

Johannes
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/6] libkms/intel.c: Fix a memory leak and a dead assignment as well as make some code easier to read.

2012-07-13 Thread Johannes Obermayr
---
 libkms/intel.c |   32 +---
 1 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/libkms/intel.c b/libkms/intel.c
index 8b8249b..12175b0 100644
--- a/libkms/intel.c
+++ b/libkms/intel.c
@@ -89,27 +89,32 @@ intel_bo_create(struct kms_driver *kms,
}
}
 
-   bo = calloc(1, sizeof(*bo));
-   if (!bo)
-   return -ENOMEM;
-
-   if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) {
+   switch (type) {
+   case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8:
pitch = 64 * 4;
size = 64 * 64 * 4;
-   } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) {
+   break;
+   case KMS_BO_TYPE_SCANOUT_X8R8G8B8:
pitch = width * 4;
pitch = (pitch + 512 - 1)  ~(512 - 1);
size = pitch * ((height + 4 - 1)  ~(4 - 1));
-   } else {
+   break;
+   default:
return -EINVAL;
}
 
+   bo = calloc(1, sizeof(*bo));
+   if (!bo)
+   return -ENOMEM;
+
memset(arg, 0, sizeof(arg));
arg.size = size;
 
ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_CREATE, arg, 
sizeof(arg));
-   if (ret)
-   goto err_free;
+   if (ret) {
+   free(bo);
+   return ret;
+   }
 
bo-base.kms = kms;
bo-base.handle = arg.handle;
@@ -124,21 +129,18 @@ intel_bo_create(struct kms_driver *kms,
tile.handle = bo-base.handle;
tile.tiling_mode = I915_TILING_X;
tile.stride = bo-base.pitch;
-
-   ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, 
tile, sizeof(tile));
 #if 0
+   ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, 
tile, sizeof(tile));
if (ret) {
kms_bo_destroy(out);
return ret;
}
+#else
+   drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, tile, 
sizeof(tile));
 #endif
}
 
return 0;
-
-err_free:
-   free(bo);
-   return ret;
 }
 
 static int
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/6] libkms/nouveau.c: Fix a memory leak and make some code easier to read.

2012-07-13 Thread Johannes Obermayr
---
 libkms/nouveau.c |   27 ++-
 1 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/libkms/nouveau.c b/libkms/nouveau.c
index 0e24a15..fbca6fe 100644
--- a/libkms/nouveau.c
+++ b/libkms/nouveau.c
@@ -90,21 +90,24 @@ nouveau_bo_create(struct kms_driver *kms,
}
}
 
-   bo = calloc(1, sizeof(*bo));
-   if (!bo)
-   return -ENOMEM;
-
-   if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) {
+   switch (type) {
+   case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8:
pitch = 64 * 4;
size = 64 * 64 * 4;
-   } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) {
+   break;
+   case KMS_BO_TYPE_SCANOUT_X8R8G8B8:
pitch = width * 4;
pitch = (pitch + 512 - 1)  ~(512 - 1);
size = pitch * height;
-   } else {
+   break;
+   default:
return -EINVAL;
}
 
+   bo = calloc(1, sizeof(*bo));
+   if (!bo)
+   return -ENOMEM;
+
memset(arg, 0, sizeof(arg));
arg.info.size = size;
arg.info.domain = NOUVEAU_GEM_DOMAIN_MAPPABLE | NOUVEAU_GEM_DOMAIN_VRAM;
@@ -114,8 +117,10 @@ nouveau_bo_create(struct kms_driver *kms,
arg.channel_hint = 0;
 
ret = drmCommandWriteRead(kms-fd, DRM_NOUVEAU_GEM_NEW, arg, 
sizeof(arg));
-   if (ret)
-   goto err_free;
+   if (ret) {
+   free(bo);
+   return ret;
+   }
 
bo-base.kms = kms;
bo-base.handle = arg.info.handle;
@@ -126,10 +131,6 @@ nouveau_bo_create(struct kms_driver *kms,
*out = bo-base;
 
return 0;
-
-err_free:
-   free(bo);
-   return ret;
 }
 
 static int
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/6] nouveau/nouveau.c: Fix two memory leaks.

2012-07-13 Thread Johannes Obermayr
---
 nouveau/nouveau.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c
index 5aa4107..e91287f 100644
--- a/nouveau/nouveau.c
+++ b/nouveau/nouveau.c
@@ -95,6 +95,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device 
**pdev)
(dev-drm_version   0x0100 ||
 dev-drm_version = 0x0200)) {
nouveau_device_del(dev);
+   free(nvdev);
return -EINVAL;
}
 
@@ -105,6 +106,7 @@ nouveau_device_wrap(int fd, int close, struct 
nouveau_device **pdev)
ret = nouveau_getparam(dev, NOUVEAU_GETPARAM_AGP_SIZE, gart);
if (ret) {
nouveau_device_del(dev);
+   free(nvdev);
return ret;
}
 
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 4/6] xf86drm.c: Make more code UDEV unrelevant.

2012-07-13 Thread Johannes Obermayr
---
 xf86drm.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/xf86drm.c b/xf86drm.c
index 6ea068f..e652731 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -255,6 +255,7 @@ static int drmMatchBusID(const char *id1, const char *id2, 
int pci_domain_ok)
 return 0;
 }
 
+#if !defined(UDEV)
 /**
  * Handles error checking for chown call.
  *
@@ -284,6 +285,7 @@ static int chown_check_return(const char *path, uid_t 
owner, gid_t group)
path, errno, strerror(errno));
return -1;
 }
+#endif
 
 /**
  * Open the DRM device, creating it if necessary.
@@ -303,14 +305,17 @@ static int drmOpenDevice(long dev, int minor, int type)
 stat_t  st;
 charbuf[64];
 int fd;
+#if !defined(UDEV)
 mode_t  devmode = DRM_DEV_MODE, serv_mode;
 int isroot  = !geteuid();
 uid_t   user= DRM_DEV_UID;
 gid_t   group   = DRM_DEV_GID, serv_group;
-
+#endif
+
 sprintf(buf, type ? DRM_DEV_NAME : DRM_CONTROL_DEV_NAME, DRM_DIR_NAME, 
minor);
 drmMsg(drmOpenDevice: node name is %s\n, buf);
 
+#if !defined(UDEV)
 if (drm_server_info) {
drm_server_info-get_perms(serv_group, serv_mode);
devmode  = serv_mode ? serv_mode : DRM_DEV_MODE;
@@ -318,7 +323,6 @@ static int drmOpenDevice(long dev, int minor, int type)
group = (serv_group = 0) ? serv_group : DRM_DEV_GID;
 }
 
-#if !defined(UDEV)
 if (stat(DRM_DIR_NAME, st)) {
if (!isroot)
return DRM_ERR_NOT_ROOT;
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 5/6] xf86drm.c: Fix two memory leaks.

2012-07-13 Thread Johannes Obermayr
---
 xf86drm.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/xf86drm.c b/xf86drm.c
index e652731..c1cc170 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -1399,8 +1399,11 @@ drm_context_t *drmGetReservedContextList(int fd, int 
*count)
 }
 
 res.contexts = list;
-if (drmIoctl(fd, DRM_IOCTL_RES_CTX, res))
+if (drmIoctl(fd, DRM_IOCTL_RES_CTX, res)) {
+   drmFree(list);
+   drmFree(retval);
return NULL;
+}
 
 for (i = 0; i  res.count; i++)
retval[i] = list[i].handle;
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 6/6] modetest.c: Add return 0 in bit_name_fn(res) macro.

2012-07-13 Thread Johannes Obermayr
---
 tests/modetest/modetest.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c
index ec3121e..00129fa 100644
--- a/tests/modetest/modetest.c
+++ b/tests/modetest/modetest.c
@@ -128,6 +128,7 @@ char * res##_str(int type) {
\
sep = , ; \
}   \
}   \
+   return 0;   \
 }
 
 static const char *mode_type_names[] = {
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC] dma-fence: dma-buf synchronization (v2)

2012-07-13 Thread Rob Clark
On Fri, Jul 13, 2012 at 12:35 PM, Tom Cooksey tom.cook...@arm.com wrote:
 My other thought is around atomicity. Could this be extended to
 (safely) allow for hardware devices which might want to access
 multiple buffers simultaneously? I think it probably can with
 some tweaks to the interface? An atomic function which does
 something like give me all the fences for all these buffers
 and add this fence to each instead/as-well-as?

fwiw, what I'm leaning towards right now is combining dma-fence w/
Maarten's idea of dma-buf-mgr (not sure if you saw his patches?).  And
let dmabufmgr handle the multi-buffer reservation stuff.  And possibly
the read vs write access, although this I'm not 100% sure on... the
other option being the concept of read vs write (or
exclusive/non-exclusive) fences.

In the current state, the fence is quite simple, and doesn't care
*what* it is fencing, which seems advantageous when you get into
trying to deal with combinations of devices sharing buffers, some of
whom can do hw sync, and some who can't.  So having a bit of
partitioning from the code dealing w/ sequencing who can access the
buffers when and for what purpose seems like it might not be a bad
idea.  Although I'm still working through the different alternatives.

BR,
-R
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[pull] drm-intel-next

2012-07-13 Thread Daniel Vetter
Hi Dave,

New pull for -next. Highlights:
- rc6/turbo support for hsw (Eugeni)
- improve corner-case of the reset handling code - gpu reset handling
  should be rock-solid now
- support for fb offset  4096 pixels on gen4+ (yeah, you need some fairly
  big screens to hit that)
- the Flush Me Harder patch to fix the gen6+ fallout from disabling the
  flushing_list
- no more /dev/agpgart on gen6+!
- HAS_PCH_xxx improvements from Paulo
- a few minor bitspieces all over, most of it in thew hsw code

QA reported 2 regression, one due a bad cable (fixed by a walk to the next
radioshack) and one due to the HPD v2 patch - I owe you one for refusing
to take v2 for -fixes after v1 blew up on Linus' machine I guess ;-) The
later has a confirmed fix already queued up in my tree.

Regressions from the last pull are all fixed and some really good news:
We've finally fixed the last DP regression from 3.2. Although I'm vary of
that blowing up elseplaces, hence I prefer that we soak it in 3.6 a bit
before submitting it to stable.

Otherwise Chris is hunting down an obscure bug that got recently
introduced due to a funny interaction between two seemingly unrelated
patches, one improving our gpu death handling, the other preparing the
removal of the flushing_list. But he has patches already, although I'm
still complaining a bit about the commit messages ...

Wrt further pulls for 3.6 I'll merge feature-y stuff only at the end of
the current drm-intel-next cycle so that if this will miss 3.6 I can just
send you a pull for the bugfixes that are currently merged (or in the case
of Chris' patches, hopefully merged soon).

Yours, Daniel

PS: This pull will make the already existing conflict with Linus' tree a
bit more fun, but I think it should be still doable (the important thing
is to keep the revert from -fixes, but don't kill any other changes from
-next).

The following changes since commit 7b0cfee1a24efdfe0235bac62e53f686fe8a8e24:

  Merge tag 'v3.5-rc4' into drm-intel-next-queued (2012-06-25 19:10:36 +0200)

are available in the git repository at:


  git://people.freedesktop.org/~danvet/drm-intel tags/drm-intel-next-2012-07-06

for you to fetch changes up to 4acf518626cdad5bbf7aac9869bd4accbbfb4ad3:

  drm/i915: program FDI_RX TP and FDI delays (2012-07-05 15:09:03 +0200)


Ben Widawsky (1):
  drm/i915: linuxify create_hw_context()

Chris Wilson (2):
  drm/i915: Group the GT routines together in both code and vtable
  drm/i915: Implement w/a for sporadic read failures on waking from rc6

Daniel Vetter (15):
  drm/i915: wrap up gt powersave enabling functions
  drm/i915: make enable/disable_gt_powersave locking consistent
  drm/i915: don't use dev-agp
  drm/i915: disable drm agp support for !gen3 with kms enabled
  agp/intel-agp: remove snb+ host bridge pciids
  drm/i915: Flush Me Harder required on gen6+
  drm/i915: fix up ilk rc6 disabling confusion
  drm/i915: don't trylock in the gpu reset code
  drm/i915: non-interruptible sleeps can't handle -EAGAIN
  drm/i915: don't hang userspace when the gpu reset is stuck
  drm/i915: properly SIGBUS on I/O errors
  drm/i915: don't return a spurious -EIO from intel_ring_begin
  drm/i915: introduce crtc-dspaddr_offset
  drm/i915: adjust framebuffer base address on gen4+
  drm/i915: introduce for_each_encoder_on_crtc

Eugeni Dodonov (11):
  drm/i915: support Haswell force waking
  drm/i915: add RPS configuration for Haswell
  drm/i915: slightly improve gt enable/disable routines
  drm/i915: enable RC6 by default on Haswell
  drm/i915: disable RC6 when disabling rps
  drm/i915: introduce haswell_init_clock_gating
  drm/i915: enable RC6 workaround on Haswell
  drm/i915: move force wake support into intel_pm
  drm/i915: re-initialize DDI buffer translations after resume
  drm/i915: prevent bogus intel_update_fbc notifications
  drm/i915: program FDI_RX TP and FDI delays

Jesper Juhl (1):
  drm/i915/sprite: Fix mem leak in intel_plane_init()

Jesse Barnes (3):
  drm/i915: mask tiled bit when updating IVB sprites
  drm/i915: correct IVB default sprite format
  drm/i915: prefer wide  slow to fast  narrow in DP configs

Paulo Zanoni (5):
  drm/i915: fix PIPE_WM_LINETIME definition
  drm/i915: add PCH_NONE to enum intel_pch
  drm/i915: get rid of dev_priv-info-has_pch_split
  drm/i915: don't ironlake_init_pch_refclk() on LPT
  drm/i915: fix PIPE_DDI_PORT_MASK

Ville Syrjälä (2):
  drm/i915: Zero initialize mode_cmd
  drm/i915: Reject page flips with changed format/offset/pitch

 drivers/char/agp/intel-agp.c|   11 -
 drivers/gpu/drm/i915/i915_dma.c |9 +-
 drivers/gpu/drm/i915/i915_drv.c |  172 ++
 drivers/gpu/drm/i915/i915_drv.h |   28 ++-
 drivers/gpu/drm/i915/i915_gem.c |   44 +++-
 

[PATCH] nouveau: Add irq waiting as alternative to busywait

2012-07-13 Thread Maarten Lankhorst
A way to trigger an irq will be needed for optimus support since
cpu-waiting isn't always viable there. This could also be nice for
power saving on since cpu would no longer have to spin, and
performance might improve slightly on cpu-limited workloads.

Some way to quantify these effects would be nice, even if the
end result would be 'no performance regression'. An earlier
version always emitted an interrupt, resulting in glxgears going
from 8k fps to 7k. However this is no longer the case, as I'm
using the kernel submission channel for generating irqs as
needed now.

On nv84 I'm using NOTIFY_INTR, but that might have been
removed on fermi, so instead I'm using invalid command
0x0058 now as a way to signal completion.

Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com

---
 drivers/gpu/drm/nouveau/nouveau_drv.h   |2 +
 drivers/gpu/drm/nouveau/nouveau_fence.c |   49 ---
 drivers/gpu/drm/nouveau/nouveau_fifo.h  |1 +
 drivers/gpu/drm/nouveau/nouveau_state.c |1 +
 drivers/gpu/drm/nouveau/nv04_fifo.c |   25 
 drivers/gpu/drm/nouveau/nv84_fence.c|   18 +--
 drivers/gpu/drm/nouveau/nvc0_fence.c|   12 ++--
 drivers/gpu/drm/nouveau/nvc0_fifo.c |3 +-
 drivers/gpu/drm/nouveau/nve0_fifo.c |   15 +++--
 9 files changed, 110 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index f97a1a7..d9d274d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -707,6 +707,7 @@ struct drm_nouveau_private {
struct drm_mm heap;
struct nouveau_bo *bo;
} fence;
+   wait_queue_head_t fence_wq;
 
struct {
spinlock_t lock;
@@ -1656,6 +1657,7 @@ nv44_graph_class(struct drm_device *dev)
 #define NV84_SUBCHAN_WRCACHE_FLUSH   0x0024
 #define NV10_SUBCHAN_REF_CNT 0x0050
 #define NVSW_SUBCHAN_PAGE_FLIP   0x0054
+#define NVSW_SUBCHAN_FENCE_WAKE  0x0058
 #define NV11_SUBCHAN_DMA_SEMAPHORE   0x0060
 #define NV11_SUBCHAN_SEMAPHORE_OFFSET0x0064
 #define NV11_SUBCHAN_SEMAPHORE_ACQUIRE   0x0068
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 3c18049..3ba8dee 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -68,7 +68,7 @@ nouveau_fence_update(struct nouveau_channel *chan)
 
spin_lock(fctx-lock);
list_for_each_entry_safe(fence, fnext, fctx-pending, head) {
-   if (priv-read(chan)  fence-sequence)
+   if (priv-read(chan) - fence-sequence = 0x8000U)
break;
 
if (fence-work)
@@ -111,11 +111,9 @@ nouveau_fence_done(struct nouveau_fence *fence)
return !fence-channel;
 }
 
-int
-nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr)
+static int nouveau_fence_wait_busy(struct nouveau_fence *fence, bool lazy, 
bool intr)
 {
unsigned long sleep_time = NSEC_PER_MSEC / 1000;
-   ktime_t t;
int ret = 0;
 
while (!nouveau_fence_done(fence)) {
@@ -127,7 +125,7 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, 
bool intr)
__set_current_state(intr ? TASK_INTERRUPTIBLE :
   TASK_UNINTERRUPTIBLE);
if (lazy) {
-   t = ktime_set(0, sleep_time);
+   ktime_t t = ktime_set(0, sleep_time);
schedule_hrtimeout(t, HRTIMER_MODE_REL);
sleep_time *= 2;
if (sleep_time  NSEC_PER_MSEC)
@@ -144,6 +142,47 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, 
bool intr)
return ret;
 }
 
+static int nouveau_fence_wait_event(struct nouveau_fence *fence, bool intr)
+{
+   struct drm_nouveau_private *dev_priv = fence-channel-dev-dev_private;
+   unsigned long timeout = fence-timeout;
+   int ret = 0;
+   struct nouveau_channel *chan = dev_priv-channel;
+   struct nouveau_channel *prev = fence-channel;
+   struct nouveau_fence_priv *priv = nv_engine(chan-dev, 
NVOBJ_ENGINE_FENCE);
+
+   if (nouveau_fence_done(fence))
+   return 0;
+
+   if (!timeout)
+   timeout = jiffies + 3 * DRM_HZ;
+
+   if (prev != chan)
+   ret = priv-sync(fence, prev, chan);
+   if (ret)
+   goto busy;
+
+   if (intr)
+   ret = wait_event_interruptible_timeout(dev_priv-fence_wq, 
nouveau_fence_done(fence), timeout);
+   else
+   ret = wait_event_timeout(dev_priv-fence_wq, 
nouveau_fence_done(fence), 

Re: [RFC] dma-fence: dma-buf synchronization (v2)

2012-07-13 Thread Rob Clark
On Fri, Jul 13, 2012 at 4:44 PM, Maarten Lankhorst
maarten.lankho...@canonical.com wrote:
 Hey,

 Op 13-07-12 20:52, Rob Clark schreef:
 On Fri, Jul 13, 2012 at 12:35 PM, Tom Cooksey tom.cook...@arm.com wrote:
 My other thought is around atomicity. Could this be extended to
 (safely) allow for hardware devices which might want to access
 multiple buffers simultaneously? I think it probably can with
 some tweaks to the interface? An atomic function which does
 something like give me all the fences for all these buffers
 and add this fence to each instead/as-well-as?
 fwiw, what I'm leaning towards right now is combining dma-fence w/
 Maarten's idea of dma-buf-mgr (not sure if you saw his patches?).  And
 let dmabufmgr handle the multi-buffer reservation stuff.  And possibly
 the read vs write access, although this I'm not 100% sure on... the
 other option being the concept of read vs write (or
 exclusive/non-exclusive) fences.
 Agreed, dmabufmgr is meant for reserving multiple buffers without deadlocks.
 The underlying mechanism for synchronization can be dma-fences, it wouldn't
 really change dmabufmgr much.
 In the current state, the fence is quite simple, and doesn't care
 *what* it is fencing, which seems advantageous when you get into
 trying to deal with combinations of devices sharing buffers, some of
 whom can do hw sync, and some who can't.  So having a bit of
 partitioning from the code dealing w/ sequencing who can access the
 buffers when and for what purpose seems like it might not be a bad
 idea.  Although I'm still working through the different alternatives.

 Yeah, I managed to get nouveau hooked up with generating irqs on
 completion today using an invalid command. It's also no longer a
 performance regression, so software syncing is no longer a problem
 for nouveau. i915 already generates irqs and r600 presumably too.

 Monday I'll take a better look at your patch, end of day now. :)

let me send you a slightly updated version.. I fixed locally some
locking fail in attach_fence() and get_fence() that I managed to
introduce when converting from global spinlock to using the
waitqueue's spinlock.

BR,
-R

 ~Maarten
 --
 To unsubscribe from this list: send the line unsubscribe linux-media in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] nouveau: Add irq waiting as alternative to busywait

2012-07-13 Thread Maarten Maathuis
On Fri, Jul 13, 2012 at 11:35 PM, Maarten Lankhorst
m.b.lankho...@gmail.com wrote:
 A way to trigger an irq will be needed for optimus support since
 cpu-waiting isn't always viable there. This could also be nice for
 power saving on since cpu would no longer have to spin, and
 performance might improve slightly on cpu-limited workloads.

 Some way to quantify these effects would be nice, even if the
 end result would be 'no performance regression'. An earlier
 version always emitted an interrupt, resulting in glxgears going
 from 8k fps to 7k. However this is no longer the case, as I'm
 using the kernel submission channel for generating irqs as
 needed now.

 On nv84 I'm using NOTIFY_INTR, but that might have been
 removed on fermi, so instead I'm using invalid command
 0x0058 now as a way to signal completion.

Out of curiosity, isn't this like a handcoded version of software
methods? If so, why handcoded? Or are software methods not supported
on NVC0?


 Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com

 ---
  drivers/gpu/drm/nouveau/nouveau_drv.h   |2 +
  drivers/gpu/drm/nouveau/nouveau_fence.c |   49 
 ---
  drivers/gpu/drm/nouveau/nouveau_fifo.h  |1 +
  drivers/gpu/drm/nouveau/nouveau_state.c |1 +
  drivers/gpu/drm/nouveau/nv04_fifo.c |   25 
  drivers/gpu/drm/nouveau/nv84_fence.c|   18 +--
  drivers/gpu/drm/nouveau/nvc0_fence.c|   12 ++--
  drivers/gpu/drm/nouveau/nvc0_fifo.c |3 +-
  drivers/gpu/drm/nouveau/nve0_fifo.c |   15 +++--
  9 files changed, 110 insertions(+), 16 deletions(-)

 diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
 b/drivers/gpu/drm/nouveau/nouveau_drv.h
 index f97a1a7..d9d274d 100644
 --- a/drivers/gpu/drm/nouveau/nouveau_drv.h
 +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
 @@ -707,6 +707,7 @@ struct drm_nouveau_private {
 struct drm_mm heap;
 struct nouveau_bo *bo;
 } fence;
 +   wait_queue_head_t fence_wq;

 struct {
 spinlock_t lock;
 @@ -1656,6 +1657,7 @@ nv44_graph_class(struct drm_device *dev)
  #define NV84_SUBCHAN_WRCACHE_FLUSH   
 0x0024
  #define NV10_SUBCHAN_REF_CNT 
 0x0050
  #define NVSW_SUBCHAN_PAGE_FLIP   
 0x0054
 +#define NVSW_SUBCHAN_FENCE_WAKE  
 0x0058
  #define NV11_SUBCHAN_DMA_SEMAPHORE   
 0x0060
  #define NV11_SUBCHAN_SEMAPHORE_OFFSET
 0x0064
  #define NV11_SUBCHAN_SEMAPHORE_ACQUIRE   
 0x0068
 diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
 b/drivers/gpu/drm/nouveau/nouveau_fence.c
 index 3c18049..3ba8dee 100644
 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
 +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
 @@ -68,7 +68,7 @@ nouveau_fence_update(struct nouveau_channel *chan)

 spin_lock(fctx-lock);
 list_for_each_entry_safe(fence, fnext, fctx-pending, head) {
 -   if (priv-read(chan)  fence-sequence)
 +   if (priv-read(chan) - fence-sequence = 0x8000U)
 break;

 if (fence-work)
 @@ -111,11 +111,9 @@ nouveau_fence_done(struct nouveau_fence *fence)
 return !fence-channel;
  }

 -int
 -nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr)
 +static int nouveau_fence_wait_busy(struct nouveau_fence *fence, bool lazy, 
 bool intr)
  {
 unsigned long sleep_time = NSEC_PER_MSEC / 1000;
 -   ktime_t t;
 int ret = 0;

 while (!nouveau_fence_done(fence)) {
 @@ -127,7 +125,7 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool 
 lazy, bool intr)
 __set_current_state(intr ? TASK_INTERRUPTIBLE :
TASK_UNINTERRUPTIBLE);
 if (lazy) {
 -   t = ktime_set(0, sleep_time);
 +   ktime_t t = ktime_set(0, sleep_time);
 schedule_hrtimeout(t, HRTIMER_MODE_REL);
 sleep_time *= 2;
 if (sleep_time  NSEC_PER_MSEC)
 @@ -144,6 +142,47 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool 
 lazy, bool intr)
 return ret;
  }

 +static int nouveau_fence_wait_event(struct nouveau_fence *fence, bool intr)
 +{
 +   struct drm_nouveau_private *dev_priv = 
 fence-channel-dev-dev_private;
 +   unsigned long timeout = fence-timeout;
 +   int ret = 0;
 +   struct nouveau_channel *chan = dev_priv-channel;
 +   struct nouveau_channel *prev = fence-channel;
 +   struct nouveau_fence_priv *priv = nv_engine(chan-dev, 
 NVOBJ_ENGINE_FENCE);
 +
 +   if (nouveau_fence_done(fence))
 +   return 0;
 +
 +   if (!timeout)
 +   timeout = jiffies + 3 * DRM_HZ;
 +
 +   if (prev != 

Re: general protection fault on ttm_init()

2012-07-13 Thread Dave Airlie
Can you try this patch on top of the previous one?

I think it should fix it.

Dave.


0001-drm-set-drm_class-to-NULL-after-removing-it.patch
Description: Binary data
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: general protection fault on ttm_init()

2012-07-13 Thread Fengguang Wu
Hi Dave,

On Sat, Jul 14, 2012 at 01:33:45PM +1000, Dave Airlie wrote:
 Can you try this patch on top of the previous one?
 
 I think it should fix it.

You are right, it works!  Thank you very much! :-)

Thanks,
Fengguang
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel