[Bug 52054] New: gallium/opencl doesnt support includes for opencl kernels
https://bugs.freedesktop.org/show_bug.cgi?id=52054 Bug #: 52054 Summary: gallium/opencl doesnt support includes for opencl kernels Classification: Unclassified Product: Mesa Version: git Platform: x86-64 (AMD64) OS/Version: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/r600 AssignedTo: dri-devel at lists.freedesktop.org ReportedBy: alexxy at gentoo.org when running tests for opencl enabled jtr (http://www.openwall.com/john/) i get following error OpenCL platform 0: Default, 1 device(s). Using device 0: AMD JUNIPER 1 error generated. Compilation log: cl_input:17:10: fatal error: 'opencl_rar.h' file not found #include "opencl_rar.h" ^ OpenCL error (CL_INVALID_PROGRAM_EXECUTABLE) in file (rar_fmt.c) at line (588) - (Error creating kernel. Double-check kernel name?) xeon ~ # ./clInfo Found 1 platform(s). platform[(nil)]: profile: FULL_PROFILE platform[(nil)]: version: OpenCL 1.1 MESA platform[(nil)]: name: Default platform[(nil)]: vendor: Mesa platform[(nil)]: extensions: platform[(nil)]: Found 1 device(s). device[0xc82360]: NAME: AMD JUNIPER device[0xc82360]: VENDOR: X.Org device[0xc82360]: PROFILE: FULL_PROFILE device[0xc82360]: VERSION: OpenCL 1.1 MESA device[0xc82360]: EXTENSIONS: device[0xc82360]: DRIVER_VERSION: device[0xc82360]: Type: GPU device[0xc82360]: EXECUTION_CAPABILITIES: Kernel device[0xc82360]: GLOBAL_MEM_CACHE_TYPE: None (0) device[0xc82360]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1) device[0xc82360]: SINGLE_FP_CONFIG: 0x7 device[0xc82360]: QUEUE_PROPERTIES: 0x2 device[0xc82360]: VENDOR_ID: 4098 device[0xc82360]: MAX_COMPUTE_UNITS: 1 device[0xc82360]: MAX_WORK_ITEM_DIMENSIONS: 3 device[0xc82360]: MAX_WORK_GROUP_SIZE: 256 device[0xc82360]: PREFERRED_VECTOR_WIDTH_CHAR: 16 device[0xc82360]: PREFERRED_VECTOR_WIDTH_SHORT: 8 device[0xc82360]: PREFERRED_VECTOR_WIDTH_INT: 4 device[0xc82360]: PREFERRED_VECTOR_WIDTH_LONG: 2 device[0xc82360]: PREFERRED_VECTOR_WIDTH_FLOAT: 4 device[0xc82360]: PREFERRED_VECTOR_WIDTH_DOUBLE: 2 device[0xc82360]: MAX_CLOCK_FREQUENCY: 0 device[0xc82360]: ADDRESS_BITS: 32 device[0xc82360]: MAX_MEM_ALLOC_SIZE: 0 device[0xc82360]: IMAGE_SUPPORT: 1 device[0xc82360]: MAX_READ_IMAGE_ARGS: 32 device[0xc82360]: MAX_WRITE_IMAGE_ARGS: 32 device[0xc82360]: IMAGE2D_MAX_WIDTH: 32768 device[0xc82360]: IMAGE2D_MAX_HEIGHT: 32768 device[0xc82360]: IMAGE3D_MAX_WIDTH: 32768 device[0xc82360]: IMAGE3D_MAX_HEIGHT: 32768 device[0xc82360]: IMAGE3D_MAX_DEPTH: 32768 device[0xc82360]: MAX_SAMPLERS: 16 device[0xc82360]: MAX_PARAMETER_SIZE: 1024 device[0xc82360]: MEM_BASE_ADDR_ALIGN: 128 device[0xc82360]: MIN_DATA_TYPE_ALIGN_SIZE: 128 device[0xc82360]: GLOBAL_MEM_CACHELINE_SIZE: 0 device[0xc82360]: GLOBAL_MEM_CACHE_SIZE: 0 device[0xc82360]: GLOBAL_MEM_SIZE: 201326592 device[0xc82360]: MAX_CONSTANT_BUFFER_SIZE: 65536 device[0xc82360]: MAX_CONSTANT_ARGS: 1 device[0xc82360]: LOCAL_MEM_SIZE: 32768 device[0xc82360]: ERROR_CORRECTION_SUPPORT: 0 device[0xc82360]: PROFILING_TIMER_RESOLUTION: 0 device[0xc82360]: ENDIAN_LITTLE: 1 device[0xc82360]: AVAILABLE: 1 device[0xc82360]: COMPILER_AVAILABLE: 1 -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[pull] drm-intel-next
Hi Dave, New pull for -next. Highlights: - rc6/turbo support for hsw (Eugeni) - improve corner-case of the reset handling code - gpu reset handling should be rock-solid now - support for fb offset > 4096 pixels on gen4+ (yeah, you need some fairly big screens to hit that) - the "Flush Me Harder" patch to fix the gen6+ fallout from disabling the flushing_list - no more /dev/agpgart on gen6+! - HAS_PCH_xxx improvements from Paulo - a few minor bits all over, most of it in thew hsw code QA reported 2 regression, one due a bad cable (fixed by a walk to the next radioshack) and one due to the HPD v2 patch - I owe you one for refusing to take v2 for -fixes after v1 blew up on Linus' machine I guess ;-) The later has a confirmed fix already queued up in my tree. Regressions from the last pull are all fixed and some really good news: We've finally fixed the last DP regression from 3.2. Although I'm vary of that blowing up elseplaces, hence I prefer that we soak it in 3.6 a bit before submitting it to stable. Otherwise Chris is hunting down an obscure bug that got recently introduced due to a funny interaction between two seemingly unrelated patches, one improving our gpu death handling, the other preparing the removal of the flushing_list. But he has patches already, although I'm still complaining a bit about the commit messages ... Wrt further pulls for 3.6 I'll merge feature-y stuff only at the end of the current drm-intel-next cycle so that if this will miss 3.6 I can just send you a pull for the bugfixes that are currently merged (or in the case of Chris' patches, hopefully merged soon). Yours, Daniel PS: This pull will make the already existing conflict with Linus' tree a bit more fun, but I think it should be still doable (the important thing is to keep the revert from -fixes, but don't kill any other changes from -next). The following changes since commit 7b0cfee1a24efdfe0235bac62e53f686fe8a8e24: Merge tag 'v3.5-rc4' into drm-intel-next-queued (2012-06-25 19:10:36 +0200) are available in the git repository at: git://people.freedesktop.org/~danvet/drm-intel tags/drm-intel-next-2012-07-06 for you to fetch changes up to 4acf518626cdad5bbf7aac9869bd4accbbfb4ad3: drm/i915: program FDI_RX TP and FDI delays (2012-07-05 15:09:03 +0200) Ben Widawsky (1): drm/i915: linuxify create_hw_context() Chris Wilson (2): drm/i915: Group the GT routines together in both code and vtable drm/i915: Implement w/a for sporadic read failures on waking from rc6 Daniel Vetter (15): drm/i915: wrap up gt powersave enabling functions drm/i915: make enable/disable_gt_powersave locking consistent drm/i915: don't use dev->agp drm/i915: disable drm agp support for !gen3 with kms enabled agp/intel-agp: remove snb+ host bridge pciids drm/i915: "Flush Me Harder" required on gen6+ drm/i915: fix up ilk rc6 disabling confusion drm/i915: don't trylock in the gpu reset code drm/i915: non-interruptible sleeps can't handle -EAGAIN drm/i915: don't hang userspace when the gpu reset is stuck drm/i915: properly SIGBUS on I/O errors drm/i915: don't return a spurious -EIO from intel_ring_begin drm/i915: introduce crtc->dspaddr_offset drm/i915: adjust framebuffer base address on gen4+ drm/i915: introduce for_each_encoder_on_crtc Eugeni Dodonov (11): drm/i915: support Haswell force waking drm/i915: add RPS configuration for Haswell drm/i915: slightly improve gt enable/disable routines drm/i915: enable RC6 by default on Haswell drm/i915: disable RC6 when disabling rps drm/i915: introduce haswell_init_clock_gating drm/i915: enable RC6 workaround on Haswell drm/i915: move force wake support into intel_pm drm/i915: re-initialize DDI buffer translations after resume drm/i915: prevent bogus intel_update_fbc notifications drm/i915: program FDI_RX TP and FDI delays Jesper Juhl (1): drm/i915/sprite: Fix mem leak in intel_plane_init() Jesse Barnes (3): drm/i915: mask tiled bit when updating IVB sprites drm/i915: correct IVB default sprite format drm/i915: prefer wide & slow to fast & narrow in DP configs Paulo Zanoni (5): drm/i915: fix PIPE_WM_LINETIME definition drm/i915: add PCH_NONE to enum intel_pch drm/i915: get rid of dev_priv->info->has_pch_split drm/i915: don't ironlake_init_pch_refclk() on LPT drm/i915: fix PIPE_DDI_PORT_MASK Ville Syrj?l? (2): drm/i915: Zero initialize mode_cmd drm/i915: Reject page flips with changed format/offset/pitch drivers/char/agp/intel-agp.c| 11 - drivers/gpu/drm/i915/i915_dma.c |9 +- drivers/gpu/drm/i915/i915_drv.c | 172 ++ drivers/gpu/drm/i915/i915_drv.h | 28 ++- drivers/gpu/drm/i915/i915_gem.c | 44 +++-
[PATCH 6/6] modetest.c: Add return 0 in bit_name_fn(res) macro.
--- tests/modetest/modetest.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c index ec3121e..00129fa 100644 --- a/tests/modetest/modetest.c +++ b/tests/modetest/modetest.c @@ -128,6 +128,7 @@ char * res##_str(int type) { \ sep = ", "; \ } \ } \ + return 0; \ } static const char *mode_type_names[] = { -- 1.7.7
[PATCH 5/6] xf86drm.c: Fix two memory leaks.
--- xf86drm.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/xf86drm.c b/xf86drm.c index e652731..c1cc170 100644 --- a/xf86drm.c +++ b/xf86drm.c @@ -1399,8 +1399,11 @@ drm_context_t *drmGetReservedContextList(int fd, int *count) } res.contexts = list; -if (drmIoctl(fd, DRM_IOCTL_RES_CTX, )) +if (drmIoctl(fd, DRM_IOCTL_RES_CTX, )) { + drmFree(list); + drmFree(retval); return NULL; +} for (i = 0; i < res.count; i++) retval[i] = list[i].handle; -- 1.7.7
[PATCH 4/6] xf86drm.c: Make more code UDEV unrelevant.
--- xf86drm.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/xf86drm.c b/xf86drm.c index 6ea068f..e652731 100644 --- a/xf86drm.c +++ b/xf86drm.c @@ -255,6 +255,7 @@ static int drmMatchBusID(const char *id1, const char *id2, int pci_domain_ok) return 0; } +#if !defined(UDEV) /** * Handles error checking for chown call. * @@ -284,6 +285,7 @@ static int chown_check_return(const char *path, uid_t owner, gid_t group) path, errno, strerror(errno)); return -1; } +#endif /** * Open the DRM device, creating it if necessary. @@ -303,14 +305,17 @@ static int drmOpenDevice(long dev, int minor, int type) stat_t st; charbuf[64]; int fd; +#if !defined(UDEV) mode_t devmode = DRM_DEV_MODE, serv_mode; int isroot = !geteuid(); uid_t user= DRM_DEV_UID; gid_t group = DRM_DEV_GID, serv_group; - +#endif + sprintf(buf, type ? DRM_DEV_NAME : DRM_CONTROL_DEV_NAME, DRM_DIR_NAME, minor); drmMsg("drmOpenDevice: node name is %s\n", buf); +#if !defined(UDEV) if (drm_server_info) { drm_server_info->get_perms(_group, _mode); devmode = serv_mode ? serv_mode : DRM_DEV_MODE; @@ -318,7 +323,6 @@ static int drmOpenDevice(long dev, int minor, int type) group = (serv_group >= 0) ? serv_group : DRM_DEV_GID; } -#if !defined(UDEV) if (stat(DRM_DIR_NAME, )) { if (!isroot) return DRM_ERR_NOT_ROOT; -- 1.7.7
[PATCH 3/6] nouveau/nouveau.c: Fix two memory leaks.
--- nouveau/nouveau.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c index 5aa4107..e91287f 100644 --- a/nouveau/nouveau.c +++ b/nouveau/nouveau.c @@ -95,6 +95,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device **pdev) (dev->drm_version < 0x0100 || dev->drm_version >= 0x0200)) { nouveau_device_del(); + free(nvdev); return -EINVAL; } @@ -105,6 +106,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device **pdev) ret = nouveau_getparam(dev, NOUVEAU_GETPARAM_AGP_SIZE, ); if (ret) { nouveau_device_del(); + free(nvdev); return ret; } -- 1.7.7
[PATCH 2/6] libkms/nouveau.c: Fix a memory leak and make some code easier to read.
--- libkms/nouveau.c | 27 ++- 1 files changed, 14 insertions(+), 13 deletions(-) diff --git a/libkms/nouveau.c b/libkms/nouveau.c index 0e24a15..fbca6fe 100644 --- a/libkms/nouveau.c +++ b/libkms/nouveau.c @@ -90,21 +90,24 @@ nouveau_bo_create(struct kms_driver *kms, } } - bo = calloc(1, sizeof(*bo)); - if (!bo) - return -ENOMEM; - - if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) { + switch (type) { + case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8: pitch = 64 * 4; size = 64 * 64 * 4; - } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) { + break; + case KMS_BO_TYPE_SCANOUT_X8R8G8B8: pitch = width * 4; pitch = (pitch + 512 - 1) & ~(512 - 1); size = pitch * height; - } else { + break; + default: return -EINVAL; } + bo = calloc(1, sizeof(*bo)); + if (!bo) + return -ENOMEM; + memset(, 0, sizeof(arg)); arg.info.size = size; arg.info.domain = NOUVEAU_GEM_DOMAIN_MAPPABLE | NOUVEAU_GEM_DOMAIN_VRAM; @@ -114,8 +117,10 @@ nouveau_bo_create(struct kms_driver *kms, arg.channel_hint = 0; ret = drmCommandWriteRead(kms->fd, DRM_NOUVEAU_GEM_NEW, , sizeof(arg)); - if (ret) - goto err_free; + if (ret) { + free(bo); + return ret; + } bo->base.kms = kms; bo->base.handle = arg.info.handle; @@ -126,10 +131,6 @@ nouveau_bo_create(struct kms_driver *kms, *out = >base; return 0; - -err_free: - free(bo); - return ret; } static int -- 1.7.7
[PATCH 1/6] libkms/intel.c: Fix a memory leak and a dead assignment as well as make some code easier to read.
--- libkms/intel.c | 32 +--- 1 files changed, 17 insertions(+), 15 deletions(-) diff --git a/libkms/intel.c b/libkms/intel.c index 8b8249b..12175b0 100644 --- a/libkms/intel.c +++ b/libkms/intel.c @@ -89,27 +89,32 @@ intel_bo_create(struct kms_driver *kms, } } - bo = calloc(1, sizeof(*bo)); - if (!bo) - return -ENOMEM; - - if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) { + switch (type) { + case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8: pitch = 64 * 4; size = 64 * 64 * 4; - } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) { + break; + case KMS_BO_TYPE_SCANOUT_X8R8G8B8: pitch = width * 4; pitch = (pitch + 512 - 1) & ~(512 - 1); size = pitch * ((height + 4 - 1) & ~(4 - 1)); - } else { + break; + default: return -EINVAL; } + bo = calloc(1, sizeof(*bo)); + if (!bo) + return -ENOMEM; + memset(, 0, sizeof(arg)); arg.size = size; ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_CREATE, , sizeof(arg)); - if (ret) - goto err_free; + if (ret) { + free(bo); + return ret; + } bo->base.kms = kms; bo->base.handle = arg.handle; @@ -124,21 +129,18 @@ intel_bo_create(struct kms_driver *kms, tile.handle = bo->base.handle; tile.tiling_mode = I915_TILING_X; tile.stride = bo->base.pitch; - - ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, , sizeof(tile)); #if 0 + ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, , sizeof(tile)); if (ret) { kms_bo_destroy(out); return ret; } +#else + drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, , sizeof(tile)); #endif } return 0; - -err_free: - free(bo); - return ret; } static int -- 1.7.7
libdrm: Fix some warnings reported by clang's scan-build tool [try 2]
Am Freitag, 13. Juli 2012, 18:47:50 schrieb Marcin Slusarz: > On Fri, Jul 13, 2012 at 05:49:12PM +0200, Johannes Obermayr wrote: > > > > Patches 1 to 4 were sent to mesa-dev. > > And you chose to ignore most of my comments. > Fine. Don't expect further reviews from me. > > Marcin Patch 1 and 2: - Adapted - I want to keep proposed easier to read "switch" case Patch 3: - Resend - Waiting on your response: http://lists.freedesktop.org/archives/mesa-dev/2012-June/023456.html Patch 4 and 5: - Splitted - http://llvm.org/bugs/show_bug.cgi?id=13358 (forgot to split and to add 'drmFree(list);') - The 'more if's case' seems better to me Patch 6: - Resend Marcin, not that I ignore comments. But sometimes I want to hear also opinions from (some more) other people. I hope I can calm the waves ... Johannes
libdrm: Fix some warnings reported by clang's scan-build tool
On Fri, Jul 13, 2012 at 05:49:12PM +0200, Johannes Obermayr wrote: > > Patches 1 to 4 were sent to mesa-dev. And you chose to ignore most of my comments. Fine. Don't expect further reviews from me. Marcin
[RFC] dma-fence: dma-buf synchronization (v2)
Hi Rob, Yes, sorry we've been a bit slack progressing KDS publicly. Your approach looks interesting and seems like it could enable both implicit and explicit synchronization. A good compromise. > From: Rob Clark > > A dma-fence can be attached to a buffer which is being filled or > consumed by hw, to allow userspace to pass the buffer without waiting > to another device. For example, userspace can call page_flip ioctl to > display the next frame of graphics after kicking the GPU but while the > GPU is still rendering. The display device sharing the buffer with the > GPU would attach a callback to get notified when the GPU's rendering- > complete IRQ fires, to update the scan-out address of the display, > without having to wake up userspace. > > A dma-fence is transient, one-shot deal. It is allocated and attached > to dma-buf's list of fences. When the one that attached it is done, > with the pending operation, it can signal the fence removing it from > the dma-buf's list of fences: > > + dma_buf_attach_fence() > + dma_fence_signal() It would be useful to have two lists of fences, those around writes to the buffer and those around reads. The idea being that if you only want to read from a buffer, you don't need to wait for fences around other read operations, you only need to wait for the "last" writer fence. If you do want to write to the buffer however, you need to wait for all the read fences and the last writer fence. The use-case is when EGL swap behaviour is EGL_BUFFER_PRESERVED. You have the display controller reading the buffer with its fence defined to be signalled when it is no-longer scanning out that buffer. It can only stop scanning out that buffer when it is given another buffer to scan-out. If that next buffer must be rendered by copying the currently scanned-out buffer into it (one possible option for implementing EGL_BUFFER_PRESERVED) then you essentially deadlock if the scan-out job blocks the "render the next frame" job. There's probably variations of this idea, perhaps you only need a flag to indicate if a fence is around a read-only or rw access? > The intention is to provide a userspace interface (presumably via > eventfd) later, to be used in conjunction with dma-buf's mmap support > for sw access to buffers (or for userspace apps that would prefer to > do their own synchronization). >From our experience of our own KDS, we've come up with an interesting approach to synchronizing userspace applications which have a buffer mmap'd. We wanted to avoid userspace being able to block jobs running on hardware while still allowing userspace to participate. Our original idea was to have a lock/unlock ioctl interface on a dma_buf but have a timeout whereby the application's lock would be broken if held for too long. That at least bounded how long userspace could potentially block hardware making progress, though was pretty "harsh". The approach we have now settled on is to instead only allow an application to wait for all jobs currently pending for a buffer. So there's no way userspace can prevent anything else from using a buffer, other than not issuing jobs which will use that buffer. Also, the interface we settled on was to add a poll handler to dma_buf, that way userspace can select() on multiple dma_buff buffers in one syscall. It can also chose if it wants to wait for only the last writer fence, I.e. wait until it can read (POLLIN) or wait for all fences as it wants to write to the buffer (POLLOUT). We kinda like this, but does restrict the utility a little. An idea worth considering anyway. My other thought is around atomicity. Could this be extended to (safely) allow for hardware devices which might want to access multiple buffers simultaneously? I think it probably can with some tweaks to the interface? An atomic function which does something like "give me all the fences for all these buffers and add this fence to each instead/as-well-as"? Cheers, Tom
[PATCH 5/5] modetest.c: Add return 0 in bit_name_fn(res) macro.
--- tests/modetest/modetest.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c index ec3121e..00129fa 100644 --- a/tests/modetest/modetest.c +++ b/tests/modetest/modetest.c @@ -128,6 +128,7 @@ char * res##_str(int type) { \ sep = ", "; \ } \ } \ + return 0; \ } static const char *mode_type_names[] = { -- 1.7.7
[PATCH 4/5] xf86drm.c: Make more code UDEV unrelevant and fix a memory leak.
--- xf86drm.c | 12 +--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/xf86drm.c b/xf86drm.c index 6ea068f..e3789c8 100644 --- a/xf86drm.c +++ b/xf86drm.c @@ -255,6 +255,7 @@ static int drmMatchBusID(const char *id1, const char *id2, int pci_domain_ok) return 0; } +#if !defined(UDEV) /** * Handles error checking for chown call. * @@ -284,6 +285,7 @@ static int chown_check_return(const char *path, uid_t owner, gid_t group) path, errno, strerror(errno)); return -1; } +#endif /** * Open the DRM device, creating it if necessary. @@ -303,14 +305,17 @@ static int drmOpenDevice(long dev, int minor, int type) stat_t st; charbuf[64]; int fd; +#if !defined(UDEV) mode_t devmode = DRM_DEV_MODE, serv_mode; int isroot = !geteuid(); uid_t user= DRM_DEV_UID; gid_t group = DRM_DEV_GID, serv_group; - +#endif + sprintf(buf, type ? DRM_DEV_NAME : DRM_CONTROL_DEV_NAME, DRM_DIR_NAME, minor); drmMsg("drmOpenDevice: node name is %s\n", buf); +#if !defined(UDEV) if (drm_server_info) { drm_server_info->get_perms(_group, _mode); devmode = serv_mode ? serv_mode : DRM_DEV_MODE; @@ -318,7 +323,6 @@ static int drmOpenDevice(long dev, int minor, int type) group = (serv_group >= 0) ? serv_group : DRM_DEV_GID; } -#if !defined(UDEV) if (stat(DRM_DIR_NAME, )) { if (!isroot) return DRM_ERR_NOT_ROOT; @@ -1395,8 +1399,10 @@ drm_context_t *drmGetReservedContextList(int fd, int *count) } res.contexts = list; -if (drmIoctl(fd, DRM_IOCTL_RES_CTX, )) +if (drmIoctl(fd, DRM_IOCTL_RES_CTX, )) { + drmFree(retval); return NULL; +} for (i = 0; i < res.count; i++) retval[i] = list[i].handle; -- 1.7.7
[PATCH 3/5] nouveau/nouveau.c: Fix two memory leaks.
--- nouveau/nouveau.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c index 5aa4107..e91287f 100644 --- a/nouveau/nouveau.c +++ b/nouveau/nouveau.c @@ -95,6 +95,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device **pdev) (dev->drm_version < 0x0100 || dev->drm_version >= 0x0200)) { nouveau_device_del(); + free(nvdev); return -EINVAL; } @@ -105,6 +106,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device **pdev) ret = nouveau_getparam(dev, NOUVEAU_GETPARAM_AGP_SIZE, ); if (ret) { nouveau_device_del(); + free(nvdev); return ret; } -- 1.7.7
[PATCH 2/5] libkms/nouveau.c: Fix a memory leak and cleanup code a bit.
--- libkms/nouveau.c | 20 +++- 1 files changed, 11 insertions(+), 9 deletions(-) diff --git a/libkms/nouveau.c b/libkms/nouveau.c index 0e24a15..4cbca96 100644 --- a/libkms/nouveau.c +++ b/libkms/nouveau.c @@ -94,14 +94,18 @@ nouveau_bo_create(struct kms_driver *kms, if (!bo) return -ENOMEM; - if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) { + switch (type) { + case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8: pitch = 64 * 4; size = 64 * 64 * 4; - } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) { + break; + case KMS_BO_TYPE_SCANOUT_X8R8G8B8: pitch = width * 4; pitch = (pitch + 512 - 1) & ~(512 - 1); size = pitch * height; - } else { + break; + default: + free(bo); return -EINVAL; } @@ -114,8 +118,10 @@ nouveau_bo_create(struct kms_driver *kms, arg.channel_hint = 0; ret = drmCommandWriteRead(kms->fd, DRM_NOUVEAU_GEM_NEW, , sizeof(arg)); - if (ret) - goto err_free; + if (ret) { + free(bo); + return ret; + } bo->base.kms = kms; bo->base.handle = arg.info.handle; @@ -126,10 +132,6 @@ nouveau_bo_create(struct kms_driver *kms, *out = >base; return 0; - -err_free: - free(bo); - return ret; } static int -- 1.7.7
[PATCH 1/5] libkms/intel.c: Fix a memory leak and a dead assignment as well as cleanup code a bit.
--- libkms/intel.c | 25 ++--- 1 files changed, 14 insertions(+), 11 deletions(-) diff --git a/libkms/intel.c b/libkms/intel.c index 8b8249b..b8ac343 100644 --- a/libkms/intel.c +++ b/libkms/intel.c @@ -93,14 +93,18 @@ intel_bo_create(struct kms_driver *kms, if (!bo) return -ENOMEM; - if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) { + switch (type) { + case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8: pitch = 64 * 4; size = 64 * 64 * 4; - } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) { + break; + case KMS_BO_TYPE_SCANOUT_X8R8G8B8: pitch = width * 4; pitch = (pitch + 512 - 1) & ~(512 - 1); size = pitch * ((height + 4 - 1) & ~(4 - 1)); - } else { + break; + default: + free(bo); return -EINVAL; } @@ -108,8 +112,10 @@ intel_bo_create(struct kms_driver *kms, arg.size = size; ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_CREATE, , sizeof(arg)); - if (ret) - goto err_free; + if (ret) { + free(bo); + return ret; + } bo->base.kms = kms; bo->base.handle = arg.handle; @@ -124,21 +130,18 @@ intel_bo_create(struct kms_driver *kms, tile.handle = bo->base.handle; tile.tiling_mode = I915_TILING_X; tile.stride = bo->base.pitch; - - ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, , sizeof(tile)); #if 0 + ret = drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, , sizeof(tile)); if (ret) { kms_bo_destroy(out); return ret; } +#else + drmCommandWriteRead(kms->fd, DRM_I915_GEM_SET_TILING, , sizeof(tile)); #endif } return 0; - -err_free: - free(bo); - return ret; } static int -- 1.7.7
libdrm: Fix some warnings reported by clang's scan-build tool
Patches 1 to 4 were sent to mesa-dev.
[RFC] dma-fence: dma-buf synchronization (v2)
On Fri, Jul 13, 2012 at 4:44 PM, Maarten Lankhorst wrote: > Hey, > > Op 13-07-12 20:52, Rob Clark schreef: >> On Fri, Jul 13, 2012 at 12:35 PM, Tom Cooksey wrote: >>> My other thought is around atomicity. Could this be extended to >>> (safely) allow for hardware devices which might want to access >>> multiple buffers simultaneously? I think it probably can with >>> some tweaks to the interface? An atomic function which does >>> something like "give me all the fences for all these buffers >>> and add this fence to each instead/as-well-as"? >> fwiw, what I'm leaning towards right now is combining dma-fence w/ >> Maarten's idea of dma-buf-mgr (not sure if you saw his patches?). And >> let dmabufmgr handle the multi-buffer reservation stuff. And possibly >> the read vs write access, although this I'm not 100% sure on... the >> other option being the concept of read vs write (or >> exclusive/non-exclusive) fences. > Agreed, dmabufmgr is meant for reserving multiple buffers without deadlocks. > The underlying mechanism for synchronization can be dma-fences, it wouldn't > really change dmabufmgr much. >> In the current state, the fence is quite simple, and doesn't care >> *what* it is fencing, which seems advantageous when you get into >> trying to deal with combinations of devices sharing buffers, some of >> whom can do hw sync, and some who can't. So having a bit of >> partitioning from the code dealing w/ sequencing who can access the >> buffers when and for what purpose seems like it might not be a bad >> idea. Although I'm still working through the different alternatives. >> > Yeah, I managed to get nouveau hooked up with generating irqs on > completion today using an invalid command. It's also no longer a > performance regression, so software syncing is no longer a problem > for nouveau. i915 already generates irqs and r600 presumably too. > > Monday I'll take a better look at your patch, end of day now. :) let me send you a slightly updated version.. I fixed locally some locking fail in attach_fence() and get_fence() that I managed to introduce when converting from global spinlock to using the waitqueue's spinlock. BR, -R > ~Maarten > -- > To unsubscribe from this list: send the line "unsubscribe linux-media" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/7] drm/exynos: Add IOMMU support for mapping gem object
> -Original Message- > From: Prathyush K [mailto:prathyush.k at samsung.com] > Sent: Wednesday, July 11, 2012 6:40 PM > To: dri-devel at lists.freedesktop.org > Cc: prathyush at chromium.org; m.szyprowski at samsung.com; inki.dae at samsung.com; > subash.ramaswamy at linaro.org > Subject: [PATCH 7/7] drm/exynos: Add IOMMU support for mapping gem object > > A gem object is created using dma_alloc_writecombine. Currently, this > buffer is assumed to be contiguous. If a IOMMU mapping is created for > DRM, this buffer would be non-contig so the map functions are modified > to call dma_mmap_writecombine. This works for both contig and non-contig > buffers. > > Signed-off-by: Prathyush K > --- > drivers/gpu/drm/exynos/exynos_drm_gem.c | 35 ++- > --- > 1 files changed, 16 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c > b/drivers/gpu/drm/exynos/exynos_drm_gem.c > index 5c8b683..59240f7 100644 > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c > @@ -162,17 +162,22 @@ static int exynos_drm_gem_map_pages(struct > drm_gem_object *obj, > { > struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj); > struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer; > - unsigned long pfn; > > if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG) { > + unsigned long pfn; > if (!buf->pages) > return -EINTR; > > pfn = page_to_pfn(buf->pages[page_offset++]); > - } else > - pfn = (buf->dma_addr >> PAGE_SHIFT) + page_offset; > - > - return vm_insert_mixed(vma, f_vaddr, pfn); > + return vm_insert_mixed(vma, f_vaddr, pfn); > + } else { It's not good. EXYNOS_BO_NONCONTIG means physically non-contiguous otherwise physically contiguous memory but with your patch, in case of using iommu, memory type of the gem object may have no any meaning. in this case, the memory type is EXYNOS_BO_CONTIG and has physically non-contiguous memory. > + int ret; > + ret = dma_mmap_writecombine(obj->dev->dev, vma, buf->kvaddr, > + buf->dma_addr, buf->size); > + if (ret) > + DRM_ERROR("dma_mmap_writecombine failed\n"); > + return ret; > + } > } > > static int exynos_drm_gem_get_pages(struct drm_gem_object *obj) > @@ -503,7 +508,7 @@ static int exynos_drm_gem_mmap_buffer(struct file > *filp, > struct drm_gem_object *obj = filp->private_data; > struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj); > struct exynos_drm_gem_buf *buffer; > - unsigned long pfn, vm_size, usize, uaddr = vma->vm_start; > + unsigned long vm_size, usize, uaddr = vma->vm_start; > int ret; > > DRM_DEBUG_KMS("%s\n", __FILE__); > @@ -543,19 +548,11 @@ static int exynos_drm_gem_mmap_buffer(struct file > *filp, > usize -= PAGE_SIZE; > } while (usize > 0); > } else { > - /* > - * get page frame number to physical memory to be mapped > - * to user space. > - */ > - pfn = ((unsigned long)exynos_gem_obj->buffer->dma_addr) >> > - PAGE_SHIFT; > - > - DRM_DEBUG_KMS("pfn = 0x%lx\n", pfn); > - > - if (remap_pfn_range(vma, vma->vm_start, pfn, vm_size, > - vma->vm_page_prot)) { > - DRM_ERROR("failed to remap pfn range.\n"); > - return -EAGAIN; > + ret = dma_mmap_writecombine(obj->dev->dev, vma, buffer- > >kvaddr, What if we don't use iommu and memory type of this buffer is non-contiguous? > + buffer->dma_addr, buffer->size); > + if (ret) { > + DRM_ERROR("dma_mmap_writecombine failed\n"); > + return ret; > } > } > > -- > 1.7.0.4
[PATCH 3/3] drm/radeon: fix const IB handling
Const IBs are executed on the CE not the CP, so we can't fence them in the normal way. So submit them directly before the IB instead, just as the documentation says. Signed-off-by: Christian K?nig --- drivers/gpu/drm/radeon/r100.c|2 +- drivers/gpu/drm/radeon/r600.c|2 +- drivers/gpu/drm/radeon/radeon.h |3 ++- drivers/gpu/drm/radeon/radeon_cs.c | 25 +++-- drivers/gpu/drm/radeon/radeon_ring.c | 10 +- 5 files changed, 24 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index e0f5ae8..4ee5a74 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct radeon_ring *ring) ib.ptr[6] = PACKET2(0); ib.ptr[7] = PACKET2(0); ib.length_dw = 8; - r = radeon_ib_schedule(rdev, ); + r = radeon_ib_schedule(rdev, , NULL); if (r) { radeon_scratch_free(rdev, scratch); radeon_ib_free(rdev, ); diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 3156d25..c2e5069 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct radeon_ring *ring) ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2); ib.ptr[2] = 0xDEADBEEF; ib.length_dw = 3; - r = radeon_ib_schedule(rdev, ); + r = radeon_ib_schedule(rdev, , NULL); if (r) { radeon_scratch_free(rdev, scratch); radeon_ib_free(rdev, ); diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 2cb355b..2d7f06c 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -751,7 +751,8 @@ struct si_rlc { int radeon_ib_get(struct radeon_device *rdev, int ring, struct radeon_ib *ib, unsigned size); void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib); -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib); +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib, + struct radeon_ib *const_ib); int radeon_ib_pool_init(struct radeon_device *rdev); void radeon_ib_pool_fini(struct radeon_device *rdev); int radeon_ib_ring_tests(struct radeon_device *rdev); diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 553da67..d0be5d5 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev, } radeon_cs_sync_rings(parser); parser->ib.vm_id = 0; - r = radeon_ib_schedule(rdev, >ib); + r = radeon_ib_schedule(rdev, >ib, NULL); if (r) { DRM_ERROR("Failed to schedule IB !\n"); } @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev, } radeon_cs_sync_rings(parser); + parser->ib.vm_id = vm->id; + /* ib pool is bind at 0 in virtual address space, +* so gpu_addr is the offset inside the pool bo +*/ + parser->ib.gpu_addr = parser->ib.sa_bo->soffset; + if ((rdev->family >= CHIP_TAHITI) && (parser->chunk_const_ib_idx != -1)) { parser->const_ib.vm_id = vm->id; - /* ib pool is bind at 0 in virtual address space to gpu_addr is the -* offset inside the pool bo -*/ + /* same reason as above */ parser->const_ib.gpu_addr = parser->const_ib.sa_bo->soffset; - r = radeon_ib_schedule(rdev, >const_ib); - if (r) - goto out; + r = radeon_ib_schedule(rdev, >ib, >const_ib); + } else { + r = radeon_ib_schedule(rdev, >ib, NULL); } - parser->ib.vm_id = vm->id; - /* ib pool is bind at 0 in virtual address space to gpu_addr is the -* offset inside the pool bo -*/ - parser->ib.gpu_addr = parser->ib.sa_bo->soffset; - parser->ib.is_const_ib = false; - r = radeon_ib_schedule(rdev, >ib); out: if (!r) { if (vm->fence) { diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 75cbe46..c48c354 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -74,7 +74,8 @@ void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib) radeon_fence_unref(>fence); } -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib) +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib, + struct radeon_ib *const_ib) { struct radeon_ring *ring = >ring[ib->ring]; bool need_sync = false; @@ -105,6
[PATCH 2/3] drm/radeon: let sa manager block for fences to wait for
Otherwise we can encounter out of memory situations under extreme load. Signed-off-by: Christian K?nig --- drivers/gpu/drm/radeon/radeon.h|2 +- drivers/gpu/drm/radeon/radeon_sa.c | 72 +--- 2 files changed, 51 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 6715e4c..2cb355b 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -362,7 +362,7 @@ struct radeon_bo_list { * alignment). */ struct radeon_sa_manager { - spinlock_t lock; + wait_queue_head_t wq; struct radeon_bo*bo; struct list_head*hole; struct list_headflist[RADEON_NUM_RINGS]; diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c index 81dbb5b..b535fc4 100644 --- a/drivers/gpu/drm/radeon/radeon_sa.c +++ b/drivers/gpu/drm/radeon/radeon_sa.c @@ -54,7 +54,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev, { int i, r; - spin_lock_init(_manager->lock); + init_waitqueue_head(_manager->wq); sa_manager->bo = NULL; sa_manager->size = size; sa_manager->domain = domain; @@ -211,6 +211,29 @@ static bool radeon_sa_bo_try_alloc(struct radeon_sa_manager *sa_manager, return false; } +static bool radeon_sa_event(struct radeon_sa_manager *sa_manager, + unsigned size, unsigned align) +{ + unsigned soffset, eoffset, wasted; + int i; + + for (i = 0; i < RADEON_NUM_RINGS; ++i) { + if (!list_empty(_manager->flist[i])) { + return true; + } + } + + soffset = radeon_sa_bo_hole_soffset(sa_manager); + eoffset = radeon_sa_bo_hole_eoffset(sa_manager); + wasted = (align - (soffset % align)) % align; + + if ((eoffset - soffset) >= (size + wasted)) { + return true; + } + + return false; +} + static bool radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager, struct radeon_fence **fences, unsigned *tries) @@ -297,8 +320,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev, INIT_LIST_HEAD(&(*sa_bo)->olist); INIT_LIST_HEAD(&(*sa_bo)->flist); - spin_lock(_manager->lock); - do { + spin_lock(_manager->wq.lock); + while(1) { for (i = 0; i < RADEON_NUM_RINGS; ++i) { fences[i] = NULL; tries[i] = 0; @@ -309,30 +332,34 @@ int radeon_sa_bo_new(struct radeon_device *rdev, if (radeon_sa_bo_try_alloc(sa_manager, *sa_bo, size, align)) { - spin_unlock(_manager->lock); + spin_unlock(_manager->wq.lock); return 0; } /* see if we can skip over some allocations */ } while (radeon_sa_bo_next_hole(sa_manager, fences, tries)); - if (block) { - spin_unlock(_manager->lock); - r = radeon_fence_wait_any(rdev, fences, false); - spin_lock(_manager->lock); - if (r) { - /* if we have nothing to wait for we - are practically out of memory */ - if (r == -ENOENT) { - r = -ENOMEM; - } - goto out_err; - } + if (!block) { + break; + } + + spin_unlock(_manager->wq.lock); + r = radeon_fence_wait_any(rdev, fences, false); + spin_lock(_manager->wq.lock); + /* if we have nothing to wait for block */ + if (r == -ENOENT) { + r = wait_event_interruptible_locked( + sa_manager->wq, + radeon_sa_event(sa_manager, size, align) + ); + } + if (r) { + goto out_err; } - } while (block); + }; out_err: - spin_unlock(_manager->lock); + spin_unlock(_manager->wq.lock); kfree(*sa_bo); *sa_bo = NULL; return r; @@ -348,7 +375,7 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo **sa_bo, } sa_manager = (*sa_bo)->manager; - spin_lock(_manager->lock); + spin_lock(_manager->wq.lock); if (fence && !radeon_fence_signaled(fence)) { (*sa_bo)->fence = radeon_fence_ref(fence); list_add_tail(&(*sa_bo)->flist, @@ -356,7 +383,8 @@ void
[PATCH 1/3] drm/radeon: return an error if there is nothing to wait for
Otherwise the sa managers out of memory handling doesn't work. Signed-off-by: Christian K?nig --- drivers/gpu/drm/radeon/radeon_fence.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index 76c5b22..7a181c3 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -331,7 +331,7 @@ static int radeon_fence_wait_any_seq(struct radeon_device *rdev, /* nothing to wait for ? */ if (ring == RADEON_NUM_RINGS) { - return 0; + return -ENOENT; } while (!radeon_fence_any_seq_signaled(rdev, target_seq)) { -- 1.7.9.5
[PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function
> -Original Message- > From: Prathyush K [mailto:prathyush.k at samsung.com] > Sent: Wednesday, July 11, 2012 6:40 PM > To: dri-devel at lists.freedesktop.org > Cc: prathyush at chromium.org; m.szyprowski at samsung.com; inki.dae at samsung.com; > subash.ramaswamy at linaro.org > Subject: [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function > > This patch adds a exynos drm specific implementation of fb_mmap > which supports mapping a non-contiguous buffer to user space. > This new function does not assume that the frame buffer is contiguous > and calls dma_mmap_writecombine for mapping the buffer to user space. > dma_mmap_writecombine will be able to map a contiguous buffer as well > as non-contig buffer depending on whether an IOMMU mapping is created > for drm or not. > > Signed-off-by: Prathyush K > --- > drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 16 > 1 files changed, 16 insertions(+), 0 deletions(-) > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c > b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c > index d5586cc..b53e638 100644 > --- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c > +++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c > @@ -46,8 +46,24 @@ struct exynos_drm_fbdev { > struct exynos_drm_gem_obj *exynos_gem_obj; > }; > > +static int exynos_drm_fb_mmap(struct fb_info *info, > + struct vm_area_struct *vma) > +{ > + if ((vma->vm_end - vma->vm_start) > info->fix.smem_len) > + return -EINVAL; > + > + vma->vm_pgoff = 0; > + vma->vm_flags |= VM_IO | VM_RESERVED; > + if (dma_mmap_writecombine(info->device, vma, info->screen_base, > + info->fix.smem_start, vma->vm_end - vma->vm_start)) > + return -EAGAIN; > + > + return 0; > +} > + Ok, it's good feature. actually the physically non-contiguous gem buffer allocated for console framebuffer has to be mapped with user space. Thanks. > static struct fb_ops exynos_drm_fb_ops = { > .owner = THIS_MODULE, > + .fb_mmap= exynos_drm_fb_mmap, > .fb_fillrect= cfb_fillrect, > .fb_copyarea= cfb_copyarea, > .fb_imageblit = cfb_imageblit, > -- > 1.7.0.4
[PATCH 5/7] drm/exynos: attach drm device with common drm mapping
> -Original Message- > From: Prathyush K [mailto:prathyush.k at samsung.com] > Sent: Wednesday, July 11, 2012 6:40 PM > To: dri-devel at lists.freedesktop.org > Cc: prathyush at chromium.org; m.szyprowski at samsung.com; inki.dae at samsung.com; > subash.ramaswamy at linaro.org > Subject: [PATCH 5/7] drm/exynos: attach drm device with common drm mapping > > This patch sets the common mapping created during drm init, to the > drm device's archdata. The dma_ops of drm device is set as arm_iommu_ops. > The common mapping is shared across all the drm devices which ensures > that any buffer allocated with drm is accessible by drm-fimd or drm-hdmi > or both. > > Signed-off-by: Prathyush K > --- > drivers/gpu/drm/exynos/exynos_drm_drv.c |9 + > 1 files changed, 9 insertions(+), 0 deletions(-) > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c > b/drivers/gpu/drm/exynos/exynos_drm_drv.c > index c3ad87e..2e40ca8 100644 > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c > @@ -276,6 +276,15 @@ static struct drm_driver exynos_drm_driver = { > > static int exynos_drm_platform_probe(struct platform_device *pdev) > { > +#ifdef CONFIG_EXYNOS_IOMMU > + struct device *dev = >dev; > + > + kref_get(_drm_common_mapping->kref); > + dev->archdata.mapping = exynos_drm_common_mapping; Ok, exynos_drm_common_mapping is shared with drivers using dev->archdata.mapping > + set_dma_ops(dev, _iommu_ops); > + > + DRM_INFO("drm common mapping set to drm device.\n"); > +#endif > DRM_DEBUG_DRIVER("%s\n", __FILE__); > > exynos_drm_driver.num_ioctls = DRM_ARRAY_SIZE(exynos_ioctls); > -- > 1.7.0.4
[PATCH 3/7] drm/exynos: add IOMMU support to drm fimd
> -Original Message- > From: Prathyush K [mailto:prathyush.k at samsung.com] > Sent: Wednesday, July 11, 2012 6:40 PM > To: dri-devel at lists.freedesktop.org > Cc: prathyush at chromium.org; m.szyprowski at samsung.com; inki.dae at samsung.com; > subash.ramaswamy at linaro.org > Subject: [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd > > This patch adds device tree based IOMMU support to DRM FIMD. During > probe, the driver searches for a 'sysmmu' field in the device node. The > sysmmu field points to the corresponding sysmmu device of fimd. > This sysmmu device is retrieved and set as fimd's sysmmu. The common > IOMMU mapping created during DRM init is then attached to drm fimd. > > Signed-off-by: Prathyush K > --- > drivers/gpu/drm/exynos/exynos_drm_fimd.c | 54 > +- > 1 files changed, 53 insertions(+), 1 deletions(-) > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c > b/drivers/gpu/drm/exynos/exynos_drm_fimd.c > index 15b5286..6d4048a 100644 > --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c > +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c > @@ -19,7 +19,7 @@ > #include > #include > #include > - > +#include > #include > #include > > @@ -790,12 +790,56 @@ static int fimd_power_on(struct fimd_context *ctx, > bool enable) > } > > #ifdef CONFIG_OF > + > +#ifdef CONFIG_EXYNOS_IOMMU > +static int iommu_init(struct device *dev) > +{ > + struct platform_device *pds; > + struct device_node *dn, *dns; > + const __be32 *parp; > + int ret; > + > + dn = dev->of_node; > + parp = of_get_property(dn, "sysmmu", NULL); > + if (parp == NULL) { > + dev_err(dev, "failed to find sysmmu property\n"); > + return -EINVAL; > + } > + dns = of_find_node_by_phandle(be32_to_cpup(parp)); > + if (dns == NULL) { > + dev_err(dev, "failed to find sysmmu node\n"); > + return -EINVAL; > + } > + pds = of_find_device_by_node(dns); > + if (pds == NULL) { > + dev_err(dev, "failed to find sysmmu platform device\n"); > + return -EINVAL; > + } > + > + platform_set_sysmmu(>dev, dev); > + dev->dma_parms = kzalloc(sizeof(*dev->dma_parms), GFP_KERNEL); > + if (!dev->dma_parms) { > + dev_err(dev, "failed to allocate dma parms\n"); > + return -ENOMEM; > + } > + dma_set_max_seg_size(dev, 0xu); > + > + ret = arm_iommu_attach_device(dev, exynos_drm_common_mapping); where is exynos_drm_common_mapping declared? you can get this point using exynos_drm_private structure. > + if (ret) { > + dev_err(dev, "failed to attach device\n"); > + return ret; > + } > + return 0; > +} > +#endif > + with your patch, we can use iommu feature only with device tree. I think iommu feature should be used commonly. > static struct exynos_drm_fimd_pdata *drm_fimd_dt_parse_pdata(struct > device *dev) > { > struct device_node *np = dev->of_node; > struct device_node *disp_np; > struct exynos_drm_fimd_pdata *pd; > u32 data[4]; > + int ret; > > pd = kzalloc(sizeof(*pd), GFP_KERNEL); > if (!pd) { > @@ -803,6 +847,14 @@ static struct exynos_drm_fimd_pdata > *drm_fimd_dt_parse_pdata(struct device *dev) > return ERR_PTR(-ENOMEM); > } > > +#ifdef CONFIG_EXYNOS_IOMMU and please avoid such #ifdef in device driver. > + ret = iommu_init(dev); > + if (ret) { > + dev_err(dev, "failed to initialize iommu\n"); > + return ERR_PTR(ret); > + } > +#endif > + > if (of_get_property(np, "samsung,fimd-vidout-rgb", NULL)) > pd->vidcon0 |= VIDCON0_VIDOUT_RGB | VIDCON0_PNRMODE_RGB; > if (of_get_property(np, "samsung,fimd-vidout-tv", NULL)) > -- > 1.7.0.4
[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3
On 13.07.2012 14:27, Alex Deucher wrote: > On Fri, Jul 13, 2012 at 5:09 AM, Christian K?nig > wrote: >> On 12.07.2012 18:36, Alex Deucher wrote: >>> On Thu, Jul 12, 2012 at 12:12 PM, Christian K?nig >>> wrote: Before emitting any indirect buffer, emit the offset of the next valid ring content if any. This allow code that want to resume ring to resume ring right after ib that caused GPU lockup. v2: use scratch registers instead of storing it into memory v3: skip over the surface sync for ni and si as well Signed-off-by: Jerome Glisse Signed-off-by: Christian K?nig --- drivers/gpu/drm/radeon/evergreen.c |8 +++- drivers/gpu/drm/radeon/ni.c | 11 ++- drivers/gpu/drm/radeon/r600.c| 18 -- drivers/gpu/drm/radeon/radeon.h |1 + drivers/gpu/drm/radeon/radeon_ring.c |4 drivers/gpu/drm/radeon/rv770.c |4 +++- drivers/gpu/drm/radeon/si.c | 22 +++--- 7 files changed, 60 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index f39b900..40de347 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib) /* set to DX10/11 mode */ radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); radeon_ring_write(ring, 1); - /* FIXME: implement */ + + if (ring->rptr_save_reg) { + uint32_t next_rptr = ring->wptr + 2 + 4; + radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0)); + radeon_ring_write(ring, next_rptr); + } >>> On r600 and newer please use SET_CONFIG_REG rather than Packet0. >> Why? Please note that it's on purpose that this doesn't interfere with the >> top/bottom of pipe handling and the draw commands, e.g. the register write >> isn't associated with drawing but instead just marks the beginning of >> parsing the IB. > Packet0's are have been semi-deprecated since r600. They still work, > but the CP guys recommend using the appropriate packet3 whenever > possible. Ok, that makes sense. Any further comments on the patchset, or can I send that to Dave for merging now? Cheers, Christian.
[PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM
> -Original Message- > From: Prathyush K [mailto:prathyush.k at samsung.com] > Sent: Wednesday, July 11, 2012 6:40 PM > To: dri-devel at lists.freedesktop.org > Cc: prathyush at chromium.org; m.szyprowski at samsung.com; inki.dae at samsung.com; > subash.ramaswamy at linaro.org > Subject: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM > > The dma-mapping framework needs a IOMMU mapping to be created for the > device which allocates/maps/frees the non-contig buffer. In the DRM > framework, a gem buffer is created by the DRM virtual device and not > directly by any of the physical devices (FIMD, HDMI etc). Each gem object > can be set as a framebuffer to one or many of the drm devices. So a gem > object cannot be allocated for any one device. All the DRM devices should > be able to access this buffer. > It's good to use unified iommu table so I agree to your opinion but we don't decide whether we use dma mapping api or not. now dma mapping api has one issue. in case of using iommu with dma mapping api, we couldn't use physically contiguous memory region with iommu. for this, there is a case that we should use physically contiguous memory region with iommu. it is because we sometime may use mfc(hw video codec) with secure zone such as ARM TrustZone. Then, it needs physically contiguous memory region. Thanks, Inki Dae > The proposed method is to create a common IOMMU mapping during drm init. > This > mapping is then attached to all of the drm devices including the drm > device. > [PATCH 1/7] drm/exynos: create common IOMMU mapping for DRM > > During the probe of drm fimd, the driver retrieves a 'sysmmu' field > in the device node for fimd. If such a field exists, the driver retrieves > the > platform device of the sysmmu device. This sysmmu is set as the sysmmu > for fimd. The common mapping created is then attached to fimd. > This needs to be done for all the other devices (hdmi, vidi etc). > [PATCH 2/7] ARM: EXYNOS5: add sysmmu field to fimd device node > [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd > > During DRM's probe which happens last, the common mapping is set to its > archdata > and iommu ops are set as its dma ops. This requires a modification in the > dma-mapping framework so that the iommu ops can be visible to all drivers. > [PATCH 4/7] ARM: dma-mapping: rename and export iommu_ops > [PATCH 5/7] drm/exynos: attach drm device with common drm mapping > > Currently allocation and free use the iommu framework by calling > dma_alloc_writecombine and dma_free_writecombine respectively. > For mapping the buffers to user space, the mmap functions assume that > the buffer is contiguous. This is modified by calling > dma_mmap_writecombine. > [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function > [PATCH 7/7] Add IOMMU support for mapping gem object > > The device tree based patches are based on Leela's patch which was posted > last week for adding DT support to DRM FIMD. The patch to add sysmmu > field is for reference only and will be posted to the device tree > mailing list. Same with the rename and export iommu_ops patch. > > These patches are tested on Exynos5250 SMDK board and tested with modetest > from libdrm tests. > > Prathyush K (7): > drm/exynos: create common IOMMU mapping for DRM > ARM: EXYNOS5: add sysmmu field to fimd device node > drm/exynos: add IOMMU support to drm fimd > ARM: dma-mapping: rename and export iommu_ops > drm/exynos: attach drm device with common drm mapping > drm/exynos: Add exynos drm specific fb_mmap function > drm/exynos: Add IOMMU support for mapping gem object > > arch/arm/boot/dts/exynos5250.dtsi |1 + > arch/arm/include/asm/dma-mapping.h|1 + > arch/arm/mm/dma-mapping.c |5 ++- > drivers/gpu/drm/exynos/exynos_drm_core.c |3 ++ > drivers/gpu/drm/exynos/exynos_drm_drv.c | 30 > drivers/gpu/drm/exynos/exynos_drm_drv.h | 10 + > drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 16 > drivers/gpu/drm/exynos/exynos_drm_fimd.c | 54 > - > drivers/gpu/drm/exynos/exynos_drm_gem.c | 35 -- > 9 files changed, 133 insertions(+), 22 deletions(-)
[PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM
On 07/13/2012 12:09 PM, Inki Dae wrote: > >> -Original Message- >> From: Prathyush K [mailto:prathyush.k at samsung.com] >> Sent: Wednesday, July 11, 2012 6:40 PM >> To: dri-devel at lists.freedesktop.org >> Cc: prathyush at chromium.org; m.szyprowski at samsung.com; > inki.dae at samsung.com; >> subash.ramaswamy at linaro.org >> Subject: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM >> >> The dma-mapping framework needs a IOMMU mapping to be created for the >> device which allocates/maps/frees the non-contig buffer. In the DRM >> framework, a gem buffer is created by the DRM virtual device and not >> directly by any of the physical devices (FIMD, HDMI etc). Each gem object >> can be set as a framebuffer to one or many of the drm devices. So a gem >> object cannot be allocated for any one device. All the DRM devices should >> be able to access this buffer. >> > It's good to use unified iommu table so I agree to your opinion but we don't > decide whether we use dma mapping api or not. now dma mapping api has one > issue. > in case of using iommu with dma mapping api, we couldn't use physically > contiguous memory region with iommu. for this, there is a case that we > should use physically contiguous memory region with iommu. it is because we > sometime may use mfc(hw video codec) with secure zone such as ARM TrustZone. > Then, it needs physically contiguous memory region. > > Thanks, > Inki Dae I agree. In the mainline code, as of now only the arm_dma_ops has the support allocating from the CMA. But in the function arm_iommu_alloc_attrs(), there is no way to know if the device had declared a contiguous memory range. The reason, we don't store that cookie into the device during the dma_declare_contiguous(). So is it advisable to store such information like mapping(in the iommu operations) in the device.archdata? Regards, Subash > >> The proposed method is to create a common IOMMU mapping during drm init. >> This >> mapping is then attached to all of the drm devices including the drm >> device. >> [PATCH 1/7] drm/exynos: create common IOMMU mapping for DRM >> >> During the probe of drm fimd, the driver retrieves a 'sysmmu' field >> in the device node for fimd. If such a field exists, the driver retrieves >> the >> platform device of the sysmmu device. This sysmmu is set as the sysmmu >> for fimd. The common mapping created is then attached to fimd. >> This needs to be done for all the other devices (hdmi, vidi etc). >> [PATCH 2/7] ARM: EXYNOS5: add sysmmu field to fimd device node >> [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd >> >> During DRM's probe which happens last, the common mapping is set to its >> archdata >> and iommu ops are set as its dma ops. This requires a modification in the >> dma-mapping framework so that the iommu ops can be visible to all drivers. >> [PATCH 4/7] ARM: dma-mapping: rename and export iommu_ops >> [PATCH 5/7] drm/exynos: attach drm device with common drm mapping >> >> Currently allocation and free use the iommu framework by calling >> dma_alloc_writecombine and dma_free_writecombine respectively. >> For mapping the buffers to user space, the mmap functions assume that >> the buffer is contiguous. This is modified by calling >> dma_mmap_writecombine. >> [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function >> [PATCH 7/7] Add IOMMU support for mapping gem object >> >> The device tree based patches are based on Leela's patch which was posted >> last week for adding DT support to DRM FIMD. The patch to add sysmmu >> field is for reference only and will be posted to the device tree >> mailing list. Same with the rename and export iommu_ops patch. >> >> These patches are tested on Exynos5250 SMDK board and tested with modetest >> from libdrm tests. >> >> Prathyush K (7): >>drm/exynos: create common IOMMU mapping for DRM >>ARM: EXYNOS5: add sysmmu field to fimd device node >>drm/exynos: add IOMMU support to drm fimd >>ARM: dma-mapping: rename and export iommu_ops >>drm/exynos: attach drm device with common drm mapping >>drm/exynos: Add exynos drm specific fb_mmap function >>drm/exynos: Add IOMMU support for mapping gem object >> >> arch/arm/boot/dts/exynos5250.dtsi |1 + >> arch/arm/include/asm/dma-mapping.h|1 + >> arch/arm/mm/dma-mapping.c |5 ++- >> drivers/gpu/drm/exynos/exynos_drm_core.c |3 ++ >> drivers/gpu/drm/exynos/exynos_drm_drv.c | 30 >> drivers/gpu/drm/exynos/exynos_drm_drv.h | 10 + >> drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 16 >> drivers/gpu/drm/exynos/exynos_drm_fimd.c | 54 >> - >> drivers/gpu/drm/exynos/exynos_drm_gem.c | 35 -- >> 9 files changed, 133 insertions(+), 22 deletions(-)
[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3
On 12.07.2012 18:36, Alex Deucher wrote: > On Thu, Jul 12, 2012 at 12:12 PM, Christian K?nig > wrote: >> Before emitting any indirect buffer, emit the offset of the next >> valid ring content if any. This allow code that want to resume >> ring to resume ring right after ib that caused GPU lockup. >> >> v2: use scratch registers instead of storing it into memory >> v3: skip over the surface sync for ni and si as well >> >> Signed-off-by: Jerome Glisse >> Signed-off-by: Christian K?nig >> --- >> drivers/gpu/drm/radeon/evergreen.c |8 +++- >> drivers/gpu/drm/radeon/ni.c | 11 ++- >> drivers/gpu/drm/radeon/r600.c| 18 -- >> drivers/gpu/drm/radeon/radeon.h |1 + >> drivers/gpu/drm/radeon/radeon_ring.c |4 >> drivers/gpu/drm/radeon/rv770.c |4 +++- >> drivers/gpu/drm/radeon/si.c | 22 +++--- >> 7 files changed, 60 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/gpu/drm/radeon/evergreen.c >> b/drivers/gpu/drm/radeon/evergreen.c >> index f39b900..40de347 100644 >> --- a/drivers/gpu/drm/radeon/evergreen.c >> +++ b/drivers/gpu/drm/radeon/evergreen.c >> @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device >> *rdev, struct radeon_ib *ib) >> /* set to DX10/11 mode */ >> radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); >> radeon_ring_write(ring, 1); >> - /* FIXME: implement */ >> + >> + if (ring->rptr_save_reg) { >> + uint32_t next_rptr = ring->wptr + 2 + 4; >> + radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0)); >> + radeon_ring_write(ring, next_rptr); >> + } > On r600 and newer please use SET_CONFIG_REG rather than Packet0. Why? Please note that it's on purpose that this doesn't interfere with the top/bottom of pipe handling and the draw commands, e.g. the register write isn't associated with drawing but instead just marks the beginning of parsing the IB. Christian. > > Alex > >> + >> radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2)); >> radeon_ring_write(ring, >> #ifdef __BIG_ENDIAN >> diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c >> index f2afefb..5b7ce2c 100644 >> --- a/drivers/gpu/drm/radeon/ni.c >> +++ b/drivers/gpu/drm/radeon/ni.c >> @@ -855,6 +855,13 @@ void cayman_ring_ib_execute(struct radeon_device *rdev, >> struct radeon_ib *ib) >> /* set to DX10/11 mode */ >> radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); >> radeon_ring_write(ring, 1); >> + >> + if (ring->rptr_save_reg) { >> + uint32_t next_rptr = ring->wptr + 2 + 4 + 8; >> + radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0)); >> + radeon_ring_write(ring, next_rptr); >> + } >> + >> radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2)); >> radeon_ring_write(ring, >> #ifdef __BIG_ENDIAN >> @@ -981,8 +988,10 @@ static int cayman_cp_start(struct radeon_device *rdev) >> >> static void cayman_cp_fini(struct radeon_device *rdev) >> { >> + struct radeon_ring *ring = >ring[RADEON_RING_TYPE_GFX_INDEX]; >> cayman_cp_enable(rdev, false); >> - radeon_ring_fini(rdev, >ring[RADEON_RING_TYPE_GFX_INDEX]); >> + radeon_ring_fini(rdev, ring); >> + radeon_scratch_free(rdev, ring->rptr_save_reg); >> } >> >> int cayman_cp_resume(struct radeon_device *rdev) >> diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c >> index c808fa9..74fca15 100644 >> --- a/drivers/gpu/drm/radeon/r600.c >> +++ b/drivers/gpu/drm/radeon/r600.c >> @@ -2155,18 +2155,27 @@ int r600_cp_resume(struct radeon_device *rdev) >> void r600_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, >> unsigned ring_size) >> { >> u32 rb_bufsz; >> + int r; >> >> /* Align ring size */ >> rb_bufsz = drm_order(ring_size / 8); >> ring_size = (1 << (rb_bufsz + 1)) * 4; >> ring->ring_size = ring_size; >> ring->align_mask = 16 - 1; >> + >> + r = radeon_scratch_get(rdev, >rptr_save_reg); >> + if (r) { >> + DRM_ERROR("failed to get scratch reg for rptr save (%d).\n", >> r); >> + ring->rptr_save_reg = 0; >> + } >> } >> >> void r600_cp_fini(struct radeon_device *rdev) >> { >> + struct radeon_ring *ring = >ring[RADEON_RING_TYPE_GFX_INDEX]; >> r600_cp_stop(rdev); >> - radeon_ring_fini(rdev, >ring[RADEON_RING_TYPE_GFX_INDEX]); >> + radeon_ring_fini(rdev, ring); >> + radeon_scratch_free(rdev, ring->rptr_save_reg); >> } >> >> >> @@ -2568,7 +2577,12 @@ void r600_ring_ib_execute(struct radeon_device *rdev, >> struct radeon_ib *ib) >> { >> struct radeon_ring *ring = >ring[ib->ring]; >> >> - /* FIXME: implement */ >> + if (ring->rptr_save_reg) { >> +
[PATCH 3/3] drm/radeon: fix const IB handling
On Fri, Jul 13, 2012 at 10:08 AM, Christian K?nig wrote: > Const IBs are executed on the CE not the CP, so we can't > fence them in the normal way. > > So submit them directly before the IB instead, just as > the documentation says. > > Signed-off-by: Christian K?nig > --- > drivers/gpu/drm/radeon/r100.c|2 +- > drivers/gpu/drm/radeon/r600.c|2 +- > drivers/gpu/drm/radeon/radeon.h |3 ++- > drivers/gpu/drm/radeon/radeon_cs.c | 25 +++-- > drivers/gpu/drm/radeon/radeon_ring.c | 10 +- > 5 files changed, 24 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c > index e0f5ae8..4ee5a74 100644 > --- a/drivers/gpu/drm/radeon/r100.c > +++ b/drivers/gpu/drm/radeon/r100.c > @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct > radeon_ring *ring) > ib.ptr[6] = PACKET2(0); > ib.ptr[7] = PACKET2(0); > ib.length_dw = 8; > - r = radeon_ib_schedule(rdev, ); > + r = radeon_ib_schedule(rdev, , NULL); > if (r) { > radeon_scratch_free(rdev, scratch); > radeon_ib_free(rdev, ); > diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c > index 3156d25..c2e5069 100644 > --- a/drivers/gpu/drm/radeon/r600.c > +++ b/drivers/gpu/drm/radeon/r600.c > @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct > radeon_ring *ring) > ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2); > ib.ptr[2] = 0xDEADBEEF; > ib.length_dw = 3; > - r = radeon_ib_schedule(rdev, ); > + r = radeon_ib_schedule(rdev, , NULL); > if (r) { > radeon_scratch_free(rdev, scratch); > radeon_ib_free(rdev, ); > diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h > index 2cb355b..2d7f06c 100644 > --- a/drivers/gpu/drm/radeon/radeon.h > +++ b/drivers/gpu/drm/radeon/radeon.h > @@ -751,7 +751,8 @@ struct si_rlc { > int radeon_ib_get(struct radeon_device *rdev, int ring, > struct radeon_ib *ib, unsigned size); > void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib); > -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib); > +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib, > + struct radeon_ib *const_ib); > int radeon_ib_pool_init(struct radeon_device *rdev); > void radeon_ib_pool_fini(struct radeon_device *rdev); > int radeon_ib_ring_tests(struct radeon_device *rdev); > diff --git a/drivers/gpu/drm/radeon/radeon_cs.c > b/drivers/gpu/drm/radeon/radeon_cs.c > index 553da67..d0be5d5 100644 > --- a/drivers/gpu/drm/radeon/radeon_cs.c > +++ b/drivers/gpu/drm/radeon/radeon_cs.c > @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev, > } > radeon_cs_sync_rings(parser); > parser->ib.vm_id = 0; > - r = radeon_ib_schedule(rdev, >ib); > + r = radeon_ib_schedule(rdev, >ib, NULL); > if (r) { > DRM_ERROR("Failed to schedule IB !\n"); > } > @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device > *rdev, > } > radeon_cs_sync_rings(parser); > > + parser->ib.vm_id = vm->id; > + /* ib pool is bind at 0 in virtual address space, > +* so gpu_addr is the offset inside the pool bo > +*/ > + parser->ib.gpu_addr = parser->ib.sa_bo->soffset; > + > if ((rdev->family >= CHIP_TAHITI) && > (parser->chunk_const_ib_idx != -1)) { > parser->const_ib.vm_id = vm->id; > - /* ib pool is bind at 0 in virtual address space to gpu_addr > is the > -* offset inside the pool bo > -*/ > + /* same reason as above */ Don't remove comment, code might move and the above comment might not be the same better to duplicate comment then trying to cross reference comment across file. > parser->const_ib.gpu_addr = parser->const_ib.sa_bo->soffset; > - r = radeon_ib_schedule(rdev, >const_ib); > - if (r) > - goto out; > + r = radeon_ib_schedule(rdev, >ib, >const_ib); > + } else { > + r = radeon_ib_schedule(rdev, >ib, NULL); > } > > - parser->ib.vm_id = vm->id; > - /* ib pool is bind at 0 in virtual address space to gpu_addr is the > -* offset inside the pool bo > -*/ > - parser->ib.gpu_addr = parser->ib.sa_bo->soffset; > - parser->ib.is_const_ib = false; > - r = radeon_ib_schedule(rdev, >ib); > out: > if (!r) { > if (vm->fence) { > diff --git a/drivers/gpu/drm/radeon/radeon_ring.c > b/drivers/gpu/drm/radeon/radeon_ring.c > index 75cbe46..c48c354 100644 > --- a/drivers/gpu/drm/radeon/radeon_ring.c > +++
[RFC] dma-fence: dma-buf synchronization (v2)
From: Rob ClarkA dma-fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A dma-fence is transient, one-shot deal. It is allocated and attached to dma-buf's list of fences. When the one that attached it is done, with the pending operation, it can signal the fence removing it from the dma-buf's list of fences: + dma_buf_attach_fence() + dma_fence_signal() Other drivers can access the current fence on the dma-buf (if any), which increment's the fences refcnt: + dma_buf_get_fence() + dma_fence_put() The one pending on the fence can add an async callback (and optionally cancel it.. for example, to recover from GPU hangs): + dma_fence_add_callback() + dma_fence_cancel_callback() Or wait synchronously (optionally with timeout or from atomic context): + dma_fence_wait() A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = dma_buf_get_fence(dmabuf); if (fence->ops == _dma_fence_ops) { dma_buf *fence_buf; mem_dma_fence_get_buf(fence, _buf, ); ... tell the hw the memory location to wait on ... } else { /* fall-back to sw sync * / dma_fence_add_callback(fence, my_cb); } The memory location is itself backed by dma-buf, to simplify mapping to the device's address space, an idea borrowed from Maarten Lankhorst. NOTE: the memory location fence is not implemented yet, the above is just for explaining how it would work. On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. The other non-sw implementations would wrap the add/cancel_callback and wait fence ops, so that they can keep track if a device not supporting hw sync is waiting on the fence, and in this case should arrange to call dma_fence_signal() at some point after the condition has changed, to notify other devices waiting on the fence. If there are no sw waiters, this can be skipped to avoid waking the CPU unnecessarily. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw->hw signaling path (it can be handled same as sw->sw case), and therefore the fence->ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw->hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). --- drivers/base/Makefile |2 +- drivers/base/dma-buf.c|3 + drivers/base/dma-fence.c | 364 + include/linux/dma-buf.h |2 + include/linux/dma-fence.h | 128 5 files changed, 498 insertions(+), 1 deletion(-) create mode 100644 drivers/base/dma-fence.c create mode 100644 include/linux/dma-fence.h diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 5aa2d70..6e9f217 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER)+= firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 24e88fe..b053236 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -39,6 +39,8 @@ static int dma_buf_release(struct inode *inode, struct file *file) dmabuf = file->private_data; + WARN_ON(!list_empty(>fence_list)); + dmabuf->ops->release(dmabuf); kfree(dmabuf); return
[PATCH 3/3] drm/radeon: fix const IB handling
On Fri, Jul 13, 2012 at 04:08:15PM +0200, Christian K?nig wrote: > Const IBs are executed on the CE not the CP, so we can't > fence them in the normal way. > > So submit them directly before the IB instead, just as > the documentation says. > > Signed-off-by: Christian K?nig > --- > drivers/gpu/drm/radeon/r100.c|2 +- > drivers/gpu/drm/radeon/r600.c|2 +- > drivers/gpu/drm/radeon/radeon.h |3 ++- > drivers/gpu/drm/radeon/radeon_cs.c | 25 +++-- > drivers/gpu/drm/radeon/radeon_ring.c | 10 +- > 5 files changed, 24 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c > index e0f5ae8..4ee5a74 100644 > --- a/drivers/gpu/drm/radeon/r100.c > +++ b/drivers/gpu/drm/radeon/r100.c > @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct > radeon_ring *ring) > ib.ptr[6] = PACKET2(0); > ib.ptr[7] = PACKET2(0); > ib.length_dw = 8; > - r = radeon_ib_schedule(rdev, ); > + r = radeon_ib_schedule(rdev, , NULL); > if (r) { > radeon_scratch_free(rdev, scratch); > radeon_ib_free(rdev, ); > diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c > index 3156d25..c2e5069 100644 > --- a/drivers/gpu/drm/radeon/r600.c > +++ b/drivers/gpu/drm/radeon/r600.c > @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct > radeon_ring *ring) > ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2); > ib.ptr[2] = 0xDEADBEEF; > ib.length_dw = 3; > - r = radeon_ib_schedule(rdev, ); > + r = radeon_ib_schedule(rdev, , NULL); > if (r) { > radeon_scratch_free(rdev, scratch); > radeon_ib_free(rdev, ); > diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h > index 2cb355b..2d7f06c 100644 > --- a/drivers/gpu/drm/radeon/radeon.h > +++ b/drivers/gpu/drm/radeon/radeon.h > @@ -751,7 +751,8 @@ struct si_rlc { > int radeon_ib_get(struct radeon_device *rdev, int ring, > struct radeon_ib *ib, unsigned size); > void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib); > -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib); > +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib, > +struct radeon_ib *const_ib); > int radeon_ib_pool_init(struct radeon_device *rdev); > void radeon_ib_pool_fini(struct radeon_device *rdev); > int radeon_ib_ring_tests(struct radeon_device *rdev); > diff --git a/drivers/gpu/drm/radeon/radeon_cs.c > b/drivers/gpu/drm/radeon/radeon_cs.c > index 553da67..d0be5d5 100644 > --- a/drivers/gpu/drm/radeon/radeon_cs.c > +++ b/drivers/gpu/drm/radeon/radeon_cs.c > @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev, > } > radeon_cs_sync_rings(parser); > parser->ib.vm_id = 0; > - r = radeon_ib_schedule(rdev, >ib); > + r = radeon_ib_schedule(rdev, >ib, NULL); > if (r) { > DRM_ERROR("Failed to schedule IB !\n"); > } > @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device > *rdev, > } > radeon_cs_sync_rings(parser); > > + parser->ib.vm_id = vm->id; > + /* ib pool is bind at 0 in virtual address space, > + * so gpu_addr is the offset inside the pool bo > + */ > + parser->ib.gpu_addr = parser->ib.sa_bo->soffset; > + > if ((rdev->family >= CHIP_TAHITI) && > (parser->chunk_const_ib_idx != -1)) { > parser->const_ib.vm_id = vm->id; > - /* ib pool is bind at 0 in virtual address space to gpu_addr is > the > - * offset inside the pool bo > - */ > + /* same reason as above */ > parser->const_ib.gpu_addr = parser->const_ib.sa_bo->soffset; > - r = radeon_ib_schedule(rdev, >const_ib); > - if (r) > - goto out; > + r = radeon_ib_schedule(rdev, >ib, >const_ib); > + } else { > + r = radeon_ib_schedule(rdev, >ib, NULL); > } > > - parser->ib.vm_id = vm->id; > - /* ib pool is bind at 0 in virtual address space to gpu_addr is the > - * offset inside the pool bo > - */ > - parser->ib.gpu_addr = parser->ib.sa_bo->soffset; > - parser->ib.is_const_ib = false; > - r = radeon_ib_schedule(rdev, >ib); > out: > if (!r) { > if (vm->fence) { > diff --git a/drivers/gpu/drm/radeon/radeon_ring.c > b/drivers/gpu/drm/radeon/radeon_ring.c > index 75cbe46..c48c354 100644 > --- a/drivers/gpu/drm/radeon/radeon_ring.c > +++ b/drivers/gpu/drm/radeon/radeon_ring.c > @@ -74,7 +74,8 @@ void radeon_ib_free(struct radeon_device *rdev, struct > radeon_ib *ib) > radeon_fence_unref(>fence); > } > > -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib) > +int radeon_ib_schedule(struct
[PATCH 2/3] drm/radeon: let sa manager block for fences to wait for
On Fri, Jul 13, 2012 at 04:08:14PM +0200, Christian K?nig wrote: > Otherwise we can encounter out of memory situations under extreme load. > > Signed-off-by: Christian K?nig > --- > drivers/gpu/drm/radeon/radeon.h|2 +- > drivers/gpu/drm/radeon/radeon_sa.c | 72 > +--- > 2 files changed, 51 insertions(+), 23 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h > index 6715e4c..2cb355b 100644 > --- a/drivers/gpu/drm/radeon/radeon.h > +++ b/drivers/gpu/drm/radeon/radeon.h > @@ -362,7 +362,7 @@ struct radeon_bo_list { > * alignment). > */ > struct radeon_sa_manager { > - spinlock_t lock; > + wait_queue_head_t wq; > struct radeon_bo*bo; > struct list_head*hole; > struct list_headflist[RADEON_NUM_RINGS]; > diff --git a/drivers/gpu/drm/radeon/radeon_sa.c > b/drivers/gpu/drm/radeon/radeon_sa.c > index 81dbb5b..b535fc4 100644 > --- a/drivers/gpu/drm/radeon/radeon_sa.c > +++ b/drivers/gpu/drm/radeon/radeon_sa.c > @@ -54,7 +54,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev, > { > int i, r; > > - spin_lock_init(_manager->lock); > + init_waitqueue_head(_manager->wq); > sa_manager->bo = NULL; > sa_manager->size = size; > sa_manager->domain = domain; > @@ -211,6 +211,29 @@ static bool radeon_sa_bo_try_alloc(struct > radeon_sa_manager *sa_manager, > return false; > } > > +static bool radeon_sa_event(struct radeon_sa_manager *sa_manager, > + unsigned size, unsigned align) > +{ > + unsigned soffset, eoffset, wasted; > + int i; > + > + for (i = 0; i < RADEON_NUM_RINGS; ++i) { > + if (!list_empty(_manager->flist[i])) { > + return true; > + } > + } > + > + soffset = radeon_sa_bo_hole_soffset(sa_manager); > + eoffset = radeon_sa_bo_hole_eoffset(sa_manager); > + wasted = (align - (soffset % align)) % align; > + > + if ((eoffset - soffset) >= (size + wasted)) { > + return true; > + } > + > + return false; > +} > + This new function should come with a comment, per the new documentation rules. > static bool radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager, > struct radeon_fence **fences, > unsigned *tries) > @@ -297,8 +320,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev, > INIT_LIST_HEAD(&(*sa_bo)->olist); > INIT_LIST_HEAD(&(*sa_bo)->flist); > > - spin_lock(_manager->lock); > - do { > + spin_lock(_manager->wq.lock); > + while(1) { > for (i = 0; i < RADEON_NUM_RINGS; ++i) { > fences[i] = NULL; > tries[i] = 0; > @@ -309,30 +332,34 @@ int radeon_sa_bo_new(struct radeon_device *rdev, > > if (radeon_sa_bo_try_alloc(sa_manager, *sa_bo, > size, align)) { > - spin_unlock(_manager->lock); > + spin_unlock(_manager->wq.lock); > return 0; > } > > /* see if we can skip over some allocations */ > } while (radeon_sa_bo_next_hole(sa_manager, fences, tries)); > > - if (block) { > - spin_unlock(_manager->lock); > - r = radeon_fence_wait_any(rdev, fences, false); > - spin_lock(_manager->lock); > - if (r) { > - /* if we have nothing to wait for we > -are practically out of memory */ > - if (r == -ENOENT) { > - r = -ENOMEM; > - } > - goto out_err; > - } > + if (!block) { > + break; > + } > + > + spin_unlock(_manager->wq.lock); > + r = radeon_fence_wait_any(rdev, fences, false); > + spin_lock(_manager->wq.lock); > + /* if we have nothing to wait for block */ > + if (r == -ENOENT) { > + r = wait_event_interruptible_locked( > + sa_manager->wq, > + radeon_sa_event(sa_manager, size, align) > + ); > + } > + if (r) { > + goto out_err; > } > - } while (block); > + }; > > out_err: > - spin_unlock(_manager->lock); > + spin_unlock(_manager->wq.lock); > kfree(*sa_bo); > *sa_bo = NULL; > return r; > @@ -348,7 +375,7 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct > radeon_sa_bo **sa_bo, > } > > sa_manager = (*sa_bo)->manager; > -
[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3
On Fri, Jul 13, 2012 at 9:46 AM, Christian K?nig wrote: > On 13.07.2012 14:27, Alex Deucher wrote: >> >> On Fri, Jul 13, 2012 at 5:09 AM, Christian K?nig >> wrote: >>> >>> On 12.07.2012 18:36, Alex Deucher wrote: On Thu, Jul 12, 2012 at 12:12 PM, Christian K?nig wrote: > > Before emitting any indirect buffer, emit the offset of the next > valid ring content if any. This allow code that want to resume > ring to resume ring right after ib that caused GPU lockup. > > v2: use scratch registers instead of storing it into memory > v3: skip over the surface sync for ni and si as well > > Signed-off-by: Jerome Glisse > Signed-off-by: Christian K?nig > --- >drivers/gpu/drm/radeon/evergreen.c |8 +++- >drivers/gpu/drm/radeon/ni.c | 11 ++- >drivers/gpu/drm/radeon/r600.c| 18 -- >drivers/gpu/drm/radeon/radeon.h |1 + >drivers/gpu/drm/radeon/radeon_ring.c |4 >drivers/gpu/drm/radeon/rv770.c |4 +++- >drivers/gpu/drm/radeon/si.c | 22 +++--- >7 files changed, 60 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/evergreen.c > b/drivers/gpu/drm/radeon/evergreen.c > index f39b900..40de347 100644 > --- a/drivers/gpu/drm/radeon/evergreen.c > +++ b/drivers/gpu/drm/radeon/evergreen.c > @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct > radeon_device *rdev, struct radeon_ib *ib) > /* set to DX10/11 mode */ > radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); > radeon_ring_write(ring, 1); > - /* FIXME: implement */ > + > + if (ring->rptr_save_reg) { > + uint32_t next_rptr = ring->wptr + 2 + 4; > + radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, > 0)); > + radeon_ring_write(ring, next_rptr); > + } On r600 and newer please use SET_CONFIG_REG rather than Packet0. >>> >>> Why? Please note that it's on purpose that this doesn't interfere with >>> the >>> top/bottom of pipe handling and the draw commands, e.g. the register >>> write >>> isn't associated with drawing but instead just marks the beginning of >>> parsing the IB. >> >> Packet0's are have been semi-deprecated since r600. They still work, >> but the CP guys recommend using the appropriate packet3 whenever >> possible. > > Ok, that makes sense. > > Any further comments on the patchset, or can I send that to Dave for merging > now? Other than that, it looks good to me. For the series: Reviewed-by: Alex Deucher
[PATCH] drm/radeon: fix bo creation retry path
On 13.07.2012 00:23, j.glisse at gmail.com wrote: > From: Jerome Glisse > > Retry label was at wrong place in function leading to memory > leak. > > Cc: > Signed-off-by: Jerome Glisse Reviewed-by: Christian K?nig > --- > drivers/gpu/drm/radeon/radeon_object.c |3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/radeon/radeon_object.c > b/drivers/gpu/drm/radeon/radeon_object.c > index 6ecb200..f71e472 100644 > --- a/drivers/gpu/drm/radeon/radeon_object.c > +++ b/drivers/gpu/drm/radeon/radeon_object.c > @@ -138,7 +138,6 @@ int radeon_bo_create(struct radeon_device *rdev, > acc_size = ttm_bo_dma_acc_size(>mman.bdev, size, > sizeof(struct radeon_bo)); > > -retry: > bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL); > if (bo == NULL) > return -ENOMEM; > @@ -152,6 +151,8 @@ retry: > bo->surface_reg = -1; > INIT_LIST_HEAD(>list); > INIT_LIST_HEAD(>va); > + > +retry: > radeon_ttm_placement_from_domain(bo, domain); > /* Kernel allocation are uninterruptible */ > down_read(>pm.mclk_lock);
[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3
On Fri, Jul 13, 2012 at 5:09 AM, Christian K?nig wrote: > On 12.07.2012 18:36, Alex Deucher wrote: >> >> On Thu, Jul 12, 2012 at 12:12 PM, Christian K?nig >> wrote: >>> >>> Before emitting any indirect buffer, emit the offset of the next >>> valid ring content if any. This allow code that want to resume >>> ring to resume ring right after ib that caused GPU lockup. >>> >>> v2: use scratch registers instead of storing it into memory >>> v3: skip over the surface sync for ni and si as well >>> >>> Signed-off-by: Jerome Glisse >>> Signed-off-by: Christian K?nig >>> --- >>> drivers/gpu/drm/radeon/evergreen.c |8 +++- >>> drivers/gpu/drm/radeon/ni.c | 11 ++- >>> drivers/gpu/drm/radeon/r600.c| 18 -- >>> drivers/gpu/drm/radeon/radeon.h |1 + >>> drivers/gpu/drm/radeon/radeon_ring.c |4 >>> drivers/gpu/drm/radeon/rv770.c |4 +++- >>> drivers/gpu/drm/radeon/si.c | 22 +++--- >>> 7 files changed, 60 insertions(+), 8 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/radeon/evergreen.c >>> b/drivers/gpu/drm/radeon/evergreen.c >>> index f39b900..40de347 100644 >>> --- a/drivers/gpu/drm/radeon/evergreen.c >>> +++ b/drivers/gpu/drm/radeon/evergreen.c >>> @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct >>> radeon_device *rdev, struct radeon_ib *ib) >>> /* set to DX10/11 mode */ >>> radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); >>> radeon_ring_write(ring, 1); >>> - /* FIXME: implement */ >>> + >>> + if (ring->rptr_save_reg) { >>> + uint32_t next_rptr = ring->wptr + 2 + 4; >>> + radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0)); >>> + radeon_ring_write(ring, next_rptr); >>> + } >> >> On r600 and newer please use SET_CONFIG_REG rather than Packet0. > > Why? Please note that it's on purpose that this doesn't interfere with the > top/bottom of pipe handling and the draw commands, e.g. the register write > isn't associated with drawing but instead just marks the beginning of > parsing the IB. Packet0's are have been semi-deprecated since r600. They still work, but the CP guys recommend using the appropriate packet3 whenever possible. Alex
[PATCH] drm/radeon: fix bo creation retry path
On Don, 2012-07-12 at 18:23 -0400, j.glisse at gmail.com wrote: > From: Jerome Glisse > > Retry label was at wrong place in function leading to memory > leak. > > Cc: > Signed-off-by: Jerome Glisse > --- > drivers/gpu/drm/radeon/radeon_object.c |3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/radeon/radeon_object.c > b/drivers/gpu/drm/radeon/radeon_object.c > index 6ecb200..f71e472 100644 > --- a/drivers/gpu/drm/radeon/radeon_object.c > +++ b/drivers/gpu/drm/radeon/radeon_object.c > @@ -138,7 +138,6 @@ int radeon_bo_create(struct radeon_device *rdev, > acc_size = ttm_bo_dma_acc_size(>mman.bdev, size, > sizeof(struct radeon_bo)); > > -retry: > bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL); > if (bo == NULL) > return -ENOMEM; > @@ -152,6 +151,8 @@ retry: > bo->surface_reg = -1; > INIT_LIST_HEAD(>list); > INIT_LIST_HEAD(>va); > + > +retry: > radeon_ttm_placement_from_domain(bo, domain); > /* Kernel allocation are uninterruptible */ > down_read(>pm.mclk_lock); Reviewed-by: Michel D?nzer -- Earthling Michel D?nzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer
[PATCH] Documentation: DocBook DRM framework documentation
Signed-off-by: Laurent Pinchart --- Documentation/DocBook/drm.tmpl | 2835 +++- 1 files changed, 2226 insertions(+), 609 deletions(-) Hi everybody, Here's the DRM kernel framework documentation previously posted to the dri-devel mailing list. The documentation has been reworked, converted to DocBook and merged with the existing DocBook DRM documentation stub. The result doesn't cover the whole DRM API but should hopefully be good enough for a start. I've done my best to follow a natural flow starting at initialization and covering the major DRM internal topics. As I'm not a native English speaker I'm not totally happy with the result, so if anyone wants to edit the text please feel free to do so. Review will as usual be appreciated, and acks will be even more welcome (I've been working on this document for longer than I feel comfortable with). diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl index 196b8b9..44a2c66 100644 --- a/Documentation/DocBook/drm.tmpl +++ b/Documentation/DocBook/drm.tmpl @@ -6,11 +6,36 @@ Linux DRM Developer's Guide + + + Jesse + Barnes + Initial version + + Intel Corporation + + jesse.barnes at intel.com + + + + + Laurent + Pinchart + Driver internals + + Ideas on board SPRL + + laurent.pinchart at ideasonboard.com + + + + + 2008-2009 - - Intel Corporation (Jesse Barnes jesse.barnes at intel.com) - + 2012 + Intel Corporation + Laurent Pinchart @@ -20,6 +45,17 @@ the kernel source COPYING file. + + + + + 1.0 + 2012-07-13 + LP + Added extensive documentation about driver internals. + + + @@ -72,342 +108,361 @@ submission fencing, suspend/resume support, and DMA services. - - The core of every DRM driver is struct drm_driver. Drivers - typically statically initialize a drm_driver structure, - then pass it to drm_init() at load time. - -Driver initialization - - Before calling the DRM initialization routines, the driver must - first create and fill out a struct drm_driver structure. - - - static struct drm_driver driver = { - /* Don't use MTRRs here; the Xserver or userspace app should -* deal with them for Intel hardware. -*/ - .driver_features = - DRIVER_USE_AGP | DRIVER_REQUIRE_AGP | - DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_MODESET, - .load = i915_driver_load, - .unload = i915_driver_unload, - .firstopen = i915_driver_firstopen, - .lastclose = i915_driver_lastclose, - .preclose = i915_driver_preclose, - .save = i915_save, - .restore = i915_restore, - .device_is_agp = i915_driver_device_is_agp, - .get_vblank_counter = i915_get_vblank_counter, - .enable_vblank = i915_enable_vblank, - .disable_vblank = i915_disable_vblank, - .irq_preinstall = i915_driver_irq_preinstall, - .irq_postinstall = i915_driver_irq_postinstall, - .irq_uninstall = i915_driver_irq_uninstall, - .irq_handler = i915_driver_irq_handler, - .reclaim_buffers = drm_core_reclaim_buffers, - .get_map_ofs = drm_core_get_map_ofs, - .get_reg_ofs = drm_core_get_reg_ofs, - .fb_probe = intelfb_probe, - .fb_remove = intelfb_remove, - .fb_resize = intelfb_resize, - .master_create = i915_master_create, - .master_destroy = i915_master_destroy, -#if defined(CONFIG_DEBUG_FS) - .debugfs_init = i915_debugfs_init, - .debugfs_cleanup = i915_debugfs_cleanup, -#endif - .gem_init_object = i915_gem_init_object, - .gem_free_object = i915_gem_free_object, - .gem_vm_ops = i915_gem_vm_ops, - .ioctls = i915_ioctls, - .fops = { - .owner = THIS_MODULE, - .open = drm_open, - .release = drm_release, - .ioctl = drm_ioctl, - .mmap = drm_mmap, - .poll = drm_poll, - .fasync = drm_fasync, -#ifdef CONFIG_COMPAT - .compat_ioctl = i915_compat_ioctl, -#endif - .llseek = noop_llseek, - }, - .pci_driver = { - .name = DRIVER_NAME, - .id_table = pciidlist, - .probe = probe, - .remove = __devexit_p(drm_cleanup_pci), - }, - .name = DRIVER_NAME, - .desc = DRIVER_DESC, - .date = DRIVER_DATE, - .major = DRIVER_MAJOR, - .minor = DRIVER_MINOR, - .patchlevel = DRIVER_PATCHLEVEL, - }; - - - In the example above, taken from the i915 DRM driver, the driver - sets several flags indicating what core features
Re: [PATCH] drm/radeon: fix bo creation retry path
On Don, 2012-07-12 at 18:23 -0400, j.gli...@gmail.com wrote: From: Jerome Glisse jgli...@redhat.com Retry label was at wrong place in function leading to memory leak. Cc: sta...@vger.kernel.org Signed-off-by: Jerome Glisse jgli...@redhat.com --- drivers/gpu/drm/radeon/radeon_object.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index 6ecb200..f71e472 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -138,7 +138,6 @@ int radeon_bo_create(struct radeon_device *rdev, acc_size = ttm_bo_dma_acc_size(rdev-mman.bdev, size, sizeof(struct radeon_bo)); -retry: bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL); if (bo == NULL) return -ENOMEM; @@ -152,6 +151,8 @@ retry: bo-surface_reg = -1; INIT_LIST_HEAD(bo-list); INIT_LIST_HEAD(bo-va); + +retry: radeon_ttm_placement_from_domain(bo, domain); /* Kernel allocation are uninterruptible */ down_read(rdev-pm.mclk_lock); Reviewed-by: Michel Dänzer michel.daen...@amd.com -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM
-Original Message- From: Prathyush K [mailto:prathyus...@samsung.com] Sent: Wednesday, July 11, 2012 6:40 PM To: dri-devel@lists.freedesktop.org Cc: prathy...@chromium.org; m.szyprow...@samsung.com; inki@samsung.com; subash.ramasw...@linaro.org Subject: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM The dma-mapping framework needs a IOMMU mapping to be created for the device which allocates/maps/frees the non-contig buffer. In the DRM framework, a gem buffer is created by the DRM virtual device and not directly by any of the physical devices (FIMD, HDMI etc). Each gem object can be set as a framebuffer to one or many of the drm devices. So a gem object cannot be allocated for any one device. All the DRM devices should be able to access this buffer. It's good to use unified iommu table so I agree to your opinion but we don't decide whether we use dma mapping api or not. now dma mapping api has one issue. in case of using iommu with dma mapping api, we couldn't use physically contiguous memory region with iommu. for this, there is a case that we should use physically contiguous memory region with iommu. it is because we sometime may use mfc(hw video codec) with secure zone such as ARM TrustZone. Then, it needs physically contiguous memory region. Thanks, Inki Dae The proposed method is to create a common IOMMU mapping during drm init. This mapping is then attached to all of the drm devices including the drm device. [PATCH 1/7] drm/exynos: create common IOMMU mapping for DRM During the probe of drm fimd, the driver retrieves a 'sysmmu' field in the device node for fimd. If such a field exists, the driver retrieves the platform device of the sysmmu device. This sysmmu is set as the sysmmu for fimd. The common mapping created is then attached to fimd. This needs to be done for all the other devices (hdmi, vidi etc). [PATCH 2/7] ARM: EXYNOS5: add sysmmu field to fimd device node [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd During DRM's probe which happens last, the common mapping is set to its archdata and iommu ops are set as its dma ops. This requires a modification in the dma-mapping framework so that the iommu ops can be visible to all drivers. [PATCH 4/7] ARM: dma-mapping: rename and export iommu_ops [PATCH 5/7] drm/exynos: attach drm device with common drm mapping Currently allocation and free use the iommu framework by calling dma_alloc_writecombine and dma_free_writecombine respectively. For mapping the buffers to user space, the mmap functions assume that the buffer is contiguous. This is modified by calling dma_mmap_writecombine. [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function [PATCH 7/7] Add IOMMU support for mapping gem object The device tree based patches are based on Leela's patch which was posted last week for adding DT support to DRM FIMD. The patch to add sysmmu field is for reference only and will be posted to the device tree mailing list. Same with the rename and export iommu_ops patch. These patches are tested on Exynos5250 SMDK board and tested with modetest from libdrm tests. Prathyush K (7): drm/exynos: create common IOMMU mapping for DRM ARM: EXYNOS5: add sysmmu field to fimd device node drm/exynos: add IOMMU support to drm fimd ARM: dma-mapping: rename and export iommu_ops drm/exynos: attach drm device with common drm mapping drm/exynos: Add exynos drm specific fb_mmap function drm/exynos: Add IOMMU support for mapping gem object arch/arm/boot/dts/exynos5250.dtsi |1 + arch/arm/include/asm/dma-mapping.h|1 + arch/arm/mm/dma-mapping.c |5 ++- drivers/gpu/drm/exynos/exynos_drm_core.c |3 ++ drivers/gpu/drm/exynos/exynos_drm_drv.c | 30 drivers/gpu/drm/exynos/exynos_drm_drv.h | 10 + drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 16 drivers/gpu/drm/exynos/exynos_drm_fimd.c | 54 - drivers/gpu/drm/exynos/exynos_drm_gem.c | 35 -- 9 files changed, 133 insertions(+), 22 deletions(-) ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd
-Original Message- From: Prathyush K [mailto:prathyus...@samsung.com] Sent: Wednesday, July 11, 2012 6:40 PM To: dri-devel@lists.freedesktop.org Cc: prathy...@chromium.org; m.szyprow...@samsung.com; inki@samsung.com; subash.ramasw...@linaro.org Subject: [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd This patch adds device tree based IOMMU support to DRM FIMD. During probe, the driver searches for a 'sysmmu' field in the device node. The sysmmu field points to the corresponding sysmmu device of fimd. This sysmmu device is retrieved and set as fimd's sysmmu. The common IOMMU mapping created during DRM init is then attached to drm fimd. Signed-off-by: Prathyush K prathyus...@samsung.com --- drivers/gpu/drm/exynos/exynos_drm_fimd.c | 54 +- 1 files changed, 53 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c index 15b5286..6d4048a 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c @@ -19,7 +19,7 @@ #include linux/clk.h #include linux/pm_runtime.h #include linux/of.h - +#include linux/of_platform.h #include drm/exynos_drm.h #include plat/regs-fb-v4.h @@ -790,12 +790,56 @@ static int fimd_power_on(struct fimd_context *ctx, bool enable) } #ifdef CONFIG_OF + +#ifdef CONFIG_EXYNOS_IOMMU +static int iommu_init(struct device *dev) +{ + struct platform_device *pds; + struct device_node *dn, *dns; + const __be32 *parp; + int ret; + + dn = dev-of_node; + parp = of_get_property(dn, sysmmu, NULL); + if (parp == NULL) { + dev_err(dev, failed to find sysmmu property\n); + return -EINVAL; + } + dns = of_find_node_by_phandle(be32_to_cpup(parp)); + if (dns == NULL) { + dev_err(dev, failed to find sysmmu node\n); + return -EINVAL; + } + pds = of_find_device_by_node(dns); + if (pds == NULL) { + dev_err(dev, failed to find sysmmu platform device\n); + return -EINVAL; + } + + platform_set_sysmmu(pds-dev, dev); + dev-dma_parms = kzalloc(sizeof(*dev-dma_parms), GFP_KERNEL); + if (!dev-dma_parms) { + dev_err(dev, failed to allocate dma parms\n); + return -ENOMEM; + } + dma_set_max_seg_size(dev, 0xu); + + ret = arm_iommu_attach_device(dev, exynos_drm_common_mapping); where is exynos_drm_common_mapping declared? you can get this point using exynos_drm_private structure. + if (ret) { + dev_err(dev, failed to attach device\n); + return ret; + } + return 0; +} +#endif + with your patch, we can use iommu feature only with device tree. I think iommu feature should be used commonly. static struct exynos_drm_fimd_pdata *drm_fimd_dt_parse_pdata(struct device *dev) { struct device_node *np = dev-of_node; struct device_node *disp_np; struct exynos_drm_fimd_pdata *pd; u32 data[4]; + int ret; pd = kzalloc(sizeof(*pd), GFP_KERNEL); if (!pd) { @@ -803,6 +847,14 @@ static struct exynos_drm_fimd_pdata *drm_fimd_dt_parse_pdata(struct device *dev) return ERR_PTR(-ENOMEM); } +#ifdef CONFIG_EXYNOS_IOMMU and please avoid such #ifdef in device driver. + ret = iommu_init(dev); + if (ret) { + dev_err(dev, failed to initialize iommu\n); + return ERR_PTR(ret); + } +#endif + if (of_get_property(np, samsung,fimd-vidout-rgb, NULL)) pd-vidcon0 |= VIDCON0_VIDOUT_RGB | VIDCON0_PNRMODE_RGB; if (of_get_property(np, samsung,fimd-vidout-tv, NULL)) -- 1.7.0.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH 5/7] drm/exynos: attach drm device with common drm mapping
-Original Message- From: Prathyush K [mailto:prathyus...@samsung.com] Sent: Wednesday, July 11, 2012 6:40 PM To: dri-devel@lists.freedesktop.org Cc: prathy...@chromium.org; m.szyprow...@samsung.com; inki@samsung.com; subash.ramasw...@linaro.org Subject: [PATCH 5/7] drm/exynos: attach drm device with common drm mapping This patch sets the common mapping created during drm init, to the drm device's archdata. The dma_ops of drm device is set as arm_iommu_ops. The common mapping is shared across all the drm devices which ensures that any buffer allocated with drm is accessible by drm-fimd or drm-hdmi or both. Signed-off-by: Prathyush K prathyus...@samsung.com --- drivers/gpu/drm/exynos/exynos_drm_drv.c |9 + 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c index c3ad87e..2e40ca8 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c @@ -276,6 +276,15 @@ static struct drm_driver exynos_drm_driver = { static int exynos_drm_platform_probe(struct platform_device *pdev) { +#ifdef CONFIG_EXYNOS_IOMMU + struct device *dev = pdev-dev; + + kref_get(exynos_drm_common_mapping-kref); + dev-archdata.mapping = exynos_drm_common_mapping; Ok, exynos_drm_common_mapping is shared with drivers using dev-archdata.mapping + set_dma_ops(dev, arm_iommu_ops); + + DRM_INFO(drm common mapping set to drm device.\n); +#endif DRM_DEBUG_DRIVER(%s\n, __FILE__); exynos_drm_driver.num_ioctls = DRM_ARRAY_SIZE(exynos_ioctls); -- 1.7.0.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function
-Original Message- From: Prathyush K [mailto:prathyus...@samsung.com] Sent: Wednesday, July 11, 2012 6:40 PM To: dri-devel@lists.freedesktop.org Cc: prathy...@chromium.org; m.szyprow...@samsung.com; inki@samsung.com; subash.ramasw...@linaro.org Subject: [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function This patch adds a exynos drm specific implementation of fb_mmap which supports mapping a non-contiguous buffer to user space. This new function does not assume that the frame buffer is contiguous and calls dma_mmap_writecombine for mapping the buffer to user space. dma_mmap_writecombine will be able to map a contiguous buffer as well as non-contig buffer depending on whether an IOMMU mapping is created for drm or not. Signed-off-by: Prathyush K prathyus...@samsung.com --- drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 16 1 files changed, 16 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c index d5586cc..b53e638 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c +++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c @@ -46,8 +46,24 @@ struct exynos_drm_fbdev { struct exynos_drm_gem_obj *exynos_gem_obj; }; +static int exynos_drm_fb_mmap(struct fb_info *info, + struct vm_area_struct *vma) +{ + if ((vma-vm_end - vma-vm_start) info-fix.smem_len) + return -EINVAL; + + vma-vm_pgoff = 0; + vma-vm_flags |= VM_IO | VM_RESERVED; + if (dma_mmap_writecombine(info-device, vma, info-screen_base, + info-fix.smem_start, vma-vm_end - vma-vm_start)) + return -EAGAIN; + + return 0; +} + Ok, it's good feature. actually the physically non-contiguous gem buffer allocated for console framebuffer has to be mapped with user space. Thanks. static struct fb_ops exynos_drm_fb_ops = { .owner = THIS_MODULE, + .fb_mmap= exynos_drm_fb_mmap, .fb_fillrect= cfb_fillrect, .fb_copyarea= cfb_copyarea, .fb_imageblit = cfb_imageblit, -- 1.7.0.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH 7/7] drm/exynos: Add IOMMU support for mapping gem object
-Original Message- From: Prathyush K [mailto:prathyus...@samsung.com] Sent: Wednesday, July 11, 2012 6:40 PM To: dri-devel@lists.freedesktop.org Cc: prathy...@chromium.org; m.szyprow...@samsung.com; inki@samsung.com; subash.ramasw...@linaro.org Subject: [PATCH 7/7] drm/exynos: Add IOMMU support for mapping gem object A gem object is created using dma_alloc_writecombine. Currently, this buffer is assumed to be contiguous. If a IOMMU mapping is created for DRM, this buffer would be non-contig so the map functions are modified to call dma_mmap_writecombine. This works for both contig and non-contig buffers. Signed-off-by: Prathyush K prathyus...@samsung.com --- drivers/gpu/drm/exynos/exynos_drm_gem.c | 35 ++- --- 1 files changed, 16 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c index 5c8b683..59240f7 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c @@ -162,17 +162,22 @@ static int exynos_drm_gem_map_pages(struct drm_gem_object *obj, { struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj); struct exynos_drm_gem_buf *buf = exynos_gem_obj-buffer; - unsigned long pfn; if (exynos_gem_obj-flags EXYNOS_BO_NONCONTIG) { + unsigned long pfn; if (!buf-pages) return -EINTR; pfn = page_to_pfn(buf-pages[page_offset++]); - } else - pfn = (buf-dma_addr PAGE_SHIFT) + page_offset; - - return vm_insert_mixed(vma, f_vaddr, pfn); + return vm_insert_mixed(vma, f_vaddr, pfn); + } else { It's not good. EXYNOS_BO_NONCONTIG means physically non-contiguous otherwise physically contiguous memory but with your patch, in case of using iommu, memory type of the gem object may have no any meaning. in this case, the memory type is EXYNOS_BO_CONTIG and has physically non-contiguous memory. + int ret; + ret = dma_mmap_writecombine(obj-dev-dev, vma, buf-kvaddr, + buf-dma_addr, buf-size); + if (ret) + DRM_ERROR(dma_mmap_writecombine failed\n); + return ret; + } } static int exynos_drm_gem_get_pages(struct drm_gem_object *obj) @@ -503,7 +508,7 @@ static int exynos_drm_gem_mmap_buffer(struct file *filp, struct drm_gem_object *obj = filp-private_data; struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj); struct exynos_drm_gem_buf *buffer; - unsigned long pfn, vm_size, usize, uaddr = vma-vm_start; + unsigned long vm_size, usize, uaddr = vma-vm_start; int ret; DRM_DEBUG_KMS(%s\n, __FILE__); @@ -543,19 +548,11 @@ static int exynos_drm_gem_mmap_buffer(struct file *filp, usize -= PAGE_SIZE; } while (usize 0); } else { - /* - * get page frame number to physical memory to be mapped - * to user space. - */ - pfn = ((unsigned long)exynos_gem_obj-buffer-dma_addr) - PAGE_SHIFT; - - DRM_DEBUG_KMS(pfn = 0x%lx\n, pfn); - - if (remap_pfn_range(vma, vma-vm_start, pfn, vm_size, - vma-vm_page_prot)) { - DRM_ERROR(failed to remap pfn range.\n); - return -EAGAIN; + ret = dma_mmap_writecombine(obj-dev-dev, vma, buffer- kvaddr, What if we don't use iommu and memory type of this buffer is non-contiguous? + buffer-dma_addr, buffer-size); + if (ret) { + DRM_ERROR(dma_mmap_writecombine failed\n); + return ret; } } -- 1.7.0.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon: fix bo creation retry path
On 13.07.2012 00:23, j.gli...@gmail.com wrote: From: Jerome Glisse jgli...@redhat.com Retry label was at wrong place in function leading to memory leak. Cc: sta...@vger.kernel.org Signed-off-by: Jerome Glisse jgli...@redhat.com Reviewed-by: Christian König christian.koe...@amd.com --- drivers/gpu/drm/radeon/radeon_object.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index 6ecb200..f71e472 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -138,7 +138,6 @@ int radeon_bo_create(struct radeon_device *rdev, acc_size = ttm_bo_dma_acc_size(rdev-mman.bdev, size, sizeof(struct radeon_bo)); -retry: bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL); if (bo == NULL) return -ENOMEM; @@ -152,6 +151,8 @@ retry: bo-surface_reg = -1; INIT_LIST_HEAD(bo-list); INIT_LIST_HEAD(bo-va); + +retry: radeon_ttm_placement_from_domain(bo, domain); /* Kernel allocation are uninterruptible */ down_read(rdev-pm.mclk_lock); ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3
On 12.07.2012 18:36, Alex Deucher wrote: On Thu, Jul 12, 2012 at 12:12 PM, Christian König deathsim...@vodafone.de wrote: Before emitting any indirect buffer, emit the offset of the next valid ring content if any. This allow code that want to resume ring to resume ring right after ib that caused GPU lockup. v2: use scratch registers instead of storing it into memory v3: skip over the surface sync for ni and si as well Signed-off-by: Jerome Glisse jgli...@redhat.com Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/evergreen.c |8 +++- drivers/gpu/drm/radeon/ni.c | 11 ++- drivers/gpu/drm/radeon/r600.c| 18 -- drivers/gpu/drm/radeon/radeon.h |1 + drivers/gpu/drm/radeon/radeon_ring.c |4 drivers/gpu/drm/radeon/rv770.c |4 +++- drivers/gpu/drm/radeon/si.c | 22 +++--- 7 files changed, 60 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index f39b900..40de347 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib) /* set to DX10/11 mode */ radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); radeon_ring_write(ring, 1); - /* FIXME: implement */ + + if (ring-rptr_save_reg) { + uint32_t next_rptr = ring-wptr + 2 + 4; + radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0)); + radeon_ring_write(ring, next_rptr); + } On r600 and newer please use SET_CONFIG_REG rather than Packet0. Why? Please note that it's on purpose that this doesn't interfere with the top/bottom of pipe handling and the draw commands, e.g. the register write isn't associated with drawing but instead just marks the beginning of parsing the IB. Christian. Alex + radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2)); radeon_ring_write(ring, #ifdef __BIG_ENDIAN diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c index f2afefb..5b7ce2c 100644 --- a/drivers/gpu/drm/radeon/ni.c +++ b/drivers/gpu/drm/radeon/ni.c @@ -855,6 +855,13 @@ void cayman_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib) /* set to DX10/11 mode */ radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); radeon_ring_write(ring, 1); + + if (ring-rptr_save_reg) { + uint32_t next_rptr = ring-wptr + 2 + 4 + 8; + radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0)); + radeon_ring_write(ring, next_rptr); + } + radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2)); radeon_ring_write(ring, #ifdef __BIG_ENDIAN @@ -981,8 +988,10 @@ static int cayman_cp_start(struct radeon_device *rdev) static void cayman_cp_fini(struct radeon_device *rdev) { + struct radeon_ring *ring = rdev-ring[RADEON_RING_TYPE_GFX_INDEX]; cayman_cp_enable(rdev, false); - radeon_ring_fini(rdev, rdev-ring[RADEON_RING_TYPE_GFX_INDEX]); + radeon_ring_fini(rdev, ring); + radeon_scratch_free(rdev, ring-rptr_save_reg); } int cayman_cp_resume(struct radeon_device *rdev) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index c808fa9..74fca15 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2155,18 +2155,27 @@ int r600_cp_resume(struct radeon_device *rdev) void r600_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, unsigned ring_size) { u32 rb_bufsz; + int r; /* Align ring size */ rb_bufsz = drm_order(ring_size / 8); ring_size = (1 (rb_bufsz + 1)) * 4; ring-ring_size = ring_size; ring-align_mask = 16 - 1; + + r = radeon_scratch_get(rdev, ring-rptr_save_reg); + if (r) { + DRM_ERROR(failed to get scratch reg for rptr save (%d).\n, r); + ring-rptr_save_reg = 0; + } } void r600_cp_fini(struct radeon_device *rdev) { + struct radeon_ring *ring = rdev-ring[RADEON_RING_TYPE_GFX_INDEX]; r600_cp_stop(rdev); - radeon_ring_fini(rdev, rdev-ring[RADEON_RING_TYPE_GFX_INDEX]); + radeon_ring_fini(rdev, ring); + radeon_scratch_free(rdev, ring-rptr_save_reg); } @@ -2568,7 +2577,12 @@ void r600_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib) { struct radeon_ring *ring = rdev-ring[ib-ring]; - /* FIXME: implement */ + if (ring-rptr_save_reg) { + uint32_t next_rptr = ring-wptr + 2 + 4; + radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0)); + radeon_ring_write(ring, next_rptr); + } + radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
Re: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM
On 07/13/2012 12:09 PM, Inki Dae wrote: -Original Message- From: Prathyush K [mailto:prathyus...@samsung.com] Sent: Wednesday, July 11, 2012 6:40 PM To: dri-devel@lists.freedesktop.org Cc: prathy...@chromium.org; m.szyprow...@samsung.com; inki@samsung.com; subash.ramasw...@linaro.org Subject: [PATCH 0/7] [RFC] drm/exynos: Add IOMMU support to DRM The dma-mapping framework needs a IOMMU mapping to be created for the device which allocates/maps/frees the non-contig buffer. In the DRM framework, a gem buffer is created by the DRM virtual device and not directly by any of the physical devices (FIMD, HDMI etc). Each gem object can be set as a framebuffer to one or many of the drm devices. So a gem object cannot be allocated for any one device. All the DRM devices should be able to access this buffer. It's good to use unified iommu table so I agree to your opinion but we don't decide whether we use dma mapping api or not. now dma mapping api has one issue. in case of using iommu with dma mapping api, we couldn't use physically contiguous memory region with iommu. for this, there is a case that we should use physically contiguous memory region with iommu. it is because we sometime may use mfc(hw video codec) with secure zone such as ARM TrustZone. Then, it needs physically contiguous memory region. Thanks, Inki Dae I agree. In the mainline code, as of now only the arm_dma_ops has the support allocating from the CMA. But in the function arm_iommu_alloc_attrs(), there is no way to know if the device had declared a contiguous memory range. The reason, we don't store that cookie into the device during the dma_declare_contiguous(). So is it advisable to store such information like mapping(in the iommu operations) in the device.archdata? Regards, Subash The proposed method is to create a common IOMMU mapping during drm init. This mapping is then attached to all of the drm devices including the drm device. [PATCH 1/7] drm/exynos: create common IOMMU mapping for DRM During the probe of drm fimd, the driver retrieves a 'sysmmu' field in the device node for fimd. If such a field exists, the driver retrieves the platform device of the sysmmu device. This sysmmu is set as the sysmmu for fimd. The common mapping created is then attached to fimd. This needs to be done for all the other devices (hdmi, vidi etc). [PATCH 2/7] ARM: EXYNOS5: add sysmmu field to fimd device node [PATCH 3/7] drm/exynos: add IOMMU support to drm fimd During DRM's probe which happens last, the common mapping is set to its archdata and iommu ops are set as its dma ops. This requires a modification in the dma-mapping framework so that the iommu ops can be visible to all drivers. [PATCH 4/7] ARM: dma-mapping: rename and export iommu_ops [PATCH 5/7] drm/exynos: attach drm device with common drm mapping Currently allocation and free use the iommu framework by calling dma_alloc_writecombine and dma_free_writecombine respectively. For mapping the buffers to user space, the mmap functions assume that the buffer is contiguous. This is modified by calling dma_mmap_writecombine. [PATCH 6/7] drm/exynos: Add exynos drm specific fb_mmap function [PATCH 7/7] Add IOMMU support for mapping gem object The device tree based patches are based on Leela's patch which was posted last week for adding DT support to DRM FIMD. The patch to add sysmmu field is for reference only and will be posted to the device tree mailing list. Same with the rename and export iommu_ops patch. These patches are tested on Exynos5250 SMDK board and tested with modetest from libdrm tests. Prathyush K (7): drm/exynos: create common IOMMU mapping for DRM ARM: EXYNOS5: add sysmmu field to fimd device node drm/exynos: add IOMMU support to drm fimd ARM: dma-mapping: rename and export iommu_ops drm/exynos: attach drm device with common drm mapping drm/exynos: Add exynos drm specific fb_mmap function drm/exynos: Add IOMMU support for mapping gem object arch/arm/boot/dts/exynos5250.dtsi |1 + arch/arm/include/asm/dma-mapping.h|1 + arch/arm/mm/dma-mapping.c |5 ++- drivers/gpu/drm/exynos/exynos_drm_core.c |3 ++ drivers/gpu/drm/exynos/exynos_drm_drv.c | 30 drivers/gpu/drm/exynos/exynos_drm_drv.h | 10 + drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 16 drivers/gpu/drm/exynos/exynos_drm_fimd.c | 54 - drivers/gpu/drm/exynos/exynos_drm_gem.c | 35 -- 9 files changed, 133 insertions(+), 22 deletions(-) ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3
On Fri, Jul 13, 2012 at 5:09 AM, Christian König deathsim...@vodafone.de wrote: On 12.07.2012 18:36, Alex Deucher wrote: On Thu, Jul 12, 2012 at 12:12 PM, Christian König deathsim...@vodafone.de wrote: Before emitting any indirect buffer, emit the offset of the next valid ring content if any. This allow code that want to resume ring to resume ring right after ib that caused GPU lockup. v2: use scratch registers instead of storing it into memory v3: skip over the surface sync for ni and si as well Signed-off-by: Jerome Glisse jgli...@redhat.com Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/evergreen.c |8 +++- drivers/gpu/drm/radeon/ni.c | 11 ++- drivers/gpu/drm/radeon/r600.c| 18 -- drivers/gpu/drm/radeon/radeon.h |1 + drivers/gpu/drm/radeon/radeon_ring.c |4 drivers/gpu/drm/radeon/rv770.c |4 +++- drivers/gpu/drm/radeon/si.c | 22 +++--- 7 files changed, 60 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index f39b900..40de347 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib) /* set to DX10/11 mode */ radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); radeon_ring_write(ring, 1); - /* FIXME: implement */ + + if (ring-rptr_save_reg) { + uint32_t next_rptr = ring-wptr + 2 + 4; + radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0)); + radeon_ring_write(ring, next_rptr); + } On r600 and newer please use SET_CONFIG_REG rather than Packet0. Why? Please note that it's on purpose that this doesn't interfere with the top/bottom of pipe handling and the draw commands, e.g. the register write isn't associated with drawing but instead just marks the beginning of parsing the IB. Packet0's are have been semi-deprecated since r600. They still work, but the CP guys recommend using the appropriate packet3 whenever possible. Alex ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3
On 13.07.2012 14:27, Alex Deucher wrote: On Fri, Jul 13, 2012 at 5:09 AM, Christian König deathsim...@vodafone.de wrote: On 12.07.2012 18:36, Alex Deucher wrote: On Thu, Jul 12, 2012 at 12:12 PM, Christian König deathsim...@vodafone.de wrote: Before emitting any indirect buffer, emit the offset of the next valid ring content if any. This allow code that want to resume ring to resume ring right after ib that caused GPU lockup. v2: use scratch registers instead of storing it into memory v3: skip over the surface sync for ni and si as well Signed-off-by: Jerome Glisse jgli...@redhat.com Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/evergreen.c |8 +++- drivers/gpu/drm/radeon/ni.c | 11 ++- drivers/gpu/drm/radeon/r600.c| 18 -- drivers/gpu/drm/radeon/radeon.h |1 + drivers/gpu/drm/radeon/radeon_ring.c |4 drivers/gpu/drm/radeon/rv770.c |4 +++- drivers/gpu/drm/radeon/si.c | 22 +++--- 7 files changed, 60 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index f39b900..40de347 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib) /* set to DX10/11 mode */ radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); radeon_ring_write(ring, 1); - /* FIXME: implement */ + + if (ring-rptr_save_reg) { + uint32_t next_rptr = ring-wptr + 2 + 4; + radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0)); + radeon_ring_write(ring, next_rptr); + } On r600 and newer please use SET_CONFIG_REG rather than Packet0. Why? Please note that it's on purpose that this doesn't interfere with the top/bottom of pipe handling and the draw commands, e.g. the register write isn't associated with drawing but instead just marks the beginning of parsing the IB. Packet0's are have been semi-deprecated since r600. They still work, but the CP guys recommend using the appropriate packet3 whenever possible. Ok, that makes sense. Any further comments on the patchset, or can I send that to Dave for merging now? Cheers, Christian. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v3
On Fri, Jul 13, 2012 at 9:46 AM, Christian König deathsim...@vodafone.de wrote: On 13.07.2012 14:27, Alex Deucher wrote: On Fri, Jul 13, 2012 at 5:09 AM, Christian König deathsim...@vodafone.de wrote: On 12.07.2012 18:36, Alex Deucher wrote: On Thu, Jul 12, 2012 at 12:12 PM, Christian König deathsim...@vodafone.de wrote: Before emitting any indirect buffer, emit the offset of the next valid ring content if any. This allow code that want to resume ring to resume ring right after ib that caused GPU lockup. v2: use scratch registers instead of storing it into memory v3: skip over the surface sync for ni and si as well Signed-off-by: Jerome Glisse jgli...@redhat.com Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/evergreen.c |8 +++- drivers/gpu/drm/radeon/ni.c | 11 ++- drivers/gpu/drm/radeon/r600.c| 18 -- drivers/gpu/drm/radeon/radeon.h |1 + drivers/gpu/drm/radeon/radeon_ring.c |4 drivers/gpu/drm/radeon/rv770.c |4 +++- drivers/gpu/drm/radeon/si.c | 22 +++--- 7 files changed, 60 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index f39b900..40de347 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib) /* set to DX10/11 mode */ radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0)); radeon_ring_write(ring, 1); - /* FIXME: implement */ + + if (ring-rptr_save_reg) { + uint32_t next_rptr = ring-wptr + 2 + 4; + radeon_ring_write(ring, PACKET0(ring-rptr_save_reg, 0)); + radeon_ring_write(ring, next_rptr); + } On r600 and newer please use SET_CONFIG_REG rather than Packet0. Why? Please note that it's on purpose that this doesn't interfere with the top/bottom of pipe handling and the draw commands, e.g. the register write isn't associated with drawing but instead just marks the beginning of parsing the IB. Packet0's are have been semi-deprecated since r600. They still work, but the CP guys recommend using the appropriate packet3 whenever possible. Ok, that makes sense. Any further comments on the patchset, or can I send that to Dave for merging now? Other than that, it looks good to me. For the series: Reviewed-by: Alex Deucher alexander.deuc...@amd.com ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/3] drm/radeon: return an error if there is nothing to wait for
Otherwise the sa managers out of memory handling doesn't work. Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/radeon_fence.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index 76c5b22..7a181c3 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -331,7 +331,7 @@ static int radeon_fence_wait_any_seq(struct radeon_device *rdev, /* nothing to wait for ? */ if (ring == RADEON_NUM_RINGS) { - return 0; + return -ENOENT; } while (!radeon_fence_any_seq_signaled(rdev, target_seq)) { -- 1.7.9.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 2/3] drm/radeon: let sa manager block for fences to wait for
Otherwise we can encounter out of memory situations under extreme load. Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/radeon.h|2 +- drivers/gpu/drm/radeon/radeon_sa.c | 72 +--- 2 files changed, 51 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 6715e4c..2cb355b 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -362,7 +362,7 @@ struct radeon_bo_list { * alignment). */ struct radeon_sa_manager { - spinlock_t lock; + wait_queue_head_t wq; struct radeon_bo*bo; struct list_head*hole; struct list_headflist[RADEON_NUM_RINGS]; diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c index 81dbb5b..b535fc4 100644 --- a/drivers/gpu/drm/radeon/radeon_sa.c +++ b/drivers/gpu/drm/radeon/radeon_sa.c @@ -54,7 +54,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev, { int i, r; - spin_lock_init(sa_manager-lock); + init_waitqueue_head(sa_manager-wq); sa_manager-bo = NULL; sa_manager-size = size; sa_manager-domain = domain; @@ -211,6 +211,29 @@ static bool radeon_sa_bo_try_alloc(struct radeon_sa_manager *sa_manager, return false; } +static bool radeon_sa_event(struct radeon_sa_manager *sa_manager, + unsigned size, unsigned align) +{ + unsigned soffset, eoffset, wasted; + int i; + + for (i = 0; i RADEON_NUM_RINGS; ++i) { + if (!list_empty(sa_manager-flist[i])) { + return true; + } + } + + soffset = radeon_sa_bo_hole_soffset(sa_manager); + eoffset = radeon_sa_bo_hole_eoffset(sa_manager); + wasted = (align - (soffset % align)) % align; + + if ((eoffset - soffset) = (size + wasted)) { + return true; + } + + return false; +} + static bool radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager, struct radeon_fence **fences, unsigned *tries) @@ -297,8 +320,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev, INIT_LIST_HEAD((*sa_bo)-olist); INIT_LIST_HEAD((*sa_bo)-flist); - spin_lock(sa_manager-lock); - do { + spin_lock(sa_manager-wq.lock); + while(1) { for (i = 0; i RADEON_NUM_RINGS; ++i) { fences[i] = NULL; tries[i] = 0; @@ -309,30 +332,34 @@ int radeon_sa_bo_new(struct radeon_device *rdev, if (radeon_sa_bo_try_alloc(sa_manager, *sa_bo, size, align)) { - spin_unlock(sa_manager-lock); + spin_unlock(sa_manager-wq.lock); return 0; } /* see if we can skip over some allocations */ } while (radeon_sa_bo_next_hole(sa_manager, fences, tries)); - if (block) { - spin_unlock(sa_manager-lock); - r = radeon_fence_wait_any(rdev, fences, false); - spin_lock(sa_manager-lock); - if (r) { - /* if we have nothing to wait for we - are practically out of memory */ - if (r == -ENOENT) { - r = -ENOMEM; - } - goto out_err; - } + if (!block) { + break; + } + + spin_unlock(sa_manager-wq.lock); + r = radeon_fence_wait_any(rdev, fences, false); + spin_lock(sa_manager-wq.lock); + /* if we have nothing to wait for block */ + if (r == -ENOENT) { + r = wait_event_interruptible_locked( + sa_manager-wq, + radeon_sa_event(sa_manager, size, align) + ); + } + if (r) { + goto out_err; } - } while (block); + }; out_err: - spin_unlock(sa_manager-lock); + spin_unlock(sa_manager-wq.lock); kfree(*sa_bo); *sa_bo = NULL; return r; @@ -348,7 +375,7 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo **sa_bo, } sa_manager = (*sa_bo)-manager; - spin_lock(sa_manager-lock); + spin_lock(sa_manager-wq.lock); if (fence !radeon_fence_signaled(fence)) { (*sa_bo)-fence = radeon_fence_ref(fence); list_add_tail((*sa_bo)-flist,
[PATCH 3/3] drm/radeon: fix const IB handling
Const IBs are executed on the CE not the CP, so we can't fence them in the normal way. So submit them directly before the IB instead, just as the documentation says. Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/r100.c|2 +- drivers/gpu/drm/radeon/r600.c|2 +- drivers/gpu/drm/radeon/radeon.h |3 ++- drivers/gpu/drm/radeon/radeon_cs.c | 25 +++-- drivers/gpu/drm/radeon/radeon_ring.c | 10 +- 5 files changed, 24 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index e0f5ae8..4ee5a74 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct radeon_ring *ring) ib.ptr[6] = PACKET2(0); ib.ptr[7] = PACKET2(0); ib.length_dw = 8; - r = radeon_ib_schedule(rdev, ib); + r = radeon_ib_schedule(rdev, ib, NULL); if (r) { radeon_scratch_free(rdev, scratch); radeon_ib_free(rdev, ib); diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 3156d25..c2e5069 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct radeon_ring *ring) ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) 2); ib.ptr[2] = 0xDEADBEEF; ib.length_dw = 3; - r = radeon_ib_schedule(rdev, ib); + r = radeon_ib_schedule(rdev, ib, NULL); if (r) { radeon_scratch_free(rdev, scratch); radeon_ib_free(rdev, ib); diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 2cb355b..2d7f06c 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -751,7 +751,8 @@ struct si_rlc { int radeon_ib_get(struct radeon_device *rdev, int ring, struct radeon_ib *ib, unsigned size); void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib); -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib); +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib, + struct radeon_ib *const_ib); int radeon_ib_pool_init(struct radeon_device *rdev); void radeon_ib_pool_fini(struct radeon_device *rdev); int radeon_ib_ring_tests(struct radeon_device *rdev); diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 553da67..d0be5d5 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev, } radeon_cs_sync_rings(parser); parser-ib.vm_id = 0; - r = radeon_ib_schedule(rdev, parser-ib); + r = radeon_ib_schedule(rdev, parser-ib, NULL); if (r) { DRM_ERROR(Failed to schedule IB !\n); } @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev, } radeon_cs_sync_rings(parser); + parser-ib.vm_id = vm-id; + /* ib pool is bind at 0 in virtual address space, +* so gpu_addr is the offset inside the pool bo +*/ + parser-ib.gpu_addr = parser-ib.sa_bo-soffset; + if ((rdev-family = CHIP_TAHITI) (parser-chunk_const_ib_idx != -1)) { parser-const_ib.vm_id = vm-id; - /* ib pool is bind at 0 in virtual address space to gpu_addr is the -* offset inside the pool bo -*/ + /* same reason as above */ parser-const_ib.gpu_addr = parser-const_ib.sa_bo-soffset; - r = radeon_ib_schedule(rdev, parser-const_ib); - if (r) - goto out; + r = radeon_ib_schedule(rdev, parser-ib, parser-const_ib); + } else { + r = radeon_ib_schedule(rdev, parser-ib, NULL); } - parser-ib.vm_id = vm-id; - /* ib pool is bind at 0 in virtual address space to gpu_addr is the -* offset inside the pool bo -*/ - parser-ib.gpu_addr = parser-ib.sa_bo-soffset; - parser-ib.is_const_ib = false; - r = radeon_ib_schedule(rdev, parser-ib); out: if (!r) { if (vm-fence) { diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 75cbe46..c48c354 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -74,7 +74,8 @@ void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib) radeon_fence_unref(ib-fence); } -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib) +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib, + struct radeon_ib *const_ib) { struct radeon_ring *ring =
Re: [PATCH 2/3] drm/radeon: let sa manager block for fences to wait for
On Fri, Jul 13, 2012 at 04:08:14PM +0200, Christian König wrote: Otherwise we can encounter out of memory situations under extreme load. Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/radeon.h|2 +- drivers/gpu/drm/radeon/radeon_sa.c | 72 +--- 2 files changed, 51 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 6715e4c..2cb355b 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -362,7 +362,7 @@ struct radeon_bo_list { * alignment). */ struct radeon_sa_manager { - spinlock_t lock; + wait_queue_head_t wq; struct radeon_bo*bo; struct list_head*hole; struct list_headflist[RADEON_NUM_RINGS]; diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c index 81dbb5b..b535fc4 100644 --- a/drivers/gpu/drm/radeon/radeon_sa.c +++ b/drivers/gpu/drm/radeon/radeon_sa.c @@ -54,7 +54,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev, { int i, r; - spin_lock_init(sa_manager-lock); + init_waitqueue_head(sa_manager-wq); sa_manager-bo = NULL; sa_manager-size = size; sa_manager-domain = domain; @@ -211,6 +211,29 @@ static bool radeon_sa_bo_try_alloc(struct radeon_sa_manager *sa_manager, return false; } +static bool radeon_sa_event(struct radeon_sa_manager *sa_manager, + unsigned size, unsigned align) +{ + unsigned soffset, eoffset, wasted; + int i; + + for (i = 0; i RADEON_NUM_RINGS; ++i) { + if (!list_empty(sa_manager-flist[i])) { + return true; + } + } + + soffset = radeon_sa_bo_hole_soffset(sa_manager); + eoffset = radeon_sa_bo_hole_eoffset(sa_manager); + wasted = (align - (soffset % align)) % align; + + if ((eoffset - soffset) = (size + wasted)) { + return true; + } + + return false; +} + This new function should come with a comment, per the new documentation rules. static bool radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager, struct radeon_fence **fences, unsigned *tries) @@ -297,8 +320,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev, INIT_LIST_HEAD((*sa_bo)-olist); INIT_LIST_HEAD((*sa_bo)-flist); - spin_lock(sa_manager-lock); - do { + spin_lock(sa_manager-wq.lock); + while(1) { for (i = 0; i RADEON_NUM_RINGS; ++i) { fences[i] = NULL; tries[i] = 0; @@ -309,30 +332,34 @@ int radeon_sa_bo_new(struct radeon_device *rdev, if (radeon_sa_bo_try_alloc(sa_manager, *sa_bo, size, align)) { - spin_unlock(sa_manager-lock); + spin_unlock(sa_manager-wq.lock); return 0; } /* see if we can skip over some allocations */ } while (radeon_sa_bo_next_hole(sa_manager, fences, tries)); - if (block) { - spin_unlock(sa_manager-lock); - r = radeon_fence_wait_any(rdev, fences, false); - spin_lock(sa_manager-lock); - if (r) { - /* if we have nothing to wait for we -are practically out of memory */ - if (r == -ENOENT) { - r = -ENOMEM; - } - goto out_err; - } + if (!block) { + break; + } + + spin_unlock(sa_manager-wq.lock); + r = radeon_fence_wait_any(rdev, fences, false); + spin_lock(sa_manager-wq.lock); + /* if we have nothing to wait for block */ + if (r == -ENOENT) { + r = wait_event_interruptible_locked( + sa_manager-wq, + radeon_sa_event(sa_manager, size, align) + ); + } + if (r) { + goto out_err; } - } while (block); + }; out_err: - spin_unlock(sa_manager-lock); + spin_unlock(sa_manager-wq.lock); kfree(*sa_bo); *sa_bo = NULL; return r; @@ -348,7 +375,7 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo **sa_bo, } sa_manager = (*sa_bo)-manager; - spin_lock(sa_manager-lock); + spin_lock(sa_manager-wq.lock); if (fence
Re: [PATCH 3/3] drm/radeon: fix const IB handling
On Fri, Jul 13, 2012 at 04:08:15PM +0200, Christian König wrote: Const IBs are executed on the CE not the CP, so we can't fence them in the normal way. So submit them directly before the IB instead, just as the documentation says. Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/r100.c|2 +- drivers/gpu/drm/radeon/r600.c|2 +- drivers/gpu/drm/radeon/radeon.h |3 ++- drivers/gpu/drm/radeon/radeon_cs.c | 25 +++-- drivers/gpu/drm/radeon/radeon_ring.c | 10 +- 5 files changed, 24 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index e0f5ae8..4ee5a74 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct radeon_ring *ring) ib.ptr[6] = PACKET2(0); ib.ptr[7] = PACKET2(0); ib.length_dw = 8; - r = radeon_ib_schedule(rdev, ib); + r = radeon_ib_schedule(rdev, ib, NULL); if (r) { radeon_scratch_free(rdev, scratch); radeon_ib_free(rdev, ib); diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 3156d25..c2e5069 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct radeon_ring *ring) ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) 2); ib.ptr[2] = 0xDEADBEEF; ib.length_dw = 3; - r = radeon_ib_schedule(rdev, ib); + r = radeon_ib_schedule(rdev, ib, NULL); if (r) { radeon_scratch_free(rdev, scratch); radeon_ib_free(rdev, ib); diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 2cb355b..2d7f06c 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -751,7 +751,8 @@ struct si_rlc { int radeon_ib_get(struct radeon_device *rdev, int ring, struct radeon_ib *ib, unsigned size); void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib); -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib); +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib, +struct radeon_ib *const_ib); int radeon_ib_pool_init(struct radeon_device *rdev); void radeon_ib_pool_fini(struct radeon_device *rdev); int radeon_ib_ring_tests(struct radeon_device *rdev); diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 553da67..d0be5d5 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev, } radeon_cs_sync_rings(parser); parser-ib.vm_id = 0; - r = radeon_ib_schedule(rdev, parser-ib); + r = radeon_ib_schedule(rdev, parser-ib, NULL); if (r) { DRM_ERROR(Failed to schedule IB !\n); } @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev, } radeon_cs_sync_rings(parser); + parser-ib.vm_id = vm-id; + /* ib pool is bind at 0 in virtual address space, + * so gpu_addr is the offset inside the pool bo + */ + parser-ib.gpu_addr = parser-ib.sa_bo-soffset; + if ((rdev-family = CHIP_TAHITI) (parser-chunk_const_ib_idx != -1)) { parser-const_ib.vm_id = vm-id; - /* ib pool is bind at 0 in virtual address space to gpu_addr is the - * offset inside the pool bo - */ + /* same reason as above */ parser-const_ib.gpu_addr = parser-const_ib.sa_bo-soffset; - r = radeon_ib_schedule(rdev, parser-const_ib); - if (r) - goto out; + r = radeon_ib_schedule(rdev, parser-ib, parser-const_ib); + } else { + r = radeon_ib_schedule(rdev, parser-ib, NULL); } - parser-ib.vm_id = vm-id; - /* ib pool is bind at 0 in virtual address space to gpu_addr is the - * offset inside the pool bo - */ - parser-ib.gpu_addr = parser-ib.sa_bo-soffset; - parser-ib.is_const_ib = false; - r = radeon_ib_schedule(rdev, parser-ib); out: if (!r) { if (vm-fence) { diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 75cbe46..c48c354 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -74,7 +74,8 @@ void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib) radeon_fence_unref(ib-fence); } -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib) +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib, +
Re: [PATCH 3/3] drm/radeon: fix const IB handling
On Fri, Jul 13, 2012 at 10:08 AM, Christian König deathsim...@vodafone.de wrote: Const IBs are executed on the CE not the CP, so we can't fence them in the normal way. So submit them directly before the IB instead, just as the documentation says. Signed-off-by: Christian König deathsim...@vodafone.de --- drivers/gpu/drm/radeon/r100.c|2 +- drivers/gpu/drm/radeon/r600.c|2 +- drivers/gpu/drm/radeon/radeon.h |3 ++- drivers/gpu/drm/radeon/radeon_cs.c | 25 +++-- drivers/gpu/drm/radeon/radeon_ring.c | 10 +- 5 files changed, 24 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index e0f5ae8..4ee5a74 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -3693,7 +3693,7 @@ int r100_ib_test(struct radeon_device *rdev, struct radeon_ring *ring) ib.ptr[6] = PACKET2(0); ib.ptr[7] = PACKET2(0); ib.length_dw = 8; - r = radeon_ib_schedule(rdev, ib); + r = radeon_ib_schedule(rdev, ib, NULL); if (r) { radeon_scratch_free(rdev, scratch); radeon_ib_free(rdev, ib); diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 3156d25..c2e5069 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2619,7 +2619,7 @@ int r600_ib_test(struct radeon_device *rdev, struct radeon_ring *ring) ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) 2); ib.ptr[2] = 0xDEADBEEF; ib.length_dw = 3; - r = radeon_ib_schedule(rdev, ib); + r = radeon_ib_schedule(rdev, ib, NULL); if (r) { radeon_scratch_free(rdev, scratch); radeon_ib_free(rdev, ib); diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 2cb355b..2d7f06c 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -751,7 +751,8 @@ struct si_rlc { int radeon_ib_get(struct radeon_device *rdev, int ring, struct radeon_ib *ib, unsigned size); void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib); -int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib); +int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib, + struct radeon_ib *const_ib); int radeon_ib_pool_init(struct radeon_device *rdev); void radeon_ib_pool_fini(struct radeon_device *rdev); int radeon_ib_ring_tests(struct radeon_device *rdev); diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 553da67..d0be5d5 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -354,7 +354,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev, } radeon_cs_sync_rings(parser); parser-ib.vm_id = 0; - r = radeon_ib_schedule(rdev, parser-ib); + r = radeon_ib_schedule(rdev, parser-ib, NULL); if (r) { DRM_ERROR(Failed to schedule IB !\n); } @@ -452,25 +452,22 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev, } radeon_cs_sync_rings(parser); + parser-ib.vm_id = vm-id; + /* ib pool is bind at 0 in virtual address space, +* so gpu_addr is the offset inside the pool bo +*/ + parser-ib.gpu_addr = parser-ib.sa_bo-soffset; + if ((rdev-family = CHIP_TAHITI) (parser-chunk_const_ib_idx != -1)) { parser-const_ib.vm_id = vm-id; - /* ib pool is bind at 0 in virtual address space to gpu_addr is the -* offset inside the pool bo -*/ + /* same reason as above */ Don't remove comment, code might move and the above comment might not be the same better to duplicate comment then trying to cross reference comment across file. parser-const_ib.gpu_addr = parser-const_ib.sa_bo-soffset; - r = radeon_ib_schedule(rdev, parser-const_ib); - if (r) - goto out; + r = radeon_ib_schedule(rdev, parser-ib, parser-const_ib); + } else { + r = radeon_ib_schedule(rdev, parser-ib, NULL); } - parser-ib.vm_id = vm-id; - /* ib pool is bind at 0 in virtual address space to gpu_addr is the -* offset inside the pool bo -*/ - parser-ib.gpu_addr = parser-ib.sa_bo-soffset; - parser-ib.is_const_ib = false; - r = radeon_ib_schedule(rdev, parser-ib); out: if (!r) { if (vm-fence) { diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 75cbe46..c48c354 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -74,7 +74,8 @@ void
[Bug 52054] New: gallium/opencl doesnt support includes for opencl kernels
https://bugs.freedesktop.org/show_bug.cgi?id=52054 Bug #: 52054 Summary: gallium/opencl doesnt support includes for opencl kernels Classification: Unclassified Product: Mesa Version: git Platform: x86-64 (AMD64) OS/Version: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/r600 AssignedTo: dri-devel@lists.freedesktop.org ReportedBy: ale...@gentoo.org when running tests for opencl enabled jtr (http://www.openwall.com/john/) i get following error OpenCL platform 0: Default, 1 device(s). Using device 0: AMD JUNIPER 1 error generated. Compilation log: cl_input:17:10: fatal error: 'opencl_rar.h' file not found #include opencl_rar.h ^ OpenCL error (CL_INVALID_PROGRAM_EXECUTABLE) in file (rar_fmt.c) at line (588) - (Error creating kernel. Double-check kernel name?) xeon ~ # ./clInfo Found 1 platform(s). platform[(nil)]: profile: FULL_PROFILE platform[(nil)]: version: OpenCL 1.1 MESA platform[(nil)]: name: Default platform[(nil)]: vendor: Mesa platform[(nil)]: extensions: platform[(nil)]: Found 1 device(s). device[0xc82360]: NAME: AMD JUNIPER device[0xc82360]: VENDOR: X.Org device[0xc82360]: PROFILE: FULL_PROFILE device[0xc82360]: VERSION: OpenCL 1.1 MESA device[0xc82360]: EXTENSIONS: device[0xc82360]: DRIVER_VERSION: device[0xc82360]: Type: GPU device[0xc82360]: EXECUTION_CAPABILITIES: Kernel device[0xc82360]: GLOBAL_MEM_CACHE_TYPE: None (0) device[0xc82360]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1) device[0xc82360]: SINGLE_FP_CONFIG: 0x7 device[0xc82360]: QUEUE_PROPERTIES: 0x2 device[0xc82360]: VENDOR_ID: 4098 device[0xc82360]: MAX_COMPUTE_UNITS: 1 device[0xc82360]: MAX_WORK_ITEM_DIMENSIONS: 3 device[0xc82360]: MAX_WORK_GROUP_SIZE: 256 device[0xc82360]: PREFERRED_VECTOR_WIDTH_CHAR: 16 device[0xc82360]: PREFERRED_VECTOR_WIDTH_SHORT: 8 device[0xc82360]: PREFERRED_VECTOR_WIDTH_INT: 4 device[0xc82360]: PREFERRED_VECTOR_WIDTH_LONG: 2 device[0xc82360]: PREFERRED_VECTOR_WIDTH_FLOAT: 4 device[0xc82360]: PREFERRED_VECTOR_WIDTH_DOUBLE: 2 device[0xc82360]: MAX_CLOCK_FREQUENCY: 0 device[0xc82360]: ADDRESS_BITS: 32 device[0xc82360]: MAX_MEM_ALLOC_SIZE: 0 device[0xc82360]: IMAGE_SUPPORT: 1 device[0xc82360]: MAX_READ_IMAGE_ARGS: 32 device[0xc82360]: MAX_WRITE_IMAGE_ARGS: 32 device[0xc82360]: IMAGE2D_MAX_WIDTH: 32768 device[0xc82360]: IMAGE2D_MAX_HEIGHT: 32768 device[0xc82360]: IMAGE3D_MAX_WIDTH: 32768 device[0xc82360]: IMAGE3D_MAX_HEIGHT: 32768 device[0xc82360]: IMAGE3D_MAX_DEPTH: 32768 device[0xc82360]: MAX_SAMPLERS: 16 device[0xc82360]: MAX_PARAMETER_SIZE: 1024 device[0xc82360]: MEM_BASE_ADDR_ALIGN: 128 device[0xc82360]: MIN_DATA_TYPE_ALIGN_SIZE: 128 device[0xc82360]: GLOBAL_MEM_CACHELINE_SIZE: 0 device[0xc82360]: GLOBAL_MEM_CACHE_SIZE: 0 device[0xc82360]: GLOBAL_MEM_SIZE: 201326592 device[0xc82360]: MAX_CONSTANT_BUFFER_SIZE: 65536 device[0xc82360]: MAX_CONSTANT_ARGS: 1 device[0xc82360]: LOCAL_MEM_SIZE: 32768 device[0xc82360]: ERROR_CORRECTION_SUPPORT: 0 device[0xc82360]: PROFILING_TIMER_RESOLUTION: 0 device[0xc82360]: ENDIAN_LITTLE: 1 device[0xc82360]: AVAILABLE: 1 device[0xc82360]: COMPILER_AVAILABLE: 1 -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[RFC] dma-fence: dma-buf synchronization (v2)
From: Rob Clark r...@ti.com A dma-fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A dma-fence is transient, one-shot deal. It is allocated and attached to dma-buf's list of fences. When the one that attached it is done, with the pending operation, it can signal the fence removing it from the dma-buf's list of fences: + dma_buf_attach_fence() + dma_fence_signal() Other drivers can access the current fence on the dma-buf (if any), which increment's the fences refcnt: + dma_buf_get_fence() + dma_fence_put() The one pending on the fence can add an async callback (and optionally cancel it.. for example, to recover from GPU hangs): + dma_fence_add_callback() + dma_fence_cancel_callback() Or wait synchronously (optionally with timeout or from atomic context): + dma_fence_wait() A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = dma_buf_get_fence(dmabuf); if (fence-ops == mem_dma_fence_ops) { dma_buf *fence_buf; mem_dma_fence_get_buf(fence, fence_buf, offset); ... tell the hw the memory location to wait on ... } else { /* fall-back to sw sync * / dma_fence_add_callback(fence, my_cb); } The memory location is itself backed by dma-buf, to simplify mapping to the device's address space, an idea borrowed from Maarten Lankhorst. NOTE: the memory location fence is not implemented yet, the above is just for explaining how it would work. On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. The other non-sw implementations would wrap the add/cancel_callback and wait fence ops, so that they can keep track if a device not supporting hw sync is waiting on the fence, and in this case should arrange to call dma_fence_signal() at some point after the condition has changed, to notify other devices waiting on the fence. If there are no sw waiters, this can be skipped to avoid waking the CPU unnecessarily. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). --- drivers/base/Makefile |2 +- drivers/base/dma-buf.c|3 + drivers/base/dma-fence.c | 364 + include/linux/dma-buf.h |2 + include/linux/dma-fence.h | 128 5 files changed, 498 insertions(+), 1 deletion(-) create mode 100644 drivers/base/dma-fence.c create mode 100644 include/linux/dma-fence.h diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 5aa2d70..6e9f217 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER)+= firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 24e88fe..b053236 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -39,6 +39,8 @@ static int dma_buf_release(struct inode *inode, struct file *file) dmabuf = file-private_data; + WARN_ON(!list_empty(dmabuf-fence_list)); + dmabuf-ops-release(dmabuf); kfree(dmabuf);
libdrm: Fix some warnings reported by clang's scan-build tool
Patches 1 to 4 were sent to mesa-dev. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 2/5] libkms/nouveau.c: Fix a memory leak and cleanup code a bit.
--- libkms/nouveau.c | 20 +++- 1 files changed, 11 insertions(+), 9 deletions(-) diff --git a/libkms/nouveau.c b/libkms/nouveau.c index 0e24a15..4cbca96 100644 --- a/libkms/nouveau.c +++ b/libkms/nouveau.c @@ -94,14 +94,18 @@ nouveau_bo_create(struct kms_driver *kms, if (!bo) return -ENOMEM; - if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) { + switch (type) { + case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8: pitch = 64 * 4; size = 64 * 64 * 4; - } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) { + break; + case KMS_BO_TYPE_SCANOUT_X8R8G8B8: pitch = width * 4; pitch = (pitch + 512 - 1) ~(512 - 1); size = pitch * height; - } else { + break; + default: + free(bo); return -EINVAL; } @@ -114,8 +118,10 @@ nouveau_bo_create(struct kms_driver *kms, arg.channel_hint = 0; ret = drmCommandWriteRead(kms-fd, DRM_NOUVEAU_GEM_NEW, arg, sizeof(arg)); - if (ret) - goto err_free; + if (ret) { + free(bo); + return ret; + } bo-base.kms = kms; bo-base.handle = arg.info.handle; @@ -126,10 +132,6 @@ nouveau_bo_create(struct kms_driver *kms, *out = bo-base; return 0; - -err_free: - free(bo); - return ret; } static int -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/5] libkms/intel.c: Fix a memory leak and a dead assignment as well as cleanup code a bit.
--- libkms/intel.c | 25 ++--- 1 files changed, 14 insertions(+), 11 deletions(-) diff --git a/libkms/intel.c b/libkms/intel.c index 8b8249b..b8ac343 100644 --- a/libkms/intel.c +++ b/libkms/intel.c @@ -93,14 +93,18 @@ intel_bo_create(struct kms_driver *kms, if (!bo) return -ENOMEM; - if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) { + switch (type) { + case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8: pitch = 64 * 4; size = 64 * 64 * 4; - } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) { + break; + case KMS_BO_TYPE_SCANOUT_X8R8G8B8: pitch = width * 4; pitch = (pitch + 512 - 1) ~(512 - 1); size = pitch * ((height + 4 - 1) ~(4 - 1)); - } else { + break; + default: + free(bo); return -EINVAL; } @@ -108,8 +112,10 @@ intel_bo_create(struct kms_driver *kms, arg.size = size; ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_CREATE, arg, sizeof(arg)); - if (ret) - goto err_free; + if (ret) { + free(bo); + return ret; + } bo-base.kms = kms; bo-base.handle = arg.handle; @@ -124,21 +130,18 @@ intel_bo_create(struct kms_driver *kms, tile.handle = bo-base.handle; tile.tiling_mode = I915_TILING_X; tile.stride = bo-base.pitch; - - ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, tile, sizeof(tile)); #if 0 + ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, tile, sizeof(tile)); if (ret) { kms_bo_destroy(out); return ret; } +#else + drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, tile, sizeof(tile)); #endif } return 0; - -err_free: - free(bo); - return ret; } static int -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 3/5] nouveau/nouveau.c: Fix two memory leaks.
--- nouveau/nouveau.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c index 5aa4107..e91287f 100644 --- a/nouveau/nouveau.c +++ b/nouveau/nouveau.c @@ -95,6 +95,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device **pdev) (dev-drm_version 0x0100 || dev-drm_version = 0x0200)) { nouveau_device_del(dev); + free(nvdev); return -EINVAL; } @@ -105,6 +106,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device **pdev) ret = nouveau_getparam(dev, NOUVEAU_GETPARAM_AGP_SIZE, gart); if (ret) { nouveau_device_del(dev); + free(nvdev); return ret; } -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 4/5] xf86drm.c: Make more code UDEV unrelevant and fix a memory leak.
--- xf86drm.c | 12 +--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/xf86drm.c b/xf86drm.c index 6ea068f..e3789c8 100644 --- a/xf86drm.c +++ b/xf86drm.c @@ -255,6 +255,7 @@ static int drmMatchBusID(const char *id1, const char *id2, int pci_domain_ok) return 0; } +#if !defined(UDEV) /** * Handles error checking for chown call. * @@ -284,6 +285,7 @@ static int chown_check_return(const char *path, uid_t owner, gid_t group) path, errno, strerror(errno)); return -1; } +#endif /** * Open the DRM device, creating it if necessary. @@ -303,14 +305,17 @@ static int drmOpenDevice(long dev, int minor, int type) stat_t st; charbuf[64]; int fd; +#if !defined(UDEV) mode_t devmode = DRM_DEV_MODE, serv_mode; int isroot = !geteuid(); uid_t user= DRM_DEV_UID; gid_t group = DRM_DEV_GID, serv_group; - +#endif + sprintf(buf, type ? DRM_DEV_NAME : DRM_CONTROL_DEV_NAME, DRM_DIR_NAME, minor); drmMsg(drmOpenDevice: node name is %s\n, buf); +#if !defined(UDEV) if (drm_server_info) { drm_server_info-get_perms(serv_group, serv_mode); devmode = serv_mode ? serv_mode : DRM_DEV_MODE; @@ -318,7 +323,6 @@ static int drmOpenDevice(long dev, int minor, int type) group = (serv_group = 0) ? serv_group : DRM_DEV_GID; } -#if !defined(UDEV) if (stat(DRM_DIR_NAME, st)) { if (!isroot) return DRM_ERR_NOT_ROOT; @@ -1395,8 +1399,10 @@ drm_context_t *drmGetReservedContextList(int fd, int *count) } res.contexts = list; -if (drmIoctl(fd, DRM_IOCTL_RES_CTX, res)) +if (drmIoctl(fd, DRM_IOCTL_RES_CTX, res)) { + drmFree(retval); return NULL; +} for (i = 0; i res.count; i++) retval[i] = list[i].handle; -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 5/5] modetest.c: Add return 0 in bit_name_fn(res) macro.
--- tests/modetest/modetest.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c index ec3121e..00129fa 100644 --- a/tests/modetest/modetest.c +++ b/tests/modetest/modetest.c @@ -128,6 +128,7 @@ char * res##_str(int type) { \ sep = , ; \ } \ } \ + return 0; \ } static const char *mode_type_names[] = { -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: libdrm: Fix some warnings reported by clang's scan-build tool
On Fri, Jul 13, 2012 at 05:49:12PM +0200, Johannes Obermayr wrote: Patches 1 to 4 were sent to mesa-dev. And you chose to ignore most of my comments. Fine. Don't expect further reviews from me. Marcin ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [RFC] dma-fence: dma-buf synchronization (v2)
Hi Rob, Yes, sorry we've been a bit slack progressing KDS publicly. Your approach looks interesting and seems like it could enable both implicit and explicit synchronization. A good compromise. From: Rob Clark r...@ti.com A dma-fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering- complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A dma-fence is transient, one-shot deal. It is allocated and attached to dma-buf's list of fences. When the one that attached it is done, with the pending operation, it can signal the fence removing it from the dma-buf's list of fences: + dma_buf_attach_fence() + dma_fence_signal() It would be useful to have two lists of fences, those around writes to the buffer and those around reads. The idea being that if you only want to read from a buffer, you don't need to wait for fences around other read operations, you only need to wait for the last writer fence. If you do want to write to the buffer however, you need to wait for all the read fences and the last writer fence. The use-case is when EGL swap behaviour is EGL_BUFFER_PRESERVED. You have the display controller reading the buffer with its fence defined to be signalled when it is no-longer scanning out that buffer. It can only stop scanning out that buffer when it is given another buffer to scan-out. If that next buffer must be rendered by copying the currently scanned-out buffer into it (one possible option for implementing EGL_BUFFER_PRESERVED) then you essentially deadlock if the scan-out job blocks the render the next frame job. There's probably variations of this idea, perhaps you only need a flag to indicate if a fence is around a read-only or rw access? The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). From our experience of our own KDS, we've come up with an interesting approach to synchronizing userspace applications which have a buffer mmap'd. We wanted to avoid userspace being able to block jobs running on hardware while still allowing userspace to participate. Our original idea was to have a lock/unlock ioctl interface on a dma_buf but have a timeout whereby the application's lock would be broken if held for too long. That at least bounded how long userspace could potentially block hardware making progress, though was pretty harsh. The approach we have now settled on is to instead only allow an application to wait for all jobs currently pending for a buffer. So there's no way userspace can prevent anything else from using a buffer, other than not issuing jobs which will use that buffer. Also, the interface we settled on was to add a poll handler to dma_buf, that way userspace can select() on multiple dma_buff buffers in one syscall. It can also chose if it wants to wait for only the last writer fence, I.e. wait until it can read (POLLIN) or wait for all fences as it wants to write to the buffer (POLLOUT). We kinda like this, but does restrict the utility a little. An idea worth considering anyway. My other thought is around atomicity. Could this be extended to (safely) allow for hardware devices which might want to access multiple buffers simultaneously? I think it probably can with some tweaks to the interface? An atomic function which does something like give me all the fences for all these buffers and add this fence to each instead/as-well-as? Cheers, Tom ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
libdrm: Fix some warnings reported by clang's scan-build tool [try 2]
Am Freitag, 13. Juli 2012, 18:47:50 schrieb Marcin Slusarz: On Fri, Jul 13, 2012 at 05:49:12PM +0200, Johannes Obermayr wrote: Patches 1 to 4 were sent to mesa-dev. And you chose to ignore most of my comments. Fine. Don't expect further reviews from me. Marcin Patch 1 and 2: - Adapted - I want to keep proposed easier to read switch case Patch 3: - Resend - Waiting on your response: http://lists.freedesktop.org/archives/mesa-dev/2012-June/023456.html Patch 4 and 5: - Splitted - http://llvm.org/bugs/show_bug.cgi?id=13358 (forgot to split and to add 'drmFree(list);') - The 'more if's case' seems better to me Patch 6: - Resend Marcin, not that I ignore comments. But sometimes I want to hear also opinions from (some more) other people. I hope I can calm the waves ... Johannes ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/6] libkms/intel.c: Fix a memory leak and a dead assignment as well as make some code easier to read.
--- libkms/intel.c | 32 +--- 1 files changed, 17 insertions(+), 15 deletions(-) diff --git a/libkms/intel.c b/libkms/intel.c index 8b8249b..12175b0 100644 --- a/libkms/intel.c +++ b/libkms/intel.c @@ -89,27 +89,32 @@ intel_bo_create(struct kms_driver *kms, } } - bo = calloc(1, sizeof(*bo)); - if (!bo) - return -ENOMEM; - - if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) { + switch (type) { + case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8: pitch = 64 * 4; size = 64 * 64 * 4; - } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) { + break; + case KMS_BO_TYPE_SCANOUT_X8R8G8B8: pitch = width * 4; pitch = (pitch + 512 - 1) ~(512 - 1); size = pitch * ((height + 4 - 1) ~(4 - 1)); - } else { + break; + default: return -EINVAL; } + bo = calloc(1, sizeof(*bo)); + if (!bo) + return -ENOMEM; + memset(arg, 0, sizeof(arg)); arg.size = size; ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_CREATE, arg, sizeof(arg)); - if (ret) - goto err_free; + if (ret) { + free(bo); + return ret; + } bo-base.kms = kms; bo-base.handle = arg.handle; @@ -124,21 +129,18 @@ intel_bo_create(struct kms_driver *kms, tile.handle = bo-base.handle; tile.tiling_mode = I915_TILING_X; tile.stride = bo-base.pitch; - - ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, tile, sizeof(tile)); #if 0 + ret = drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, tile, sizeof(tile)); if (ret) { kms_bo_destroy(out); return ret; } +#else + drmCommandWriteRead(kms-fd, DRM_I915_GEM_SET_TILING, tile, sizeof(tile)); #endif } return 0; - -err_free: - free(bo); - return ret; } static int -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 2/6] libkms/nouveau.c: Fix a memory leak and make some code easier to read.
--- libkms/nouveau.c | 27 ++- 1 files changed, 14 insertions(+), 13 deletions(-) diff --git a/libkms/nouveau.c b/libkms/nouveau.c index 0e24a15..fbca6fe 100644 --- a/libkms/nouveau.c +++ b/libkms/nouveau.c @@ -90,21 +90,24 @@ nouveau_bo_create(struct kms_driver *kms, } } - bo = calloc(1, sizeof(*bo)); - if (!bo) - return -ENOMEM; - - if (type == KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8) { + switch (type) { + case KMS_BO_TYPE_CURSOR_64X64_A8R8G8B8: pitch = 64 * 4; size = 64 * 64 * 4; - } else if (type == KMS_BO_TYPE_SCANOUT_X8R8G8B8) { + break; + case KMS_BO_TYPE_SCANOUT_X8R8G8B8: pitch = width * 4; pitch = (pitch + 512 - 1) ~(512 - 1); size = pitch * height; - } else { + break; + default: return -EINVAL; } + bo = calloc(1, sizeof(*bo)); + if (!bo) + return -ENOMEM; + memset(arg, 0, sizeof(arg)); arg.info.size = size; arg.info.domain = NOUVEAU_GEM_DOMAIN_MAPPABLE | NOUVEAU_GEM_DOMAIN_VRAM; @@ -114,8 +117,10 @@ nouveau_bo_create(struct kms_driver *kms, arg.channel_hint = 0; ret = drmCommandWriteRead(kms-fd, DRM_NOUVEAU_GEM_NEW, arg, sizeof(arg)); - if (ret) - goto err_free; + if (ret) { + free(bo); + return ret; + } bo-base.kms = kms; bo-base.handle = arg.info.handle; @@ -126,10 +131,6 @@ nouveau_bo_create(struct kms_driver *kms, *out = bo-base; return 0; - -err_free: - free(bo); - return ret; } static int -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 3/6] nouveau/nouveau.c: Fix two memory leaks.
--- nouveau/nouveau.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c index 5aa4107..e91287f 100644 --- a/nouveau/nouveau.c +++ b/nouveau/nouveau.c @@ -95,6 +95,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device **pdev) (dev-drm_version 0x0100 || dev-drm_version = 0x0200)) { nouveau_device_del(dev); + free(nvdev); return -EINVAL; } @@ -105,6 +106,7 @@ nouveau_device_wrap(int fd, int close, struct nouveau_device **pdev) ret = nouveau_getparam(dev, NOUVEAU_GETPARAM_AGP_SIZE, gart); if (ret) { nouveau_device_del(dev); + free(nvdev); return ret; } -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 4/6] xf86drm.c: Make more code UDEV unrelevant.
--- xf86drm.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/xf86drm.c b/xf86drm.c index 6ea068f..e652731 100644 --- a/xf86drm.c +++ b/xf86drm.c @@ -255,6 +255,7 @@ static int drmMatchBusID(const char *id1, const char *id2, int pci_domain_ok) return 0; } +#if !defined(UDEV) /** * Handles error checking for chown call. * @@ -284,6 +285,7 @@ static int chown_check_return(const char *path, uid_t owner, gid_t group) path, errno, strerror(errno)); return -1; } +#endif /** * Open the DRM device, creating it if necessary. @@ -303,14 +305,17 @@ static int drmOpenDevice(long dev, int minor, int type) stat_t st; charbuf[64]; int fd; +#if !defined(UDEV) mode_t devmode = DRM_DEV_MODE, serv_mode; int isroot = !geteuid(); uid_t user= DRM_DEV_UID; gid_t group = DRM_DEV_GID, serv_group; - +#endif + sprintf(buf, type ? DRM_DEV_NAME : DRM_CONTROL_DEV_NAME, DRM_DIR_NAME, minor); drmMsg(drmOpenDevice: node name is %s\n, buf); +#if !defined(UDEV) if (drm_server_info) { drm_server_info-get_perms(serv_group, serv_mode); devmode = serv_mode ? serv_mode : DRM_DEV_MODE; @@ -318,7 +323,6 @@ static int drmOpenDevice(long dev, int minor, int type) group = (serv_group = 0) ? serv_group : DRM_DEV_GID; } -#if !defined(UDEV) if (stat(DRM_DIR_NAME, st)) { if (!isroot) return DRM_ERR_NOT_ROOT; -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 5/6] xf86drm.c: Fix two memory leaks.
--- xf86drm.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/xf86drm.c b/xf86drm.c index e652731..c1cc170 100644 --- a/xf86drm.c +++ b/xf86drm.c @@ -1399,8 +1399,11 @@ drm_context_t *drmGetReservedContextList(int fd, int *count) } res.contexts = list; -if (drmIoctl(fd, DRM_IOCTL_RES_CTX, res)) +if (drmIoctl(fd, DRM_IOCTL_RES_CTX, res)) { + drmFree(list); + drmFree(retval); return NULL; +} for (i = 0; i res.count; i++) retval[i] = list[i].handle; -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 6/6] modetest.c: Add return 0 in bit_name_fn(res) macro.
--- tests/modetest/modetest.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c index ec3121e..00129fa 100644 --- a/tests/modetest/modetest.c +++ b/tests/modetest/modetest.c @@ -128,6 +128,7 @@ char * res##_str(int type) { \ sep = , ; \ } \ } \ + return 0; \ } static const char *mode_type_names[] = { -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC] dma-fence: dma-buf synchronization (v2)
On Fri, Jul 13, 2012 at 12:35 PM, Tom Cooksey tom.cook...@arm.com wrote: My other thought is around atomicity. Could this be extended to (safely) allow for hardware devices which might want to access multiple buffers simultaneously? I think it probably can with some tweaks to the interface? An atomic function which does something like give me all the fences for all these buffers and add this fence to each instead/as-well-as? fwiw, what I'm leaning towards right now is combining dma-fence w/ Maarten's idea of dma-buf-mgr (not sure if you saw his patches?). And let dmabufmgr handle the multi-buffer reservation stuff. And possibly the read vs write access, although this I'm not 100% sure on... the other option being the concept of read vs write (or exclusive/non-exclusive) fences. In the current state, the fence is quite simple, and doesn't care *what* it is fencing, which seems advantageous when you get into trying to deal with combinations of devices sharing buffers, some of whom can do hw sync, and some who can't. So having a bit of partitioning from the code dealing w/ sequencing who can access the buffers when and for what purpose seems like it might not be a bad idea. Although I'm still working through the different alternatives. BR, -R ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[pull] drm-intel-next
Hi Dave, New pull for -next. Highlights: - rc6/turbo support for hsw (Eugeni) - improve corner-case of the reset handling code - gpu reset handling should be rock-solid now - support for fb offset 4096 pixels on gen4+ (yeah, you need some fairly big screens to hit that) - the Flush Me Harder patch to fix the gen6+ fallout from disabling the flushing_list - no more /dev/agpgart on gen6+! - HAS_PCH_xxx improvements from Paulo - a few minor bitspieces all over, most of it in thew hsw code QA reported 2 regression, one due a bad cable (fixed by a walk to the next radioshack) and one due to the HPD v2 patch - I owe you one for refusing to take v2 for -fixes after v1 blew up on Linus' machine I guess ;-) The later has a confirmed fix already queued up in my tree. Regressions from the last pull are all fixed and some really good news: We've finally fixed the last DP regression from 3.2. Although I'm vary of that blowing up elseplaces, hence I prefer that we soak it in 3.6 a bit before submitting it to stable. Otherwise Chris is hunting down an obscure bug that got recently introduced due to a funny interaction between two seemingly unrelated patches, one improving our gpu death handling, the other preparing the removal of the flushing_list. But he has patches already, although I'm still complaining a bit about the commit messages ... Wrt further pulls for 3.6 I'll merge feature-y stuff only at the end of the current drm-intel-next cycle so that if this will miss 3.6 I can just send you a pull for the bugfixes that are currently merged (or in the case of Chris' patches, hopefully merged soon). Yours, Daniel PS: This pull will make the already existing conflict with Linus' tree a bit more fun, but I think it should be still doable (the important thing is to keep the revert from -fixes, but don't kill any other changes from -next). The following changes since commit 7b0cfee1a24efdfe0235bac62e53f686fe8a8e24: Merge tag 'v3.5-rc4' into drm-intel-next-queued (2012-06-25 19:10:36 +0200) are available in the git repository at: git://people.freedesktop.org/~danvet/drm-intel tags/drm-intel-next-2012-07-06 for you to fetch changes up to 4acf518626cdad5bbf7aac9869bd4accbbfb4ad3: drm/i915: program FDI_RX TP and FDI delays (2012-07-05 15:09:03 +0200) Ben Widawsky (1): drm/i915: linuxify create_hw_context() Chris Wilson (2): drm/i915: Group the GT routines together in both code and vtable drm/i915: Implement w/a for sporadic read failures on waking from rc6 Daniel Vetter (15): drm/i915: wrap up gt powersave enabling functions drm/i915: make enable/disable_gt_powersave locking consistent drm/i915: don't use dev-agp drm/i915: disable drm agp support for !gen3 with kms enabled agp/intel-agp: remove snb+ host bridge pciids drm/i915: Flush Me Harder required on gen6+ drm/i915: fix up ilk rc6 disabling confusion drm/i915: don't trylock in the gpu reset code drm/i915: non-interruptible sleeps can't handle -EAGAIN drm/i915: don't hang userspace when the gpu reset is stuck drm/i915: properly SIGBUS on I/O errors drm/i915: don't return a spurious -EIO from intel_ring_begin drm/i915: introduce crtc-dspaddr_offset drm/i915: adjust framebuffer base address on gen4+ drm/i915: introduce for_each_encoder_on_crtc Eugeni Dodonov (11): drm/i915: support Haswell force waking drm/i915: add RPS configuration for Haswell drm/i915: slightly improve gt enable/disable routines drm/i915: enable RC6 by default on Haswell drm/i915: disable RC6 when disabling rps drm/i915: introduce haswell_init_clock_gating drm/i915: enable RC6 workaround on Haswell drm/i915: move force wake support into intel_pm drm/i915: re-initialize DDI buffer translations after resume drm/i915: prevent bogus intel_update_fbc notifications drm/i915: program FDI_RX TP and FDI delays Jesper Juhl (1): drm/i915/sprite: Fix mem leak in intel_plane_init() Jesse Barnes (3): drm/i915: mask tiled bit when updating IVB sprites drm/i915: correct IVB default sprite format drm/i915: prefer wide slow to fast narrow in DP configs Paulo Zanoni (5): drm/i915: fix PIPE_WM_LINETIME definition drm/i915: add PCH_NONE to enum intel_pch drm/i915: get rid of dev_priv-info-has_pch_split drm/i915: don't ironlake_init_pch_refclk() on LPT drm/i915: fix PIPE_DDI_PORT_MASK Ville Syrjälä (2): drm/i915: Zero initialize mode_cmd drm/i915: Reject page flips with changed format/offset/pitch drivers/char/agp/intel-agp.c| 11 - drivers/gpu/drm/i915/i915_dma.c |9 +- drivers/gpu/drm/i915/i915_drv.c | 172 ++ drivers/gpu/drm/i915/i915_drv.h | 28 ++- drivers/gpu/drm/i915/i915_gem.c | 44 +++-
[PATCH] nouveau: Add irq waiting as alternative to busywait
A way to trigger an irq will be needed for optimus support since cpu-waiting isn't always viable there. This could also be nice for power saving on since cpu would no longer have to spin, and performance might improve slightly on cpu-limited workloads. Some way to quantify these effects would be nice, even if the end result would be 'no performance regression'. An earlier version always emitted an interrupt, resulting in glxgears going from 8k fps to 7k. However this is no longer the case, as I'm using the kernel submission channel for generating irqs as needed now. On nv84 I'm using NOTIFY_INTR, but that might have been removed on fermi, so instead I'm using invalid command 0x0058 now as a way to signal completion. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/gpu/drm/nouveau/nouveau_drv.h |2 + drivers/gpu/drm/nouveau/nouveau_fence.c | 49 --- drivers/gpu/drm/nouveau/nouveau_fifo.h |1 + drivers/gpu/drm/nouveau/nouveau_state.c |1 + drivers/gpu/drm/nouveau/nv04_fifo.c | 25 drivers/gpu/drm/nouveau/nv84_fence.c| 18 +-- drivers/gpu/drm/nouveau/nvc0_fence.c| 12 ++-- drivers/gpu/drm/nouveau/nvc0_fifo.c |3 +- drivers/gpu/drm/nouveau/nve0_fifo.c | 15 +++-- 9 files changed, 110 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h index f97a1a7..d9d274d 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drv.h +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h @@ -707,6 +707,7 @@ struct drm_nouveau_private { struct drm_mm heap; struct nouveau_bo *bo; } fence; + wait_queue_head_t fence_wq; struct { spinlock_t lock; @@ -1656,6 +1657,7 @@ nv44_graph_class(struct drm_device *dev) #define NV84_SUBCHAN_WRCACHE_FLUSH 0x0024 #define NV10_SUBCHAN_REF_CNT 0x0050 #define NVSW_SUBCHAN_PAGE_FLIP 0x0054 +#define NVSW_SUBCHAN_FENCE_WAKE 0x0058 #define NV11_SUBCHAN_DMA_SEMAPHORE 0x0060 #define NV11_SUBCHAN_SEMAPHORE_OFFSET0x0064 #define NV11_SUBCHAN_SEMAPHORE_ACQUIRE 0x0068 diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 3c18049..3ba8dee 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -68,7 +68,7 @@ nouveau_fence_update(struct nouveau_channel *chan) spin_lock(fctx-lock); list_for_each_entry_safe(fence, fnext, fctx-pending, head) { - if (priv-read(chan) fence-sequence) + if (priv-read(chan) - fence-sequence = 0x8000U) break; if (fence-work) @@ -111,11 +111,9 @@ nouveau_fence_done(struct nouveau_fence *fence) return !fence-channel; } -int -nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr) +static int nouveau_fence_wait_busy(struct nouveau_fence *fence, bool lazy, bool intr) { unsigned long sleep_time = NSEC_PER_MSEC / 1000; - ktime_t t; int ret = 0; while (!nouveau_fence_done(fence)) { @@ -127,7 +125,7 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr) __set_current_state(intr ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE); if (lazy) { - t = ktime_set(0, sleep_time); + ktime_t t = ktime_set(0, sleep_time); schedule_hrtimeout(t, HRTIMER_MODE_REL); sleep_time *= 2; if (sleep_time NSEC_PER_MSEC) @@ -144,6 +142,47 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr) return ret; } +static int nouveau_fence_wait_event(struct nouveau_fence *fence, bool intr) +{ + struct drm_nouveau_private *dev_priv = fence-channel-dev-dev_private; + unsigned long timeout = fence-timeout; + int ret = 0; + struct nouveau_channel *chan = dev_priv-channel; + struct nouveau_channel *prev = fence-channel; + struct nouveau_fence_priv *priv = nv_engine(chan-dev, NVOBJ_ENGINE_FENCE); + + if (nouveau_fence_done(fence)) + return 0; + + if (!timeout) + timeout = jiffies + 3 * DRM_HZ; + + if (prev != chan) + ret = priv-sync(fence, prev, chan); + if (ret) + goto busy; + + if (intr) + ret = wait_event_interruptible_timeout(dev_priv-fence_wq, nouveau_fence_done(fence), timeout); + else + ret = wait_event_timeout(dev_priv-fence_wq, nouveau_fence_done(fence),
Re: [RFC] dma-fence: dma-buf synchronization (v2)
On Fri, Jul 13, 2012 at 4:44 PM, Maarten Lankhorst maarten.lankho...@canonical.com wrote: Hey, Op 13-07-12 20:52, Rob Clark schreef: On Fri, Jul 13, 2012 at 12:35 PM, Tom Cooksey tom.cook...@arm.com wrote: My other thought is around atomicity. Could this be extended to (safely) allow for hardware devices which might want to access multiple buffers simultaneously? I think it probably can with some tweaks to the interface? An atomic function which does something like give me all the fences for all these buffers and add this fence to each instead/as-well-as? fwiw, what I'm leaning towards right now is combining dma-fence w/ Maarten's idea of dma-buf-mgr (not sure if you saw his patches?). And let dmabufmgr handle the multi-buffer reservation stuff. And possibly the read vs write access, although this I'm not 100% sure on... the other option being the concept of read vs write (or exclusive/non-exclusive) fences. Agreed, dmabufmgr is meant for reserving multiple buffers without deadlocks. The underlying mechanism for synchronization can be dma-fences, it wouldn't really change dmabufmgr much. In the current state, the fence is quite simple, and doesn't care *what* it is fencing, which seems advantageous when you get into trying to deal with combinations of devices sharing buffers, some of whom can do hw sync, and some who can't. So having a bit of partitioning from the code dealing w/ sequencing who can access the buffers when and for what purpose seems like it might not be a bad idea. Although I'm still working through the different alternatives. Yeah, I managed to get nouveau hooked up with generating irqs on completion today using an invalid command. It's also no longer a performance regression, so software syncing is no longer a problem for nouveau. i915 already generates irqs and r600 presumably too. Monday I'll take a better look at your patch, end of day now. :) let me send you a slightly updated version.. I fixed locally some locking fail in attach_fence() and get_fence() that I managed to introduce when converting from global spinlock to using the waitqueue's spinlock. BR, -R ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] nouveau: Add irq waiting as alternative to busywait
On Fri, Jul 13, 2012 at 11:35 PM, Maarten Lankhorst m.b.lankho...@gmail.com wrote: A way to trigger an irq will be needed for optimus support since cpu-waiting isn't always viable there. This could also be nice for power saving on since cpu would no longer have to spin, and performance might improve slightly on cpu-limited workloads. Some way to quantify these effects would be nice, even if the end result would be 'no performance regression'. An earlier version always emitted an interrupt, resulting in glxgears going from 8k fps to 7k. However this is no longer the case, as I'm using the kernel submission channel for generating irqs as needed now. On nv84 I'm using NOTIFY_INTR, but that might have been removed on fermi, so instead I'm using invalid command 0x0058 now as a way to signal completion. Out of curiosity, isn't this like a handcoded version of software methods? If so, why handcoded? Or are software methods not supported on NVC0? Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/gpu/drm/nouveau/nouveau_drv.h |2 + drivers/gpu/drm/nouveau/nouveau_fence.c | 49 --- drivers/gpu/drm/nouveau/nouveau_fifo.h |1 + drivers/gpu/drm/nouveau/nouveau_state.c |1 + drivers/gpu/drm/nouveau/nv04_fifo.c | 25 drivers/gpu/drm/nouveau/nv84_fence.c| 18 +-- drivers/gpu/drm/nouveau/nvc0_fence.c| 12 ++-- drivers/gpu/drm/nouveau/nvc0_fifo.c |3 +- drivers/gpu/drm/nouveau/nve0_fifo.c | 15 +++-- 9 files changed, 110 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h index f97a1a7..d9d274d 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drv.h +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h @@ -707,6 +707,7 @@ struct drm_nouveau_private { struct drm_mm heap; struct nouveau_bo *bo; } fence; + wait_queue_head_t fence_wq; struct { spinlock_t lock; @@ -1656,6 +1657,7 @@ nv44_graph_class(struct drm_device *dev) #define NV84_SUBCHAN_WRCACHE_FLUSH 0x0024 #define NV10_SUBCHAN_REF_CNT 0x0050 #define NVSW_SUBCHAN_PAGE_FLIP 0x0054 +#define NVSW_SUBCHAN_FENCE_WAKE 0x0058 #define NV11_SUBCHAN_DMA_SEMAPHORE 0x0060 #define NV11_SUBCHAN_SEMAPHORE_OFFSET 0x0064 #define NV11_SUBCHAN_SEMAPHORE_ACQUIRE 0x0068 diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 3c18049..3ba8dee 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -68,7 +68,7 @@ nouveau_fence_update(struct nouveau_channel *chan) spin_lock(fctx-lock); list_for_each_entry_safe(fence, fnext, fctx-pending, head) { - if (priv-read(chan) fence-sequence) + if (priv-read(chan) - fence-sequence = 0x8000U) break; if (fence-work) @@ -111,11 +111,9 @@ nouveau_fence_done(struct nouveau_fence *fence) return !fence-channel; } -int -nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr) +static int nouveau_fence_wait_busy(struct nouveau_fence *fence, bool lazy, bool intr) { unsigned long sleep_time = NSEC_PER_MSEC / 1000; - ktime_t t; int ret = 0; while (!nouveau_fence_done(fence)) { @@ -127,7 +125,7 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr) __set_current_state(intr ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE); if (lazy) { - t = ktime_set(0, sleep_time); + ktime_t t = ktime_set(0, sleep_time); schedule_hrtimeout(t, HRTIMER_MODE_REL); sleep_time *= 2; if (sleep_time NSEC_PER_MSEC) @@ -144,6 +142,47 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr) return ret; } +static int nouveau_fence_wait_event(struct nouveau_fence *fence, bool intr) +{ + struct drm_nouveau_private *dev_priv = fence-channel-dev-dev_private; + unsigned long timeout = fence-timeout; + int ret = 0; + struct nouveau_channel *chan = dev_priv-channel; + struct nouveau_channel *prev = fence-channel; + struct nouveau_fence_priv *priv = nv_engine(chan-dev, NVOBJ_ENGINE_FENCE); + + if (nouveau_fence_done(fence)) + return 0; + + if (!timeout) + timeout = jiffies + 3 * DRM_HZ; + + if (prev !=
Re: general protection fault on ttm_init()
Can you try this patch on top of the previous one? I think it should fix it. Dave. 0001-drm-set-drm_class-to-NULL-after-removing-it.patch Description: Binary data ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: general protection fault on ttm_init()
Hi Dave, On Sat, Jul 14, 2012 at 01:33:45PM +1000, Dave Airlie wrote: Can you try this patch on top of the previous one? I think it should fix it. You are right, it works! Thank you very much! :-) Thanks, Fengguang ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel