Re: [Intel-gfx] [PATCH] drm/i915: More vma fixups around unbind/destroy
On Fri, Aug 23, 2013 at 02:08:11AM +0200, Daniel Vetter wrote: The important bugfix here is that we must not unlink the vma when we keep it around as a placeholder for the execbuf code. Since then we won't find it again when execbuf gets interrupt and restarted and create a 2nd vma. And since the code as-is isn't fit yet to deal with more than one vma, hilarity ensues. Specifically the dma map/unmap of the sg table isn't adjusted for multiple vmas yet and will blow up like this: BUG: unable to handle kernel NULL pointer dereference at 0008 IP: [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] PGD 56bb5067 PUD ad3dd067 PMD 0 Oops: [#1] SMP Modules linked in: tcp_lp ppdev parport_pc lp parport ipv6 dm_mod dcdbas snd_hda_codec_hdmi pcspkr snd_hda_codec_realtek serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec lpc_ich snd_hwdep mfd_core snd_pcm snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm mperf freq_table CPU: 1 PID: 16650 Comm: fbo-maxsize Not tainted 3.11.0-rc4_nightlytop_d93f59_debug_20130814_+ #6957 Hardware name: Dell Inc. OptiPlex 9010/03JR84, BIOS A01 05/04/2012 task: 8800563b3f00 ti: 88004bdf4000 task.ti: 88004bdf4000 RIP: 0010:[a008fb37] [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP: 0018:88004bdf5958 EFLAGS: 00010246 RAX: RBX: 8801135e RCX: 8800ad3bf8e0 RDX: 8800ad3bf8e0 RSI: RDI: 8801007ee780 RBP: 88004bdf5978 R08: 8800ad3bf8e0 R09: R10: 86ca1810 R11: 880036a17101 R12: 8801007ee780 R13: 00018001 R14: 880118c4e000 R15: 8801007ee780 FS: 7f401a0ce740() GS:88011e28() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0008 CR3: 5635c000 CR4: 001407e0 Stack: 8801007ee780 88005c253180 00018000 8801135e 88004bdf59a8 a0088e55 0011 8801007eec00 00018000 880036a17101 88004bdf5a08 a0089026 Call Trace: [a0088e55] i915_vma_unbind+0xdf/0x1ab [i915] [a0089026] __i915_gem_shrink+0x105/0x177 [i915] [a0089452] i915_gem_object_get_pages_gtt+0x108/0x309 [i915] [a0085ba9] i915_gem_object_get_pages+0x61/0x90 [i915] [a008f22b] ? gen6_ppgtt_insert_entries+0x103/0x125 [i915] [a008a113] i915_gem_object_pin+0x1fa/0x5df [i915] [a008cdfe] i915_gem_execbuffer_reserve_object.isra.6+0x8d/0x1bc [i915] [a008d156] i915_gem_execbuffer_reserve+0x229/0x367 [i915] [a008dbf6] i915_gem_do_execbuffer.isra.12+0x4dc/0xf3a [i915] [810fc823] ? might_fault+0x40/0x90 [a008eb89] i915_gem_execbuffer2+0x187/0x222 [i915] [a000971c] drm_ioctl+0x308/0x442 [drm] [a008ea02] ? i915_gem_execbuffer+0x3ae/0x3ae [i915] [817db156] ? __do_page_fault+0x3dd/0x481 [8112fdba] vfs_ioctl+0x26/0x39 [811306a2] do_vfs_ioctl+0x40e/0x451 [817deda7] ? sysret_check+0x1b/0x56 [8113073c] SyS_ioctl+0x57/0x87 [8135bbfe] ? trace_hardirqs_on_thunk+0x3a/0x3f [817ded82] system_call_fastpath+0x16/0x1b Code: 48 c7 c6 84 30 0e a0 31 c0 e8 d0 e9 f7 ff bf c6 a7 00 00 e8 07 af 2c e1 41 f6 84 24 03 01 00 00 10 75 44 49 8b 84 24 08 01 00 00 8b 50 08 48 8b 30 49 8b 86 b0 04 00 00 48 89 c7 48 81 c7 98 00 RIP [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP 88004bdf5958 CR2: 0008 As a consequence we need to change the only one vma for now check in vma_unbind - since vma_destroy isn't always called the obj-vma_list might not be empty. Instead check that the vma list is singular at the beginning of vma_unbind. This is also more symmetric with bind_to_vm. v2: - Add a paranoid WARN to mark_free in the eviction code to make sure we never try to evict a vma used by the execbuf code right now. - Move the check for a temporary execbuf vma into vma_destroy - otherwise the failure path cleanup in bind_to_vm will blow up. Our first attempting at fixing this was commit 1be81a2f2cfd8789a627401d470423358fba2d76 Author: Chris Wilson ch...@chris-wilson.co.uk Date: Tue Aug 20 12:56:40 2013 +0100 drm/i915: Don't destroy the vma placeholder during execbuffer reservation Squash with this when merging! Cc: Chris Wilson ch...@chris-wilson.co.uk Cc: Ben Widawsky b...@bwidawsk.net Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68298 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch Tested-by: lu hua huax...@intel.com -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list
Re: [Intel-gfx] [PATCH] Add second DRI driver name (DRI2DriverVDPAU)
Ping. Пятница, 16 августа 2013, 14:31 +04:00 от Ибрагимов Ринатibragimovri...@mail.ru: libvdpau uses second DRI driver name to determine which VDPAU driver to use. This patch will allow libvdpau choose libvdpau_i965.so on systems with Intel GPUs, libvdpau_nvidia.so on those with nVidia ones, and so on. I'm experimenting now with generic vdpau driver using OpenGL/VA-API, it would be convenient to have this driver selection working without manual driver selection. Signed-off-by: Rinat ibragimovri...@mail.ru --- src/sna/sna_dri.c |5 +++-- src/uxa/intel_dri.c |5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/src/sna/sna_dri.c b/src/sna/sna_dri.c index 1569251..0ba373d 100644 --- a/src/sna/sna_dri.c +++ b/src/sna/sna_dri.c @@ -2299,7 +2299,7 @@ bool sna_dri_open(struct sna *sna, ScreenPtr screen) DRI2InfoRec info; int major = 1, minor = 0; #if DRI2INFOREC_VERSION = 4 - const char *driverNames[1]; + const char *driverNames[2]; #endif DBG((%s()\n, __FUNCTION__)); @@ -2336,9 +2336,10 @@ bool sna_dri_open(struct sna *sna, ScreenPtr screen) info.ScheduleSwap = sna_dri_schedule_swap; info.GetMSC = sna_dri_get_msc; info.ScheduleWaitMSC = sna_dri_schedule_wait_msc; - info.numDrivers = 1; + info.numDrivers = 2; info.driverNames = driverNames; driverNames[0] = info.driverName; + driverNames[1] = info.driverName; #endif #if DRI2INFOREC_VERSION = 6 diff --git a/src/uxa/intel_dri.c b/src/uxa/intel_dri.c index 0370034..2d33380 100644 --- a/src/uxa/intel_dri.c +++ b/src/uxa/intel_dri.c @@ -1553,7 +1553,7 @@ Bool I830DRI2ScreenInit(ScreenPtr screen) int dri2_major = 1; int dri2_minor = 0; #if DRI2INFOREC_VERSION = 4 - const char *driverNames[1]; + const char *driverNames[2]; #endif if (intel-force_fallback) { @@ -1620,9 +1620,10 @@ Bool I830DRI2ScreenInit(ScreenPtr screen) info.ScheduleSwap = I830DRI2ScheduleSwap; info.GetMSC = I830DRI2GetMSC; info.ScheduleWaitMSC = I830DRI2ScheduleWaitMSC; - info.numDrivers = 1; + info.numDrivers = 2; info.driverNames = driverNames; driverNames[0] = info.driverName; + driverNames[1] = info.driverName; #endif return DRI2ScreenInit(screen, info); -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx --- Rinat ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 6/6] DRAFT: drm/i915: do adapter power state notification on PC8+ enable/disable
On Fri, Aug 23, 2013 at 05:14:00PM -0300, Paulo Zanoni wrote: 2013/8/23 Ville Syrjälä ville.syrj...@linux.intel.com: On Fri, Aug 23, 2013 at 04:41:57PM -0300, Paulo Zanoni wrote: 2013/8/23 Kristen Carlson Accardi kristen.c.acca...@intel.com: On Fri, 23 Aug 2013 13:44:17 -0300 Paulo Zanoni przan...@gmail.com wrote: 2013/8/23 Jani Nikula jani.nik...@intel.com: /* Please insert explanation on why we need this and what changes if we do this. */ I applied your patches and booted them. I got into PC8, did the PC8 test suite and nothing changed. I really don't know what to expect from this series and/or how to check what's improving. Also, see below: So this is one of these things that will have no visible impact on i915, but will impact other parts of the system. So I think the only way to test it is by throwing it on the SIP board and checking the power level of the components this impacts (Audio, thermal, KBC/EC, LPT). And without the code which does the actual PCI D3 request from i915, nothing will happen. I was told we could try to verify this by reading PCI config register 0xd4, but for me it's always 0x0, which means we're probably not really getting into D3. You have to write 0x3 into the PCI PM register to make it enter D3. Exactly like you do when you suspend the whole machine. Not sure 0xd4 is the correct offset in this case, but you can look it up from lspci output (remember +4), or in kernel code just use pci_dev.pm_cap+4. Hmm, it's 0xd4 in my SNB at least. Maybe it's always the same for the GPU. Check the description of 0xd4 on BSpec. I actually wrote the get into D3 value to it (using setpci), and then when I got out of PC8 I saw tons and tons of error messages on dmesg due to the fact that we were failing to write registers. Which means it probably works, but we may have more work to do. My quick and dirty idea for runtime PM would be something like this: - for ioctls just slap rpm_get_sync/put() around drm_ioctl, not optimal but it's very easy for getting started - all sysfs/debugfs stuff would need to be handled individually - to deal w/ gtt mmaps just call put_fence or something during suspend, grab one ref in fault and probably release it from a delayed work, or mabe record a timestamp at last fault time and check it in the idle callback - grab one ref per request (or per active ring maybe?) - grab one ref per active pipe - maybe some odd delayed work would need another reference With that, I think everything of importance would hold a reference, so the runtime pm idle callback shouldn't really need to do much. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Patches for performance improvement
On Sun, Aug 25, 2013 at 04:35:52AM +, Kannan, Vandana wrote: Hi, For sprite performance improvement, instead of the patch drm/i915: Don't wait for vblank for sprite plane flips, we made use of Chris Wilson's patches from RFC asynchronous vblank tasks and verified that it gives the expected performance boost. Chris's patches have not been merged yet, specifically, drm/i915: Introduce vblank work function drm/i915: Asynchronously unpin the old framebuffer after the next vblank drm/i915/sprite: Make plane switching asynchronous Will the updated patches for these be coming any time soon ? We require these patches as it gives us a very good performance improvement. Please let me know, I can verify with the updated patches. I don't seem to have said patches in my mailbox, so I can't comment on the details, but... My main concern is again the contents of the old fb after the setplane ioctl returns. Due to the current synchronous nature of the ioctl, userspace knows that when the ioctl returns the flip has completed, and so can write the old fb again. If the ioctl is async, userspace has to guess whether the old fb can be safely written. We could solve this with a new event that gets sent from setplane. We do have plenty of flags left in setplane so it's certainly doable. Also with the async ioctl userspace can issue another setplane before the previous flip has completed. I personally think that's a nice feature that allows no-fps-limits triple buffering, but it does make it a bit more difficult to know which fb is actually being used at a given time. I have it solved in my atomic page flip code though, as the drm_flip_helper contraption knows how to track this stuff. Maybe someone should just start pulling in the drm_flip_helper and associated code from the atomic code to make this feasible. The new event type I added there would also be suitable for this purpose as it carries the plane id and information on which fb is freed by the operation. The other option is to just add a new event, and disallow another setplane until the previous has completed (maybe try to generalize the current page flip code a bit to handle sprites as well as primary planes?) -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Patches for performance improvement
On Mon, Aug 26, 2013 at 10:16 AM, Ville Syrjälä ville.syrj...@linux.intel.com wrote: Maybe someone should just start pulling in the drm_flip_helper and associated code from the atomic code to make this feasible. The new event type I added there would also be suitable for this purpose as it carries the plane id and information on which fb is freed by the operation. The other option is to just add a new event, and disallow another setplane until the previous has completed (maybe try to generalize the current page flip code a bit to handle sprites as well as primary planes?) I think extending the current pageflip ioctl to planes would be a suitable stop-gap measure until we have nuclear pageflips for real. It's only a 90% solution though since we can't update anything else than the base address with the current pageflip ioctl. Actually I think doing this would be a neat way to start pulling in parts of the nuclear pageflip code and getting all the flip complete/event signalling straigthened out (and properly tested with igt like the primary plane flips). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 6/6] DRAFT: drm/i915: do adapter power state notification on PC8+ enable/disable
On Mon, Aug 26, 2013 at 9:43 AM, Ville Syrjälä ville.syrj...@linux.intel.com wrote: My quick and dirty idea for runtime PM would be something like this: - for ioctls just slap rpm_get_sync/put() around drm_ioctl, not optimal but it's very easy for getting started - all sysfs/debugfs stuff would need to be handled individually - to deal w/ gtt mmaps just call put_fence or something during suspend, grab one ref in fault and probably release it from a delayed work, or mabe record a timestamp at last fault time and check it in the idle callback - grab one ref per request (or per active ring maybe?) - grab one ref per active pipe - maybe some odd delayed work would need another reference With that, I think everything of importance would hold a reference, so the runtime pm idle callback shouldn't really need to do much. I think we can do this cheaper: - Grab a ref everywhere we want to disallow pc8+ currently (by hooking into the same get/put calls Paulo's patches sprinkled all over). That should cover 100% of all the runtime stuff - Paulo's pc8+ test is rather extensive. That leaves us with the debugfs/sysfs crap doing register access. Cheap and dirty (and likely to make Chris rage) is to add get/put calls with a suitable timeout to the register i/o functions. Less cheap (but I'm unsure whether that's better) would be to sprinkle get/put calls around all relevant sysfs/debugfs files. Pure s/w stuff like the error state wouldn't need it though (and we should try hard not to require it in case runtime pm hangs). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: More vma fixups around unbind/destroy
The important bugfix here is that we must not unlink the vma when we keep it around as a placeholder for the execbuf code. Since then we won't find it again when execbuf gets interrupt and restarted and create a 2nd vma. And since the code as-is isn't fit yet to deal with more than one vma, hilarity ensues. Specifically the dma map/unmap of the sg table isn't adjusted for multiple vmas yet and will blow up like this: BUG: unable to handle kernel NULL pointer dereference at 0008 IP: [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] PGD 56bb5067 PUD ad3dd067 PMD 0 Oops: [#1] SMP Modules linked in: tcp_lp ppdev parport_pc lp parport ipv6 dm_mod dcdbas snd_hda_codec_hdmi pcspkr snd_hda_codec_realtek serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec lpc_ich snd_hwdep mfd_core snd_pcm snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm mperf freq_table CPU: 1 PID: 16650 Comm: fbo-maxsize Not tainted 3.11.0-rc4_nightlytop_d93f59_debug_20130814_+ #6957 Hardware name: Dell Inc. OptiPlex 9010/03JR84, BIOS A01 05/04/2012 task: 8800563b3f00 ti: 88004bdf4000 task.ti: 88004bdf4000 RIP: 0010:[a008fb37] [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP: 0018:88004bdf5958 EFLAGS: 00010246 RAX: RBX: 8801135e RCX: 8800ad3bf8e0 RDX: 8800ad3bf8e0 RSI: RDI: 8801007ee780 RBP: 88004bdf5978 R08: 8800ad3bf8e0 R09: R10: 86ca1810 R11: 880036a17101 R12: 8801007ee780 R13: 00018001 R14: 880118c4e000 R15: 8801007ee780 FS: 7f401a0ce740() GS:88011e28() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0008 CR3: 5635c000 CR4: 001407e0 Stack: 8801007ee780 88005c253180 00018000 8801135e 88004bdf59a8 a0088e55 0011 8801007eec00 00018000 880036a17101 88004bdf5a08 a0089026 Call Trace: [a0088e55] i915_vma_unbind+0xdf/0x1ab [i915] [a0089026] __i915_gem_shrink+0x105/0x177 [i915] [a0089452] i915_gem_object_get_pages_gtt+0x108/0x309 [i915] [a0085ba9] i915_gem_object_get_pages+0x61/0x90 [i915] [a008f22b] ? gen6_ppgtt_insert_entries+0x103/0x125 [i915] [a008a113] i915_gem_object_pin+0x1fa/0x5df [i915] [a008cdfe] i915_gem_execbuffer_reserve_object.isra.6+0x8d/0x1bc [i915] [a008d156] i915_gem_execbuffer_reserve+0x229/0x367 [i915] [a008dbf6] i915_gem_do_execbuffer.isra.12+0x4dc/0xf3a [i915] [810fc823] ? might_fault+0x40/0x90 [a008eb89] i915_gem_execbuffer2+0x187/0x222 [i915] [a000971c] drm_ioctl+0x308/0x442 [drm] [a008ea02] ? i915_gem_execbuffer+0x3ae/0x3ae [i915] [817db156] ? __do_page_fault+0x3dd/0x481 [8112fdba] vfs_ioctl+0x26/0x39 [811306a2] do_vfs_ioctl+0x40e/0x451 [817deda7] ? sysret_check+0x1b/0x56 [8113073c] SyS_ioctl+0x57/0x87 [8135bbfe] ? trace_hardirqs_on_thunk+0x3a/0x3f [817ded82] system_call_fastpath+0x16/0x1b Code: 48 c7 c6 84 30 0e a0 31 c0 e8 d0 e9 f7 ff bf c6 a7 00 00 e8 07 af 2c e1 41 f6 84 24 03 01 00 00 10 75 44 49 8b 84 24 08 01 00 00 8b 50 08 48 8b 30 49 8b 86 b0 04 00 00 48 89 c7 48 81 c7 98 00 RIP [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP 88004bdf5958 CR2: 0008 As a consequence we need to change the only one vma for now check in vma_unbind - since vma_destroy isn't always called the obj-vma_list might not be empty. Instead check that the vma list is singular at the beginning of vma_unbind. This is also more symmetric with bind_to_vm. v2: - Add a paranoid WARN to mark_free in the eviction code to make sure we never try to evict a vma used by the execbuf code right now. - Move the check for a temporary execbuf vma into vma_destroy - otherwise the failure path cleanup in bind_to_vm will blow up. Our first attempting at fixing this was commit 1be81a2f2cfd8789a627401d470423358fba2d76 Author: Chris Wilson ch...@chris-wilson.co.uk Date: Tue Aug 20 12:56:40 2013 +0100 drm/i915: Don't destroy the vma placeholder during execbuffer reservation Squash with this when merging! v3: Improvements suggested in Chris' review: - Move the WARN_ON in vma_destroy that checks for vmas with an drm_mm allocation before the early return. - Bail out if we hit the WARN in mark_free to hopefully make the kernel survive for long enough to capture it. Cc: Chris Wilson ch...@chris-wilson.co.uk Cc: Ben Widawsky b...@bwidawsk.net Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68298 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171 Tested-by: lu hua huax...@intel.com (v2) Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 10 ++
Re: [Intel-gfx] [PATCH] drm/i915: More vma fixups around unbind/destroy
On Mon, Aug 26, 2013 at 11:23:47AM +0200, Daniel Vetter wrote: The important bugfix here is that we must not unlink the vma when we keep it around as a placeholder for the execbuf code. Since then we won't find it again when execbuf gets interrupt and restarted and create a 2nd vma. And since the code as-is isn't fit yet to deal with more than one vma, hilarity ensues. Specifically the dma map/unmap of the sg table isn't adjusted for multiple vmas yet and will blow up like this: BUG: unable to handle kernel NULL pointer dereference at 0008 IP: [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] PGD 56bb5067 PUD ad3dd067 PMD 0 Oops: [#1] SMP Modules linked in: tcp_lp ppdev parport_pc lp parport ipv6 dm_mod dcdbas snd_hda_codec_hdmi pcspkr snd_hda_codec_realtek serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec lpc_ich snd_hwdep mfd_core snd_pcm snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm mperf freq_table CPU: 1 PID: 16650 Comm: fbo-maxsize Not tainted 3.11.0-rc4_nightlytop_d93f59_debug_20130814_+ #6957 Hardware name: Dell Inc. OptiPlex 9010/03JR84, BIOS A01 05/04/2012 task: 8800563b3f00 ti: 88004bdf4000 task.ti: 88004bdf4000 RIP: 0010:[a008fb37] [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP: 0018:88004bdf5958 EFLAGS: 00010246 RAX: RBX: 8801135e RCX: 8800ad3bf8e0 RDX: 8800ad3bf8e0 RSI: RDI: 8801007ee780 RBP: 88004bdf5978 R08: 8800ad3bf8e0 R09: R10: 86ca1810 R11: 880036a17101 R12: 8801007ee780 R13: 00018001 R14: 880118c4e000 R15: 8801007ee780 FS: 7f401a0ce740() GS:88011e28() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0008 CR3: 5635c000 CR4: 001407e0 Stack: 8801007ee780 88005c253180 00018000 8801135e 88004bdf59a8 a0088e55 0011 8801007eec00 00018000 880036a17101 88004bdf5a08 a0089026 Call Trace: [a0088e55] i915_vma_unbind+0xdf/0x1ab [i915] [a0089026] __i915_gem_shrink+0x105/0x177 [i915] [a0089452] i915_gem_object_get_pages_gtt+0x108/0x309 [i915] [a0085ba9] i915_gem_object_get_pages+0x61/0x90 [i915] [a008f22b] ? gen6_ppgtt_insert_entries+0x103/0x125 [i915] [a008a113] i915_gem_object_pin+0x1fa/0x5df [i915] [a008cdfe] i915_gem_execbuffer_reserve_object.isra.6+0x8d/0x1bc [i915] [a008d156] i915_gem_execbuffer_reserve+0x229/0x367 [i915] [a008dbf6] i915_gem_do_execbuffer.isra.12+0x4dc/0xf3a [i915] [810fc823] ? might_fault+0x40/0x90 [a008eb89] i915_gem_execbuffer2+0x187/0x222 [i915] [a000971c] drm_ioctl+0x308/0x442 [drm] [a008ea02] ? i915_gem_execbuffer+0x3ae/0x3ae [i915] [817db156] ? __do_page_fault+0x3dd/0x481 [8112fdba] vfs_ioctl+0x26/0x39 [811306a2] do_vfs_ioctl+0x40e/0x451 [817deda7] ? sysret_check+0x1b/0x56 [8113073c] SyS_ioctl+0x57/0x87 [8135bbfe] ? trace_hardirqs_on_thunk+0x3a/0x3f [817ded82] system_call_fastpath+0x16/0x1b Code: 48 c7 c6 84 30 0e a0 31 c0 e8 d0 e9 f7 ff bf c6 a7 00 00 e8 07 af 2c e1 41 f6 84 24 03 01 00 00 10 75 44 49 8b 84 24 08 01 00 00 8b 50 08 48 8b 30 49 8b 86 b0 04 00 00 48 89 c7 48 81 c7 98 00 RIP [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP 88004bdf5958 CR2: 0008 As a consequence we need to change the only one vma for now check in vma_unbind - since vma_destroy isn't always called the obj-vma_list might not be empty. Instead check that the vma list is singular at the beginning of vma_unbind. This is also more symmetric with bind_to_vm. v2: - Add a paranoid WARN to mark_free in the eviction code to make sure we never try to evict a vma used by the execbuf code right now. - Move the check for a temporary execbuf vma into vma_destroy - otherwise the failure path cleanup in bind_to_vm will blow up. Our first attempting at fixing this was commit 1be81a2f2cfd8789a627401d470423358fba2d76 Author: Chris Wilson ch...@chris-wilson.co.uk Date: Tue Aug 20 12:56:40 2013 +0100 drm/i915: Don't destroy the vma placeholder during execbuffer reservation Squash with this when merging! v3: Improvements suggested in Chris' review: - Move the WARN_ON in vma_destroy that checks for vmas with an drm_mm allocation before the early return. - Bail out if we hit the WARN in mark_free to hopefully make the kernel survive for long enough to capture it. Cc: Chris Wilson ch...@chris-wilson.co.uk Cc: Ben Widawsky b...@bwidawsk.net Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68298 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171 Tested-by: lu
Re: [Intel-gfx] [PATCH] drm/i915: More vma fixups around unbind/destroy
On Mon, Aug 26, 2013 at 11:23:47AM +0200, Daniel Vetter wrote: The important bugfix here is that we must not unlink the vma when we keep it around as a placeholder for the execbuf code. Since then we won't find it again when execbuf gets interrupt and restarted and create a 2nd vma. And since the code as-is isn't fit yet to deal with more than one vma, hilarity ensues. Specifically the dma map/unmap of the sg table isn't adjusted for multiple vmas yet and will blow up like this: BUG: unable to handle kernel NULL pointer dereference at 0008 IP: [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] PGD 56bb5067 PUD ad3dd067 PMD 0 Oops: [#1] SMP Modules linked in: tcp_lp ppdev parport_pc lp parport ipv6 dm_mod dcdbas snd_hda_codec_hdmi pcspkr snd_hda_codec_realtek serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec lpc_ich snd_hwdep mfd_core snd_pcm snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm mperf freq_table CPU: 1 PID: 16650 Comm: fbo-maxsize Not tainted 3.11.0-rc4_nightlytop_d93f59_debug_20130814_+ #6957 Hardware name: Dell Inc. OptiPlex 9010/03JR84, BIOS A01 05/04/2012 task: 8800563b3f00 ti: 88004bdf4000 task.ti: 88004bdf4000 RIP: 0010:[a008fb37] [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP: 0018:88004bdf5958 EFLAGS: 00010246 RAX: RBX: 8801135e RCX: 8800ad3bf8e0 RDX: 8800ad3bf8e0 RSI: RDI: 8801007ee780 RBP: 88004bdf5978 R08: 8800ad3bf8e0 R09: R10: 86ca1810 R11: 880036a17101 R12: 8801007ee780 R13: 00018001 R14: 880118c4e000 R15: 8801007ee780 FS: 7f401a0ce740() GS:88011e28() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0008 CR3: 5635c000 CR4: 001407e0 Stack: 8801007ee780 88005c253180 00018000 8801135e 88004bdf59a8 a0088e55 0011 8801007eec00 00018000 880036a17101 88004bdf5a08 a0089026 Call Trace: [a0088e55] i915_vma_unbind+0xdf/0x1ab [i915] [a0089026] __i915_gem_shrink+0x105/0x177 [i915] [a0089452] i915_gem_object_get_pages_gtt+0x108/0x309 [i915] [a0085ba9] i915_gem_object_get_pages+0x61/0x90 [i915] [a008f22b] ? gen6_ppgtt_insert_entries+0x103/0x125 [i915] [a008a113] i915_gem_object_pin+0x1fa/0x5df [i915] [a008cdfe] i915_gem_execbuffer_reserve_object.isra.6+0x8d/0x1bc [i915] [a008d156] i915_gem_execbuffer_reserve+0x229/0x367 [i915] [a008dbf6] i915_gem_do_execbuffer.isra.12+0x4dc/0xf3a [i915] [810fc823] ? might_fault+0x40/0x90 [a008eb89] i915_gem_execbuffer2+0x187/0x222 [i915] [a000971c] drm_ioctl+0x308/0x442 [drm] [a008ea02] ? i915_gem_execbuffer+0x3ae/0x3ae [i915] [817db156] ? __do_page_fault+0x3dd/0x481 [8112fdba] vfs_ioctl+0x26/0x39 [811306a2] do_vfs_ioctl+0x40e/0x451 [817deda7] ? sysret_check+0x1b/0x56 [8113073c] SyS_ioctl+0x57/0x87 [8135bbfe] ? trace_hardirqs_on_thunk+0x3a/0x3f [817ded82] system_call_fastpath+0x16/0x1b Code: 48 c7 c6 84 30 0e a0 31 c0 e8 d0 e9 f7 ff bf c6 a7 00 00 e8 07 af 2c e1 41 f6 84 24 03 01 00 00 10 75 44 49 8b 84 24 08 01 00 00 8b 50 08 48 8b 30 49 8b 86 b0 04 00 00 48 89 c7 48 81 c7 98 00 RIP [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP 88004bdf5958 CR2: 0008 As a consequence we need to change the only one vma for now check in vma_unbind - since vma_destroy isn't always called the obj-vma_list might not be empty. Instead check that the vma list is singular at the beginning of vma_unbind. This is also more symmetric with bind_to_vm. v2: - Add a paranoid WARN to mark_free in the eviction code to make sure we never try to evict a vma used by the execbuf code right now. - Move the check for a temporary execbuf vma into vma_destroy - otherwise the failure path cleanup in bind_to_vm will blow up. Our first attempting at fixing this was commit 1be81a2f2cfd8789a627401d470423358fba2d76 Author: Chris Wilson ch...@chris-wilson.co.uk Date: Tue Aug 20 12:56:40 2013 +0100 drm/i915: Don't destroy the vma placeholder during execbuffer reservation Squash with this when merging! v3: Improvements suggested in Chris' review: - Move the WARN_ON in vma_destroy that checks for vmas with an drm_mm allocation before the early return. - Bail out if we hit the WARN in mark_free to hopefully make the kernel survive for long enough to capture it. Cc: Chris Wilson ch...@chris-wilson.co.uk Cc: Ben Widawsky b...@bwidawsk.net Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68298 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171 Tested-by: lu
[Intel-gfx] sysfs: i915_cur_delayinfo is always twice as gt_cur_freq_mhz in 3.8.13.4?
Hi all, I'm using haswell with linux-3.8.13.4: http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=commitdiff;h=d1baa1360260b5a01938674fc518109a4e5a148d and I find the CAGF showed by i915_cur_delayinfo is always twice as gt_cur_freq_mhz, e.g., when the system is idle, I can get the below log: - Log begins # cat /sys/kernel/debug/dri/0/i915_cur_delayinfo; echo; cat /sys/devices/pci:00/:00:02.0/drm/card0/gt_cur_freq_mhz; echo ; GT_PERF_STATUS: 0x0800 RPSTAT1: 0x00040400 Render p-state ratio: 8 Render p-state VID: 0 Render p-state limit: 255 CAGF: 400MHz RP CUR UP EI: 55890us RP CUR UP: 276us RP PREV UP: 0us RP CUR DOWN EI: 94295us RP CUR DOWN: 389us RP PREV DOWN: 0us Lowest (RPN) frequency: 200MHz Nominal (RP1) frequency: 400MHz Max non-overclocked (RP0) frequency: 1200MHz 200 - Log ends. But with the latest 3.10.0, both values are 200, as expected. I guess there is a fix in 3.10.0 I can pick for my 3.8.13.4? Thanks, -- Dexuan ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
On Sun, Aug 25, 2013 at 08:45:34PM +0100, Chris Wilson wrote: Obtaining the forcwake requires expensive and time consuming serialisation. And we often try to obtain the forcewake multiple times in very quick succession. We can reduce the overhead of these sequences by delaying the forcewake release, and so not hammer the hw quite so hard. I was hoping this would help with the spurious [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear. found on Haswell. Alas not. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_dma.c | 2 ++ drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/intel_uncore.c | 33 +++-- 3 files changed, 36 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 883990f..c0fb23e 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1800,6 +1800,8 @@ int i915_driver_unload(struct drm_device *dev) dev_priv-gtt.base.cleanup(dev_priv-gtt.base); + intel_uncore_fini(dev); + We've already unmapped the registers. if (dev_priv-slab) kmem_cache_destroy(dev_priv-slab); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a6354c3..8c93d93 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -406,6 +406,8 @@ struct intel_uncore { unsigned fifo_count; unsigned forcewake_count; + + struct delayed_work force_wake_work; }; #define DEV_INFO_FOR_EACH_FLAG(func, sep) \ @@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev); extern void intel_uncore_init(struct drm_device *dev); extern void intel_uncore_clear_errors(struct drm_device *dev); extern void intel_uncore_check_errors(struct drm_device *dev); +extern void intel_uncore_fini(struct drm_device *dev); void i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask); diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 8f5bc86..50fdad7 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv) gen6_gt_check_fifodbg(dev_priv); } +static void gen6_force_wake_work(struct work_struct *work) +{ + struct drm_i915_private *dev_priv = + container_of(work, typeof(*dev_priv), uncore.force_wake_work.work); + unsigned long irqflags; + + spin_lock_irqsave(dev_priv-uncore.lock, irqflags); + if (--dev_priv-uncore.forcewake_count == 0) + dev_priv-uncore.funcs.force_wake_put(dev_priv); + spin_unlock_irqrestore(dev_priv-uncore.lock, irqflags); +} + void intel_uncore_early_sanitize(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; + INIT_DELAYED_WORK(dev_priv-uncore.force_wake_work, + gen6_force_wake_work); + if (IS_VALLEYVIEW(dev)) { dev_priv-uncore.funcs.force_wake_get = vlv_force_wake_get; dev_priv-uncore.funcs.force_wake_put = vlv_force_wake_put; @@ -261,6 +276,13 @@ void intel_uncore_init(struct drm_device *dev) } } +void intel_uncore_fini(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + + cancel_delayed_work_sync(dev_priv-uncore.force_wake_work); Aren't we potentially leaking a forcewake here? Or do we clear it forcefully somewhere else? +} + void intel_uncore_sanitize(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -301,8 +323,15 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv) unsigned long irqflags; spin_lock_irqsave(dev_priv-uncore.lock, irqflags); - if (--dev_priv-uncore.forcewake_count == 0) - dev_priv-uncore.funcs.force_wake_put(dev_priv); + if (--dev_priv-uncore.forcewake_count == 0) { + if (dev_priv-uncore.force_wake_work.work.func) { + dev_priv-uncore.forcewake_count++; + mod_delayed_work(dev_priv-wq, + dev_priv-uncore.force_wake_work, + 1); + } else + dev_priv-uncore.funcs.force_wake_put(dev_priv); + } spin_unlock_irqrestore(dev_priv-uncore.lock, irqflags); } -- 1.8.4.rc3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC
[Intel-gfx] [PATCH 2/2] drm/i915: Tweak RPS thresholds to more aggressively downclock
After applying wait-boost we often find ourselves stuck at higher clocks than required. The current threshold value requires the GPU to be continuously and completely idle for 313ms before it is dropped by one bin. Conversely, we require the GPU to be busy for an average of 90% over a 84ms period before we upclock. So the current thresholds almost never downclock the GPU, and respond very slowly to sudden demands for more power. It is easy to observe that we currently lock into the wrong bin and both underperform in benchmarks and consume more power than optimal (just by repeating the task and measuring the different results). An alternative approach, as discussed in the bspec, is to use a continuous threshold for upclocking, and an average value for downclocking. This is good for quickly detecting and reacting to state changes within a frame, however it fails with the common throttling method of waiting upon the outstanding frame - at least it is difficult to choose a threshold that works well at 15,000fps and at 60fps. So continue to use average busy/idle loads to determine frequency change. v2: Use 3 power zones to keep frequencies low in steady-state mostly idle (e.g. scrolling, interactive 2D drawing), and frequencies high for demanding games. In between those end-states, we use a fast-reclocking algorithm to converge more quickly on the desired bin. v3: Bug fixes - make sure we reset adj after switching power zones. v4: Tune - drop the continuous busy thresholds as it prevents us from choosing the right frequency for glxgears style swap benchmarks. Instead the goal is to be able to find the right clocks irrespective of the wait-boost. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Kenneth Graunke kenn...@whitecape.org Cc: Stéphane Marchesin stephane.marche...@gmail.com Cc: Owen Taylor otay...@redhat.com Cc: Meng, Mengmeng mengmeng.m...@intel.com Cc: Zhuang, Lena lena.zhu...@intel.com --- drivers/gpu/drm/i915/i915_drv.h | 5 ++ drivers/gpu/drm/i915/i915_irq.c | 46 ++ drivers/gpu/drm/i915/i915_reg.h | 2 +- drivers/gpu/drm/i915/intel_pm.c | 135 ++-- 4 files changed, 141 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8993a0b..a6354c3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -835,8 +835,13 @@ struct intel_gen6_power_mgmt { u8 min_delay; u8 max_delay; u8 rpe_delay; + u8 rp1_delay; + u8 rp0_delay; u8 hw_max; + int last_adj; + enum { LOW_POWER, BETWEEN, HIGH_POWER } power; + struct delayed_work delayed_resume_work; /* diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 875125f..5072a61 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -812,7 +812,7 @@ static void gen6_pm_rps_work(struct work_struct *work) drm_i915_private_t *dev_priv = container_of(work, drm_i915_private_t, rps.work); u32 pm_iir; - u8 new_delay; + int new_delay, adj; spin_lock_irq(dev_priv-irq_lock); pm_iir = dev_priv-rps.pm_iir; @@ -829,29 +829,49 @@ static void gen6_pm_rps_work(struct work_struct *work) mutex_lock(dev_priv-rps.hw_lock); + adj = dev_priv-rps.last_adj; if (pm_iir GEN6_PM_RP_UP_THRESHOLD) { - new_delay = dev_priv-rps.cur_delay + 1; + if (adj 0) + adj *= 2; + else + adj = 1; + new_delay = dev_priv-rps.cur_delay + adj; /* * For better performance, jump directly * to RPe if we're below it. */ - if (IS_VALLEYVIEW(dev_priv-dev) - dev_priv-rps.cur_delay dev_priv-rps.rpe_delay) + if (new_delay dev_priv-rps.rpe_delay) + new_delay = dev_priv-rps.rpe_delay; + } else if (pm_iir GEN6_PM_RP_DOWN_TIMEOUT) { + if (dev_priv-rps.cur_delay dev_priv-rps.rpe_delay) new_delay = dev_priv-rps.rpe_delay; - } else - new_delay = dev_priv-rps.cur_delay - 1; + else + new_delay = dev_priv-rps.min_delay; + adj = 0; + } else if (pm_iir GEN6_PM_RP_DOWN_THRESHOLD) { + if (adj 0) + adj *= 2; + else + adj = -1; + new_delay = dev_priv-rps.cur_delay + adj; + } else { /* unknown event */ + new_delay = dev_priv-rps.cur_delay; + } /* sysfs frequency interfaces may have snuck in while servicing the * interrupt */ - if (new_delay = dev_priv-rps.min_delay - new_delay = dev_priv-rps.max_delay) { -
[Intel-gfx] [PATCH 1/2] drm/i915: Boost RPS frequency for CPU stalls
If we encounter a situation where the CPU blocks waiting for results from the GPU, give the GPU a kick to boost its the frequency. This should work to reduce user interface stalls and to quickly promote mesa to high frequencies - but the cost is that our requested frequency stalls high (as we do not idle for long enough before rc6 to start reducing frequencies, nor are we aggressive at down clocking an underused GPU). However, this should be mitigated by rc6 itself powering off the GPU when idle, and that energy use is dependent upon the workload of the GPU in addition to its frequency (e.g. the math or sampler functions only consume power when used). Still, this is likely to adversely affect light workloads. In particular, this nearly eliminates the highly noticeable wake-up lag in animations from idle. For example, expose or workspace transitions. (However, given the situation where we fail to downclock, our requested frequency is almost always the maximum, except for Baytrail where we manually downclock upon idling. This often masks the latency of upclocking after being idle, so animations are typically smooth - at the cost of increased power consumption.) Stéphane raised the concern that this will punish good applications and reward bad applications - but due to the nature of how mesa performs its client throttling, I believe all mesa applications will be roughly equally affected. To address this concern, and to prevent applications like compositors from regularly boosting the RPS state, we only allow each client to receive one boost in each period of activity. Unfortunately, this techinique is ineffective with Ironlake - which also has dynamic render power states and suffers just as dramatically. For Ironlake, the thermal/power headroom is shared with the CPU through Intelligent Power Sharing and the intel-ips module. This leaves us with no GPU boost frequencies available when coming out of idle, and due to hardware limitations we cannot change the arbitration between the CPU and GPU quickly enough to be effective. v2: Limit each client to receiving a single boost for each active period. v3: Tidy up. Allow the device to always boost if it waits outside of client context. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Kenneth Graunke kenn...@whitecape.org Cc: Stéphane Marchesin stephane.marche...@gmail.com Cc: Owen Taylor otay...@redhat.com Cc: Meng, Mengmeng mengmeng.m...@intel.com Cc: Zhuang, Lena lena.zhu...@intel.com --- drivers/gpu/drm/i915/i915_dma.c | 16 ++--- drivers/gpu/drm/i915/i915_drv.h | 19 +++-- drivers/gpu/drm/i915/i915_gem.c | 136 --- drivers/gpu/drm/i915/i915_irq.c | 11 --- drivers/gpu/drm/i915/intel_display.c | 3 + drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_pm.c | 42 ++- 7 files changed, 139 insertions(+), 91 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 99159c9..883990f 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1811,19 +1811,11 @@ int i915_driver_unload(struct drm_device *dev) int i915_driver_open(struct drm_device *dev, struct drm_file *file) { - struct drm_i915_file_private *file_priv; - - DRM_DEBUG_DRIVER(\n); - file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL); - if (!file_priv) - return -ENOMEM; - - file-driver_priv = file_priv; - - spin_lock_init(file_priv-mm.lock); - INIT_LIST_HEAD(file_priv-mm.request_list); + int ret; - idr_init(file_priv-context_idr); + ret = i915_gem_open(dev, file); + if (ret) + return ret; return 0; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f626271..8993a0b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -829,9 +829,6 @@ struct intel_gen6_power_mgmt { struct work_struct work; u32 pm_iir; - /* On vlv we need to manually drop to Vmin with a delayed work. */ - struct delayed_work vlv_work; - /* The below variables an all the rps hw state are protected by * dev-struct mutext. */ u8 cur_delay; @@ -947,6 +944,15 @@ struct i915_gem_mm { struct delayed_work retire_work; /** +* When we detect an idle GPU, we want to turn on +* powersaving features. So once we see that there +* are no more requests outstanding and no more +* arrive within a small period of time, we fire +* off the idle_work. +*/ + struct delayed_work idle_work; + + /** * Are we in a non-interruptible section of code like * modesetting? */ @@ -1570,13 +1576,17 @@ struct drm_i915_gem_request { }; struct drm_i915_file_private { + struct drm_i915_private *dev_priv; + struct { spinlock_t lock;
[Intel-gfx] [PATCH] quick_dump/hsw: support haswell in device auto-detection
From: Mengdong Lin mengdong@intel.com This patch exposes is_haswell() to python, to be used by device auto-detection. Signed-off-by: Mengdong Lin mengdong@intel.com diff --git a/tools/quick_dump/chipset.i b/tools/quick_dump/chipset.i index 69dc8df..4b13487 100644 --- a/tools/quick_dump/chipset.i +++ b/tools/quick_dump/chipset.i @@ -7,6 +7,7 @@ extern int is_sandybridge(unsigned short pciid); extern int is_ivybridge(unsigned short pciid); extern int is_valleyview(unsigned short pciid); +extern int is_haswell(unsigned short pciid); extern struct pci_device *intel_get_pci_device(); extern int intel_register_access_init(struct pci_device *pci_dev, int safe); extern uint32_t intel_register_read(uint32_t reg); @@ -19,6 +20,7 @@ extern uint32_t intel_dpio_reg_read(uint32_t reg); extern int is_sandybridge(unsigned short pciid); extern int is_ivybridge(unsigned short pciid); +extern int is_haswell(unsigned short pciid); extern int is_valleyview(unsigned short pciid); extern struct pci_device *intel_get_pci_device(); extern int intel_register_access_init(struct pci_device *pci_dev, int safe); -- 1.8.1.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Patches for performance improvement
On Mon, Aug 26, 2013 at 11:16:40AM +0300, Ville Syrjälä wrote: On Sun, Aug 25, 2013 at 04:35:52AM +, Kannan, Vandana wrote: Hi, For sprite performance improvement, instead of the patch drm/i915: Don't wait for vblank for sprite plane flips, we made use of Chris Wilson's patches from RFC asynchronous vblank tasks and verified that it gives the expected performance boost. Chris's patches have not been merged yet, specifically, drm/i915: Introduce vblank work function drm/i915: Asynchronously unpin the old framebuffer after the next vblank drm/i915/sprite: Make plane switching asynchronous Will the updated patches for these be coming any time soon ? We require these patches as it gives us a very good performance improvement. Please let me know, I can verify with the updated patches. I don't seem to have said patches in my mailbox, so I can't comment on the details, but... My main concern is again the contents of the old fb after the setplane ioctl returns. Due to the current synchronous nature of the ioctl, userspace knows that when the ioctl returns the flip has completed, and so can write the old fb again. If the ioctl is async, userspace has to guess whether the old fb can be safely written. We could solve this with a new event that gets sent from setplane. We do have plenty of flags left in setplane so it's certainly doable. The patches were originally motivated because the setplane ioctl was doing the opposite of the userspace expectations. Its user is quite happy to have the ioctl pretend it completed immediately without an event. (Okay, not happy, but the whole setup just ignores the complexity by have a small ringbuffer of frames and hoping for the best [mplayer].) The issue and discussion is as old as the original patch for setplane. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
Obtaining the forcwake requires expensive and time consuming serialisation. And we often try to obtain the forcewake multiple times in very quick succession. We can reduce the overhead of these sequences by delaying the forcewake release, and so not hammer the hw quite so hard. I was hoping this would help with the spurious [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear. found on Haswell. Alas not. v2: Fix teardown ordering - unmap the regs after turning off forcewake, and make sure we do turn off forcewake - both found by Ville. Note: I have no claims for improved performance, stablity or power comsumption for this patch. We should not be hitting the registers often enough for this to improve benchmarks, but given the nature of our hw it is likely to improve long term stability. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_dma.c | 6 -- drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/intel_uncore.c | 30 -- 3 files changed, 35 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 883990f..97a6e22 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1787,8 +1787,6 @@ int i915_driver_unload(struct drm_device *dev) list_del(dev_priv-gtt.base.global_link); WARN_ON(!list_empty(dev_priv-vm_list)); drm_mm_takedown(dev_priv-gtt.base.mm); - if (dev_priv-regs != NULL) - pci_iounmap(dev-pdev, dev_priv-regs); drm_vblank_cleanup(dev); @@ -1800,6 +1798,10 @@ int i915_driver_unload(struct drm_device *dev) dev_priv-gtt.base.cleanup(dev_priv-gtt.base); + intel_uncore_fini(dev); + if (dev_priv-regs != NULL) + pci_iounmap(dev-pdev, dev_priv-regs); + if (dev_priv-slab) kmem_cache_destroy(dev_priv-slab); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a6354c3..8c93d93 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -406,6 +406,8 @@ struct intel_uncore { unsigned fifo_count; unsigned forcewake_count; + + struct delayed_work force_wake_work; }; #define DEV_INFO_FOR_EACH_FLAG(func, sep) \ @@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev); extern void intel_uncore_init(struct drm_device *dev); extern void intel_uncore_clear_errors(struct drm_device *dev); extern void intel_uncore_check_errors(struct drm_device *dev); +extern void intel_uncore_fini(struct drm_device *dev); void i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask); diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 8649f1c..462cc7f 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv) gen6_gt_check_fifodbg(dev_priv); } +static void gen6_force_wake_work(struct work_struct *work) +{ + struct drm_i915_private *dev_priv = + container_of(work, typeof(*dev_priv), uncore.force_wake_work.work); + unsigned long irqflags; + + spin_lock_irqsave(dev_priv-uncore.lock, irqflags); + if (--dev_priv-uncore.forcewake_count == 0) + dev_priv-uncore.funcs.force_wake_put(dev_priv); + spin_unlock_irqrestore(dev_priv-uncore.lock, irqflags); +} + void intel_uncore_early_sanitize(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; + INIT_DELAYED_WORK(dev_priv-uncore.force_wake_work, + gen6_force_wake_work); + if (IS_VALLEYVIEW(dev)) { dev_priv-uncore.funcs.force_wake_get = vlv_force_wake_get; dev_priv-uncore.funcs.force_wake_put = vlv_force_wake_put; @@ -261,6 +276,13 @@ void intel_uncore_init(struct drm_device *dev) } } +void intel_uncore_fini(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + + flush_delayed_work(dev_priv-uncore.force_wake_work); +} + static void intel_uncore_forcewake_reset(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -306,8 +328,12 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv) unsigned long irqflags; spin_lock_irqsave(dev_priv-uncore.lock, irqflags); - if (--dev_priv-uncore.forcewake_count == 0) - dev_priv-uncore.funcs.force_wake_put(dev_priv); + if (--dev_priv-uncore.forcewake_count == 0) { + dev_priv-uncore.forcewake_count++; + mod_delayed_work(dev_priv-wq, +
Re: [Intel-gfx] [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
On Mon, Aug 26, 2013 at 12:06:43PM +0100, Chris Wilson wrote: Obtaining the forcwake requires expensive and time consuming serialisation. And we often try to obtain the forcewake multiple times in very quick succession. We can reduce the overhead of these sequences by delaying the forcewake release, and so not hammer the hw quite so hard. I was hoping this would help with the spurious [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear. found on Haswell. Alas not. v2: Fix teardown ordering - unmap the regs after turning off forcewake, and make sure we do turn off forcewake - both found by Ville. Note: I have no claims for improved performance, stablity or power comsumption for this patch. We should not be hitting the registers often enough for this to improve benchmarks, but given the nature of our hw it is likely to improve long term stability. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_dma.c | 6 -- drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/intel_uncore.c | 30 -- 3 files changed, 35 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 883990f..97a6e22 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1787,8 +1787,6 @@ int i915_driver_unload(struct drm_device *dev) list_del(dev_priv-gtt.base.global_link); WARN_ON(!list_empty(dev_priv-vm_list)); drm_mm_takedown(dev_priv-gtt.base.mm); - if (dev_priv-regs != NULL) - pci_iounmap(dev-pdev, dev_priv-regs); drm_vblank_cleanup(dev); @@ -1800,6 +1798,10 @@ int i915_driver_unload(struct drm_device *dev) dev_priv-gtt.base.cleanup(dev_priv-gtt.base); + intel_uncore_fini(dev); + if (dev_priv-regs != NULL) + pci_iounmap(dev-pdev, dev_priv-regs); + if (dev_priv-slab) kmem_cache_destroy(dev_priv-slab); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a6354c3..8c93d93 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -406,6 +406,8 @@ struct intel_uncore { unsigned fifo_count; unsigned forcewake_count; + + struct delayed_work force_wake_work; }; #define DEV_INFO_FOR_EACH_FLAG(func, sep) \ @@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev); extern void intel_uncore_init(struct drm_device *dev); extern void intel_uncore_clear_errors(struct drm_device *dev); extern void intel_uncore_check_errors(struct drm_device *dev); +extern void intel_uncore_fini(struct drm_device *dev); void i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask); diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 8649f1c..462cc7f 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv) gen6_gt_check_fifodbg(dev_priv); } +static void gen6_force_wake_work(struct work_struct *work) +{ + struct drm_i915_private *dev_priv = + container_of(work, typeof(*dev_priv), uncore.force_wake_work.work); + unsigned long irqflags; + + spin_lock_irqsave(dev_priv-uncore.lock, irqflags); + if (--dev_priv-uncore.forcewake_count == 0) + dev_priv-uncore.funcs.force_wake_put(dev_priv); + spin_unlock_irqrestore(dev_priv-uncore.lock, irqflags); +} + void intel_uncore_early_sanitize(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; + INIT_DELAYED_WORK(dev_priv-uncore.force_wake_work, + gen6_force_wake_work); + if (IS_VALLEYVIEW(dev)) { dev_priv-uncore.funcs.force_wake_get = vlv_force_wake_get; dev_priv-uncore.funcs.force_wake_put = vlv_force_wake_put; @@ -261,6 +276,13 @@ void intel_uncore_init(struct drm_device *dev) } } +void intel_uncore_fini(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + + flush_delayed_work(dev_priv-uncore.force_wake_work); A bit orthogonal, but maybe we should also add 'WARN_ON(dev_priv-uncore.forcewake_count)' just to make sure we don't leave forcewake enabled by accident? Or if we're really paranoid, maybe even 'if (WARN_ON(...)) force_wake_reset()'. But anyways, the patch looks like it should do what it claims, and I can't see other bugs, so: Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com +} + static void intel_uncore_forcewake_reset(struct drm_device *dev) { struct drm_i915_private
Re: [Intel-gfx] [PATCH v2] i915: Update VGA arbiter support for newer devices
On Sat, Aug 24, 2013 at 08:53:40AM -0600, Alex Williamson wrote: This is intended to add VGA arbiter support for Intel HD graphics on Core processors. The old GMCH registers no longer exist, so even though it appears that i915 participates in VGA arbitration, it doesn't work. On Intel HD graphics we already attempt to disable VGA regions of the device. This makes registering as a VGA client unnecessary since we don't intend to operate differently depending on how many VGA devices are present. We can disable VGA memory regions by clearing the memory enable bit in the VGA MSR. That only leaves VGA IO, which we update the VGA arbiter to know that we don't participate in VGA memory arbitration. We also add a hook on unload to re-enable memory and reinstate VGA memory arbitration. Signed-off-by: Alex Williamson alex.william...@redhat.com --- v2: I915_READ/WRITE accessors don't work in i915_disable_vga, use inb/outb directly. Also, on the driver unbind VGA enable path, acquire legacy IO to re-enable VGA memory. Correct comment. As with v1, this depends on vgaarb: Fixes for partial VGA opt-out. With all patches I'm able to assign a discrete PEG VGA device to a guest and execute the VBIOS w/o interference from IGD or corruption of the IGD framebuffer. I glanced through those before, but I'll try to do a more thorough review soon. drivers/gpu/drm/i915/i915_dma.c |9 ++--- drivers/gpu/drm/i915/intel_display.c | 24 2 files changed, 30 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index f466980..d9cf216 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1287,9 +1287,12 @@ static int i915_load_modeset_init(struct drm_device *dev) * then we do not take part in VGA arbitration and the * vga_client_register() fails with -ENODEV. */ - ret = vga_client_register(dev-pdev, dev, NULL, i915_vga_set_decode); - if (ret ret != -ENODEV) - goto out; + if (!HAS_PCH_SPLIT(dev)) { + ret = vga_client_register(dev-pdev, dev, NULL, + i915_vga_set_decode); + if (ret ret != -ENODEV) + goto out; + } I think we should do something to fix i915_redisable_vga() for !pch platforms too. The MSR method should work for everything, so I'm leaning towards using it all the time. The other option would be add an mmio codepath into intel_disable_vga(), but then we have several codepaths to worry about. intel_register_dsm_handler(); diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 5fb3058..2bb805c 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9519,6 +9519,15 @@ static void i915_disable_vga(struct drm_device *dev) outb(SR01, VGA_SR_INDEX); sr1 = inb(VGA_SR_DATA); outb(sr1 | 15, VGA_SR_DATA); + + /* Disable VGA memory on Intel HD */ + if (HAS_PCH_SPLIT(dev)) { + outb(inb(VGA_MSR_READ) ~VGA_MSR_MEM_EN, VGA_MSR_WRITE); + vga_set_legacy_decoding(dev-pdev, VGA_RSRC_LEGACY_IO | +VGA_RSRC_NORMAL_IO | +VGA_RSRC_NORMAL_MEM); + } + vga_put(dev-pdev, VGA_RSRC_LEGACY_IO); udelay(300); @@ -9526,6 +9535,19 @@ static void i915_disable_vga(struct drm_device *dev) POSTING_READ(vga_reg); } +static void i915_enable_vga(struct drm_device *dev) +{ Not sure we need this actually. We don't re-enable the VGA display on unload, so you'll never see anything on the display anyway. But then again, I don't see any real harm in doing this either. + /* Enable VGA memory on Intel HD */ + if (HAS_PCH_SPLIT(dev)) { + vga_get_uninterruptible(dev-pdev, VGA_RSRC_LEGACY_IO); + outb(inb(VGA_MSR_READ) | VGA_MSR_MEM_EN, VGA_MSR_WRITE); I was thinking we might save the MSR state during driver load and restore to that same state, but it could be a bit overkill. + vga_set_legacy_decoding(dev-pdev, VGA_RSRC_LEGACY_MASK | Maybe spell out IO and MEM for legacy too. Otherwise it looks a bit cryptic. +VGA_RSRC_NORMAL_IO | +VGA_RSRC_NORMAL_MEM); + vga_put(dev-pdev, VGA_RSRC_LEGACY_IO); + } +} + void intel_modeset_init_hw(struct drm_device *dev) { intel_init_power_well(dev); @@ -9983,6 +10005,8 @@ void intel_modeset_cleanup(struct drm_device *dev) intel_disable_fbc(dev); + i915_enable_vga(dev); + intel_disable_gt_powersave(dev); ironlake_teardown_rc6(dev); -- Ville Syrjälä Intel OTC ___ Intel-gfx
[Intel-gfx] [PATCH] drm/i915: Initialise min/max frequencies before updating RPS registers
The RPS register writing routines use the current value of min/max to set certain limits and interrupt gating. If we set those afterwards, we risk setting up the hw incorrectly and losing power management events, and worse, trigger some internal assertions. Reorder the calling sequences to be correct, and remove the then unrequired clamping from inside set_rps(). And for a bonus, fix the bug of calling gen6_set_rps() from Valleyview. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_sysfs.c | 16 drivers/gpu/drm/i915/intel_pm.c | 19 +-- 3 files changed, 14 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 2a276c8..b2b1730 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2113,7 +2113,7 @@ i915_max_freq_set(void *data, u64 val) if (IS_VALLEYVIEW(dev)) { val = vlv_freq_opcode(dev_priv-mem_freq, val); dev_priv-rps.max_delay = val; - gen6_set_rps(dev, val); + valleyview_set_rps(dev, val); } else { do_div(val, GT_FREQUENCY_MULTIPLIER); dev_priv-rps.max_delay = val; diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c index c8c4112..05195c0 100644 --- a/drivers/gpu/drm/i915/i915_sysfs.c +++ b/drivers/gpu/drm/i915/i915_sysfs.c @@ -294,15 +294,15 @@ static ssize_t gt_max_freq_mhz_store(struct device *kdev, DRM_DEBUG(User requested overclocking to %d\n, val * GT_FREQUENCY_MULTIPLIER); + dev_priv-rps.max_delay = val; + if (dev_priv-rps.cur_delay val) { - if (IS_VALLEYVIEW(dev_priv-dev)) - valleyview_set_rps(dev_priv-dev, val); + if (IS_VALLEYVIEW(dev)) + valleyview_set_rps(dev, val); else - gen6_set_rps(dev_priv-dev, val); + gen6_set_rps(dev, val); } - dev_priv-rps.max_delay = val; - mutex_unlock(dev_priv-rps.hw_lock); return count; @@ -359,15 +359,15 @@ static ssize_t gt_min_freq_mhz_store(struct device *kdev, return -EINVAL; } + dev_priv-rps.min_delay = val; + if (dev_priv-rps.cur_delay val) { if (IS_VALLEYVIEW(dev)) valleyview_set_rps(dev, val); else - gen6_set_rps(dev_priv-dev, val); + gen6_set_rps(dev, val); } - dev_priv-rps.min_delay = val; - mutex_unlock(dev_priv-rps.hw_lock); return count; diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 6767e2b..f995dda 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -3263,26 +3263,19 @@ static void ironlake_disable_drps(struct drm_device *dev) * ourselves, instead of doing a rmw cycle (which might result in us clearing * all limits and the gpu stuck at whatever frequency it is at atm). */ -static u32 gen6_rps_limits(struct drm_i915_private *dev_priv, u8 *val) +static u32 gen6_rps_limits(struct drm_i915_private *dev_priv, u8 val) { u32 limits; - limits = 0; - - if (*val = dev_priv-rps.max_delay) - *val = dev_priv-rps.max_delay; - limits |= dev_priv-rps.max_delay 24; - /* Only set the down limit when we've reached the lowest level to avoid * getting more interrupts, otherwise leave this clear. This prevents a * race in the hw when coming out of rc6: There's a tiny window where * the hw runs at the minimal clock before selecting the desired * frequency, if the down threshold expires in that window we will not * receive a down interrupt. */ - if (*val = dev_priv-rps.min_delay) { - *val = dev_priv-rps.min_delay; + limits = dev_priv-rps.max_delay 24; + if (val = dev_priv-rps.min_delay) limits |= dev_priv-rps.min_delay 16; - } return limits; } @@ -3382,7 +3375,6 @@ static void gen6_set_rps_thresholds(struct drm_i915_private *dev_priv, u8 val) void gen6_set_rps(struct drm_device *dev, u8 val) { struct drm_i915_private *dev_priv = dev-dev_private; - u32 limits = gen6_rps_limits(dev_priv, val); WARN_ON(!mutex_is_locked(dev_priv-rps.hw_lock)); WARN_ON(val dev_priv-rps.max_delay); @@ -3405,7 +3397,8 @@ void gen6_set_rps(struct drm_device *dev, u8 val) /* Make sure we continue to get interrupts * until we hit the minimum or maximum frequencies. */ - I915_WRITE(GEN6_RP_INTERRUPT_LIMITS, limits); + I915_WRITE(GEN6_RP_INTERRUPT_LIMITS, + gen6_rps_limits(dev_priv, val));
[Intel-gfx] [PATCH] drm/i915: Delay the relase of the forcewake by a jiffie
Obtaining the forcwake requires expensive and time consuming serialisation. And we often try to obtain the forcewake multiple times in very quick succession. We can reduce the overhead of these sequences by delaying the forcewake release, and so not hammer the hw quite so hard. I was hoping this would help with the spurious [drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear. found on Haswell. Alas not. v2: Fix teardown ordering - unmap the regs after turning off forcewake, and make sure we do turn off forcewake - both found by Ville. v3: As we introduce intel_uncore_fini(), use it to make sure everything is disabled before we hand back to the BIOS. Note: I have no claims for improved performance, stablity or power comsumption for this patch. We should not be hitting the registers often enough for this to improve benchmarks, but given the nature of our hw it is likely to improve long term stability. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/i915/i915_dma.c | 6 -- drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/intel_uncore.c | 33 +++-- 3 files changed, 38 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 883990f..97a6e22 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1787,8 +1787,6 @@ int i915_driver_unload(struct drm_device *dev) list_del(dev_priv-gtt.base.global_link); WARN_ON(!list_empty(dev_priv-vm_list)); drm_mm_takedown(dev_priv-gtt.base.mm); - if (dev_priv-regs != NULL) - pci_iounmap(dev-pdev, dev_priv-regs); drm_vblank_cleanup(dev); @@ -1800,6 +1798,10 @@ int i915_driver_unload(struct drm_device *dev) dev_priv-gtt.base.cleanup(dev_priv-gtt.base); + intel_uncore_fini(dev); + if (dev_priv-regs != NULL) + pci_iounmap(dev-pdev, dev_priv-regs); + if (dev_priv-slab) kmem_cache_destroy(dev_priv-slab); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a6354c3..8c93d93 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -406,6 +406,8 @@ struct intel_uncore { unsigned fifo_count; unsigned forcewake_count; + + struct delayed_work force_wake_work; }; #define DEV_INFO_FOR_EACH_FLAG(func, sep) \ @@ -1792,6 +1794,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev); extern void intel_uncore_init(struct drm_device *dev); extern void intel_uncore_clear_errors(struct drm_device *dev); extern void intel_uncore_check_errors(struct drm_device *dev); +extern void intel_uncore_fini(struct drm_device *dev); void i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask); diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 8649f1c..f2753d9 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv) gen6_gt_check_fifodbg(dev_priv); } +static void gen6_force_wake_work(struct work_struct *work) +{ + struct drm_i915_private *dev_priv = + container_of(work, typeof(*dev_priv), uncore.force_wake_work.work); + unsigned long irqflags; + + spin_lock_irqsave(dev_priv-uncore.lock, irqflags); + if (--dev_priv-uncore.forcewake_count == 0) + dev_priv-uncore.funcs.force_wake_put(dev_priv); + spin_unlock_irqrestore(dev_priv-uncore.lock, irqflags); +} + void intel_uncore_early_sanitize(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; + INIT_DELAYED_WORK(dev_priv-uncore.force_wake_work, + gen6_force_wake_work); + if (IS_VALLEYVIEW(dev)) { dev_priv-uncore.funcs.force_wake_get = vlv_force_wake_get; dev_priv-uncore.funcs.force_wake_put = vlv_force_wake_put; @@ -261,6 +276,16 @@ void intel_uncore_init(struct drm_device *dev) } } +void intel_uncore_fini(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + + flush_delayed_work(dev_priv-uncore.force_wake_work); + + /* Paranoia: make sure we have disabled everything before we exit. */ + intel_uncore_sanitize(dev); +} + static void intel_uncore_forcewake_reset(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -306,8 +331,12 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv) unsigned long irqflags; spin_lock_irqsave(dev_priv-uncore.lock, irqflags); -
[Intel-gfx] [PATCH] drm/i915: Add a tracepoint for using a semaphore
So that we can find the callers who introduce a ring stall. A single ring stall is not too unwelcome, the right issue becomes when they start to interlock and prevent any concurrent work. That, however, is a little tricker to detect with a mere tracepoint! Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem.c | 2 ++ drivers/gpu/drm/i915/i915_trace.h | 19 +++ 2 files changed, 21 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 85537d1..0122278 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2698,6 +2698,8 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj, if (ret) return ret; + trace_i915_gem_object_sync(obj, to); + ret = to-sync_to(to, from, seqno); if (!ret) /* We use last_read_seqno because sync_to() diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index feaacbb..728c43b 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -33,6 +33,25 @@ TRACE_EVENT(i915_gem_object_create, TP_printk(obj=%p, size=%u, __entry-obj, __entry-size) ); +TRACE_EVENT(i915_gem_object_sync, + TP_PROTO(struct drm_i915_gem_object *obj, struct intel_ring_buffer *to), + TP_ARGS(obj, to), + + TP_STRUCT__entry( +__field(struct drm_i915_gem_object *, obj) +__field(u32, sync_from) +__field(u32, sync_to) +), + + TP_fast_assign( + __entry-obj = obj; + __entry-sync_from = obj-ring-id; + __entry-sync_to = to-id; + ), + + TP_printk(obj=%p, sync-from=%u, sync-to=%u, __entry-obj, __entry-sync_from, __entry-sync_to) +); + TRACE_EVENT(i915_vma_bind, TP_PROTO(struct i915_vma *vma, bool mappable), TP_ARGS(vma, mappable), -- 1.8.4.rc3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] tests: Simulate missed breadcrumb irqs
Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- tests/Makefile.am | 1 + tests/ZZ_missed_irq | 70 + 2 files changed, 71 insertions(+) create mode 100755 tests/ZZ_missed_irq diff --git a/tests/Makefile.am b/tests/Makefile.am index 9878a25..e258b1f 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -130,6 +130,7 @@ TESTS_scripts = \ test_rte_check \ tools_test \ ZZ_hangman \ + ZZ_missed_irq \ $(NULL) # This target contains testcases which support automagic subtest enumeration diff --git a/tests/ZZ_missed_irq b/tests/ZZ_missed_irq new file mode 100755 index 000..86c0123 --- /dev/null +++ b/tests/ZZ_missed_irq @@ -0,0 +1,70 @@ +#!/bin/bash +# +# Testcase: Simulate missed breadcrumb interrupts +# + +SOURCE_DIR=$( dirname ${BASH_SOURCE[0]} ) +test $SOURCE_DIR = . SOURCE_DIR=`pwd` +. $SOURCE_DIR/drm_lib.sh +cd $i915_dfs_path + +echo $SOURCE_DIR + +function check_for_missed_irq { + if test `cat i915_ring_missed_irq` = 0x; then + echo missed interrupts undetected + exit 1 + fi +} + +function check_for_hang { + if cat i915_error_state | grep -v no error state collected /dev/null ; then + echo gpu hang reported + exit 2 + fi +} + +if [ ! -f i915_ring_missed_irq ] ; then + echo kernel doesn't support interrupt masking + exit 77 +fi + +# clear error state first +echo i915_error_state +check_for_hang + +echo 0xf i915_ring_test_irq +echo Interrupts masked +if test `cat i915_ring_test_irq` != 0x000f; then + echo Failed to set interrupt mask + exit 3 +fi + +$SOURCE_DIR/gem_exec_big /dev/null + +check_for_missed_irq +check_for_hang + +$SOURCE_DIR/gem_exec_big /dev/null + +check_for_hang + +echo 0 i915_ring_test_irq +echo Interrupts unmasked +if test `cat i915_ring_test_irq` != 0x; then + echo Failed to clear interrupt mask + exit 3 +fi + +$SOURCE_DIR/gem_exec_big /dev/null + +check_for_hang + +echo 0 i915_ring_missed_irq +echo Cleared missed interrupts +if test `cat i915_ring_missed_irq` != 0x; then + echo Failed to clear missed interrupts + exit 3 +fi + +exit 0 -- 1.8.4.rc3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Add a tracepoint for using a semaphore
On Mon, Aug 26, 2013 at 01:46:21PM +0100, Chris Wilson wrote: So that we can find the callers who introduce a ring stall. A single ring stall is not too unwelcome, the right issue becomes when they start to interlock and prevent any concurrent work. That, however, is a little tricker to detect with a mere tracepoint! I was staring at the semaphore code a bit recently, and I was wondering if it would make sense to do a manual seqno check (against the ring's current seqno, not just the last sync_seqno) before we decide to add the semaphore into the ring? Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem.c | 2 ++ drivers/gpu/drm/i915/i915_trace.h | 19 +++ 2 files changed, 21 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 85537d1..0122278 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2698,6 +2698,8 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj, if (ret) return ret; + trace_i915_gem_object_sync(obj, to); + ret = to-sync_to(to, from, seqno); if (!ret) /* We use last_read_seqno because sync_to() diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index feaacbb..728c43b 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -33,6 +33,25 @@ TRACE_EVENT(i915_gem_object_create, TP_printk(obj=%p, size=%u, __entry-obj, __entry-size) ); +TRACE_EVENT(i915_gem_object_sync, + TP_PROTO(struct drm_i915_gem_object *obj, struct intel_ring_buffer *to), + TP_ARGS(obj, to), + + TP_STRUCT__entry( + __field(struct drm_i915_gem_object *, obj) + __field(u32, sync_from) + __field(u32, sync_to) + ), + + TP_fast_assign( +__entry-obj = obj; +__entry-sync_from = obj-ring-id; +__entry-sync_to = to-id; +), + + TP_printk(obj=%p, sync-from=%u, sync-to=%u, __entry-obj, __entry-sync_from, __entry-sync_to) +); + TRACE_EVENT(i915_vma_bind, TP_PROTO(struct i915_vma *vma, bool mappable), TP_ARGS(vma, mappable), -- 1.8.4.rc3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Add a tracepoint for using a semaphore
On Mon, Aug 26, 2013 at 04:20:06PM +0300, Ville Syrjälä wrote: On Mon, Aug 26, 2013 at 01:46:21PM +0100, Chris Wilson wrote: So that we can find the callers who introduce a ring stall. A single ring stall is not too unwelcome, the right issue becomes when they start to interlock and prevent any concurrent work. That, however, is a little tricker to detect with a mere tracepoint! I was staring at the semaphore code a bit recently, and I was wondering if it would make sense to do a manual seqno check (against the ring's current seqno, not just the last sync_seqno) before we decide to add the semaphore into the ring? If we use the cheap check, yeah. We can also store the cross-sync seqno as well. Both are going to be unlikely paths though. Anyway, I'm moving the tracepoint over to the ring rather than the object for consistency. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Review me!
Daniel is pushing for something like this upstream so that we can catch missed interrupts again. This contains the compromise for Ben and Jesse, and should work even better than before with absent interrupts. I'm not thrilled by the test_/missed_irq debugfs interface, it is a little racy, but I didn't want to impact the hotpath. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] drm/i915: Call io_schedule() whilst whilsting for the GPU
Since we are waiting upon IO completion, inform the kernel through use of the io_schedule() call rather than the regular schedule(). This should allow the kernel to make better decisions regarding scheduling and power management. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 4ed782f..408578b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1063,7 +1063,7 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno, mod_timer(timer, expire); } - schedule(); + io_schedule(); if (timeout) timeout_jiffies = expire - jiffies; -- 1.8.4.rc3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] drm/i915: Fix __wait_seqno to use true infinite timeouts
When we switched to always using a timeout in conjunction with wait_seqno, we lost the ability to detect missed interrupts. Since, we have had issues with interrupts on a number of generations, and they are required to be delivered in a timely fashion for a smooth UX, it is important that we do log errors found in the wild and prevent the display stalling for upwards of 1s every time the seqno interrupt is missed. Rather than continue to fix up the timeouts to work around the interface impedence in wait_event_*(), open code the combination of wait_event[_interruptible][_timeout], and use the exposed timer to poll for seqno should we detect a lost interrupt. v2: In order to satisfy the debug requirement of logging missed interrupts with the real world requirments of making machines work even if interrupts are hosed, we revert to polling after detecting a missed interrupt. v3: Throw in a debugfs interface to simulate broken hw not reporting interrupts. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_debugfs.c | 68 drivers/gpu/drm/i915/i915_drv.h | 5 ++ drivers/gpu/drm/i915/i915_gem.c | 114 -- drivers/gpu/drm/i915/i915_gpu_error.c | 1 + drivers/gpu/drm/i915/i915_irq.c | 11 ++-- 5 files changed, 148 insertions(+), 51 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 55ab924..e823169 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1876,6 +1876,72 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_ring_stop_fops, i915_ring_stop_get, i915_ring_stop_set, 0x%08llx\n); +static int +i915_ring_missed_irq_get(void *data, u64 *val) +{ + struct drm_device *dev = data; + drm_i915_private_t *dev_priv = dev-dev_private; + + *val = dev_priv-gpu_error.missed_irq_rings; + return 0; +} + +static int +i915_ring_missed_irq_set(void *data, u64 val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + int ret; + + /* Lock against concurrent debugfs callers */ + ret = mutex_lock_interruptible(dev-struct_mutex); + if (ret) + return ret; + dev_priv-gpu_error.missed_irq_rings = val; + mutex_unlock(dev-struct_mutex); + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_ring_missed_irq_fops, + i915_ring_missed_irq_get, i915_ring_missed_irq_set, + 0x%08llx\n); + +static int +i915_ring_test_irq_get(void *data, u64 *val) +{ + struct drm_device *dev = data; + drm_i915_private_t *dev_priv = dev-dev_private; + + *val = dev_priv-gpu_error.test_irq_rings; + + return 0; +} + +static int +i915_ring_test_irq_set(void *data, u64 val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + int ret; + + DRM_DEBUG_DRIVER(Masking interrupts on rings 0x%08llx\n, val); + + /* Lock against concurrent debugfs callers */ + ret = mutex_lock_interruptible(dev-struct_mutex); + if (ret) + return ret; + + dev_priv-gpu_error.test_irq_rings = val; + mutex_unlock(dev-struct_mutex); + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_ring_test_irq_fops, + i915_ring_test_irq_get, i915_ring_test_irq_set, + 0x%08llx\n); + #define DROP_UNBOUND 0x1 #define DROP_BOUND 0x2 #define DROP_RETIRE 0x4 @@ -2269,6 +2335,8 @@ static struct i915_debugfs_files { {i915_min_freq, i915_min_freq_fops}, {i915_cache_sharing, i915_cache_sharing_fops}, {i915_ring_stop, i915_ring_stop_fops}, + {i915_ring_missed_irq, i915_ring_missed_irq_fops}, + {i915_ring_test_irq, i915_ring_test_irq_fops}, {i915_gem_drop_caches, i915_drop_caches_fops}, {i915_error_state, i915_error_state_fops}, {i915_next_seqno, i915_next_seqno_fops}, diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 84b95b1..ac6db6e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -993,6 +993,8 @@ struct i915_gpu_error { unsigned long last_reset; + unsigned long missed_irq_rings; + /** * State variable and reset counter controlling the reset flow * @@ -1031,6 +1033,9 @@ struct i915_gpu_error { /* For gpu hang simulation. */ unsigned int stop_rings; + + /* For missed irq/seqno simulation. */ + unsigned int test_irq_rings; }; enum modeset_restore { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ccfebf5..4ed782f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -970,6 +970,17 @@ i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno) return ret; }
[Intel-gfx] [PATCH] drm/i915: Adjust available RPS information through sysfs for vlv
Valleyview has its own render power state implementation with different capability knobs - it has no RP0,RP1,RPn but rather RPe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67734 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Tested-by: kobe@intel.com Reviewed-by: Jani Nikula jani.nik...@intel.com --- drivers/gpu/drm/i915/i915_sysfs.c | 36 +++- 1 file changed, 31 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c index a777e7f..c8c4112 100644 --- a/drivers/gpu/drm/i915/i915_sysfs.c +++ b/drivers/gpu/drm/i915/i915_sysfs.c @@ -224,6 +224,18 @@ static ssize_t gt_cur_freq_mhz_show(struct device *kdev, return snprintf(buf, PAGE_SIZE, %d\n, ret); } +static ssize_t vlv_rpe_freq_mhz_show(struct device *kdev, +struct device_attribute *attr, char *buf) +{ + struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev); + struct drm_device *dev = minor-dev; + struct drm_i915_private *dev_priv = dev-dev_private; + + return snprintf(buf, PAGE_SIZE, %d\n, + vlv_gpu_freq(dev_priv-mem_freq, +dev_priv-rps.rpe_delay)); +} + static ssize_t gt_max_freq_mhz_show(struct device *kdev, struct device_attribute *attr, char *buf) { struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev); @@ -366,6 +378,7 @@ static DEVICE_ATTR(gt_cur_freq_mhz, S_IRUGO, gt_cur_freq_mhz_show, NULL); static DEVICE_ATTR(gt_max_freq_mhz, S_IRUGO | S_IWUSR, gt_max_freq_mhz_show, gt_max_freq_mhz_store); static DEVICE_ATTR(gt_min_freq_mhz, S_IRUGO | S_IWUSR, gt_min_freq_mhz_show, gt_min_freq_mhz_store); +static DEVICE_ATTR(vlv_rpe_freq_mhz, S_IRUGO, vlv_rpe_freq_mhz_show, NULL); static ssize_t gt_rp_mhz_show(struct device *kdev, struct device_attribute *attr, char *buf); static DEVICE_ATTR(gt_RP0_freq_mhz, S_IRUGO, gt_rp_mhz_show, NULL); @@ -409,6 +422,14 @@ static const struct attribute *gen6_attrs[] = { NULL, }; +static const struct attribute *vlv_attrs[] = { + dev_attr_gt_cur_freq_mhz.attr, + dev_attr_gt_max_freq_mhz.attr, + dev_attr_gt_min_freq_mhz.attr, + dev_attr_vlv_rpe_freq_mhz.attr, + NULL, +}; + static ssize_t error_state_read(struct file *filp, struct kobject *kobj, struct bin_attribute *attr, char *buf, loff_t off, size_t count) @@ -492,11 +513,13 @@ void i915_setup_sysfs(struct drm_device *dev) DRM_ERROR(l3 parity sysfs setup failed\n); } - if (INTEL_INFO(dev)-gen = 6) { + ret = 0; + if (IS_VALLEYVIEW(dev)) + ret = sysfs_create_files(dev-primary-kdev.kobj, vlv_attrs); + else if (INTEL_INFO(dev)-gen = 6) ret = sysfs_create_files(dev-primary-kdev.kobj, gen6_attrs); - if (ret) - DRM_ERROR(gen6 sysfs setup failed\n); - } + if (ret) + DRM_ERROR(RPS sysfs setup failed\n); ret = sysfs_create_bin_file(dev-primary-kdev.kobj, error_state_attr); @@ -507,7 +530,10 @@ void i915_setup_sysfs(struct drm_device *dev) void i915_teardown_sysfs(struct drm_device *dev) { sysfs_remove_bin_file(dev-primary-kdev.kobj, error_state_attr); - sysfs_remove_files(dev-primary-kdev.kobj, gen6_attrs); + if (IS_VALLEYVIEW(dev)) + sysfs_remove_files(dev-primary-kdev.kobj, vlv_attrs); + else + sysfs_remove_files(dev-primary-kdev.kobj, gen6_attrs); device_remove_bin_file(dev-primary-kdev, dpf_attrs); #ifdef CONFIG_PM sysfs_unmerge_group(dev-primary-kdev.kobj, rc6_attr_group); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] i915: Update VGA arbiter support for newer devices
On Mon, 2013-08-26 at 15:23 +0300, Ville Syrjälä wrote: On Sat, Aug 24, 2013 at 08:53:40AM -0600, Alex Williamson wrote: This is intended to add VGA arbiter support for Intel HD graphics on Core processors. The old GMCH registers no longer exist, so even though it appears that i915 participates in VGA arbitration, it doesn't work. On Intel HD graphics we already attempt to disable VGA regions of the device. This makes registering as a VGA client unnecessary since we don't intend to operate differently depending on how many VGA devices are present. We can disable VGA memory regions by clearing the memory enable bit in the VGA MSR. That only leaves VGA IO, which we update the VGA arbiter to know that we don't participate in VGA memory arbitration. We also add a hook on unload to re-enable memory and reinstate VGA memory arbitration. Signed-off-by: Alex Williamson alex.william...@redhat.com --- v2: I915_READ/WRITE accessors don't work in i915_disable_vga, use inb/outb directly. Also, on the driver unbind VGA enable path, acquire legacy IO to re-enable VGA memory. Correct comment. As with v1, this depends on vgaarb: Fixes for partial VGA opt-out. With all patches I'm able to assign a discrete PEG VGA device to a guest and execute the VBIOS w/o interference from IGD or corruption of the IGD framebuffer. I glanced through those before, but I'll try to do a more thorough review soon. Thanks! drivers/gpu/drm/i915/i915_dma.c |9 ++--- drivers/gpu/drm/i915/intel_display.c | 24 2 files changed, 30 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index f466980..d9cf216 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1287,9 +1287,12 @@ static int i915_load_modeset_init(struct drm_device *dev) * then we do not take part in VGA arbitration and the * vga_client_register() fails with -ENODEV. */ - ret = vga_client_register(dev-pdev, dev, NULL, i915_vga_set_decode); - if (ret ret != -ENODEV) - goto out; + if (!HAS_PCH_SPLIT(dev)) { + ret = vga_client_register(dev-pdev, dev, NULL, + i915_vga_set_decode); + if (ret ret != -ENODEV) + goto out; + } I think we should do something to fix i915_redisable_vga() for !pch platforms too. The MSR method should work for everything, so I'm leaning towards using it all the time. The other option would be add an mmio codepath into intel_disable_vga(), but then we have several codepaths to worry about. Do you have time, systems, and interest to be able to test !pch? I assume that at some point in the past VGA arbiter did work on i915 with the right hardware, so I didn't want to go breaking code that I can't test. intel_register_dsm_handler(); diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 5fb3058..2bb805c 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9519,6 +9519,15 @@ static void i915_disable_vga(struct drm_device *dev) outb(SR01, VGA_SR_INDEX); sr1 = inb(VGA_SR_DATA); outb(sr1 | 15, VGA_SR_DATA); + + /* Disable VGA memory on Intel HD */ + if (HAS_PCH_SPLIT(dev)) { + outb(inb(VGA_MSR_READ) ~VGA_MSR_MEM_EN, VGA_MSR_WRITE); + vga_set_legacy_decoding(dev-pdev, VGA_RSRC_LEGACY_IO | + VGA_RSRC_NORMAL_IO | + VGA_RSRC_NORMAL_MEM); + } + vga_put(dev-pdev, VGA_RSRC_LEGACY_IO); udelay(300); @@ -9526,6 +9535,19 @@ static void i915_disable_vga(struct drm_device *dev) POSTING_READ(vga_reg); } +static void i915_enable_vga(struct drm_device *dev) +{ Not sure we need this actually. We don't re-enable the VGA display on unload, so you'll never see anything on the display anyway. But then again, I don't see any real harm in doing this either. Right, I actually get a flood of DMAR errors from VT-d when I unbind the device from i915, but that happens regardless of this patch. I did confirm that VGA memory is re-added to VGA arbitration though, which is important because even if i915 doesn't leave the device in a usable state we might still be able to pass it to a guest, which will rely on the VGA arbiter control being re-enabled. + /* Enable VGA memory on Intel HD */ + if (HAS_PCH_SPLIT(dev)) { + vga_get_uninterruptible(dev-pdev, VGA_RSRC_LEGACY_IO); + outb(inb(VGA_MSR_READ) | VGA_MSR_MEM_EN, VGA_MSR_WRITE); I was thinking we might save the MSR state during driver load and restore to that same state, but it could be a bit overkill. It seems unnecessary for just this bit, but if
[Intel-gfx] (no subject)
On Mon, Aug 26, 2013 at 04:05:11PM +0300, Michael S. Tsirkin wrote: On Wed, Aug 21, 2013 at 11:22:58AM +0200, Sedat Dilek wrote: [ Re: [Intel-gfx] i915 producing warnings with kernel 3.11-rc5 ] Hi, saw your posting in [1]... can you try the patches below? Not sure if they apply. Did you try v3.11-rc6(+)... or drm-intel-nightly? Regards, - Sedat - [1] http://lists.freedesktop.org/archives/intel-gfx/2013-August/032154.html Same thing observed with v3.11-rc7. Looks like when this happens, external monitor does not work. It shows this message: not optimum mode: recommended mode 1280x1024 60Hz while this is exactly what I configured in xrandr. -- Forwarded message -- From: Sedat Dilek sedat.di...@gmail.com Date: Tue, Jul 2, 2013 at 7:31 AM Subject: Re: linux-next: Tree for Jul 1 [ drm-intel-next: Several call-traces ] To: Daniel Vetter daniel.vet...@ffwll.ch Cc: Barnes, Jesse jbar...@virtuousgeek.org, Stephen Rothwell s...@canb.auug.org.au, linux-next linux-n...@vger.kernel.org, Linux Kernel Mailing List linux-ker...@vger.kernel.org, intel-gfx intel-gfx@lists.freedesktop.org, DRI dri-de...@lists.freedesktop.org, Dave Airlie airl...@gmail.com On Mon, Jul 1, 2013 at 11:03 AM, Sedat Dilek sedat.di...@gmail.com wrote: On Mon, Jul 1, 2013 at 10:52 AM, Daniel Vetter daniel.vet...@ffwll.ch wrote: On Mon, Jul 1, 2013 at 10:49 AM, Sedat Dilek sedat.di...@gmail.com wrote: On Mon, Jul 1, 2013 at 9:59 AM, Stephen Rothwell s...@canb.auug.org.au wrote: Hi all, Changes since 20130628: The regulator tree gained a build failure so I used the version from next-20130628. The trivial tree gained a conflict against the fbdev tree. The arm-soc tree gained a conflict against the net-next tree. The akpm tree lost a few patches that turned up elsewhere and I removed 2 that were causing run time problems. [ CC drm and drm-intel folks ] [ Did not check any relevant MLs ] Please, see attached dmesg output. Clock mismatch, one for Jesse to figure out. Note that this patch is for 3.12, I simply haven't yet gotten around to properly split my patch queue so a few spilled into -next. I'll do that now. I like lightspeed-fast replies :-). Guess drm/i915: get mode clock when reading the pipe config v9 [1] is the cause. Problem solved by applying these patches to next-20130701 from intel-gfx patchwork-service [0]: [1/2] drm/i915: fixup messages in pipe_config_compare [2/2] drm/i915: get clock config when checking CRTC state too AFAICS 2/2 was folded into updated drm/i915: get mode clock when reading the pipe config v9 [3]. It would be kind to be CCed on the patches and get also some credits. Also a CC to the report in linux-next should IMHO be done. - Sedat - [0] https://patchwork.kernel.org/project/intel-gfx/list/ [1] https://patchwork.kernel.org/patch/2809031/ [2] https://patchwork.kernel.org/patch/2809021/ [3] http://cgit.freedesktop.org/~danvet/drm-intel/commit/?h=drm-intel-nightlyid=f1f644dc66cbaf5a4c7dcde683361536b41885b9 - Sedat - [1] http://cgit.freedesktop.org/~danvet/drm-intel/commit/?h=drm-intel-next-queuedid=d325d8b4f351f9d45e7c8baabf581fd21f343133 -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] i915: Update VGA arbiter support for newer devices
On Mon, Aug 26, 2013 at 10:05:36AM -0600, Alex Williamson wrote: On Mon, 2013-08-26 at 15:23 +0300, Ville Syrjälä wrote: On Sat, Aug 24, 2013 at 08:53:40AM -0600, Alex Williamson wrote: This is intended to add VGA arbiter support for Intel HD graphics on Core processors. The old GMCH registers no longer exist, so even though it appears that i915 participates in VGA arbitration, it doesn't work. On Intel HD graphics we already attempt to disable VGA regions of the device. This makes registering as a VGA client unnecessary since we don't intend to operate differently depending on how many VGA devices are present. We can disable VGA memory regions by clearing the memory enable bit in the VGA MSR. That only leaves VGA IO, which we update the VGA arbiter to know that we don't participate in VGA memory arbitration. We also add a hook on unload to re-enable memory and reinstate VGA memory arbitration. Signed-off-by: Alex Williamson alex.william...@redhat.com --- v2: I915_READ/WRITE accessors don't work in i915_disable_vga, use inb/outb directly. Also, on the driver unbind VGA enable path, acquire legacy IO to re-enable VGA memory. Correct comment. As with v1, this depends on vgaarb: Fixes for partial VGA opt-out. With all patches I'm able to assign a discrete PEG VGA device to a guest and execute the VBIOS w/o interference from IGD or corruption of the IGD framebuffer. I glanced through those before, but I'll try to do a more thorough review soon. Thanks! drivers/gpu/drm/i915/i915_dma.c |9 ++--- drivers/gpu/drm/i915/intel_display.c | 24 2 files changed, 30 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index f466980..d9cf216 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1287,9 +1287,12 @@ static int i915_load_modeset_init(struct drm_device *dev) * then we do not take part in VGA arbitration and the * vga_client_register() fails with -ENODEV. */ - ret = vga_client_register(dev-pdev, dev, NULL, i915_vga_set_decode); - if (ret ret != -ENODEV) - goto out; + if (!HAS_PCH_SPLIT(dev)) { + ret = vga_client_register(dev-pdev, dev, NULL, + i915_vga_set_decode); + if (ret ret != -ENODEV) + goto out; + } I think we should do something to fix i915_redisable_vga() for !pch platforms too. The MSR method should work for everything, so I'm leaning towards using it all the time. The other option would be add an mmio codepath into intel_disable_vga(), but then we have several codepaths to worry about. Do you have time, systems, and interest to be able to test !pch? I assume that at some point in the past VGA arbiter did work on i915 with the right hardware, so I didn't want to go breaking code that I can't test. I do have a few gen4ish machines sitting around. I guess I could try it on them at some point when I have a bit of time. But you're right that maybe it's better to not touch them before that. i915_disable_vga() is already potentially broken on them, so your patch is not introducing any new regressions, just maintaining the status quo. intel_register_dsm_handler(); diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 5fb3058..2bb805c 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9519,6 +9519,15 @@ static void i915_disable_vga(struct drm_device *dev) outb(SR01, VGA_SR_INDEX); sr1 = inb(VGA_SR_DATA); outb(sr1 | 15, VGA_SR_DATA); + + /* Disable VGA memory on Intel HD */ + if (HAS_PCH_SPLIT(dev)) { + outb(inb(VGA_MSR_READ) ~VGA_MSR_MEM_EN, VGA_MSR_WRITE); + vga_set_legacy_decoding(dev-pdev, VGA_RSRC_LEGACY_IO | +VGA_RSRC_NORMAL_IO | +VGA_RSRC_NORMAL_MEM); + } + vga_put(dev-pdev, VGA_RSRC_LEGACY_IO); udelay(300); @@ -9526,6 +9535,19 @@ static void i915_disable_vga(struct drm_device *dev) POSTING_READ(vga_reg); } +static void i915_enable_vga(struct drm_device *dev) +{ Not sure we need this actually. We don't re-enable the VGA display on unload, so you'll never see anything on the display anyway. But then again, I don't see any real harm in doing this either. Right, I actually get a flood of DMAR errors from VT-d when I unbind the device from i915, but that happens regardless of this patch. I did confirm that VGA memory is re-added to VGA arbitration though, which is important because even if i915 doesn't leave the device in a usable state we might still be able to pass it to
Re: [Intel-gfx] [PATCH] drm/i915: More vma fixups around unbind/destroy
On Mon, Aug 26, 2013 at 10:36:35AM +0100, Chris Wilson wrote: On Mon, Aug 26, 2013 at 11:23:47AM +0200, Daniel Vetter wrote: The important bugfix here is that we must not unlink the vma when we keep it around as a placeholder for the execbuf code. Since then we won't find it again when execbuf gets interrupt and restarted and create a 2nd vma. And since the code as-is isn't fit yet to deal with more than one vma, hilarity ensues. Specifically the dma map/unmap of the sg table isn't adjusted for multiple vmas yet and will blow up like this: BUG: unable to handle kernel NULL pointer dereference at 0008 IP: [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] PGD 56bb5067 PUD ad3dd067 PMD 0 Oops: [#1] SMP Modules linked in: tcp_lp ppdev parport_pc lp parport ipv6 dm_mod dcdbas snd_hda_codec_hdmi pcspkr snd_hda_codec_realtek serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec lpc_ich snd_hwdep mfd_core snd_pcm snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm mperf freq_table CPU: 1 PID: 16650 Comm: fbo-maxsize Not tainted 3.11.0-rc4_nightlytop_d93f59_debug_20130814_+ #6957 Hardware name: Dell Inc. OptiPlex 9010/03JR84, BIOS A01 05/04/2012 task: 8800563b3f00 ti: 88004bdf4000 task.ti: 88004bdf4000 RIP: 0010:[a008fb37] [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP: 0018:88004bdf5958 EFLAGS: 00010246 RAX: RBX: 8801135e RCX: 8800ad3bf8e0 RDX: 8800ad3bf8e0 RSI: RDI: 8801007ee780 RBP: 88004bdf5978 R08: 8800ad3bf8e0 R09: R10: 86ca1810 R11: 880036a17101 R12: 8801007ee780 R13: 00018001 R14: 880118c4e000 R15: 8801007ee780 FS: 7f401a0ce740() GS:88011e28() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0008 CR3: 5635c000 CR4: 001407e0 Stack: 8801007ee780 88005c253180 00018000 8801135e 88004bdf59a8 a0088e55 0011 8801007eec00 00018000 880036a17101 88004bdf5a08 a0089026 Call Trace: [a0088e55] i915_vma_unbind+0xdf/0x1ab [i915] [a0089026] __i915_gem_shrink+0x105/0x177 [i915] [a0089452] i915_gem_object_get_pages_gtt+0x108/0x309 [i915] [a0085ba9] i915_gem_object_get_pages+0x61/0x90 [i915] [a008f22b] ? gen6_ppgtt_insert_entries+0x103/0x125 [i915] [a008a113] i915_gem_object_pin+0x1fa/0x5df [i915] [a008cdfe] i915_gem_execbuffer_reserve_object.isra.6+0x8d/0x1bc [i915] [a008d156] i915_gem_execbuffer_reserve+0x229/0x367 [i915] [a008dbf6] i915_gem_do_execbuffer.isra.12+0x4dc/0xf3a [i915] [810fc823] ? might_fault+0x40/0x90 [a008eb89] i915_gem_execbuffer2+0x187/0x222 [i915] [a000971c] drm_ioctl+0x308/0x442 [drm] [a008ea02] ? i915_gem_execbuffer+0x3ae/0x3ae [i915] [817db156] ? __do_page_fault+0x3dd/0x481 [8112fdba] vfs_ioctl+0x26/0x39 [811306a2] do_vfs_ioctl+0x40e/0x451 [817deda7] ? sysret_check+0x1b/0x56 [8113073c] SyS_ioctl+0x57/0x87 [8135bbfe] ? trace_hardirqs_on_thunk+0x3a/0x3f [817ded82] system_call_fastpath+0x16/0x1b Code: 48 c7 c6 84 30 0e a0 31 c0 e8 d0 e9 f7 ff bf c6 a7 00 00 e8 07 af 2c e1 41 f6 84 24 03 01 00 00 10 75 44 49 8b 84 24 08 01 00 00 8b 50 08 48 8b 30 49 8b 86 b0 04 00 00 48 89 c7 48 81 c7 98 00 RIP [a008fb37] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP 88004bdf5958 CR2: 0008 As a consequence we need to change the only one vma for now check in vma_unbind - since vma_destroy isn't always called the obj-vma_list might not be empty. Instead check that the vma list is singular at the beginning of vma_unbind. This is also more symmetric with bind_to_vm. v2: - Add a paranoid WARN to mark_free in the eviction code to make sure we never try to evict a vma used by the execbuf code right now. - Move the check for a temporary execbuf vma into vma_destroy - otherwise the failure path cleanup in bind_to_vm will blow up. Our first attempting at fixing this was commit 1be81a2f2cfd8789a627401d470423358fba2d76 Author: Chris Wilson ch...@chris-wilson.co.uk Date: Tue Aug 20 12:56:40 2013 +0100 drm/i915: Don't destroy the vma placeholder during execbuffer reservation Squash with this when merging! v3: Improvements suggested in Chris' review: - Move the WARN_ON in vma_destroy that checks for vmas with an drm_mm allocation before the early return. - Bail out if we hit the WARN in mark_free to hopefully make the kernel survive for long enough to capture it. Cc: Chris Wilson ch...@chris-wilson.co.uk Cc: Ben
Re: [Intel-gfx] [PATCH] [VPG HSW-A] drm/i915: BUN vol4g[DevHSW]
2013/8/26 Koushik Biswas koushikx.bis...@intel.com: From: koushik koushikx.bis...@intel.com WW43 2012 - DDI buffer translation corrections WW36 2012 - Added HDMI voltage swing (not implemented for HDMI) Added comments with voltage swing, pre-emphasis, transition and non-transition values in form of table for reference. This values are applicable only for HSW platform. But why exactly do we need this comment? We already have the intel_hsw_signal_levels() function (inside intel_dp.c) which is basically the implementation of your comment. Signed-off-by: koushik koushikx.bis...@intel.com Change-Id: I0cff220c7d047f41b2a96b3e3880b4029550d458 --- drivers/gpu/drm/i915/intel_ddi.c | 31 +++ drivers/gpu/drm/i915/intel_dp.c | 32 2 files changed, 63 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c index 5131517..0de236e 100644 --- a/drivers/gpu/drm/i915/intel_ddi.c +++ b/drivers/gpu/drm/i915/intel_ddi.c @@ -128,6 +128,37 @@ void intel_prepare_ddi(struct drm_device *dev) intel_prepare_ddi_buffers(dev, PORT_E, true); } + +/* Updating the new table in comments as it doesn’t cause any logic change */ + +/* For HSW Voltage swing levels ***/ +/**/ +/**/ +/*|Entry|Voltage|Pre-emphasis|Non-Transition|Transition |Pre-emphasis|*/ +/*| |Swing |level |mV diff p-p |mV diff p-p|db |*/ +/*|--|*/ +/*|0|0 |0 |400 |400|0 |*/ +/*|--|*/ +/*|1|0 |1 |400 |600|3.5 |*/ +/*|--|*/ +/*|2|0 |2 |400 |800|6 |*/ +/*|--|*/ +/*|3|0 |3 |400 |1000 |8 |*/ +/*|--|*/ +/*|4|1 |0 |600 |600|0 |*/ +/*|--|*/ +/*|5|1 |1 |600 |900|3.5 |*/ +/*|--|*/ +/*|6|1 |2 |600 |1000 |4.5 |*/ +/*|--|*/ +/*|7|2 |0 |800 |800|0 |*/ +/*|--|*/ +/*|8|2 |1 |1000 |1000 |2 |*/ +/*|--|*/ +/*|9| Entry 9 is only used for HDMI and DVI|*/ +/*|--|*/ +/**/ + static const long hsw_ddi_buf_ctl_values[] = { DDI_BUF_EMP_400MV_0DB_HSW, DDI_BUF_EMP_400MV_3_5DB_HSW, diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 9fd7f90..fa73fb1 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -1809,6 +1809,38 @@ intel_gen7_edp_signal_levels(uint8_t train_set) } } + +/* Updating the new table in comments as it doesn’t cause any logic change */ + +/* For HSW Voltage swing levels ***/ +/**/ +/**/ +/*|Entry|Voltage|Pre-emphasis|Non-Transition|Transition |Pre-emphasis|*/ +/*| |Swing |level |mV diff p-p |mV diff p-p|db |*/ +/*|--|*/ +/*|0|0 |0 |400 |400|0 |*/ +/*|--|*/ +/*|1|0 |1 |400 |600|3.5 |*/ +/*|--|*/ +/*|2|0 |2 |400 |800|6 |*/ +/*|--|*/ +/*|3|0 |3 |400 |1000 |8 |*/ +/*|--|*/ +/*|4|1 |0 |600 |600|0 |*/ +/*|--|*/ +/*|5
[Intel-gfx] [PATCH 1/2] drm/i915: Embed the ring-private within the struct intel_ring_buffer
We now have more devices using ring-private than not, and they all want the same structure. Worse, I would like to use a scratch page from outside of intel_ringbuffer.c and so for convenience would like to reuse ring-private. Embed the object into the struct intel_ringbuffer so that we can keep the code clean. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 99 ++--- drivers/gpu/drm/i915/intel_ringbuffer.h | 6 +- 3 files changed, 35 insertions(+), 72 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index e8955e7..3b003af 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -644,7 +644,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv, if (WARN_ON(ring-id != RCS)) return NULL; - obj = ring-private; + obj = ring-scratch.obj; if (acthd = i915_gem_obj_ggtt_offset(obj) acthd i915_gem_obj_ggtt_offset(obj) + obj-base.size) return i915_error_object_create(dev_priv, obj); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index b0fb5ce..7fa52bd 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -33,16 +33,6 @@ #include i915_trace.h #include intel_drv.h -/* - * 965+ support PIPE_CONTROL commands, which provide finer grained control - * over cache flushing. - */ -struct pipe_control { - struct drm_i915_gem_object *obj; - volatile u32 *cpu_page; - u32 gtt_offset; -}; - static inline int ring_space(struct intel_ring_buffer *ring) { int space = (ring-head HEAD_ADDR) - (ring-tail + I915_RING_FREE_SPACE); @@ -185,8 +175,7 @@ gen4_render_ring_flush(struct intel_ring_buffer *ring, static int intel_emit_post_sync_nonzero_flush(struct intel_ring_buffer *ring) { - struct pipe_control *pc = ring-private; - u32 scratch_addr = pc-gtt_offset + 128; + u32 scratch_addr = ring-scratch.gtt_offset + 128; int ret; @@ -223,8 +212,7 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring, u32 invalidate_domains, u32 flush_domains) { u32 flags = 0; - struct pipe_control *pc = ring-private; - u32 scratch_addr = pc-gtt_offset + 128; + u32 scratch_addr = ring-scratch.gtt_offset + 128; int ret; /* Force SNB workarounds for PIPE_CONTROL flushes */ @@ -316,8 +304,7 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring, u32 invalidate_domains, u32 flush_domains) { u32 flags = 0; - struct pipe_control *pc = ring-private; - u32 scratch_addr = pc-gtt_offset + 128; + u32 scratch_addr = ring-scratch.gtt_offset + 128; int ret; /* @@ -491,68 +478,43 @@ out: static int init_pipe_control(struct intel_ring_buffer *ring) { - struct pipe_control *pc; - struct drm_i915_gem_object *obj; int ret; - if (ring-private) + if (ring-scratch.obj) return 0; - pc = kmalloc(sizeof(*pc), GFP_KERNEL); - if (!pc) - return -ENOMEM; - - obj = i915_gem_alloc_object(ring-dev, 4096); - if (obj == NULL) { + ring-scratch.obj = i915_gem_alloc_object(ring-dev, 4096); + if (ring-scratch.obj == NULL) { DRM_ERROR(Failed to allocate seqno page\n); ret = -ENOMEM; goto err; } - i915_gem_object_set_cache_level(obj, I915_CACHE_LLC); + i915_gem_object_set_cache_level(ring-scratch.obj, I915_CACHE_LLC); - ret = i915_gem_obj_ggtt_pin(obj, 4096, true, false); + ret = i915_gem_obj_ggtt_pin(ring-scratch.obj, 4096, true, false); if (ret) goto err_unref; - pc-gtt_offset = i915_gem_obj_ggtt_offset(obj); - pc-cpu_page = kmap(sg_page(obj-pages-sgl)); - if (pc-cpu_page == NULL) { + ring-scratch.gtt_offset = i915_gem_obj_ggtt_offset(ring-scratch.obj); + ring-scratch.cpu_page = kmap(sg_page(ring-scratch.obj-pages-sgl)); + if (ring-scratch.cpu_page == NULL) { ret = -ENOMEM; goto err_unpin; } DRM_DEBUG_DRIVER(%s pipe control offset: 0x%08x\n, -ring-name, pc-gtt_offset); - - pc-obj = obj; - ring-private = pc; +ring-name, ring-scratch.gtt_offset); return 0; err_unpin: - i915_gem_object_unpin(obj); + i915_gem_object_unpin(ring-scratch.obj); err_unref: - drm_gem_object_unreference(obj-base); + drm_gem_object_unreference(ring-scratch.obj-base); err: - kfree(pc); return ret; } -static void -cleanup_pipe_control(struct
[Intel-gfx] [PATCH 2/2] drm/i915: Use RCS flips on Ivybridge+
RCS flips do work on Iybridge+ so long as we can unmask the messages through DERRMR. However, there are quite a few workarounds mentioned regarding unmasking more than one event or triggering more than one message through DERRMR. Those workarounds in principle prevent us from performing pipelined flips (and asynchronous flips across multiple planes) and equally apply to the known good BCS ring. Given that it already appears to work, and also appears to work with unmasking all 3 planes at once (and queuing flips across multiple planes), be brave. Bugzlla: https://bugs.freedesktop.org/show_bug.cgi?id=67600 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Lightly-tested-by: Stephane Marchesin marche...@icps.u-strasbg.fr Cc: Stephane Marchesin marche...@icps.u-strasbg.fr Cc: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_reg.h | 18 drivers/gpu/drm/i915/intel_display.c | 40 2 files changed, 49 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index c6f5009..df168f4 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -230,6 +230,7 @@ * address/value pairs. Don't overdue it, though, x = 2^4 must hold! */ #define MI_LOAD_REGISTER_IMM(x)MI_INSTR(0x22, 2*x-1) +#define MI_STORE_REGISTER_MEM(x) MI_INSTR(0x24, 2*x-1) #define MI_FLUSH_DWMI_INSTR(0x26, 1) /* for GEN6 */ #define MI_FLUSH_DW_STORE_INDEX (121) #define MI_INVALIDATE_TLB(118) @@ -678,6 +679,23 @@ #define FPGA_DBG_RM_NOCLAIM (131) #define DERRMR 0x44050 +#define DERRMR_PIPEA_SCANLINE(10) +#define DERRMR_PIPEA_PRI_FLIP_DONE (11) +#define DERRMR_PIPEA_SPR_FLIP_DONE (12) +#define DERRMR_PIPEA_VBLANK (13) +#define DERRMR_PIPEA_HBLANK (15) +#define DERRMR_PIPEB_SCANLINE(18) +#define DERRMR_PIPEB_PRI_FLIP_DONE (19) +#define DERRMR_PIPEB_SPR_FLIP_DONE (110) +#define DERRMR_PIPEB_VBLANK (111) +#define DERRMR_PIPEB_HBLANK (113) +/* Note that PIPEC is not a simple translation of PIPEA/PIPEB */ +#define DERRMR_PIPEC_SCANLINE(114) +#define DERRMR_PIPEC_PRI_FLIP_DONE (115) +#define DERRMR_PIPEC_SPR_FLIP_DONE (120) +#define DERRMR_PIPEC_VBLANK (121) +#define DERRMR_PIPEC_HBLANK (122) + /* GM45+ chicken bits -- debug workaround bits that may be required * for various sorts of correct behavior. The top 16 bits of each are diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 9748dce..ffbcbd1 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -7826,12 +7826,6 @@ err: return ret; } -/* - * On gen7 we currently use the blit ring because (in early silicon at least) - * the render ring doesn't give us interrpts for page flip completion, which - * means clients will hang after the first flip is queued. Fortunately the - * blit ring generates interrupts properly, so use it instead. - */ static int intel_gen7_queue_flip(struct drm_device *dev, struct drm_crtc *crtc, struct drm_framebuffer *fb, @@ -7839,9 +7833,13 @@ static int intel_gen7_queue_flip(struct drm_device *dev, { struct drm_i915_private *dev_priv = dev-dev_private; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); - struct intel_ring_buffer *ring = dev_priv-ring[BCS]; + struct intel_ring_buffer *ring; uint32_t plane_bit = 0; - int ret; + int len, ret; + + ring = obj-ring; + if (ring == NULL || ring-id != RCS) + ring = dev_priv-ring[BCS]; ret = intel_pin_and_fence_fb_obj(dev, obj, ring); if (ret) @@ -7863,10 +7861,34 @@ static int intel_gen7_queue_flip(struct drm_device *dev, goto err_unpin; } - ret = intel_ring_begin(ring, 4); + len = 4; + if (ring-id == RCS) + len += 6; + + ret = intel_ring_begin(ring, len); if (ret) goto err_unpin; + /* Unmask the flip-done completion message. Note that the bspec says that +* we should do this for both the BCS and RCS, and that we must not unmask +* more than one flip event at any time (or ensure that one flip message +* can be sent by waiting for flip-done prior to queueing new flips). +* Experimentation says that BCS works despite DERRMR masking all +* flip-done completion events and that unmasking all planes at once +* for the RCS also doesn't appear to drop events. Setting the DERRMR +* to zero does lead to lockups within MI_DISPLAY_FLIP. +*/ + if (ring-id == RCS) { + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); + intel_ring_emit(ring, DERRMR); +
Re: [Intel-gfx] [PATCH] Add second DRI driver name (DRI2DriverVDPAU)
Ибрагимов Ринат ibragimovri...@mail.ru writes: libvdpau uses second DRI driver name to determine which VDPAU driver to use. This patch will allow libvdpau choose libvdpau_i965.so on systems with Intel GPUs, libvdpau_nvidia.so on those with nVidia ones, and so on. I'm experimenting now with generic vdpau driver using OpenGL/VA-API, it would be convenient to have this driver selection working without manual driver selection. If it's a generic driver, why would it need a i965 string passed over the DRI2 protocol to find it? One of the things I'm planning on doing is removing use of the protocol-provided DRI2 driver name from Mesa -- Mesa knows what drivers it has built, and it knows how to detect those devices (in the EGL code), so why would we pay attention to what the X Server thinks our driver's name is? Seems like vdpau could probably do the same. pgpR0lyDH__GV.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Use RCS flips on Ivybridge+
On Mon, Aug 26, 2013 at 12:58 PM, Chris Wilson ch...@chris-wilson.co.ukwrote: RCS flips do work on Iybridge+ so long as we can unmask the messages through DERRMR. However, there are quite a few workarounds mentioned regarding unmasking more than one event or triggering more than one message through DERRMR. Those workarounds in principle prevent us from performing pipelined flips (and asynchronous flips across multiple planes) and equally apply to the known good BCS ring. Given that it already appears to work, and also appears to work with unmasking all 3 planes at once (and queuing flips across multiple planes), be brave. Bugzlla: https://bugs.freedesktop.org/show_bug.cgi?id=67600 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Lightly-tested-by: Stephane Marchesin marche...@icps.u-strasbg.fr You can go ahead and say Tested-by, I ran my usual tests for 3 hours and it didn't crash/show an issue. It would crash in ~10-30 minutes with the other patches. Stéphane Cc: Stephane Marchesin marche...@icps.u-strasbg.fr Cc: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_reg.h | 18 drivers/gpu/drm/i915/intel_display.c | 40 2 files changed, 49 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index c6f5009..df168f4 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -230,6 +230,7 @@ * address/value pairs. Don't overdue it, though, x = 2^4 must hold! */ #define MI_LOAD_REGISTER_IMM(x)MI_INSTR(0x22, 2*x-1) +#define MI_STORE_REGISTER_MEM(x) MI_INSTR(0x24, 2*x-1) #define MI_FLUSH_DWMI_INSTR(0x26, 1) /* for GEN6 */ #define MI_FLUSH_DW_STORE_INDEX (121) #define MI_INVALIDATE_TLB(118) @@ -678,6 +679,23 @@ #define FPGA_DBG_RM_NOCLAIM (131) #define DERRMR 0x44050 +#define DERRMR_PIPEA_SCANLINE(10) +#define DERRMR_PIPEA_PRI_FLIP_DONE (11) +#define DERRMR_PIPEA_SPR_FLIP_DONE (12) +#define DERRMR_PIPEA_VBLANK (13) +#define DERRMR_PIPEA_HBLANK (15) +#define DERRMR_PIPEB_SCANLINE(18) +#define DERRMR_PIPEB_PRI_FLIP_DONE (19) +#define DERRMR_PIPEB_SPR_FLIP_DONE (110) +#define DERRMR_PIPEB_VBLANK (111) +#define DERRMR_PIPEB_HBLANK (113) +/* Note that PIPEC is not a simple translation of PIPEA/PIPEB */ +#define DERRMR_PIPEC_SCANLINE(114) +#define DERRMR_PIPEC_PRI_FLIP_DONE (115) +#define DERRMR_PIPEC_SPR_FLIP_DONE (120) +#define DERRMR_PIPEC_VBLANK (121) +#define DERRMR_PIPEC_HBLANK (122) + /* GM45+ chicken bits -- debug workaround bits that may be required * for various sorts of correct behavior. The top 16 bits of each are diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 9748dce..ffbcbd1 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -7826,12 +7826,6 @@ err: return ret; } -/* - * On gen7 we currently use the blit ring because (in early silicon at least) - * the render ring doesn't give us interrpts for page flip completion, which - * means clients will hang after the first flip is queued. Fortunately the - * blit ring generates interrupts properly, so use it instead. - */ static int intel_gen7_queue_flip(struct drm_device *dev, struct drm_crtc *crtc, struct drm_framebuffer *fb, @@ -7839,9 +7833,13 @@ static int intel_gen7_queue_flip(struct drm_device *dev, { struct drm_i915_private *dev_priv = dev-dev_private; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); - struct intel_ring_buffer *ring = dev_priv-ring[BCS]; + struct intel_ring_buffer *ring; uint32_t plane_bit = 0; - int ret; + int len, ret; + + ring = obj-ring; + if (ring == NULL || ring-id != RCS) + ring = dev_priv-ring[BCS]; ret = intel_pin_and_fence_fb_obj(dev, obj, ring); if (ret) @@ -7863,10 +7861,34 @@ static int intel_gen7_queue_flip(struct drm_device *dev, goto err_unpin; } - ret = intel_ring_begin(ring, 4); + len = 4; + if (ring-id == RCS) + len += 6; + + ret = intel_ring_begin(ring, len); if (ret) goto err_unpin; + /* Unmask the flip-done completion message. Note that the bspec says that +* we should do this for both the BCS and RCS, and that we must not unmask +* more than one flip event at any time (or ensure that one flip message +* can be sent by waiting for flip-done prior to queueing new flips). +* Experimentation says that BCS works despite DERRMR masking all +* flip-done
Re: [Intel-gfx] [PATCH] drm/i915: Write RING_TAIL once per-request
On Sat, Aug 10, 2013 at 10:16:32PM +0100, Chris Wilson wrote: Ignoring the legacy DRI1 code, and a couple of special cases (to be discussed later), all access to the ring is mediated through requests. The first write to a ring will grab a seqno and mark the ring as having an outstanding_lazy_request. Either through explicitly adding a request after an execbuffer or through an implicit wait (either by the CPU or by a semaphore), that sequence of writes will be terminated with a request. So we can ellide all the intervening writes to the tail register and send the entire command stream to the GPU at once. This will reduce the number of *serialising* writes to the tail register by a factor or 3-5 times (depending upon architecture and number of workarounds, context switches, etc involved). This becomes even more noticeable when the register write is overloaded with a number of debugging tools. The astute reader will wonder if it is then possible to overflow the ring with a single command. It is not. When we start a command sequence to the ring, we check for available space and issue a wait in case we have not. The ring wait will in this case be forced to flush the outstanding register write and then poll the ACTHD for sufficient space to continue. The exception to the rule where everything is inside a request are a few initialisation cases where we may want to write GPU commands via the CS before userspace wakes up and page flips. I'm not a huge fan of having the second intel_ring_advance() that does something other than it sounds like. I'd *much* prefer to not intel_ring_advance() if you are certain more emits will be coming like in the case you mention above. We can add a paranoia check whenever we're about to return to userspace that tail == RING_TAIL Also, without performance data, it's hard to say this indirection is worth it. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_dma.c | 2 +- drivers/gpu/drm/i915/i915_gem_exec.c| 2 +- drivers/gpu/drm/i915/intel_display.c| 10 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 30 -- drivers/gpu/drm/i915/intel_ringbuffer.h | 7 ++- 5 files changed, 29 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index de0b86a..a7a0eb86 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -52,7 +52,7 @@ intel_ring_emit(LP_RING(dev_priv), x) #define ADVANCE_LP_RING() \ - intel_ring_advance(LP_RING(dev_priv)) + __intel_ring_advance(LP_RING(dev_priv)) /** * Lock test for when it's just for synchronization of ring access. diff --git a/drivers/gpu/drm/i915/i915_gem_exec.c b/drivers/gpu/drm/i915/i915_gem_exec.c index a2c6dbf..4da3704 100644 --- a/drivers/gpu/drm/i915/i915_gem_exec.c +++ b/drivers/gpu/drm/i915/i915_gem_exec.c @@ -111,7 +111,7 @@ int i915_gem_exec_clear_object(struct drm_i915_gem_object *obj) intel_ring_emit(ring, 0); intel_ring_emit(ring, MI_NOOP); - intel_ring_advance(ring); + __intel_ring_advance(ring); i915_gem_exec_dirty_object(obj, ring); unpin: diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index b26b50b..6b7ce06 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -7547,7 +7547,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev, intel_ring_emit(ring, 0); /* aux display base address, unused */ intel_mark_page_flip_active(intel_crtc); - intel_ring_advance(ring); + __intel_ring_advance(ring); return 0; err_unpin: @@ -7588,7 +7588,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_NOOP); intel_mark_page_flip_active(intel_crtc); - intel_ring_advance(ring); + __intel_ring_advance(ring); return 0; err_unpin: @@ -7636,7 +7636,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev, intel_ring_emit(ring, pf | pipesrc); intel_mark_page_flip_active(intel_crtc); - intel_ring_advance(ring); + __intel_ring_advance(ring); return 0; err_unpin: @@ -7680,7 +7680,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev, intel_ring_emit(ring, pf | pipesrc); intel_mark_page_flip_active(intel_crtc); - intel_ring_advance(ring); + __intel_ring_advance(ring); return 0; err_unpin: @@ -7736,7 +7736,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev, intel_ring_emit(ring, (MI_NOOP)); intel_mark_page_flip_active(intel_crtc); - intel_ring_advance(ring); + __intel_ring_advance(ring); return 0; err_unpin: diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 7de95da..8d5dd65 100644 ---
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Use RCS flips on Ivybridge+
On Mon, Aug 26, 2013 at 1:42 PM, Stéphane Marchesin stephane.marche...@gmail.com wrote: On Mon, Aug 26, 2013 at 12:58 PM, Chris Wilson ch...@chris-wilson.co.ukwrote: RCS flips do work on Iybridge+ so long as we can unmask the messages through DERRMR. However, there are quite a few workarounds mentioned regarding unmasking more than one event or triggering more than one message through DERRMR. Those workarounds in principle prevent us from performing pipelined flips (and asynchronous flips across multiple planes) and equally apply to the known good BCS ring. Given that it already appears to work, and also appears to work with unmasking all 3 planes at once (and queuing flips across multiple planes), be brave. Bugzlla: https://bugs.freedesktop.org/show_bug.cgi?id=67600 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Lightly-tested-by: Stephane Marchesin marche...@icps.u-strasbg.fr You can go ahead and say Tested-by, I ran my usual tests for 3 hours and it didn't crash/show an issue. It would crash in ~10-30 minutes with the other patches. Oh actually this one is a bit different... Give me 3 hours :) Stéphane Stéphane Cc: Stephane Marchesin marche...@icps.u-strasbg.fr Cc: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_reg.h | 18 drivers/gpu/drm/i915/intel_display.c | 40 2 files changed, 49 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index c6f5009..df168f4 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -230,6 +230,7 @@ * address/value pairs. Don't overdue it, though, x = 2^4 must hold! */ #define MI_LOAD_REGISTER_IMM(x)MI_INSTR(0x22, 2*x-1) +#define MI_STORE_REGISTER_MEM(x) MI_INSTR(0x24, 2*x-1) #define MI_FLUSH_DWMI_INSTR(0x26, 1) /* for GEN6 */ #define MI_FLUSH_DW_STORE_INDEX (121) #define MI_INVALIDATE_TLB(118) @@ -678,6 +679,23 @@ #define FPGA_DBG_RM_NOCLAIM (131) #define DERRMR 0x44050 +#define DERRMR_PIPEA_SCANLINE(10) +#define DERRMR_PIPEA_PRI_FLIP_DONE (11) +#define DERRMR_PIPEA_SPR_FLIP_DONE (12) +#define DERRMR_PIPEA_VBLANK (13) +#define DERRMR_PIPEA_HBLANK (15) +#define DERRMR_PIPEB_SCANLINE(18) +#define DERRMR_PIPEB_PRI_FLIP_DONE (19) +#define DERRMR_PIPEB_SPR_FLIP_DONE (110) +#define DERRMR_PIPEB_VBLANK (111) +#define DERRMR_PIPEB_HBLANK (113) +/* Note that PIPEC is not a simple translation of PIPEA/PIPEB */ +#define DERRMR_PIPEC_SCANLINE(114) +#define DERRMR_PIPEC_PRI_FLIP_DONE (115) +#define DERRMR_PIPEC_SPR_FLIP_DONE (120) +#define DERRMR_PIPEC_VBLANK (121) +#define DERRMR_PIPEC_HBLANK (122) + /* GM45+ chicken bits -- debug workaround bits that may be required * for various sorts of correct behavior. The top 16 bits of each are diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 9748dce..ffbcbd1 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -7826,12 +7826,6 @@ err: return ret; } -/* - * On gen7 we currently use the blit ring because (in early silicon at least) - * the render ring doesn't give us interrpts for page flip completion, which - * means clients will hang after the first flip is queued. Fortunately the - * blit ring generates interrupts properly, so use it instead. - */ static int intel_gen7_queue_flip(struct drm_device *dev, struct drm_crtc *crtc, struct drm_framebuffer *fb, @@ -7839,9 +7833,13 @@ static int intel_gen7_queue_flip(struct drm_device *dev, { struct drm_i915_private *dev_priv = dev-dev_private; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); - struct intel_ring_buffer *ring = dev_priv-ring[BCS]; + struct intel_ring_buffer *ring; uint32_t plane_bit = 0; - int ret; + int len, ret; + + ring = obj-ring; + if (ring == NULL || ring-id != RCS) + ring = dev_priv-ring[BCS]; ret = intel_pin_and_fence_fb_obj(dev, obj, ring); if (ret) @@ -7863,10 +7861,34 @@ static int intel_gen7_queue_flip(struct drm_device *dev, goto err_unpin; } - ret = intel_ring_begin(ring, 4); + len = 4; + if (ring-id == RCS) + len += 6; + + ret = intel_ring_begin(ring, len); if (ret) goto err_unpin; + /* Unmask the flip-done completion message. Note that the bspec says that +* we should do this for both the BCS and RCS, and that we must not unmask +* more than one flip event at any time (or ensure that one flip message +
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Use RCS flips on Ivybridge+
On Mon, Aug 26, 2013 at 01:43:25PM -0700, Stéphane Marchesin wrote: On Mon, Aug 26, 2013 at 1:42 PM, St�phane Marchesin [1]stephane.marche...@gmail.com wrote: On Mon, Aug 26, 2013 at 12:58 PM, Chris Wilson [2]ch...@chris-wilson.co.uk wrote: RCS flips do work on Iybridge+ so long as we can unmask the messages through DERRMR. However, there are quite a few workarounds mentioned regarding unmasking more than one event or triggering more than one message through DERRMR. Those workarounds in principle prevent us from performing pipelined flips (and asynchronous flips across multiple planes) and equally apply to the known good BCS ring. Given that it already appears to work, and also appears to work with unmasking all 3 planes at once (and queuing flips across multiple planes), be brave. Bugzlla: [3]https://bugs.freedesktop.org/show_bug.cgi?id=67600 Signed-off-by: Chris Wilson [4]ch...@chris-wilson.co.uk Lightly-tested-by: Stephane Marchesin [5]marche...@icps.u-strasbg.fr You can go ahead and say Tested-by, I ran my usual tests for 3 hours and it didn't crash/show an issue. It would crash in ~10-30 minutes with the other patches. Oh actually this one is a bit different... Give me 3 hours :) St�phane I should have mentionted this was v2, and your testing was for v1. Hopefully, I've moved the code around correctly. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] Add second DRI driver name (DRI2DriverVDPAU)
Понедельник, 26 августа 2013, 13:40 -07:00 от Eric Anholt e...@anholt.net: Ибрагимов Ринат ibragimovri...@mail.ru writes: libvdpau uses second DRI driver name to determine which VDPAU driver to use. This patch will allow libvdpau choose libvdpau_i965.so on systems with Intel GPUs, libvdpau_nvidia.so on those with nVidia ones, and so on. I'm experimenting now with generic vdpau driver using OpenGL/VA-API, it would be convenient to have this driver selection working without manual driver selection. If it's a generic driver, why would it need a i965 string passed over the DRI2 protocol to find it? It doesn't actually. It's just to simplify distribution package management. If one names driver libvdpau_nvidia.so.1, it will break VDPAU on nVidia equipped machines. With this patch one can name it libvdpau_i965.so.1 and driver will be loaded on intel equipped machines only. One of the things I'm planning on doing is removing use of the protocol-provided DRI2 driver name from Mesa -- Mesa knows what drivers it has built, and it knows how to detect those devices (in the EGL code), so why would we pay attention to what the X Server thinks our driver's name is? Seems like vdpau could probably do the same. VDPAU uses entry point library (libvdpau) which loads backend drivers. libvdpau uses second DRI2 driver name amongst other methods to determine which driver to load. It can not rely on Mesa's internal knowledge, because it must work with proprietary nVidia driver too. --- Rinat ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Rename __intel_ring_advance() to intel_ring_commit()
Ben complained in his review of write RING_TAIL once per request that intel_ring_advance() did nothing and was confusing with __intel_ring_advance() doing the hard work. This patch drops the intel_ring_advance(), and renames __intel_ring_advance() to intel_ring_commit(). We go one step further and rename intel_ring_begin() to intel_ring_get_space() so that the asymmetry looks less startling. The advantage of intel_ring_begin()/intel_ring_advance() is that that demarcated a known number of dword writes to the ring, with the intention of making it easier for the reviewer to check the count. The advantage of intel_ring_get_space(), ..., intel_ring_commit() is that the relaxed rules on the TAIL write are more obvious. There is no implicit nop (intel_ring_advance) to befuddle the reader. If the commands need to be emitter immediate, the reader knows to look for an explicit intel_ring_commit(). Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_dma.c|4 +- drivers/gpu/drm/i915/i915_gem_context.c|7 +-- drivers/gpu/drm/i915/i915_gem_exec.c |4 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c |9 +-- drivers/gpu/drm/i915/intel_display.c | 20 +++ drivers/gpu/drm/i915/intel_overlay.c | 12 ++-- drivers/gpu/drm/i915/intel_ringbuffer.c| 82 +++- drivers/gpu/drm/i915/intel_ringbuffer.h|8 +-- 8 files changed, 58 insertions(+), 88 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index c0fb23e..fb50467 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -46,13 +46,13 @@ #define LP_RING(d) (((struct drm_i915_private *)(d))-ring[RCS]) #define BEGIN_LP_RING(n) \ - intel_ring_begin(LP_RING(dev_priv), (n)) + intel_ring_get_space(LP_RING(dev_priv), (n)) #define OUT_RING(x) \ intel_ring_emit(LP_RING(dev_priv), x) #define ADVANCE_LP_RING() \ - __intel_ring_advance(LP_RING(dev_priv)) + intel_ring_commit(LP_RING(dev_priv)) /** * Lock test for when it's just for synchronization of ring access. diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index fae2f4d..729b1bb 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -370,7 +370,7 @@ mi_set_context(struct intel_ring_buffer *ring, if (!new_context-is_initialized) len += 2; - ret = intel_ring_begin(ring, len); + ret = intel_ring_get_space(ring, len); if (ret) return ret; @@ -419,10 +419,7 @@ mi_set_context(struct intel_ring_buffer *ring, break; } - intel_ring_advance(ring); - - return ret; - + return 0; } static int do_switch(struct i915_hw_context *to) diff --git a/drivers/gpu/drm/i915/i915_gem_exec.c b/drivers/gpu/drm/i915/i915_gem_exec.c index 4da3704..2d434b2 100644 --- a/drivers/gpu/drm/i915/i915_gem_exec.c +++ b/drivers/gpu/drm/i915/i915_gem_exec.c @@ -100,7 +100,7 @@ int i915_gem_exec_clear_object(struct drm_i915_gem_object *obj) if (ret) goto unpin; - ret = intel_ring_begin(ring, 6); + ret = intel_ring_get_space(ring, 6); if (ret) goto unpin; @@ -111,7 +111,7 @@ int i915_gem_exec_clear_object(struct drm_i915_gem_object *obj) intel_ring_emit(ring, 0); intel_ring_emit(ring, MI_NOOP); - __intel_ring_advance(ring); + intel_ring_commit(ring); i915_gem_exec_dirty_object(obj, ring); unpin: diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 9b7785d..ca0a2f5 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -899,7 +899,7 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev, if (!IS_GEN7(dev) || ring != dev_priv-ring[RCS]) return 0; - ret = intel_ring_begin(ring, 4 * 3); + ret = intel_ring_get_space(ring, 4 * 3); if (ret) return ret; @@ -909,8 +909,6 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev, intel_ring_emit(ring, 0); } - intel_ring_advance(ring); - return 0; } @@ -1121,15 +1119,14 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (ring == dev_priv-ring[RCS] mode != dev_priv-relative_constants_mode) { - ret = intel_ring_begin(ring, 4); + ret = intel_ring_get_space(ring, 4); if (ret) - goto err; + goto err; intel_ring_emit(ring, MI_NOOP); intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); intel_ring_emit(ring, INSTPM); intel_ring_emit(ring, mask 16 | mode); - intel_ring_advance(ring);
[Intel-gfx] [PATCH 01/17] drm/i915: Do not add an interrupt for a context switch
From: Chris Wilson ch...@chris-wilson.co.uk We use the request to ensure we hold a reference to the context for the duration that it remains in use by the ring. Each request only holds a reference to the current context, hence we emit a request after switching contexts with the final reference to the old context. However, the extra interrupt caused by that request is not useful (no timing critical function will wait for the context object), instead the overhead of servicing the IRQ shows up in some (lightweight) benchmarks. In order to keep the useful property of using the request to manage the context lifetime, we want to add a dummy request that is associated with the interrupt from the subsequent real request following the batch. The extra interrupt was added as a side-effect of using i915_add_request() in commit 112522f6789581824903f6f72082b5b841a7f0f9 Author: Chris Wilson ch...@chris-wilson.co.uk Date: Thu May 2 16:48:07 2013 +0300 drm/i915: put context upon switching v2: Daniel convinced me that the request here was solely for context lifetime tracking and that we have the active ref to keep the object alive whilst the MI_SET_CONTEXT. So the only concern then is which context should get the blame for MI_SET_CONTEXT failing. The old scheme added a request for the old context so that any hang upto and including the switch away would mark the old context as guilty. Now any hang here implicates the new context. However since we have already gone through a complete flush with the last context in its last request, and all that lies in no-man's-land is an invalidate flush and the MI_SET_CONTEXT, we should be safe in not unduly placing blame on the new context. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Ben Widawsky b...@bwidawsk.net Cc: Paulo Zanoni paulo.r.zan...@intel.com --- drivers/gpu/drm/i915/i915_gem.c | 1 - drivers/gpu/drm/i915/i915_gem_context.c | 12 +--- 2 files changed, 1 insertion(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 2d1cb10..56c9104 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2046,7 +2046,6 @@ int __i915_add_request(struct intel_ring_buffer *ring, if (request == NULL) return -ENOMEM; - /* Record the position of the start of the request so that * should we detect the updated seqno part-way through the * GPU processing the request, we never over-estimate the diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 403309c..b6da70b 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -451,17 +451,7 @@ static int do_switch(struct i915_hw_context *to) from-obj-dirty = 1; BUG_ON(from-obj-ring != ring); - ret = i915_add_request(ring, NULL); - if (ret) { - /* Too late, we've already scheduled a context switch. -* Try to undo the change so that the hw state is -* consistent with out tracking. In case of emergency, -* scream. -*/ - WARN_ON(mi_set_context(ring, from, MI_RESTORE_INHIBIT)); - return ret; - } - + /* obj is kept alive until the next request by its active ref */ i915_gem_object_unpin(from-obj); i915_gem_context_unreference(from); } -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 00/17] drm-intel-collector WW34 - Simple patches as series for review
Hi all, Let me introduce drm-intel-collector branch: http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=drm-intel-collector To describe drm-intel-collector I'll quote Daniel: The overall idea is to make sure that simple patches don't get lost. Bigger patch series or feature work tends to not get lost, and really trivial patches I tend to merge right away. But 1-2 patch stuff in between is occasionally lost Process: 1. Daniel pushs drm-intel-testing 2. I rebase drm-intel-collector onto drm-intel-testing 3. I collect all simple (1-2) patches that wasn't yet reviewed and not queued by Daniel 4. Request automated QA's PRTS automated i-g-t tests comparing drm-intel-testing x drm-intel-collector 5. If tests are ok I send the patches as a series to intel-gfx mailing list for better tracking and to be reviewed. There are some reasons that some patches can be left behind: 1. It was send so long time ago. I started with patches from Jul 26th. 2. Your patch didn't applied cleanly and I couldn't easily solve the conflicts. 3. Kernel didn't compiled with your patch. 4. I simply missed it. If you believe this is the case please warn me. Please help me to get these patches reviewed and queued by Daniel. Also, please let me know if you have further ideas how to improve this process. Thanks in advance, Rodrigo. Chris Wilson (13): drm/i915: Do not add an interrupt for a context switch drm/i915: Rearrange the comments in i915_add_request() drm/i915: Pin pages whilst mapping the dma-buf drm/i915: Cancel outstanding modeset workers before suspend drm/i915: Always prefer CPU relocations with LLC drm/i915: Report requested frequency alongside current frequency in debugfs drm/i915: Move the conditional seqno query into the tracepoint drm/i915: Add some missing steps to i915_driver_load error path drm/i915: Asynchronously perform the set-base for a simple modeset drm/i915: Align tiled scanouts from stolen memory to 256k in the GTT drm/i915: Apply the force-detect VGA w/a to Valleyview drm/i915: Pair seqno completion tracepoint with its dispatch RFM drm/i915: Boost RPS frequency for CPU stalls Daniel Vetter (1): drm/i915: check that the i965g/gm 4G limit is really obeyed Jesse Barnes (2): drm/i915: split PCI IDs out into i915_drm.h v4 x86: add early quirk for reserving Intel graphics stolen memory v5 Rodrigo Vivi (1): drm/i915: Enable Lower Slice on Haswell GT3. arch/x86/kernel/early-quirks.c | 154 + drivers/gpu/drm/i915/i915_debugfs.c| 11 +- drivers/gpu/drm/i915/i915_dma.c| 19 ++- drivers/gpu/drm/i915/i915_drv.c| 164 +- drivers/gpu/drm/i915/i915_drv.h| 3 + drivers/gpu/drm/i915/i915_gem.c| 25 +++- drivers/gpu/drm/i915/i915_gem_context.c| 12 +- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 41 +++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 7 +- drivers/gpu/drm/i915/i915_irq.c| 2 +- drivers/gpu/drm/i915/i915_reg.h| 20 +-- drivers/gpu/drm/i915/i915_trace.h | 33 +++-- drivers/gpu/drm/i915/intel_crt.c | 2 +- drivers/gpu/drm/i915/intel_display.c | 34 +++-- include/drm/i915_drm.h | 34 + include/drm/i915_pciids.h | 211 + 16 files changed, 566 insertions(+), 206 deletions(-) create mode 100644 include/drm/i915_pciids.h -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 02/17] drm/i915: Rearrange the comments in i915_add_request()
From: Chris Wilson ch...@chris-wilson.co.uk The comments were a little out-of-sequence with the code, forcing the reader to jump around whilst reading. Whilst moving the comments around, add one to explain the context reference. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 56c9104..d5c8a02 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2063,8 +2063,6 @@ int __i915_add_request(struct intel_ring_buffer *ring, request-ring = ring; request-head = request_start; request-tail = request_ring_position; - request-ctx = ring-last_context; - request-batch_obj = obj; /* Whilst this request exists, batch_obj will be on the * active_list, and so will hold the active reference. Only when this @@ -2072,7 +2070,12 @@ int __i915_add_request(struct intel_ring_buffer *ring, * inactive_list and lose its active reference. Hence we do not need * to explicitly hold another reference here. */ + request-batch_obj = obj; + /* Hold a reference to the current context so that we can inspect +* it later in case a hangcheck error event fires. +*/ + request-ctx = ring-last_context; if (request-ctx) i915_gem_context_reference(request-ctx); -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 03/17] drm/i915: Pin pages whilst mapping the dma-buf
From: Chris Wilson ch...@chris-wilson.co.uk As we attempt to kmalloc after calling get_pages, there is a possibility that the shrinker may reap the pages we just acquired. To prevent this we need to increment the pages_pin_count early, so rearrange the code and error paths to make it so. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 41 ++ 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index e918b05..7d5752f 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -42,27 +42,24 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme ret = i915_mutex_lock_interruptible(obj-base.dev); if (ret) - return ERR_PTR(ret); + goto err; ret = i915_gem_object_get_pages(obj); - if (ret) { - st = ERR_PTR(ret); - goto out; - } + if (ret) + goto err_unlock; + + i915_gem_object_pin_pages(obj); /* Copy sg so that we make an independent mapping */ st = kmalloc(sizeof(struct sg_table), GFP_KERNEL); if (st == NULL) { - st = ERR_PTR(-ENOMEM); - goto out; + ret = -ENOMEM; + goto err_unpin; } ret = sg_alloc_table(st, obj-pages-nents, GFP_KERNEL); - if (ret) { - kfree(st); - st = ERR_PTR(ret); - goto out; - } + if (ret) + goto err_free; src = obj-pages-sgl; dst = st-sgl; @@ -73,17 +70,23 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme } if (!dma_map_sg(attachment-dev, st-sgl, st-nents, dir)) { - sg_free_table(st); - kfree(st); - st = ERR_PTR(-ENOMEM); - goto out; + ret =-ENOMEM; + goto err_free_sg; } - i915_gem_object_pin_pages(obj); - -out: mutex_unlock(obj-base.dev-struct_mutex); return st; + +err_free_sg: + sg_free_table(st); +err_free: + kfree(st); +err_unpin: + i915_gem_object_unpin_pages(obj); +err_unlock: + mutex_unlock(obj-base.dev-struct_mutex); +err: + return ERR_PTR(ret); } static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment, -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 05/17] drm/i915: Cancel outstanding modeset workers before suspend
From: Chris Wilson ch...@chris-wilson.co.uk Upon resume we will do a complete restoration of the mode and so reset all tasks, but during suspend (and unload) we need to make sure that no workers run concurrently with our suspend code. Or worse just after. The issue was first raised whilst tackling a suspend issue with Takashi Iwai (http://lists.freedesktop.org/archives/intel-gfx/2012-April/016738.html) and then it was independently rediscovered by Chuanshen Lui. v2: Rebase for the lost year. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Acked-by: Takashi Iwai ti...@suse.de Cc: Liu, Chuansheng chuansheng@intel.com --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/intel_display.c | 21 - 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8ee15b4..7ac5ca5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2199,6 +2199,7 @@ extern void intel_modeset_init_hw(struct drm_device *dev); extern void intel_modeset_suspend_hw(struct drm_device *dev); extern void intel_modeset_init(struct drm_device *dev); extern void intel_modeset_gem_init(struct drm_device *dev); +extern void intel_modeset_quiesce(struct drm_device *dev); extern void intel_modeset_cleanup(struct drm_device *dev); extern int intel_modeset_vga_set_state(struct drm_device *dev, bool state); extern void intel_modeset_setup_hw_state(struct drm_device *dev, diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index bcb62fe..74dbc68 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -10030,6 +10030,7 @@ void intel_modeset_init_hw(struct drm_device *dev) void intel_modeset_suspend_hw(struct drm_device *dev) { + intel_modeset_quiesce(dev); intel_suspend_hw(dev); } @@ -10474,9 +10475,19 @@ void intel_modeset_gem_init(struct drm_device *dev) intel_modeset_setup_hw_state(dev, false); } -void intel_modeset_cleanup(struct drm_device *dev) +void intel_modeset_quiesce(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; + + cancel_work_sync(dev_priv-hotplug_work); + cancel_work_sync(dev_priv-rps.work); + + /* catch all required for dev_priv-wq */ + flush_scheduled_work(); +} + +void intel_modeset_cleanup(struct drm_device *dev) +{ struct drm_crtc *crtc; /* @@ -10485,7 +10496,10 @@ void intel_modeset_cleanup(struct drm_device *dev) * experience fancy races otherwise. */ drm_irq_uninstall(dev); - cancel_work_sync(dev_priv-hotplug_work); + + /* flush any delayed tasks or pending work */ + intel_modeset_quiesce(dev); + /* * Due to the hpd irq storm handling the hotplug work can re-arm the * poll handlers. Hence disable polling after hpd handling is shut down. @@ -10512,9 +10526,6 @@ void intel_modeset_cleanup(struct drm_device *dev) mutex_unlock(dev-struct_mutex); - /* flush any delayed tasks or pending work */ - flush_scheduled_work(); - /* destroy backlight, if any, before the connectors */ intel_panel_destroy_backlight(dev); -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 04/17] drm/i915: check that the i965g/gm 4G limit is really obeyed
From: Daniel Vetter daniel.vet...@ffwll.ch In truly crazy circumstances shmem might give us the wrong type of page. So be a bit paranoid and double check this. Cc: Rob Clark robdcl...@gmail.com References: http://lkml.org/lkml/2011/7/11/238 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d5c8a02..7c92923 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1829,6 +1829,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) sg-length += PAGE_SIZE; } last_pfn = page_to_pfn(page); + + /* Check that the i965g/gm workaround works. */ + WARN_ON((gfp __GFP_DMA32) (last_pfn = 0x0010UL)); } #ifdef CONFIG_SWIOTLB if (!swiotlb_nr_tbl()) -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 06/17] drm/i915: split PCI IDs out into i915_drm.h v4
From: Jesse Barnes jbar...@virtuousgeek.org For use by userspace (at some point in the future) and other kernel code. v2: move PCI IDs to uabi (Chris) move PCI IDs to drm/ (Dave) v3: fixup Quanta detection - needs to come first (Daniel) v4: fix up PCI match structure init for easier use by userspace (Chris) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/i915_drv.c | 164 +++ include/drm/i915_drm.h | 2 + include/drm/i915_pciids.h | 211 3 files changed, 247 insertions(+), 130 deletions(-) create mode 100644 include/drm/i915_pciids.h diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 735dd56..3872f66 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -157,25 +157,6 @@ MODULE_PARM_DESC(prefault_disable, static struct drm_driver driver; extern int intel_agp_enabled; -#define INTEL_VGA_DEVICE(id, info) { \ - .class = PCI_BASE_CLASS_DISPLAY 16, \ - .class_mask = 0xff, \ - .vendor = 0x8086, \ - .device = id, \ - .subvendor = PCI_ANY_ID,\ - .subdevice = PCI_ANY_ID,\ - .driver_data = (unsigned long) info } - -#define INTEL_QUANTA_VGA_DEVICE(info) {\ - .class = PCI_BASE_CLASS_DISPLAY 16, \ - .class_mask = 0xff, \ - .vendor = 0x8086, \ - .device = 0x16a,\ - .subvendor = 0x152d,\ - .subdevice = 0x8990,\ - .driver_data = (unsigned long) info } - - static const struct intel_device_info intel_i830_info = { .gen = 2, .is_mobile = 1, .cursor_needs_physical = 1, .num_pipes = 2, .has_overlay = 1, .overlay_needs_physical = 1, @@ -350,118 +331,41 @@ static const struct intel_device_info intel_haswell_m_info = { .has_vebox_ring = 1, }; +/* + * Make sure any device matches here are from most specific to most + * general. For example, since the Quanta match is based on the subsystem + * and subvendor IDs, we need it to come before the more general IVB + * PCI ID matches, otherwise we'll use the wrong info struct above. + */ +#define INTEL_PCI_IDS \ + INTEL_I830_IDS(intel_i830_info), \ + INTEL_I845G_IDS(intel_845g_info), \ + INTEL_I85X_IDS(intel_i85x_info), \ + INTEL_I865G_IDS(intel_i865g_info), \ + INTEL_I915G_IDS(intel_i915g_info), \ + INTEL_I915GM_IDS(intel_i915gm_info), \ + INTEL_I945G_IDS(intel_i945g_info), \ + INTEL_I945GM_IDS(intel_i945gm_info), \ + INTEL_I965G_IDS(intel_i965g_info), \ + INTEL_G33_IDS(intel_g33_info), \ + INTEL_I965GM_IDS(intel_i965gm_info), \ + INTEL_GM45_IDS(intel_gm45_info), \ + INTEL_G45_IDS(intel_g45_info), \ + INTEL_PINEVIEW_IDS(intel_pineview_info), \ + INTEL_IRONLAKE_D_IDS(intel_ironlake_d_info), \ + INTEL_IRONLAKE_M_IDS(intel_ironlake_m_info), \ + INTEL_SNB_D_IDS(intel_sandybridge_d_info), \ + INTEL_SNB_M_IDS(intel_sandybridge_m_info), \ + INTEL_IVB_Q_IDS(intel_ivybridge_q_info), /* must be first IVB */ \ + INTEL_IVB_M_IDS(intel_ivybridge_m_info), \ + INTEL_IVB_D_IDS(intel_ivybridge_d_info), \ + INTEL_HSW_D_IDS(intel_haswell_d_info), \ + INTEL_HSW_M_IDS(intel_haswell_m_info), \ + INTEL_VLV_M_IDS(intel_valleyview_m_info), \ + INTEL_VLV_D_IDS(intel_valleyview_d_info) + static const struct pci_device_id pciidlist[] = { /* aka */ - INTEL_VGA_DEVICE(0x3577, intel_i830_info), /* I830_M */ - INTEL_VGA_DEVICE(0x2562, intel_845g_info), /* 845_G */ - INTEL_VGA_DEVICE(0x3582, intel_i85x_info), /* I855_GM */ - INTEL_VGA_DEVICE(0x358e, intel_i85x_info), - INTEL_VGA_DEVICE(0x2572, intel_i865g_info),/* I865_G */ - INTEL_VGA_DEVICE(0x2582, intel_i915g_info),/* I915_G */ - INTEL_VGA_DEVICE(0x258a, intel_i915g_info),/* E7221_G */ - INTEL_VGA_DEVICE(0x2592, intel_i915gm_info), /* I915_GM */ - INTEL_VGA_DEVICE(0x2772, intel_i945g_info),/* I945_G */ - INTEL_VGA_DEVICE(0x27a2, intel_i945gm_info), /* I945_GM */ - INTEL_VGA_DEVICE(0x27ae, intel_i945gm_info), /* I945_GME */ - INTEL_VGA_DEVICE(0x2972, intel_i965g_info),/* I946_GZ */ - INTEL_VGA_DEVICE(0x2982, intel_i965g_info),/* G35_G */ - INTEL_VGA_DEVICE(0x2992, intel_i965g_info),/* I965_Q */ - INTEL_VGA_DEVICE(0x29a2, intel_i965g_info),/* I965_G */ - INTEL_VGA_DEVICE(0x29b2, intel_g33_info), /* Q35_G */ -
[Intel-gfx] [PATCH 07/17] x86: add early quirk for reserving Intel graphics stolen memory v5
From: Jesse Barnes jbar...@virtuousgeek.org Systems with Intel graphics controllers set aside memory exclusively for gfx driver use. This memory is not always marked in the E820 as reserved or as RAM, and so is subject to overlap from E820 manipulation later in the boot process. On some systems, MMIO space is allocated on top, despite the efforts of the RAM buffer approach, which simply rounds memory boundaries up to 64M to try to catch space that may decode as RAM and so is not suitable for MMIO. v2: use read_pci_config for 32 bit reads instead of adding a new one (Chris) add gen6 stolen size function (Chris) v3: use a function pointer (Chris) drop gen2 bits (Daniel) v4: call e820_sanitize_map after adding the region v5: fixup comments (Peter) simplify loop (Chris) Acked-by: Ingo Molnar mi...@kernel.org Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- arch/x86/kernel/early-quirks.c | 154 drivers/gpu/drm/i915/i915_reg.h | 15 include/drm/i915_drm.h | 32 + 3 files changed, 186 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c index 63bdb29..b3cd3eb 100644 --- a/arch/x86/kernel/early-quirks.c +++ b/arch/x86/kernel/early-quirks.c @@ -12,6 +12,7 @@ #include linux/pci.h #include linux/acpi.h #include linux/pci_ids.h +#include drm/i915_drm.h #include asm/pci-direct.h #include asm/dma.h #include asm/io_apic.h @@ -216,6 +217,157 @@ static void __init intel_remapping_check(int num, int slot, int func) } +/* + * Systems with Intel graphics controllers set aside memory exclusively + * for gfx driver use. This memory is not marked in the E820 as reserved + * or as RAM, and so is subject to overlap from E820 manipulation later + * in the boot process. On some systems, MMIO space is allocated on top, + * despite the efforts of the RAM buffer approach, which simply rounds + * memory boundaries up to 64M to try to catch space that may decode + * as RAM and so is not suitable for MMIO. + * + * And yes, so far on current devices the base addr is always under 4G. + */ +static u32 __init intel_stolen_base(int num, int slot, int func) +{ + u32 base; + + /* +* For the PCI IDs in this quirk, the stolen base is always +* in 0x5c, aka the BDSM register (yes that's really what +* it's called). +*/ + base = read_pci_config(num, slot, func, 0x5c); + base = ~((120) - 1); + + return base; +} + +#define KB(x) ((x) * 1024) +#define MB(x) (KB (KB (x))) +#define GB(x) (MB (KB (x))) + +static size_t __init gen3_stolen_size(int num, int slot, int func) +{ + size_t stolen_size; + u16 gmch_ctrl; + + gmch_ctrl = read_pci_config_16(0, 0, 0, I830_GMCH_CTRL); + + switch (gmch_ctrl I855_GMCH_GMS_MASK) { + case I855_GMCH_GMS_STOLEN_1M: + stolen_size = MB(1); + break; + case I855_GMCH_GMS_STOLEN_4M: + stolen_size = MB(4); + break; + case I855_GMCH_GMS_STOLEN_8M: + stolen_size = MB(8); + break; + case I855_GMCH_GMS_STOLEN_16M: + stolen_size = MB(16); + break; + case I855_GMCH_GMS_STOLEN_32M: + stolen_size = MB(32); + break; + case I915_GMCH_GMS_STOLEN_48M: + stolen_size = MB(48); + break; + case I915_GMCH_GMS_STOLEN_64M: + stolen_size = MB(64); + break; + case G33_GMCH_GMS_STOLEN_128M: + stolen_size = MB(128); + break; + case G33_GMCH_GMS_STOLEN_256M: + stolen_size = MB(256); + break; + case INTEL_GMCH_GMS_STOLEN_96M: + stolen_size = MB(96); + break; + case INTEL_GMCH_GMS_STOLEN_160M: + stolen_size = MB(160); + break; + case INTEL_GMCH_GMS_STOLEN_224M: + stolen_size = MB(224); + break; + case INTEL_GMCH_GMS_STOLEN_352M: + stolen_size = MB(352); + break; + default: + stolen_size = 0; + break; + } + + return stolen_size; +} + +static size_t __init gen6_stolen_size(int num, int slot, int func) +{ + u16 gmch_ctrl; + + gmch_ctrl = read_pci_config_16(num, slot, func, SNB_GMCH_CTRL); + gmch_ctrl = SNB_GMCH_GMS_SHIFT; + gmch_ctrl = SNB_GMCH_GMS_MASK; + + return gmch_ctrl 25; /* 32 MB units */ +} + +typedef size_t (*stolen_size_fn)(int num, int slot, int func); + +static struct pci_device_id intel_stolen_ids[] __initdata = { + INTEL_I915G_IDS(gen3_stolen_size), + INTEL_I915GM_IDS(gen3_stolen_size), + INTEL_I945G_IDS(gen3_stolen_size), + INTEL_I945GM_IDS(gen3_stolen_size), + INTEL_VLV_M_IDS(gen3_stolen_size), + INTEL_VLV_D_IDS(gen3_stolen_size), +
[Intel-gfx] [PATCH 08/17] drm/i915: Always prefer CPU relocations with LLC
From: Chris Wilson ch...@chris-wilson.co.uk A follow-on to the update of the LLC coherency logic is that we can rely on the LLC being coherent with the CS for rewriting batchbuffers irrespective of their cache domain. (This should have no effect currently as all the batch buffers are expected to be I915_CACHE_LLC and so using the cpu relocation path anyway.) Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 792c52a..3b64b9f 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -166,7 +166,8 @@ eb_destroy(struct eb_objects *eb) static inline int use_cpu_reloc(struct drm_i915_gem_object *obj) { - return (obj-base.write_domain == I915_GEM_DOMAIN_CPU || + return (HAS_LLC(obj-base.dev) || + obj-base.write_domain == I915_GEM_DOMAIN_CPU || !obj-map_and_fenceable || obj-cache_level != I915_CACHE_NONE); } @@ -179,7 +180,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, char *vaddr; int ret = -EINVAL; - ret = i915_gem_object_set_to_cpu_domain(obj, 1); + ret = i915_gem_object_set_to_cpu_domain(obj, true); if (ret) return ret; -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 10/17] drm/i915: Move the conditional seqno query into the tracepoint
From: Chris Wilson ch...@chris-wilson.co.uk We only wish to know the value of seqno when emitting the tracepoint, so move the query from a parameter to the macro to inside the conditional macro body so that the query is only evaluated when required. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_irq.c | 2 +- drivers/gpu/drm/i915/i915_trace.h | 21 ++--- 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index a03b445..efac65b 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -801,7 +801,7 @@ static void notify_ring(struct drm_device *dev, if (ring-obj == NULL) return; - trace_i915_gem_request_complete(ring, ring-get_seqno(ring, false)); + trace_i915_gem_request_complete(ring); wake_up_all(ring-irq_queue); i915_queue_hangcheck(dev); diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index e2c5ee6..a1797f6 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -304,9 +304,24 @@ DEFINE_EVENT(i915_gem_request, i915_gem_request_add, TP_ARGS(ring, seqno) ); -DEFINE_EVENT(i915_gem_request, i915_gem_request_complete, - TP_PROTO(struct intel_ring_buffer *ring, u32 seqno), - TP_ARGS(ring, seqno) +TRACE_EVENT(i915_gem_request_complete, + TP_PROTO(struct intel_ring_buffer *ring), + TP_ARGS(ring), + + TP_STRUCT__entry( +__field(u32, dev) +__field(u32, ring) +__field(u32, seqno) +), + + TP_fast_assign( + __entry-dev = ring-dev-primary-index; + __entry-ring = ring-id; + __entry-seqno = ring-get_seqno(ring, false); + ), + + TP_printk(dev=%u, ring=%u, seqno=%u, + __entry-dev, __entry-ring, __entry-seqno) ); DEFINE_EVENT(i915_gem_request, i915_gem_request_retire, -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 09/17] drm/i915: Report requested frequency alongside current frequency in debugfs
From: Chris Wilson ch...@chris-wilson.co.uk It can be useful to compare at times the current vs requested frequency of the GPU, so provide the contents of RPNSWREQ alonside CAGF. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_debugfs.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 55ab924..a6f4cb5 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -857,7 +857,7 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused) u32 gt_perf_status = I915_READ(GEN6_GT_PERF_STATUS); u32 rp_state_limits = I915_READ(GEN6_RP_STATE_LIMITS); u32 rp_state_cap = I915_READ(GEN6_RP_STATE_CAP); - u32 rpstat, cagf; + u32 rpstat, cagf, reqf; u32 rpupei, rpcurup, rpprevup; u32 rpdownei, rpcurdown, rpprevdown; int max_freq; @@ -869,6 +869,14 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused) gen6_gt_force_wake_get(dev_priv); + reqf = I915_READ(GEN6_RPNSWREQ); + reqf = ~GEN6_TURBO_DISABLE; + if (IS_HASWELL(dev)) + reqf = 24; + else + reqf = 25; + reqf *= GT_FREQUENCY_MULTIPLIER; + rpstat = I915_READ(GEN6_RPSTAT1); rpupei = I915_READ(GEN6_RP_CUR_UP_EI); rpcurup = I915_READ(GEN6_RP_CUR_UP); @@ -893,6 +901,7 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused) gt_perf_status 0xff); seq_printf(m, Render p-state limit: %d\n, rp_state_limits 0xff); + seq_printf(m, RPNSWREQ: %dMHz\n, reqf); seq_printf(m, CAGF: %dMHz\n, cagf); seq_printf(m, RP CUR UP EI: %dus\n, rpupei GEN6_CURICONT_MASK); -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 11/17] drm/i915: Add some missing steps to i915_driver_load error path
From: Chris Wilson ch...@chris-wilson.co.uk We missed adding a few cleanup steps for recent additions. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_dma.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 4c7669f..236e5e3 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1543,7 +1543,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) ret = i915_gem_gtt_init(dev); if (ret) - goto put_bridge; + goto out_regs; if (drm_core_check_feature(dev, DRIVER_MODESET)) i915_kick_out_firmware_fb(dev_priv); @@ -1572,7 +1572,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) aperture_size); if (dev_priv-gtt.mappable == NULL) { ret = -EIO; - goto out_rmmap; + goto out_gtt; } dev_priv-gtt.mtrr = arch_phys_wc_add(dev_priv-gtt.mappable_base, @@ -1646,7 +1646,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) ret = i915_load_modeset_init(dev); if (ret 0) { DRM_ERROR(failed to init modeset\n); - goto out_gem_unload; + goto out_power_well; } } else { /* Start out suspended in ums mode. */ @@ -1666,6 +1666,10 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) return 0; +out_power_well: + if (HAS_POWER_WELL(dev)) + i915_remove_power_well(dev); + drm_vblank_cleanup(dev); out_gem_unload: if (dev_priv-mm.inactive_shrinker.shrink) unregister_shrinker(dev_priv-mm.inactive_shrinker); @@ -1679,12 +1683,17 @@ out_gem_unload: out_mtrrfree: arch_phys_wc_del(dev_priv-gtt.mtrr); io_mapping_free(dev_priv-gtt.mappable); +out_gtt: + list_del(dev_priv-gtt.base.global_link); + drm_mm_takedown(dev_priv-gtt.base.mm); dev_priv-gtt.base.cleanup(dev_priv-gtt.base); -out_rmmap: +out_regs: pci_iounmap(dev-pdev, dev_priv-regs); put_bridge: pci_dev_put(dev_priv-bridge_dev); free_priv: + if (dev_priv-slab) + kmem_cache_destroy(dev_priv-slab); kfree(dev_priv); return ret; } @@ -1777,6 +1786,8 @@ int i915_driver_unload(struct drm_device *dev) if (dev_priv-regs != NULL) pci_iounmap(dev-pdev, dev_priv-regs); + drm_vblank_cleanup(dev); + intel_teardown_gmbus(dev); intel_teardown_mchbar(dev); -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 13/17] drm/i915: Align tiled scanouts from stolen memory to 256k in the GTT
From: Chris Wilson ch...@chris-wilson.co.uk For unfathomable reasons this alignment appears to be required for tiled scanouts being read from stolen memory. I can find no reference in the w/a db to support this requirement, but the evidence of my own eyes says this prevents many headaches. Note that I have not tricked anything older than Sandybridge into using stolen tiled scanouts, so the extra alignment may be required there as well. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_display.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 0da0ead1..cee9c0d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -1837,6 +1837,8 @@ intel_pin_and_fence_fb_obj(struct drm_device *dev, case I915_TILING_X: /* pin() will align the object as required by fence */ alignment = 0; + if (obj-stolen INTEL_INFO(dev)-gen = 6) + alignment = 256 * 1024; break; case I915_TILING_Y: /* Despite that we check this in framebuffer_init userspace can -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 14/17] drm/i915: Apply the force-detect VGA w/a to Valleyview
From: Chris Wilson ch...@chris-wilson.co.uk It appears that Valleyview shares its VGA encoder with more recent siblings and requires the same forced detection cycle after a hardware reset before we can rely on hotplugging. Reported-and-tested-by: kobeqin kobe@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67733 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_crt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_crt.c b/drivers/gpu/drm/i915/intel_crt.c index b5a3875..7475200 100644 --- a/drivers/gpu/drm/i915/intel_crt.c +++ b/drivers/gpu/drm/i915/intel_crt.c @@ -688,7 +688,7 @@ static void intel_crt_reset(struct drm_connector *connector) struct drm_i915_private *dev_priv = dev-dev_private; struct intel_crt *crt = intel_attached_crt(connector); - if (HAS_PCH_SPLIT(dev)) { + if (HAS_PCH_SPLIT(dev) || IS_VALLEYVIEW(dev)) { u32 adpa; adpa = I915_READ(crt-adpa_reg); -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 12/17] drm/i915: Asynchronously perform the set-base for a simple modeset
From: Chris Wilson ch...@chris-wilson.co.uk A simple modeset, where we only wish to switch over to a new framebuffer such as the transition from fbcon to X, takes around 30-60ms. This is due to three factors: 1. We need to make sure the fb-obj is in the display domain, which incurs a cache flush to ensure no dirt is left on the scanout. 2. We need to flush any pending rendering before performing the mmio so that the frame is complete before it is shown. 3. We currently wait for the vblank after the mmio to be sure that the old fb is no longer being shown before releasing it. (1) can only be eliminated by userspace preparing the fb-obj in advance to already be in the display domain. This can be done through use of the create2 ioctl, or by reusing an existing fb-obj. However, (2) and (3) are already solved by the existing page flip mechanism, and it is surprisingly trivial to wire them up for use in the set-base fast path. Though it can be argued that this represents a subtle ABI break in that the set_config ioctl now returns before the old framebuffer is unpinned. The danger is that userspace will start to modify it before it is no longer being shown, however we should be able to prevent that through proper domain tracking. By combining all of the above, we can achieve an instaneous set_config: [ 6.601] (II) intel(0): switch to mode 2560x1440@60.0 on pipe 0 using DP2, position (0, 0), rotation normal [ 6.601] (II) intel(0): Setting screen physical size to 677 x 381 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_display.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 74dbc68..0da0ead1 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9213,10 +9213,13 @@ static int intel_crtc_set_config(struct drm_mode_set *set) ret = intel_set_mode(set-crtc, set-mode, set-x, set-y, set-fb); } else if (config-fb_changed) { - intel_crtc_wait_for_pending_flips(set-crtc); - - ret = intel_pipe_set_base(set-crtc, - set-x, set-y, set-fb); + if (to_intel_framebuffer(set-fb)-obj-ring == NULL || + save_set.x != set-x || save_set.y != set-y || + intel_crtc_page_flip(set-crtc, set-fb, NULL)) { + intel_crtc_wait_for_pending_flips(set-crtc); + ret = intel_pipe_set_base(set-crtc, + set-x, set-y, set-fb); + } } if (ret) { -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 15/17] drm/i915: Pair seqno completion tracepoint with its dispatch
From: Chris Wilson ch...@chris-wilson.co.uk In order to time how long a seqno is executed by a ring, we need to measure both its insertion and its completion. (Using the completion of the previous seqno as an estimate for when the GPU starts, if busy.) In order to get an exact completion timestamp, we need irqs. This is enabled by trace_i915_gem_ring_dispatch, so it makes more sens to pair the completion event with that rather than the request. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/i915_irq.c| 2 +- drivers/gpu/drm/i915/i915_trace.h | 48 +++--- 3 files changed, 26 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 3b64b9f..4f90cb3 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1127,7 +1127,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err; } - trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags); + trace_i915_gem_ring_dispatch(ring, flags); i915_gem_execbuffer_move_to_active(eb-objects, vm, ring); i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index efac65b..d61bd5b 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -801,7 +801,7 @@ static void notify_ring(struct drm_device *dev, if (ring-obj == NULL) return; - trace_i915_gem_request_complete(ring); + trace_i915_gem_ring_complete(ring); wake_up_all(ring-irq_queue); i915_queue_hangcheck(dev); diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index a1797f6..5c8e36a 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -234,8 +234,8 @@ TRACE_EVENT(i915_gem_evict_everything, ); TRACE_EVENT(i915_gem_ring_dispatch, - TP_PROTO(struct intel_ring_buffer *ring, u32 seqno, u32 flags), - TP_ARGS(ring, seqno, flags), + TP_PROTO(struct intel_ring_buffer *ring, u32 flags), + TP_ARGS(ring, flags), TP_STRUCT__entry( __field(u32, dev) @@ -247,15 +247,35 @@ TRACE_EVENT(i915_gem_ring_dispatch, TP_fast_assign( __entry-dev = ring-dev-primary-index; __entry-ring = ring-id; - __entry-seqno = seqno; + __entry-seqno = intel_ring_get_seqno(ring), __entry-flags = flags; - i915_trace_irq_get(ring, seqno); + i915_trace_irq_get(ring, __entry-seqno); ), TP_printk(dev=%u, ring=%u, seqno=%u, flags=%x, __entry-dev, __entry-ring, __entry-seqno, __entry-flags) ); +TRACE_EVENT(i915_gem_ring_complete, + TP_PROTO(struct intel_ring_buffer *ring), + TP_ARGS(ring), + + TP_STRUCT__entry( +__field(u32, dev) +__field(u32, ring) +__field(u32, seqno) +), + + TP_fast_assign( + __entry-dev = ring-dev-primary-index; + __entry-ring = ring-id; + __entry-seqno = ring-get_seqno(ring, false); + ), + + TP_printk(dev=%u, ring=%u, seqno=%u, + __entry-dev, __entry-ring, __entry-seqno) +); + TRACE_EVENT(i915_gem_ring_flush, TP_PROTO(struct intel_ring_buffer *ring, u32 invalidate, u32 flush), TP_ARGS(ring, invalidate, flush), @@ -304,26 +324,6 @@ DEFINE_EVENT(i915_gem_request, i915_gem_request_add, TP_ARGS(ring, seqno) ); -TRACE_EVENT(i915_gem_request_complete, - TP_PROTO(struct intel_ring_buffer *ring), - TP_ARGS(ring), - - TP_STRUCT__entry( -__field(u32, dev) -__field(u32, ring) -__field(u32, seqno) -), - - TP_fast_assign( - __entry-dev = ring-dev-primary-index; - __entry-ring = ring-id; - __entry-seqno = ring-get_seqno(ring, false); - ), - - TP_printk(dev=%u, ring=%u, seqno=%u, - __entry-dev, __entry-ring, __entry-seqno) -); - DEFINE_EVENT(i915_gem_request, i915_gem_request_retire, TP_PROTO(struct intel_ring_buffer *ring, u32 seqno), TP_ARGS(ring, seqno) -- 1.8.1.4 ___
[Intel-gfx] [PATCH 16/17] RFM drm/i915: Boost RPS frequency for CPU stalls
From: Chris Wilson ch...@chris-wilson.co.uk If we encounter a situation where the CPU blocks waiting for results from the GPU, give the GPU a kick to boost its the frequency. This should work to reduce user interface stalls and to quickly promote mesa to high frequencies - but the cost is that our requested frequency stalls high (as we do not idle for long enough before rc6 to start reducing frequencies, nor are we aggressive at down clocking an underused GPU). However, this should be mitigated by rc6 itself powering off the GPU when idle, and that energy use is dependent upon the workload of the GPU in addition to its frequency (e.g. the math or sampler functions only consume power when used). Still, this is likely to adversely affect light workloads. Stéphane raised the concern that this will punish good applications and reward bad applications - but due to the nature of how mesa performs its client throttling, I believe all mesa applications will be roughly equally affected. RFM - request for measurement! Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Kenneth Graunke kenn...@whitecape.org Cc: Stéphane Marchesin stephane.marche...@gmail.com Cc: Meng, Mengmeng mengmeng.m...@intel.com Cc: Zhuang, Lena lena.zhu...@intel.com --- drivers/gpu/drm/i915/i915_gem.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7c92923..50a70ee 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1016,6 +1016,15 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno, if (WARN_ON(!ring-irq_get(ring))) return -ENODEV; + if (dev_priv-info-gen = 6) { + mutex_lock(dev_priv-rps.hw_lock); + if (dev_priv-info-is_valleyview) + valleyview_set_rps(dev_priv-dev, dev_priv-rps.max_delay); + else + gen6_set_rps(dev_priv-dev, dev_priv-rps.max_delay); + mutex_unlock(dev_priv-rps.hw_lock); + } + /* Record current time in case interrupted by signal, or wedged * */ getrawmonotonic(before); -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 17/17] drm/i915: Enable Lower Slice on Haswell GT3.
Full HSW GT3 power can only be achieved with all Execution Units turned on. This patch enables all EUs present on HSW GT3 by enabling lower slice. Credits-by: Yejun Guo yejun@intel.com Signed-off-by: Rodrigo Vivi rodrigo.v...@gmail.com --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem.c | 5 + drivers/gpu/drm/i915/i915_reg.h | 5 + 3 files changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7ac5ca5..e9b2f86 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1592,6 +1592,8 @@ struct drm_i915_file_private { ((dev)-pci_device 0xFF00) == 0x0C00) #define IS_ULT(dev)(IS_HASWELL(dev) \ ((dev)-pci_device 0xFF00) == 0x0A00) +#define IS_HSW_GT3(dev)(IS_HASWELL(dev) \ +((dev)-pci_device 0x00F0) == 0x0020) /* * The genX designation typically refers to the render engine, so render diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 50a70ee..1c72229 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4316,6 +4316,11 @@ i915_gem_init_hw(struct drm_device *dev) if (dev_priv-ellc_size) I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf)); + if (IS_HSW_GT3(dev)) + I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED); + else + I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED); + if (HAS_PCH_NOP(dev)) { u32 temp = I915_READ(GEN7_MSG_CTL); temp = ~(WAIT_FOR_PCH_FLR_ACK | WAIT_FOR_PCH_RESET_ACK); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 4c3cf22..e122b57 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -263,6 +263,11 @@ #define MI_SEMAPHORE_SYNC_VVE (116) /* VECS wait for VCS (VEVSYNC) */ #define MI_SEMAPHORE_SYNC_RVE (216) /* VECS wait for RCS (VERSYNC) */ #define MI_SEMAPHORE_SYNC_INVALID (316) + +#define MI_PREDICATE_RESULT_2 (0x2214) +#define LOWER_SLICE_ENABLED (10) +#define LOWER_SLICE_DISABLED (00) + /* * 3D instructions used by the kernel */ -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 16/17] RFM drm/i915: Boost RPS frequency for CPU stalls
On Mon, Aug 26, 2013 at 07:51:08PM -0300, Rodrigo Vivi wrote: From: Chris Wilson ch...@chris-wilson.co.uk If we encounter a situation where the CPU blocks waiting for results from the GPU, give the GPU a kick to boost its the frequency. This should work to reduce user interface stalls and to quickly promote mesa to high frequencies - but the cost is that our requested frequency stalls high (as we do not idle for long enough before rc6 to start reducing frequencies, nor are we aggressive at down clocking an underused GPU). However, this should be mitigated by rc6 itself powering off the GPU when idle, and that energy use is dependent upon the workload of the GPU in addition to its frequency (e.g. the math or sampler functions only consume power when used). Still, this is likely to adversely affect light workloads. Stéphane raised the concern that this will punish good applications and reward bad applications - but due to the nature of how mesa performs its client throttling, I believe all mesa applications will be roughly equally affected. RFM - request for measurement! Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Kenneth Graunke kenn...@whitecape.org Cc: Stéphane Marchesin stephane.marche...@gmail.com Cc: Meng, Mengmeng mengmeng.m...@intel.com Cc: Zhuang, Lena lena.zhu...@intel.com This patch is superseded by later versions. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] Add second DRI driver name (DRI2DriverVDPAU)
Rinat Ibragimov ibragimovri...@mail.ru writes: Понедельник, 26 августа 2013, 13:40 -07:00 от Eric Anholt e...@anholt.net: Ибрагимов Ринат ibragimovri...@mail.ru writes: libvdpau uses second DRI driver name to determine which VDPAU driver to use. This patch will allow libvdpau choose libvdpau_i965.so on systems with Intel GPUs, libvdpau_nvidia.so on those with nVidia ones, and so on. I'm experimenting now with generic vdpau driver using OpenGL/VA-API, it would be convenient to have this driver selection working without manual driver selection. If it's a generic driver, why would it need a i965 string passed over the DRI2 protocol to find it? It doesn't actually. It's just to simplify distribution package management. If one names driver libvdpau_nvidia.so.1, it will break VDPAU on nVidia equipped machines. With this patch one can name it libvdpau_i965.so.1 and driver will be loaded on intel equipped machines only. One of the things I'm planning on doing is removing use of the protocol-provided DRI2 driver name from Mesa -- Mesa knows what drivers it has built, and it knows how to detect those devices (in the EGL code), so why would we pay attention to what the X Server thinks our driver's name is? Seems like vdpau could probably do the same. VDPAU uses entry point library (libvdpau) which loads backend drivers. libvdpau uses second DRI2 driver name amongst other methods to determine which driver to load. It can not rely on Mesa's internal knowledge, because it must work with proprietary nVidia driver too. Right, I meant couldn't libvdpau have a method to determine which driver to load based on looking at your hardware, like how Mesa's EGL does?, since only libvdpau is likely to know how pieces of hardware map to driver binaries. pgpS9AZxXjcB4.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] [VPG HSW-A] drm/i915: BUN vol4g[DevHSW]
Hi Paulo, There are some changes in voltage swing levels for Haswell platform. In current implementation, macro name are given after the swing values. But the values are changed for Haswell. This comment is added to avoid the confusion during signal level debug or other similar kind of activity. Thanks and regards, Koushik -Original Message- From: Paulo Zanoni [mailto:przan...@gmail.com] Sent: Tuesday, August 27, 2013 12:46 AM To: Biswas, KoushikX Cc: Intel Graphics Development Subject: Re: [Intel-gfx] [PATCH] [VPG HSW-A] drm/i915: BUN vol4g[DevHSW] 2013/8/26 Koushik Biswas koushikx.bis...@intel.com: From: koushik koushikx.bis...@intel.com WW43 2012 - DDI buffer translation corrections WW36 2012 - Added HDMI voltage swing (not implemented for HDMI) Added comments with voltage swing, pre-emphasis, transition and non-transition values in form of table for reference. This values are applicable only for HSW platform. But why exactly do we need this comment? We already have the intel_hsw_signal_levels() function (inside intel_dp.c) which is basically the implementation of your comment. Signed-off-by: koushik koushikx.bis...@intel.com Change-Id: I0cff220c7d047f41b2a96b3e3880b4029550d458 --- drivers/gpu/drm/i915/intel_ddi.c | 31 +++ drivers/gpu/drm/i915/intel_dp.c | 32 2 files changed, 63 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c index 5131517..0de236e 100644 --- a/drivers/gpu/drm/i915/intel_ddi.c +++ b/drivers/gpu/drm/i915/intel_ddi.c @@ -128,6 +128,37 @@ void intel_prepare_ddi(struct drm_device *dev) intel_prepare_ddi_buffers(dev, PORT_E, true); } + +/* Updating the new table in comments as it doesn't cause any logic +change */ + +/* For HSW Voltage swing levels +***/ +/ +**/ +/*___ +_*/ /*|Entry|Voltage|Pre-emphasis|Non-Transition|Transition +|Pre-emphasis|*/ +/*| |Swing |level |mV diff p-p |mV diff p-p|db |*/ +/*|--|*/ +/*|0|0 |0 |400 |400|0 |*/ +/*|--|*/ +/*|1|0 |1 |400 |600|3.5 |*/ +/*|--|*/ +/*|2|0 |2 |400 |800|6 |*/ +/*|--|*/ +/*|3|0 |3 |400 |1000 |8 |*/ +/*|--|*/ +/*|4|1 |0 |600 |600|0 |*/ +/*|--|*/ +/*|5|1 |1 |600 |900|3.5 |*/ +/*|--|*/ +/*|6|1 |2 |600 |1000 |4.5 |*/ +/*|--|*/ +/*|7|2 |0 |800 |800|0 |*/ +/*|--|*/ +/*|8|2 |1 |1000 |1000 |2 |*/ +/*|--|*/ +/*|9| Entry 9 is only used for HDMI and DVI|*/ +/*|-- +|*/ +/ +**/ + static const long hsw_ddi_buf_ctl_values[] = { DDI_BUF_EMP_400MV_0DB_HSW, DDI_BUF_EMP_400MV_3_5DB_HSW, diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 9fd7f90..fa73fb1 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -1809,6 +1809,38 @@ intel_gen7_edp_signal_levels(uint8_t train_set) } } + +/* Updating the new table in comments as it doesn't cause any logic +change */ + +/* For HSW Voltage swing levels +***/ +/ +**/ +/*___ +_*/ /*|Entry|Voltage|Pre-emphasis|Non-Transition|Transition +|Pre-emphasis|*/ +/*| |Swing |level |mV diff p-p |mV diff p-p|db |*/ +/*|--|*/ +/*|0|0 |0 |400 |400|0 |*/