Re: [Intel-gfx] [PATCH 1/2] shmem: Support for registration of driver/file owner specific ops
On 11/10/2016 11:06 AM, Hugh Dickins wrote: On Fri, 4 Nov 2016, akash.g...@intel.com wrote: From: Chris WilsonThis provides support for the drivers or shmem file owners to register a set of callbacks, which can be invoked from the address space operations methods implemented by shmem. This allow the file owners to hook into the shmem address space operations to do some extra/custom operations in addition to the default ones. The private_data field of address_space struct is used to store the pointer to driver specific ops. Currently only one ops field is defined, which is migratepage, but can be extended on an as-needed basis. The need for driver specific operations arises since some of the operations (like migratepage) may not be handled completely within shmem, so as to be effective, and would need some driver specific handling also. Specifically, i915.ko would like to participate in migratepage(). i915.ko uses shmemfs to provide swappable backing storage for its user objects, but when those objects are in use by the GPU it must pin the entire object until the GPU is idle. As a result, large chunks of memory can be arbitrarily withdrawn from page migration, resulting in premature out-of-memory due to fragmentation. However, if i915.ko can receive the migratepage() request, it can then flush the object from the GPU, remove its pin and thus enable the migration. Since gfx allocations are one of the major consumer of system memory, its imperative to have such a mechanism to effectively deal with fragmentation. And therefore the need for such a provision for initiating driver specific actions during address space operations. Thank you for persisting with this, and sorry for all my delay. v2: - Drop dev_ prefix from the members of shmem_dev_info structure. (Joonas) - Change the return type of shmem_set_device_op() to void and remove the check for pre-existing data. (Joonas) - Rename shmem_set_device_op() to shmem_set_dev_info() to be consistent with shmem_dev_info structure. (Joonas) Cc: Hugh Dickins Cc: linux...@kvack.org Cc: linux-ker...@vger.linux.org Signed-off-by: Sourab Gupta Signed-off-by: Akash Goel Reviewed-by: Chris Wilson That doesn't seem quite right: the From line above implies that Chris wrote it, and should be first Signer; but perhaps the From line is wrong. Chris only wrote this patch initially, will do the required correction. --- include/linux/shmem_fs.h | 13 + mm/shmem.c | 17 - 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index ff078e7..454c3ba 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -39,11 +39,24 @@ struct shmem_sb_info { unsigned long shrinklist_len; /* Length of shrinklist */ }; +struct shmem_dev_info { + void *private_data; + int (*migratepage)(struct address_space *mapping, + struct page *newpage, struct page *page, + enum migrate_mode mode, void *dev_priv_data); Aren't the private_data field and dev_priv_data arg a little bit confusing and redundant? Can't the migratepage() deduce dev_priv for itself from mapping->private_data (perhaps wrapped by a shmem_get_dev_info()), by using container_of()? Yes looks like migratepage() can deduce dev_priv from mapping->private_data. Can we keep the private_data as a placeholder ?. Will s/dev_priv_data/private_data/. As per your suggestion, in the other patch, object pointer can be derived from mapping->private_data (container_of) and dev_priv in turn can be derived from object pointer. +}; + static inline struct shmem_inode_info *SHMEM_I(struct inode *inode) { return container_of(inode, struct shmem_inode_info, vfs_inode); } +static inline void shmem_set_dev_info(struct address_space *mapping, + struct shmem_dev_info *info) +{ + mapping->private_data = info; Nit: if this stays as is, I'd prefer dev_info there and above, since shmem.c uses info all over for its shmem_inode_info pointer. But in second patch I suggest obj_info may be better than dev_info. Fine will s/info/dev_info. Best regards Akash +} + /* * Functions in mm/shmem.c called directly from elsewhere: */ diff --git a/mm/shmem.c b/mm/shmem.c index ad7813d..fce8de3 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1290,6 +1290,21 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) return 0; } +#ifdef CONFIG_MIGRATION +static int shmem_migratepage(struct address_space *mapping, +struct page *newpage, struct page *page, +enum migrate_mode mode) +{ + struct shmem_dev_info *dev_info = mapping->private_data; + + if (dev_info &&
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable
On 11/10/2016 12:09 PM, Hugh Dickins wrote: On Fri, 4 Nov 2016, akash.g...@intel.com wrote: From: Chris WilsonOn a long run of more than 2-3 days, physical memory tends to get fragmented severely, which considerably slows down the system. In such a scenario, the shrinker is also unable to help as lack of memory is not the actual problem, since it has been observed that there are enough free pages of 0 order. This also manifests itself when an indiviual zone in the mm runs out of pages and if we cannot migrate pages between zones, the kernel hits an out-of-memory even though there are free pages (and often all of swap) available. To address the issue of external fragementation, kernel does a compaction (which involves migration of pages) but it's efficacy depends upon how many pages are marked as MOVABLE, as only those pages can be migrated. Currently the backing pages for GPU buffers are allocated from shmemfs with GFP_RECLAIMABLE flag, in units of 4KB pages. In the case of limited swap space, it may not be possible always to reclaim or swap-out pages of all the inactive objects, to make way for free space allowing formation of higher order groups of physically-contiguous pages on compaction. Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to pin the pages if they are in use by GPU, which will prevent their migration. So the migratepage callback in shmem is also hooked up to get a notification when kernel initiates the page migration. On the notification, i915.ko appropriately unpin the pages. With this we can effectively mark the GPU pages as MOVABLE and hence mitigate the fragmentation problem. v2: - Rename the migration routine to gem_shrink_migratepage, move it to the shrinker file, and use the existing constructs (Chris) - To cleanup, add a new helper function to encapsulate all page migration skip conditions (Chris) - Add a new local helper function in shrinker file, for dropping the backing pages, and call the same from gem_shrink() also (Chris) v3: - Fix/invert the check on the return value of unsafe_drop_pages (Chris) v4: - Minor tidy v5: - Fix unsafe usage of unsafe_drop_pages() - Rebase onto vmap-notifier v6: - Remove i915_gem_object_get/put across unsafe_drop_pages() as with struct_mutex protection object can't disappear. (Chris) Testcase: igt/gem_shrink Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254 Cc: Hugh Dickins Cc: linux...@kvack.org Signed-off-by: Sourab Gupta Signed-off-by: Akash Goel Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen Reviewed-by: Chris Wilson I'm confused! But perhaps it's gone around and around between you all, I'm not sure what the rules are then. I think this sequence implies that Sourab wrote it originally, then Akash and Chris passed it on with refinements - but then Chris wouldn't add Reviewed-by. Thank you very much for the review and sorry for all the needless confusion. Chris actually conceived the patches and prepared an initial version of them (hence he is the Author). I & Sourab did the further refinements and fixed issues (all those page_private stuff). Chris then reviewed the final patch and also recently did a rebase for it. --- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_gem.c | 9 ++- drivers/gpu/drm/i915/i915_gem_shrinker.c | 132 +++ 3 files changed, 142 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4735b417..7f2717b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1357,6 +1357,8 @@ struct intel_l3_parity { }; struct i915_gem_mm { + struct shmem_dev_info shmem_info; + /** Memory allocator for GTT stolen memory */ struct drm_mm stolen; /** Protects the usage of the GTT stolen memory allocator. This is diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1f995ce..f0d4ce7 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct drm_i915_gem_object *obj) if (obj->mm.madv == I915_MADV_WILLNEED) mark_page_accessed(page); + set_page_private(page, 0); put_page(page); } obj->mm.dirty = false; @@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void) sg->length += PAGE_SIZE; } last_pfn = page_to_pfn(page); + set_page_private(page, (unsigned long)obj); /* Check that the i965g/gm workaround works. */ WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x0010UL)); @@
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable
On 11/4/2016 7:07 PM, Chris Wilson wrote: Best if we send these as a new series to unconfuse CI. Okay will send as a new series. On Fri, Nov 04, 2016 at 06:18:26PM +0530, akash.g...@intel.com wrote: +static int do_migrate_page(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *dev_priv = to_i915(obj->base.dev); + int ret = 0; + + if (!can_migrate_page(obj)) + return -EBUSY; + + /* HW access would be required for a GGTT bound object, for which +* device has to be kept awake. But a deadlock scenario can arise if +* the attempt is made to resume the device, when either a suspend +* or a resume operation is already happening concurrently from some +* other path and that only also triggers compaction. So only unbind +* if the device is currently awake. +*/ + if (!intel_runtime_pm_get_if_in_use(dev_priv)) + return -EBUSY; + + i915_gem_object_get(obj); + if (!unsafe_drop_pages(obj)) + ret = -EBUSY; + i915_gem_object_put(obj); Since the object release changes, we can now do this without the i915_gem_object_get / i915_gem_object_put (as we are guarded by the BKL struct_mutex). Fine will remove object_get/put as with struct_mutex protection object can't disappear across unsafe_drop_pages(). Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v4 2/2] drm/i915: Make GPU pages movable
On 10/18/2016 5:35 PM, Joonas Lahtinen wrote: On ma, 2016-04-04 at 14:18 +0100, Chris Wilson wrote: From: Akash GoelOn a long run of more than 2-3 days, physical memory tends to get fragmented severely, which considerably slows down the system. In such a scenario, the shrinker is also unable to help as lack of memory is not the actual problem, since it has been observed that there are enough free pages of 0 order. This also manifests itself when an indiviual zone in the mm runs out of pages and if we cannot migrate pages between zones, the kernel hits an out-of-memory even though there are free pages (and often all of swap) available. To address the issue of external fragementation, kernel does a compaction (which involves migration of pages) but it's efficacy depends upon how many pages are marked as MOVABLE, as only those pages can be migrated. Currently the backing pages for GFX buffers are allocated from shmemfs with GFP_RECLAIMABLE flag, in units of 4KB pages. In the case of limited swap space, it may not be possible always to reclaim or swap-out pages of all the inactive objects, to make way for free space allowing formation of higher order groups of physically-contiguous pages on compaction. Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to pin the pages if they are in use by GPU, which will prevent their migration. So the migratepage callback in shmem is also hooked up to get a notification when kernel initiates the page migration. On the notification, i915.ko appropriately unpin the pages. With this we can effectively mark the GPU pages as MOVABLE and hence mitigate the fragmentation problem. v2: - Rename the migration routine to gem_shrink_migratepage, move it to the shrinker file, and use the existing constructs (Chris) - To cleanup, add a new helper function to encapsulate all page migration skip conditions (Chris) - Add a new local helper function in shrinker file, for dropping the backing pages, and call the same from gem_shrink() also (Chris) v3: - Fix/invert the check on the return value of unsafe_drop_pages (Chris) v4: - Minor tidy Testcase: igt/gem_shrink Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254 Cc: Hugh Dickins Cc: linux...@kvack.org Signed-off-by: Sourab Gupta Signed-off-by: Akash Goel Reviewed-by: Chris Wilson Could this patch be re-spinned on top of current nightly? Sure will rebase it on top of nightly. After removing; WARN(page_count(newpage) != 1, "Unexpected ref count for newpage\n") and if (ret) DRM_DEBUG_DRIVER("page=%p migration returned %d\n", page, ret); This is; Reviewed-by: Joonas Lahtinen Thanks much for the review. But there is a precursor patch also, there has been no traction on that. [1/2] shmem: Support for registration of Driver/file owner specific ops https://patchwork.freedesktop.org/patch/77935/ Best regards Akash Regards, Joonas ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/guc: WA to address the Ringbuffer coherency issue
On 10/14/2016 11:45 PM, Chris Wilson wrote: On Fri, Oct 14, 2016 at 11:53:44PM +0530, akash.g...@intel.com wrote: From: Akash GoelDriver accesses the ringbuffer pages, via GMADR BAR, if the pages are pinned in mappable aperture portion of GGTT and for ringbuffer pages allocated from Stolen memory, access can only be done through GMADR BAR. In case of GuC based submission, updates done in ringbuffer via GMADR may not get commited to memory by the time the Command streamer starts reading them, resulting in fetching of stale data. Please leave a blank line between paragraphs, or try to not leave so much whitespace at the end of a sentence. I am sorry. Will be mindful of this from now. For Host based submission, such problem is not there as the write to Ring Tail or ELSP register happens from the Host side prior to submission. Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already enforces the ordering between outstanding GMADR writes & new GTTMADR access. MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to registers within GT is contained within GT, so ordering is not enforced resulting in a race, which can manifest in form of a hang. To ensure the flush of in flight GMADR writes, a POSTING READ is done to GuC register prior to doorbell ring. There is already a similar WA in i915_gem_object_flush_gtt_write_domain(), which takes care of GMADR writes from User space to GEM buffers, but not the ringbuffer writes from KMD. This WA is needed on all recent HW. Cc: Chris Wilson Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index a1f76c8..43c8a72 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -601,6 +601,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc) */ static void i915_guc_submit(struct drm_i915_gem_request *rq) { + struct drm_i915_private *dev_priv = rq->i915; unsigned int engine_id = rq->engine->id; struct intel_guc *guc = >i915->guc; struct i915_guc_client *client = guc->execbuf_client; @@ -608,6 +609,11 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq) spin_lock(>wq_lock); guc_wq_item_append(client, rq); + + /* WA to flush out the pending GMADR writes to ring buffer. */ + if (i915_vma_is_map_and_fenceable(rq->ring->vma)) + POSTING_READ(GUC_STATUS); Did you test POSTING_READ_FW() ? Sorry though we haven't explicitly tried POSTING_READ_FW() but it should work since, as per the __gen9_fw_ranges[] table, GuC registers (C000-Cxxx) do not lie in any Forcewake domain range. Otherwise it makes an unfortunate amount of sense, and I feel justified in what I had to do in flush_gtt_write_domwin! :) Yes your hunch, expectedly, was spot on :). Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915: Allocate intel_engine_cs structure only for the enabled engines (rev4)
On 10/13/2016 10:50 PM, Patchwork wrote: == Series Details == Series: drm/i915: Allocate intel_engine_cs structure only for the enabled engines (rev4) URL : https://patchwork.freedesktop.org/series/13435/ State : warning == Summary == Series 13435v4 drm/i915: Allocate intel_engine_cs structure only for the enabled engines https://patchwork.freedesktop.org/api/1.0/series/13435/revisions/4/mbox/ Test kms_pipe_crc_basic: Subgroup nonblocking-crc-pipe-b: pass -> DMESG-WARN (fi-ilk-650) Have filed a new BZ: https://bugs.freedesktop.org/show_bug.cgi?id=98251 Most likely the above failure isn't related with the concerned patch. Best regards Akash > Subgroup read-crc-pipe-b-frame-sequence: > dmesg-warn -> PASS (fi-ilk-650) > Test vgem_basic: > Subgroup unload: > skip -> PASS (fi-skl-6770hq) > fi-bdw-5557u total:246 pass:231 dwarn:0 dfail:0 fail:0 skip:15 fi-bsw-n3050 total:246 pass:204 dwarn:0 dfail:0 fail:0 skip:42 fi-bxt-t5700 total:246 pass:216 dwarn:0 dfail:0 fail:0 skip:30 fi-byt-j1900 total:246 pass:212 dwarn:2 dfail:0 fail:1 skip:31 fi-byt-n2820 total:246 pass:210 dwarn:0 dfail:0 fail:1 skip:35 fi-hsw-4770 total:246 pass:223 dwarn:0 dfail:0 fail:0 skip:23 fi-hsw-4770r total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22 fi-ilk-650 total:246 pass:183 dwarn:1 dfail:0 fail:2 skip:60 fi-ivb-3520m total:246 pass:221 dwarn:0 dfail:0 fail:0 skip:25 fi-ivb-3770 total:246 pass:221 dwarn:0 dfail:0 fail:0 skip:25 fi-kbl-7200u total:246 pass:222 dwarn:0 dfail:0 fail:0 skip:24 fi-skl-6260u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-skl-6700hqtotal:246 pass:223 dwarn:0 dfail:0 fail:0 skip:23 fi-skl-6700k total:246 pass:221 dwarn:1 dfail:0 fail:0 skip:24 fi-skl-6770hqtotal:246 pass:229 dwarn:1 dfail:0 fail:1 skip:15 fi-snb-2520m total:246 pass:210 dwarn:0 dfail:0 fail:0 skip:36 fi-snb-2600 total:246 pass:209 dwarn:0 dfail:0 fail:0 skip:37 Results at /archive/results/CI_IGT_test/Patchwork_2708/ dbcf6fbb541e70fac7db669631958eab2e4e0d9c drm-intel-nightly: 2016y-10m-13d-15h-31m-19s UTC integration manifest 391ff6c drm/i915: Allocate intel_engine_cs structure only for the enabled engines ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915: Allocate intel_engine_cs structure only for the enabled engines (rev3)
On 10/10/2016 6:03 PM, Patchwork wrote: == Series Details == Series: drm/i915: Allocate intel_engine_cs structure only for the enabled engines (rev3) URL : https://patchwork.freedesktop.org/series/13435/ State : warning == Summary == Series 13435v3 drm/i915: Allocate intel_engine_cs structure only for the enabled engines https://patchwork.freedesktop.org/api/1.0/series/13435/revisions/3/mbox/ Test vgem_basic: Subgroup unload: pass -> SKIP (fi-skl-6260u) pass -> SKIP (fi-skl-6700hq) skip -> PASS (fi-skl-6700k) Checked with Chris about the above failure. He said that the above unload failure for vgem module can't be attributed to the patch, most likely a CI framework issue. Best regards Akash fi-bdw-5557u total:248 pass:231 dwarn:0 dfail:0 fail:0 skip:17 fi-bsw-n3050 total:248 pass:204 dwarn:0 dfail:0 fail:0 skip:44 fi-bxt-t5700 total:248 pass:217 dwarn:0 dfail:0 fail:0 skip:31 fi-byt-j1900 total:248 pass:214 dwarn:1 dfail:0 fail:1 skip:32 fi-byt-n2820 total:248 pass:210 dwarn:0 dfail:0 fail:1 skip:37 fi-hsw-4770 total:248 pass:224 dwarn:0 dfail:0 fail:0 skip:24 fi-hsw-4770r total:248 pass:224 dwarn:0 dfail:0 fail:0 skip:24 fi-ilk-650 total:248 pass:185 dwarn:0 dfail:0 fail:2 skip:61 fi-ivb-3520m total:248 pass:221 dwarn:0 dfail:0 fail:0 skip:27 fi-ivb-3770 total:248 pass:207 dwarn:0 dfail:0 fail:0 skip:41 fi-kbl-7200u total:248 pass:222 dwarn:0 dfail:0 fail:0 skip:26 fi-skl-6260u total:248 pass:232 dwarn:0 dfail:0 fail:0 skip:16 fi-skl-6700hqtotal:248 pass:223 dwarn:1 dfail:0 fail:0 skip:24 fi-skl-6700k total:248 pass:222 dwarn:1 dfail:0 fail:0 skip:25 fi-skl-6770hqtotal:248 pass:231 dwarn:1 dfail:0 fail:1 skip:15 fi-snb-2520m total:248 pass:211 dwarn:0 dfail:0 fail:0 skip:37 fi-snb-2600 total:248 pass:209 dwarn:0 dfail:0 fail:0 skip:39 Results at /archive/results/CI_IGT_test/Patchwork_2652/ f35ed31aea66b3230c366fcba5f3456ae2cb956e drm-intel-nightly: 2016y-10m-10d-11h-28m-51s UTC integration manifest 401facf drm/i915: Allocate intel_engine_cs structure only for the enabled engines ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for Support for sustained capturing of GuC firmware logs (rev11)
On 10/13/2016 1:18 PM, Tvrtko Ursulin wrote: On 12/10/2016 19:36, Saarinen, Jani wrote: == Series Details == Series: Support for sustained capturing of GuC firmware logs (rev11) URL : https://patchwork.freedesktop.org/series/7910/ State : warning == Summary == Series 7910v11 Support for sustained capturing of GuC firmware logs https://patchwork.freedesktop.org/api/1.0/series/7910/revisions/11/mbox/ Test drv_module_reload_basic: skip -> PASS (fi-skl-6770hq) Test kms_flip: Subgroup basic-flip-vs-modeset: dmesg-warn -> PASS (fi-skl-6770hq) Test kms_pipe_crc_basic: Subgroup nonblocking-crc-pipe-c: pass -> DMESG-WARN (fi-ivb-3770) [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 215 [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 215 Either we have a BAT BZ for this or a new one should be raised. Have filed a new BZ: https://bugs.freedesktop.org/show_bug.cgi?id=98225 Most likely the above failure isn't related with the GuC logging patch set. Moreover GuC based submission (& logging) are anyways disabled by default. Best regards Akash That and resolving the question on how to merge it given the relayfs change, but otherwise is ready. Regards, Tvrtko Test kms_psr_sink_crc: Subgroup psr_basic: dmesg-warn -> PASS (fi-skl-6700hq) Test vgem_basic: Subgroup unload: skip -> PASS (fi-kbl-7200u) skip -> PASS (fi-hsw-4770) fi-bdw-5557u total:248 pass:232 dwarn:0 dfail:0 fail:0 skip:16 fi-bsw-n3050 total:248 pass:205 dwarn:0 dfail:0 fail:0 skip:43 fi-bxt-t5700 total:248 pass:217 dwarn:0 dfail:0 fail:0 skip:31 fi-byt-j1900 total:248 pass:213 dwarn:2 dfail:0 fail:1 skip:32 fi-byt-n2820 total:248 pass:211 dwarn:0 dfail:0 fail:1 skip:36 fi-hsw-4770 total:248 pass:225 dwarn:0 dfail:0 fail:0 skip:23 fi-hsw-4770r total:248 pass:225 dwarn:0 dfail:0 fail:0 skip:23 fi-ivb-3520m total:248 pass:222 dwarn:0 dfail:0 fail:0 skip:26 fi-ivb-3770 total:248 pass:221 dwarn:1 dfail:0 fail:0 skip:26 fi-kbl-7200u total:248 pass:223 dwarn:0 dfail:0 fail:0 skip:25 fi-skl-6260u total:248 pass:233 dwarn:0 dfail:0 fail:0 skip:15 fi-skl-6700hqtotal:248 pass:225 dwarn:0 dfail:0 fail:0 skip:23 fi-skl-6700k total:248 pass:222 dwarn:1 dfail:0 fail:0 skip:25 fi-skl-6770hqtotal:248 pass:231 dwarn:1 dfail:0 fail:1 skip:15 fi-snb-2520m total:248 pass:211 dwarn:0 dfail:0 fail:0 skip:37 fi-snb-2600 total:248 pass:210 dwarn:0 dfail:0 fail:0 skip:38 Results at /archive/results/CI_IGT_test/Patchwork_2691/ 14740bb25ec36fe4ce8042af3eb48aeb45e5bc13 drm-intel-nightly: 2016y-10m- 12d-16h-18m-24s UTC integration manifest a590f8c drm/i915: Mark the GuC log buffer flush interrupts handling WQ as freezable a001c3d drm/i915: Early creation of relay channel for capturing boot time logs af3ee1c drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer fbbd457 drm/i915: Debugfs support for GuC logging control 656513f drm/i915: Support for forceful flush of GuC log buffer a68d17f drm/i915: Augment i915 error state to include the dump of GuC log buffer da8274a drm/i915: Increase GuC log buffer size to reduce flush interrupts 4f24c12 drm/i915: Optimization to reduce the sampling time of GuC log buffer 4739ad8 drm/i915: Add stats for GuC log buffer flush interrupts 2e8c052 drm/i915: New lock to serialize the Host2GuC actions 954e48b drm/i915: Add a relay backed debugfs interface for capturing GuC logs 23a81bb relay: Use per CPU constructs for the relay channel buffer pointers 8fd01d3 drm/i915: Handle log buffer flush interrupt event from GuC 44610d4 drm/i915: Support for GuC interrupts 05ede72 drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set ffbd48f drm/i915: New structure to contain GuC logging related fields 317ba9e drm/i915: Add GuC ukernel logging related fields to fw interface file 4832507 drm/i915: Decouple GuC log setup from verbosity parameter ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx Jani Saarinen Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v4] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file
On 10/10/2016 7:22 PM, Tvrtko Ursulin wrote: On 10/10/2016 11:59, akash.g...@intel.com wrote: From: Akash GoelThis patch provides a test utility which helps capture GuC firmware logs and then dump them to file. The logs are pulled from a debugfs file '/sys/kernel/debug/dri/guc_log' and by default stored into a file 'guc_log_dump.dat'. The name, including the location, of the output file can be changed through a command line argument. The utility goes into an infinite loop where it waits for the arrival of new logs and as soon as new set of logs are produced it captures them in its local buffer which is then flushed out to the file on disk. Any time when logging needs to be ended, User can stop this utility (CTRL+C). Before entering into a loop, it first discards whatever logs are present in the debugfs file. This way User can first launch this utility and then start a workload/activity for which GuC firmware logs are to be actually captured and keep running the utility for as long as its needed, like once the workload is over this utility can be forcefully stopped. If the logging wasn't enabled on GuC side by the Driver at boot time, utility will first enable the logging and later on when it is stopped (CTRL+C) it will also pause the logging on GuC side. v2: - Use combination of alarm system call & SIGALRM signal to run the utility for required duration. (Tvrtko) - Fix inconsistencies, do minor cleanup and refactoring. (Tvrtko) v3: - Fix discrepancy for the output file command line option and update the Usage/help string. v4: - Update the exit condition for flusher thread, now will exit only after the capture loop is over and not when the flag to stop logging is set. This handles a corner case, due to which the dump of last captured buffer was getting missed. - Add a newline character at the end of assert messages. - Avoid the assert for the case, which occurs very rarely, when there are no bytes read from the relay file. Cc: Tvrtko Ursulin Signed-off-by: Akash Goel Reviewed-by: Tvrtko Ursulin (v3) --- tools/Makefile.sources | 1 + tools/intel_guc_logger.c | 438 +++ 2 files changed, 439 insertions(+) create mode 100644 tools/intel_guc_logger.c diff --git a/tools/Makefile.sources b/tools/Makefile.sources index 2bb6c8e..be58871 100644 --- a/tools/Makefile.sources +++ b/tools/Makefile.sources @@ -19,6 +19,7 @@ tools_prog_lists =\ intel_gpu_time\ intel_gpu_top\ intel_gtt\ +intel_guc_logger\ intel_infoframes\ intel_l3_parity\ intel_lid\ diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c new file mode 100644 index 000..159a54e --- /dev/null +++ b/tools/intel_guc_logger.c @@ -0,0 +1,438 @@ + +#define _GNU_SOURCE /* For using O_DIRECT */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "igt.h" + +#define MB(x) ((uint64_t)(x) * 1024 * 1024) +#ifndef PAGE_SIZE + #define PAGE_SIZE 4096 +#endif +/* Currently the size of GuC log buffer is 19 pages & so is the size of relay + * subbuffer. If the size changes in future, then this define also needs to be + * updated accordingly. + */ +#define SUBBUF_SIZE (19*PAGE_SIZE) +/* Need large buffering from logger side to hide the DISK IO latency, Driver + * can only store 8 snapshots of GuC log buffer in relay. + */ +#define NUM_SUBBUFS 100 + +#define RELAY_FILE_NAME "guc_log" +#define DEFAULT_OUTPUT_FILE_NAME "guc_log_dump.dat" +#define CONTROL_FILE_NAME "i915_guc_log_control" + +char *read_buffer; +char *out_filename; +int poll_timeout = 2; /* by default 2ms timeout */ +pthread_mutex_t mutex; +pthread_t flush_thread; +int verbosity_level = 3; /* by default capture logs at max verbosity */ +uint32_t produced, consumed; +uint64_t total_bytes_written; +int num_buffers = NUM_SUBBUFS; +int relay_fd, outfile_fd = -1; +uint32_t test_duration, max_filesize; +pthread_cond_t underflow_cond, overflow_cond; +bool stop_logging, discard_oldlogs, capturing_stopped; + +static void guc_log_control(bool enable_logging) +{ +int control_fd; +char data[19]; +uint64_t val; +int ret; + +control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY); +igt_assert_f(control_fd >= 0, "couldn't open the guc log control file\n"); + +val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0; + +ret = snprintf(data, sizeof(data), "0x%" PRIx64, val); +igt_assert(ret > 2 && ret < sizeof(data)); + +ret = write(control_fd, data, ret); +igt_assert_f(ret > 0, "couldn't write to the log control file\n"); + +close(control_fd); +} + +static void int_sig_handler(int sig) +{ +igt_info("received signal %d\n",
Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Allocate intel_engine_cs structure only for the enabled engines
On 10/7/2016 5:14 PM, Chris Wilson wrote: On Fri, Oct 07, 2016 at 09:58:07AM -, Patchwork wrote: == Series Details == Series: drm/i915: Allocate intel_engine_cs structure only for the enabled engines URL : https://patchwork.freedesktop.org/series/13435/ State : failure == Summary == Series 13435v1 drm/i915: Allocate intel_engine_cs structure only for the enabled engines https://patchwork.freedesktop.org/api/1.0/series/13435/revisions/1/mbox/ Test drv_module_reload_basic: dmesg-warn -> PASS (fi-ilk-650) Test gem_exec_parallel: Subgroup basic: pass -> INCOMPLETE (fi-snb-2600) Test gem_sync: Subgroup basic-store-all: pass -> INCOMPLETE (fi-bxt-t5700) pass -> INCOMPLETE (fi-byt-j1900) pass -> INCOMPLETE (fi-bsw-n3050) pass -> INCOMPLETE (fi-hsw-4770) pass -> INCOMPLETE (fi-skl-6700k) pass -> INCOMPLETE (fi-skl-6770hq) pass -> INCOMPLETE (fi-hsw-4770r) pass -> INCOMPLETE (fi-snb-2520m) pass -> INCOMPLETE (fi-kbl-7200u) pass -> INCOMPLETE (fi-skl-6700hq) pass -> INCOMPLETE (fi-ivb-3520m) pass -> INCOMPLETE (fi-ivb-3770) pass -> INCOMPLETE (fi-bdw-5557u) pass -> INCOMPLETE (fi-skl-6260u) This is due to missing: git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 8c08ced..44ef6b5 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -415,7 +415,7 @@ intel_engine_sync_index(struct intel_engine_cs *engine, * vcs2 -> 0 = rcs, 1 = vcs, 2 = bcs, 3 = vecs; */ - idx = (other - engine) - 1; + idx = (other->id - engine->id) - 1; if (idx < 0) idx += I915_NUM_ENGINES; I believe that's the only case where we compare elements of the array, and even scheduled for removal. Thank you very much for finding this anomaly. So the cross engine synchronization was going for a toss, causing the above tests to get stuck or execute slowly ?. best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for Support for sustained capturing of GuC firmware logs (rev10)
On 9/12/2016 1:53 PM, Patchwork wrote: == Series Details == Series: Support for sustained capturing of GuC firmware logs (rev10) URL : https://patchwork.freedesktop.org/series/7910/ State : failure == Summary == Series 7910v10 Support for sustained capturing of GuC firmware logs http://patchwork.freedesktop.org/api/1.0/series/7910/revisions/10/mbox/ Test drv_module_reload_basic: skip -> PASS (fi-skl-6260u) Test kms_cursor_legacy: Subgroup basic-cursor-vs-flip-varying-size: pass -> FAIL (fi-bsw-n3050) This subtest seems to have a history of the sporadic failures as per the link http://benchsrv.fi.intel.com/archive/results/CI_IGT_test/igt@kms_cursor_leg...@basic-cursor-vs-flip-varying-size.html Filed a new BZ: https://bugs.freedesktop.org/show_bug.cgi?id=97775 The failure is sporadic, is most likely pre-existent & unrelated to the GuC logging patch set. Best regards Akash fi-bdw-5557u total:252 pass:236 dwarn:0 dfail:0 fail:1 skip:15 fi-bsw-n3050 total:252 pass:204 dwarn:0 dfail:0 fail:2 skip:46 fi-hsw-4770k total:252 pass:229 dwarn:0 dfail:0 fail:1 skip:22 fi-hsw-4770r total:252 pass:225 dwarn:0 dfail:0 fail:1 skip:26 fi-ilk-650 total:252 pass:182 dwarn:0 dfail:0 fail:3 skip:67 fi-ivb-3520m total:252 pass:220 dwarn:0 dfail:0 fail:1 skip:31 fi-ivb-3770 total:252 pass:220 dwarn:0 dfail:0 fail:1 skip:31 fi-skl-6260u total:252 pass:237 dwarn:0 dfail:0 fail:1 skip:14 fi-skl-6700k total:252 pass:222 dwarn:1 dfail:0 fail:1 skip:28 fi-snb-2520m total:252 pass:206 dwarn:0 dfail:0 fail:1 skip:45 fi-snb-2600 total:252 pass:206 dwarn:0 dfail:0 fail:1 skip:45 Results at /archive/results/CI_IGT_test/Patchwork_2491/ 5986f290e25f42d3d5df390411cc43683deb1301 drm-intel-nightly: 2016y-09m-08d-09h-11m-50s UTC integration manifest b66ec09 drm/i915: Mark the GuC log buffer flush interrupts handling WQ as freezable f0170a8 drm/i915: Early creation of relay channel for capturing boot time logs 28365d9 drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer 65eafef drm/i915: Debugfs support for GuC logging control 0db5fb8 drm/i915: Support for forceful flush of GuC log buffer 0a7b34a drm/i915: Augment i915 error state to include the dump of GuC log buffer 671b49b drm/i915: Increase GuC log buffer size to reduce flush interrupts 270f061 drm/i915: Optimization to reduce the sampling time of GuC log buffer a2df951 drm/i915: Add stats for GuC log buffer flush interrupts 4147500 drm/i915: New lock to serialize the Host2GuC actions e101194 drm/i915: Add a relay backed debugfs interface for capturing GuC logs eabdd2a relay: Use per CPU constructs for the relay channel buffer pointers b77518d drm/i915: Handle log buffer flush interrupt event from GuC de54755 drm/i915: Support for GuC interrupts c3228bb drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set 1c4e929 drm/i915: New structure to contain GuC logging related fields b073561 drm/i915: Add GuC ukernel logging related fields to fw interface file 6ed3738 drm/i915: Decouple GuC log setup from verbosity parameter ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file
On 9/7/2016 3:07 PM, Tvrtko Ursulin wrote: On 07/09/16 09:44, Chris Wilson wrote: On Wed, Sep 07, 2016 at 01:40:27PM +0530, Goel, Akash wrote: On 9/6/2016 9:22 PM, Tvrtko Ursulin wrote: [snip] +while (!stop_logging) +{ +if (test_duration && (igt_seconds_elapsed() > test_duration)) { If you agree to allow no poll period the this would not work right? In that case you would need to use alarm(2) or something. Can calculate the timeout value for poll call as, if (poll_timeout < 0) { timeout = test_duration - igt_seconds_elapsed()) } My point was that with indefinite poll loop will not run if there is not log data so timeout will not work implemented like this. I understood your concern in first place but probably didn't put forth my point clearly. For more clarity, this is how think it can be addressed. --- a/tools/intel_guc_logger.c +++ b/tools/intel_guc_logger.c @@ -370,6 +370,8 @@ int main(int argc, char **argv) { struct pollfd relay_poll_fd; struct timespec start={}; +uint32_t time_elapsed; +int timeout; int nfds; int ret; @@ -395,10 +397,17 @@ int main(int argc, char **argv) while (!stop_logging) { -if (test_duration && (igt_seconds_elapsed() > test_duration)) { -igt_debug("Ran for stipulated %d seconds, exit now\n", test_duration); -stop_logging = true; -break; +timeout = poll_timeout; +if (test_duration) { +time_elapsed = igt_seconds_elapsed(); +if (time_elapsed >= test_duration) { +igt_debug("Ran for stipulated %d seconds, exit now\n", test_duration); +stop_logging = true; +break; +} +if (poll_timeout < 0) +timeout = (test_duration - time_elapsed) * 1000; } /* Wait/poll for the new data to be available, relay doesn't @@ -412,7 +421,7 @@ int main(int argc, char **argv) * than a jiffy gap between 2 flush interrupts) and relay runs * out of sub buffers to store the new logs. */ -ret = poll(_poll_fd, nfds, poll_timeout); +ret = poll(_poll_fd, nfds, timeout); if (ret < 0) { if (errno == EINTR) break; So will not do polling with indefinite timeout and adjust the timeout value as per test's duration. Does it look ok ? Since the comment still refers to a kernel bug that you've fixed, it can just go. The timeout calculation is indeed more simply expressed as alarm(timeout). Yes I wrote privately that's especially true since there is already a handler for SIGINT which would do the right thing for SIGALRM as well. I don't feel so strongly about this but now that we both think the same maybe go for the simpler implementation if you don't mind Akash? Thanks much for suggestion. Will use 'alarm(timeout)', its definitely much simpler. And fixing the blocking read() is about 10 lines in the kernel... Haven't checked but if that is the case, since we are already fixing relayfs issues, it would be good to do that one as well since it would simplify the logger. Because if we do it straight away then we know logger can use it, and if we leave it for later then it gets uglier for the logger. But if we cannot make the fix go in the same kernel version (or earlier) than the GuC logging then I think we don't need to block on that. Sorry not sure that whether we would gain much by trying to add the support for blocking read in relay. For a regular disk file, which is of a fixed size, it makes sense to have a provision to block the reader until file's data is paged in from the disk into RAM. But for relay, data to be read would invariably be generated dynamically which can stop at anytime and thus the reader could get blocked for ever. I think the current relay semantics are fine that if there is no data left to be read in channel buffers zero will be returned and Clients can get to know about the generation of new data through poll (using a timeout). Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file
On 9/6/2016 9:22 PM, Tvrtko Ursulin wrote: On 06/09/16 16:33, Goel, Akash wrote: On 9/6/2016 6:47 PM, Tvrtko Ursulin wrote: Hi, On 06/09/16 11:43, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> This patch provides a test utility which helps capture GuC firmware logs and then dump them to file. The logs are pulled from a debugfs file '/sys/kernel/debug/dri/guc_log' and stored into a file '/tmp/guc_log_dump.dat', the name of the output file can be changed through a command line argument. The utility goes into an infinite loop where it waits for the arrival of new logs and as soon as new set of logs are produced it captures them in its local buffer which is then flushed out to the file on disk. Any time when logging needs to be ended, User can stop this utility (CTRL+C). Before entering into a loop, it first discards whatever logs are present in the debugfs file. This way User can first launch this utility and then start a workload/activity for which GuC firmware logs are to be actually captured and keep running the utility for as long as its needed, like once the workload is over this utility can be forcefully stopped. If the logging wasn't enabled on GuC side by the Driver at boot time, utility will first enable the logging and later on when it is stopped (CTRL+C) it will also pause the logging on GuC side. Signed-off-by: Akash Goel <akash.g...@intel.com> --- tools/Makefile.sources | 1 + tools/intel_guc_logger.c | 441 +++ 2 files changed, 442 insertions(+) create mode 100644 tools/intel_guc_logger.c diff --git a/tools/Makefile.sources b/tools/Makefile.sources index 2bb6c8e..be58871 100644 --- a/tools/Makefile.sources +++ b/tools/Makefile.sources @@ -19,6 +19,7 @@ tools_prog_lists =\ intel_gpu_time\ intel_gpu_top\ intel_gtt\ +intel_guc_logger\ intel_infoframes\ intel_l3_parity\ intel_lid\ diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c new file mode 100644 index 000..92172fa --- /dev/null +++ b/tools/intel_guc_logger.c @@ -0,0 +1,441 @@ + +#define _GNU_SOURCE /* For using O_DIRECT */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "igt.h" + +#define MB(x) ((uint64_t)(x) * 1024 * 1024) +#ifndef PAGE_SIZE + #define PAGE_SIZE 4096 +#endif +#define SUBBUF_SIZE (19*PAGE_SIZE) +/* Need large buffering from logger side to hide the DISK IO latency, Driver + * can only store 8 snapshots of GuC log buffer in relay. + */ +#define NUM_SUBBUFS 100 + +#define RELAY_FILE_NAME "guc_log" +#define CONTROL_FILE_NAME "i915_guc_log_control" + +char *read_buffer; +char *out_filename; +int poll_timeout = 2; /* by default 2ms timeout */ +pthread_mutex_t mutex; +pthread_t flush_thread; +int verbosity_level = 3; /* by default capture logs at max verbosity */ +uint32_t produced, consumed; +uint64_t total_bytes_written; +int num_buffers = NUM_SUBBUFS; +int relay_fd, outfile_fd = -1; +bool stop_logging, discard_oldlogs; +uint32_t test_duration, max_filesize; +pthread_cond_t underflow_cond, overflow_cond; + +static void guc_log_control(bool enable_logging) +{ +int control_fd; +char data[19]; +uint64_t val; +int ret; + +control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY); +if (control_fd < 0) +igt_assert_f(0, "Couldn't open the guc log control file"); + +val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0; + +snprintf(data, sizeof(data), "0x%" PRIx64, val); +ret = write(control_fd, data, strlen(data) + 1); Minor: It looks safe like it is but something like below would maybe be more robust? ret = snprintf(data, sizeof(data), "0x%" PRIx64, val); igt_assert(ret > 2 && ret < sizeof(data)); ok will add, but possibility of failure will be really remote here. but igt_assert(ret > 0) should suffice. Yes there is no possibility for failure as it stands, just more robust implementation should someone change something in the future. That's why I said you could also decide to keep it as is. My version also avoided the strlen since snprintf already tells you that. fine, will use your version then. ret = write(control_fd, data, ret); igt_assert(ret > 0); // assuming short writes can't happen Up to you. +if (ret < 0) +igt_assert_f(0, "Couldn't write to the log control file"); + +close(control_fd); +} + +static void int_sig_handler(int sig) +{ +igt_info("Received signal %d\n", sig); + +stop_logging = true; +} + +static void pull_leftover_data(void) +{ +unsigned int bytes_read = 0; +int ret; + +while (1) { +/* Read the logs from relay buffer
Re: [Intel-gfx] [PATCH] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file
On 9/6/2016 6:47 PM, Tvrtko Ursulin wrote: Hi, On 06/09/16 11:43, akash.g...@intel.com wrote: From: Akash GoelThis patch provides a test utility which helps capture GuC firmware logs and then dump them to file. The logs are pulled from a debugfs file '/sys/kernel/debug/dri/guc_log' and stored into a file '/tmp/guc_log_dump.dat', the name of the output file can be changed through a command line argument. The utility goes into an infinite loop where it waits for the arrival of new logs and as soon as new set of logs are produced it captures them in its local buffer which is then flushed out to the file on disk. Any time when logging needs to be ended, User can stop this utility (CTRL+C). Before entering into a loop, it first discards whatever logs are present in the debugfs file. This way User can first launch this utility and then start a workload/activity for which GuC firmware logs are to be actually captured and keep running the utility for as long as its needed, like once the workload is over this utility can be forcefully stopped. If the logging wasn't enabled on GuC side by the Driver at boot time, utility will first enable the logging and later on when it is stopped (CTRL+C) it will also pause the logging on GuC side. Signed-off-by: Akash Goel --- tools/Makefile.sources | 1 + tools/intel_guc_logger.c | 441 +++ 2 files changed, 442 insertions(+) create mode 100644 tools/intel_guc_logger.c diff --git a/tools/Makefile.sources b/tools/Makefile.sources index 2bb6c8e..be58871 100644 --- a/tools/Makefile.sources +++ b/tools/Makefile.sources @@ -19,6 +19,7 @@ tools_prog_lists =\ intel_gpu_time\ intel_gpu_top\ intel_gtt\ +intel_guc_logger\ intel_infoframes\ intel_l3_parity\ intel_lid\ diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c new file mode 100644 index 000..92172fa --- /dev/null +++ b/tools/intel_guc_logger.c @@ -0,0 +1,441 @@ + +#define _GNU_SOURCE /* For using O_DIRECT */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "igt.h" + +#define MB(x) ((uint64_t)(x) * 1024 * 1024) +#ifndef PAGE_SIZE + #define PAGE_SIZE 4096 +#endif +#define SUBBUF_SIZE (19*PAGE_SIZE) +/* Need large buffering from logger side to hide the DISK IO latency, Driver + * can only store 8 snapshots of GuC log buffer in relay. + */ +#define NUM_SUBBUFS 100 + +#define RELAY_FILE_NAME "guc_log" +#define CONTROL_FILE_NAME "i915_guc_log_control" + +char *read_buffer; +char *out_filename; +int poll_timeout = 2; /* by default 2ms timeout */ +pthread_mutex_t mutex; +pthread_t flush_thread; +int verbosity_level = 3; /* by default capture logs at max verbosity */ +uint32_t produced, consumed; +uint64_t total_bytes_written; +int num_buffers = NUM_SUBBUFS; +int relay_fd, outfile_fd = -1; +bool stop_logging, discard_oldlogs; +uint32_t test_duration, max_filesize; +pthread_cond_t underflow_cond, overflow_cond; + +static void guc_log_control(bool enable_logging) +{ +int control_fd; +char data[19]; +uint64_t val; +int ret; + +control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY); +if (control_fd < 0) +igt_assert_f(0, "Couldn't open the guc log control file"); + +val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0; + +snprintf(data, sizeof(data), "0x%" PRIx64, val); +ret = write(control_fd, data, strlen(data) + 1); Minor: It looks safe like it is but something like below would maybe be more robust? ret = snprintf(data, sizeof(data), "0x%" PRIx64, val); igt_assert(ret > 2 && ret < sizeof(data)); ok will add, but possibility of failure will be really remote here. but igt_assert(ret > 0) should suffice. ret = write(control_fd, data, ret); igt_assert(ret > 0); // assuming short writes can't happen Up to you. +if (ret < 0) +igt_assert_f(0, "Couldn't write to the log control file"); + +close(control_fd); +} + +static void int_sig_handler(int sig) +{ +igt_info("Received signal %d\n", sig); + +stop_logging = true; +} + +static void pull_leftover_data(void) +{ +unsigned int bytes_read = 0; +int ret; + +while (1) { +/* Read the logs from relay buffer */ +ret = read(relay_fd, read_buffer, SUBBUF_SIZE); +if (!ret) +break; +else if (ret < 0) +igt_assert_f(0, "Failed to read from the guc log file"); +else if (ret < SUBBUF_SIZE) +igt_assert_f(0, "invalid read from relay file"); + +bytes_read += ret; + +if (outfile_fd > 0) { = 0 I think. Or is it even needed since open_output_file asserts if it fails to open? Actually pull_leftover_data() will be called twice, once before opening the
Re: [Intel-gfx] [PATCH 15/19] drm/i915: Debugfs support for GuC logging control
On 8/19/2016 11:48 PM, Chris Wilson wrote: On Fri, Aug 19, 2016 at 02:13:14PM +0530, akash.g...@intel.com wrote: +static int i915_guc_log_control_get(void *data, u64 *val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = to_i915(dev); + + if (!dev_priv->guc.log.vma) + return -EINVAL; return -ENODEV; Fine will change the return code. + + *val = i915.guc_log_level; + + return 0; +} + +static int i915_guc_log_control_set(void *data, u64 val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = to_i915(dev); + int ret; if (!dev_priv->guc.log.vma) return -ENODEV; you don't need struct_mutex to check for its existence. Fine will lock the struct_mutex after the NULL vma check. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 14/19] drm/i915: Forcefully flush GuC log buffer on reset
On 8/19/2016 11:40 PM, Chris Wilson wrote: On Fri, Aug 19, 2016 at 02:13:13PM +0530, akash.g...@intel.com wrote: From: Sagar Arun KambleBefore capturing the GuC logs as a part of error state, there should be a force log buffer flush action sent to GuC before proceeding with GPU reset and re-initializing GUC. There could be some data in the log buffer which is yet to be captured and those logs would be particularly useful to understand that why the GPU reset was initiated. There's no point if we can't wait for any writes to complete, so just take the snapshot of the log at the time of the hang. +void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait) +{ + if (!i915.enable_guc_submission || (i915.guc_log_level < 0)) + return; + + /* First disable the interrupts, will be renabled afterwards */ + gen9_disable_guc_interrupts(dev_priv); calls synchronize_irq() which is also illegal from the atomic context of error capture. Fine, will not call gen9_disable_guc_interrupts, just like flush_work, from the error state capture path. But I feel it could still be useful to invoke host2guc_force_logbuffer_flush(). Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 17/19] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer
On 8/19/2016 11:49 PM, Chris Wilson wrote: On Fri, Aug 19, 2016 at 02:13:16PM +0530, akash.g...@intel.com wrote: From: Akash GoelIn order to have fast reads from the GuC log buffer, used SSE4.1 movntdqa based memcpy function i915_memcpy_from_wc. GuC log buffer has a WC type vmalloc mapping and copying using movntqda from WC type memory is almost as fast as reading from WB memory. This will further reduce the log buffer sampling time, so is needed dearly to deal with the flush interrupt storm when GuC is generating logs at a very high rate. Ideally SSE 4.1 should be present on all chipsets supporting GuC based submisssions, but if not then logging will not be enabled. v2: Rebase. Suggested-by: Chris Wilson Signed-off-by: Akash Goel Reviewed-by: Tvrtko Ursulin Should be squashed with patch 16 (use MAP_WC). Fine will squash, but please could you tell that what issue could be there with 2 patches being separate. Either both will be merged or none of them will be merged. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 08/19] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
On 8/19/2016 11:33 PM, Chris Wilson wrote: On Fri, Aug 19, 2016 at 02:13:07PM +0530, akash.g...@intel.com wrote: static void *guc_get_write_buffer(struct intel_guc *guc) { - return NULL; + /* FIXME: Cover the check under a lock ? */ + if (!guc->log.relay_chan) + return NULL; + + /* Just get the base address of a new sub buffer and copy data into it +* ourselves. NULL will be returned in no-overwrite mode, if all sub +* buffers are full. Could have used the relay_write() to indirectly +* copy the data, but that would have been bit convoluted, as we need to +* write to only certain locations inside a sub buffer which cannot be +* done without using relay_reserve() along with relay_write(). So its +* better to use relay_reserve() alone. +*/ + return relay_reserve(guc->log.relay_chan, 0); } You have to chase through the code a long way to check whether or not the allocation is correct. Please do consider adding a check such as GEM_BUG_ON(guc->log.relay_chan->size < guc->log.vma->size); (near the allocation) Fine, will add a check after the allocation, but not sure how useful it will be, as we shall trust relay to do the memory allocation for the sub-buffers as per the requested 'subbuf_size'. subbuf_size = guc->log.vma->obj->base.size; n_subbufs = 8; guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size, n_subbufs, _callbacks, dev_priv); GEM_BUG_ON(guc->log.relay_chan->subbuf_size < guc->log.vma->obj->base.size); GEM_BUG_ON(write_offset + buffer_size > guc->log.relay_chan->size); (before the memcpy, or whatever is appropriate). There is a check already for read_offset/write_offset before the memcpy. I think it would be better to add this check GEM_BUG_ON(guc->log.relay_chan->subbuf_size < guc->log.vma->obj->base.size); just before return relay_reserve(guc->log.relay_chan, 0); Best regards Akash Just to leave some clues to the reader as to what is going on. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend
On 8/18/2016 8:25 PM, Imre Deak wrote: On to, 2016-08-18 at 20:05 +0530, Goel, Akash wrote: On 8/18/2016 7:48 PM, Imre Deak wrote: On to, 2016-08-18 at 19:17 +0530, Goel, Akash wrote: [...] Thanks for the inputs. Sorry not familiar with freezable WQ semantics. But after looking at code, this is what I understood :- 1. freezable Workqueues will be frozen before the system suspend callbacks are invoked for the devices. Yes. 2. Any work item queued after the WQ is marked frozen will be scheduled later, on resume. Yes. 3. But if a work item was already present in the freezable Workqueue, before it was frozen and it did not complete, then system suspend itself will be aborted. System suspend will be aborted only if any kernel thread didn't complete within a reasonable amount of time (freeze_timeout_msecs, 20 sec by default). Otherwise already queued items will be properly waited upon and suspend will proceed. Sorry for getting this wrong. What I understood is that even if there are pending work items on freezable WQ after freeze_timeout_msecs, then also system suspend would be performed. In case of timeout suspend_prepare()->suspend_freeze_processes() ->freeze_kernel_threads()->try_to_freeze_tasks() will return -EBUSY and suspend will fail. So sorry, there was a typo in my last mail, instead of writing 'system suspend would be aborted', I wrote 'system suspend would be performed'. Sorry couldn't find an explicit/synchronous wait in kernel for the pending work items for freezable WQs, but it doesn't matter. The above try_to_freeze_tasks() will wait until freeze_workqueues_busy() indicates that there are no work items active on any freezable queues. Thanks much for clarifying. I will go through that function again. Best regards Akash --Imre 4. So if the log.flush_wq is marked as freezable, then flush of work item will not be required for the system suspend case. And runtime suspend case is already covered with rpm get/put around register access in work item function. Yes. It seems there are 2 config options CONFIG_SUSPEND_FREEZER This is set whenever system suspend is enabled. and CONFIG_FREEZER This is set except for one platform (powerpc), where I assume freezing of the tasks is achieved in a different way. In any case it doesn't matter for us. Many thanks for providing all this info. Will then mark the log.flush_wq as freezable. Best regards Akash --Imre which have to be enabled for all the above to happen. If these config options will always be enabled then probably marking log.flush_wq would work. Please kindly confirm whether I understood correctly or not, accordingly will proceed further. Best regards Akash --Imre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend
On 8/18/2016 7:48 PM, Imre Deak wrote: On to, 2016-08-18 at 19:17 +0530, Goel, Akash wrote: [...] Thanks for the inputs. Sorry not familiar with freezable WQ semantics. But after looking at code, this is what I understood :- 1. freezable Workqueues will be frozen before the system suspend callbacks are invoked for the devices. Yes. 2. Any work item queued after the WQ is marked frozen will be scheduled later, on resume. Yes. 3. But if a work item was already present in the freezable Workqueue, before it was frozen and it did not complete, then system suspend itself will be aborted. System suspend will be aborted only if any kernel thread didn't complete within a reasonable amount of time (freeze_timeout_msecs, 20 sec by default). Otherwise already queued items will be properly waited upon and suspend will proceed. Sorry for getting this wrong. What I understood is that even if there are pending work items on freezable WQ after freeze_timeout_msecs, then also system suspend would be performed. Sorry couldn't find an explicit/synchronous wait in kernel for the pending work items for freezable WQs, but it doesn't matter. 4. So if the log.flush_wq is marked as freezable, then flush of work item will not be required for the system suspend case. And runtime suspend case is already covered with rpm get/put around register access in work item function. Yes. It seems there are 2 config options CONFIG_SUSPEND_FREEZER This is set whenever system suspend is enabled. and CONFIG_FREEZER This is set except for one platform (powerpc), where I assume freezing of the tasks is achieved in a different way. In any case it doesn't matter for us. Many thanks for providing all this info. Will then mark the log.flush_wq as freezable. Best regards Akash --Imre which have to be enabled for all the above to happen. If these config options will always be enabled then probably marking log.flush_wq would work. Please kindly confirm whether I understood correctly or not, accordingly will proceed further. Best regards Akash --Imre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend
On 8/18/2016 6:29 PM, Imre Deak wrote: On to, 2016-08-18 at 16:54 +0530, Goel, Akash wrote: On 8/18/2016 4:25 PM, Imre Deak wrote: On to, 2016-08-18 at 09:15 +0530, Goel, Akash wrote: On 8/17/2016 9:07 PM, Goel, Akash wrote: On 8/17/2016 6:41 PM, Imre Deak wrote: On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote: On 8/17/2016 5:11 PM, Chris Wilson wrote: On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote: +int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend) { struct drm_i915_private *dev_priv = to_i915(dev); struct intel_guc *guc = _priv->guc; @@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev) return 0; gen9_disable_guc_interrupts(dev_priv); +/* Sync is needed only for the system suspend case, runtime suspend + * case is covered due to rpm get/put calls used around Hw access in + * the work item function. + */ +if (!rpm_suspend && (i915.guc_log_level >= 0)) +flush_work(_priv->guc.log.flush_work); In which case (rpm suspend) the flush_work is idle and this a noop. That you have to pass around such state suggests that you are papering over a bug? In case of rpm suspend the flush_work may not be a NOOP. Can use the flush_work for runtime suspend also but in spite of that can't prevent the 'RPM wakelock' asserts, as the work item can get executed after the rpm ref count drops to zero and before runtime suspend kicks in (after autosuspend delay). For that you had earlier suggested to use rpm get/put in the work item function, around the register access, but with that had to remove the flush_work from the suspend hook, otherwise a deadlock can happen. So doing the flush_work conditionally for system suspend case, as rpm get/put won't cause the resume of device in that case. Actually I had discussed about this with Imre and as per his inputs prepared this patch. There would be this alternative: Thanks much for suggesting the alternate approach. Just to confirm whether I understood everything correctly, in gen9_guc_irq_handler(): WARN_ON(!intel_runtime_pm_get_if_in_use()); Used WARN, as we don't expect the device to be suspended at this juncture, so intel_runtime_pm_get_if_in_use() should return true. if (!queue_work(log.flush_work)) If queue_work returns 0, then work item is already pending, so it won't be queued hence can release the rpm ref count now only. intel_runtime_pm_put(); and dropping the reference at the end of the work item. This will be just like the __intel_autoenable_gt_powersave This would make the flush_work() a nop in case of runtime_suspend(). So can call the flush_work unconditionally. Hope I understood it correctly. Yes, the above is correct except for my mistake in handling intel_runtime_pm_get_if_in_use() returning false as discussed below. Hi Imre, You had suggested to use the below code from irq handler, suspecting that intel_runtime_pm_get_if_in_use() can return false, if interrupt gets handled just after device goes out of use. if (intel_runtime_pm_get_if_in_use()) { if (!queue_work(log.flush_work)) intel_runtime_pm_put(); } Do you mean to say that interrupt can come when rpm suspend has already started but before the interrupt is disabled from the suspend hook ? Like if interrupt comes b/w 1) & 4), then runtime_pm_get_if_in_use() will return false. 1) Autosuspend delay elapses (device is marked as suspending) 2) intel_runtime_suspend 3) intel_guc_suspend 4) gen9_disable_guc_interrupts(dev_pri v); No, it can return false anytime the last RPM reference is dropped, that is even before the autosuspend delay elapses. Sorry I missed that pm_runtime_get_if_in_use() will return 0 if RPM ref count has dropped to 0, even if device is still in runtime active state (as autosuspend delay has not elapsed). > But that still makes the likelihood for a missed work item scheduling small, because 1) we want to reduce the autosuspend delay considerably from the current 10 sec and 2) because what you say below about the GPU actually idling before the RPM refcount going to 0. If the above hypothesis is correct, then it implies that interrupt has to come after autosuspend delay has elapsed for the above scenario to arise. I think it would be unlikely for the interrupt to come so late because device would have gone idle just before the autosuspend period started and so no GuC submissions would have been done after that. Right. So the probability of missing a work item could be very less and we can bear that. I haven't looked into what is the consequence of missing a work item, you know this better. In any case - since it is still a possibility - if it's a problem you could still make sure in intel_guc_suspend() that any pending work is completed by calling gu
Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend
On 8/18/2016 4:25 PM, Imre Deak wrote: On to, 2016-08-18 at 09:15 +0530, Goel, Akash wrote: On 8/17/2016 9:07 PM, Goel, Akash wrote: On 8/17/2016 6:41 PM, Imre Deak wrote: On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote: On 8/17/2016 5:11 PM, Chris Wilson wrote: On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote: +int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend) { struct drm_i915_private *dev_priv = to_i915(dev); struct intel_guc *guc = _priv->guc; @@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev) return 0; gen9_disable_guc_interrupts(dev_priv); +/* Sync is needed only for the system suspend case, runtime suspend + * case is covered due to rpm get/put calls used around Hw access in + * the work item function. + */ +if (!rpm_suspend && (i915.guc_log_level >= 0)) +flush_work(_priv->guc.log.flush_work); In which case (rpm suspend) the flush_work is idle and this a noop. That you have to pass around such state suggests that you are papering over a bug? In case of rpm suspend the flush_work may not be a NOOP. Can use the flush_work for runtime suspend also but in spite of that can't prevent the 'RPM wakelock' asserts, as the work item can get executed after the rpm ref count drops to zero and before runtime suspend kicks in (after autosuspend delay). For that you had earlier suggested to use rpm get/put in the work item function, around the register access, but with that had to remove the flush_work from the suspend hook, otherwise a deadlock can happen. So doing the flush_work conditionally for system suspend case, as rpm get/put won't cause the resume of device in that case. Actually I had discussed about this with Imre and as per his inputs prepared this patch. There would be this alternative: Thanks much for suggesting the alternate approach. Just to confirm whether I understood everything correctly, in gen9_guc_irq_handler(): WARN_ON(!intel_runtime_pm_get_if_in_use()); Used WARN, as we don't expect the device to be suspended at this juncture, so intel_runtime_pm_get_if_in_use() should return true. if (!queue_work(log.flush_work)) If queue_work returns 0, then work item is already pending, so it won't be queued hence can release the rpm ref count now only. intel_runtime_pm_put(); and dropping the reference at the end of the work item. This will be just like the __intel_autoenable_gt_powersave This would make the flush_work() a nop in case of runtime_suspend(). So can call the flush_work unconditionally. Hope I understood it correctly. Yes, the above is correct except for my mistake in handling intel_runtime_pm_get_if_in_use() returning false as discussed below. Hi Imre, You had suggested to use the below code from irq handler, suspecting that intel_runtime_pm_get_if_in_use() can return false, if interrupt gets handled just after device goes out of use. if (intel_runtime_pm_get_if_in_use()) { if (!queue_work(log.flush_work)) intel_runtime_pm_put(); } Do you mean to say that interrupt can come when rpm suspend has already started but before the interrupt is disabled from the suspend hook ? Like if interrupt comes b/w 1) & 4), then runtime_pm_get_if_in_use() will return false. 1) Autosuspend delay elapses (device is marked as suspending) 2) intel_runtime_suspend 3) intel_guc_suspend 4) gen9_disable_guc_interrupts(dev_pri v); No, it can return false anytime the last RPM reference is dropped, that is even before the autosuspend delay elapses. Sorry I missed that pm_runtime_get_if_in_use() will return 0 if RPM ref count has dropped to 0, even if device is still in runtime active state (as autosuspend delay has not elapsed). > But that still makes the likelihood for a missed work item scheduling small, because 1) we want to reduce the autosuspend delay considerably from the current 10 sec and 2) because what you say below about the GPU actually idling before the RPM refcount going to 0. If the above hypothesis is correct, then it implies that interrupt has to come after autosuspend delay has elapsed for the above scenario to arise. I think it would be unlikely for the interrupt to come so late because device would have gone idle just before the autosuspend period started and so no GuC submissions would have been done after that. Right. So the probability of missing a work item could be very less and we can bear that. I haven't looked into what is the consequence of missing a work item, you know this better. In any case - since it is still a possibility - if it's a problem you could still make sure in intel_guc_suspend() that any pending work is completed by calling guc_read_update_log_buffer(), host2guc_logbuffer_flush_complete() if necessary after disabling interrupts in intel
Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend
On 8/17/2016 9:07 PM, Goel, Akash wrote: On 8/17/2016 6:41 PM, Imre Deak wrote: On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote: On 8/17/2016 5:11 PM, Chris Wilson wrote: On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote: +int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend) { struct drm_i915_private *dev_priv = to_i915(dev); struct intel_guc *guc = _priv->guc; @@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev) return 0; gen9_disable_guc_interrupts(dev_priv); +/* Sync is needed only for the system suspend case, runtime suspend + * case is covered due to rpm get/put calls used around Hw access in + * the work item function. + */ +if (!rpm_suspend && (i915.guc_log_level >= 0)) +flush_work(_priv->guc.log.flush_work); In which case (rpm suspend) the flush_work is idle and this a noop. That you have to pass around such state suggests that you are papering over a bug? In case of rpm suspend the flush_work may not be a NOOP. Can use the flush_work for runtime suspend also but in spite of that can't prevent the 'RPM wakelock' asserts, as the work item can get executed after the rpm ref count drops to zero and before runtime suspend kicks in (after autosuspend delay). For that you had earlier suggested to use rpm get/put in the work item function, around the register access, but with that had to remove the flush_work from the suspend hook, otherwise a deadlock can happen. So doing the flush_work conditionally for system suspend case, as rpm get/put won't cause the resume of device in that case. Actually I had discussed about this with Imre and as per his inputs prepared this patch. There would be this alternative: Thanks much for suggesting the alternate approach. Just to confirm whether I understood everything correctly, in gen9_guc_irq_handler(): WARN_ON(!intel_runtime_pm_get_if_in_use()); Used WARN, as we don't expect the device to be suspended at this juncture, so intel_runtime_pm_get_if_in_use() should return true. if (!queue_work(log.flush_work)) If queue_work returns 0, then work item is already pending, so it won't be queued hence can release the rpm ref count now only. intel_runtime_pm_put(); and dropping the reference at the end of the work item. This will be just like the __intel_autoenable_gt_powersave This would make the flush_work() a nop in case of runtime_suspend(). So can call the flush_work unconditionally. Hope I understood it correctly. Hi Imre, You had suggested to use the below code from irq handler, suspecting that intel_runtime_pm_get_if_in_use() can return false, if interrupt gets handled just after device goes out of use. if (intel_runtime_pm_get_if_in_use()) { if (!queue_work(log.flush_work)) intel_runtime_pm_put(); } Do you mean to say that interrupt can come when rpm suspend has already started but before the interrupt is disabled from the suspend hook ? Like if interrupt comes b/w 1) & 4), then runtime_pm_get_if_in_use() will return false. 1) Autosuspend delay elapses (device is marked as suspending) 2) intel_runtime_suspend 3) intel_guc_suspend 4) gen9_disable_guc_interrupts(dev_priv); If the above hypothesis is correct, then it implies that interrupt has to come after autosuspend delay has elapsed for the above scenario to arise. I think it would be unlikely for the interrupt to come so late because device would have gone idle just before the autosuspend period started and so no GuC submissions would have been done after that. So the probability of missing a work item could be very less and we can bear that. Best regards Akash Best regards Akash --Imre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend
On 8/17/2016 6:41 PM, Imre Deak wrote: On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote: On 8/17/2016 5:11 PM, Chris Wilson wrote: On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote: On 17/08/16 11:14, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> The GuC log buffer flush work item does a register access to send the ack to GuC and this work item, if not synced before suspend, can potentially get executed after the GFX device is suspended. The work item function uses rpm_get/rpm_put calls around the Hw access, this covers the runtime suspend case but for system suspend case (which can be done asychronously/forcefully) sync would be required as kernel can potentially schedule the work items even after some devices, including GFX, have been put to suspend. Also sync has to be done conditionally i.e. only for the system suspend case, as sync along with rpm_get/rpm_put calls can cause a deadlock for rpm suspend path. Cc: Imre Deak <imre.d...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_drv.c| 4 ++-- drivers/gpu/drm/i915/i915_guc_submission.c | 8 +++- drivers/gpu/drm/i915/intel_guc.h | 2 +- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index cdee60b..2ae0ad4 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1427,7 +1427,7 @@ static int i915_drm_suspend(struct drm_device *dev) goto out; } - intel_guc_suspend(dev); + intel_guc_suspend(dev, false); intel_display_suspend(dev); @@ -2321,7 +2321,7 @@ static int intel_runtime_suspend(struct device *device) i915_gem_release_all_mmaps(dev_priv); mutex_unlock(>struct_mutex); - intel_guc_suspend(dev); + intel_guc_suspend(dev, true); intel_runtime_pm_disable_interrupts(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index ef0c116..1af8a8b 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1519,7 +1519,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv) * intel_guc_suspend() - notify GuC entering suspend state * @dev: drm device */ -int intel_guc_suspend(struct drm_device *dev) +int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend) { struct drm_i915_private *dev_priv = to_i915(dev); struct intel_guc *guc = _priv->guc; @@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev) return 0; gen9_disable_guc_interrupts(dev_priv); + /* Sync is needed only for the system suspend case, runtime suspend +* case is covered due to rpm get/put calls used around Hw access in +* the work item function. +*/ + if (!rpm_suspend && (i915.guc_log_level >= 0)) + flush_work(_priv->guc.log.flush_work); In which case (rpm suspend) the flush_work is idle and this a noop. That you have to pass around such state suggests that you are papering over a bug? In case of rpm suspend the flush_work may not be a NOOP. Can use the flush_work for runtime suspend also but in spite of that can't prevent the 'RPM wakelock' asserts, as the work item can get executed after the rpm ref count drops to zero and before runtime suspend kicks in (after autosuspend delay). For that you had earlier suggested to use rpm get/put in the work item function, around the register access, but with that had to remove the flush_work from the suspend hook, otherwise a deadlock can happen. So doing the flush_work conditionally for system suspend case, as rpm get/put won't cause the resume of device in that case. Actually I had discussed about this with Imre and as per his inputs prepared this patch. There would be this alternative: Thanks much for suggesting the alternate approach. Just to confirm whether I understood everything correctly, in gen9_guc_irq_handler(): WARN_ON(!intel_runtime_pm_get_if_in_use()); Used WARN, as we don't expect the device to be suspended at this juncture, so intel_runtime_pm_get_if_in_use() should return true. if (!queue_work(log.flush_work)) If queue_work returns 0, then work item is already pending, so it won't be queued hence can release the rpm ref count now only. intel_runtime_pm_put(); and dropping the reference at the end of the work item. This will be just like the __intel_autoenable_gt_powersave This would make the flush_work() a nop in case of runtime_suspend(). So can call the flush_work unconditionally. Hope I understood it correctly. Best regards Akash --Imre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend
On 8/17/2016 5:11 PM, Chris Wilson wrote: On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote: On 17/08/16 11:14, akash.g...@intel.com wrote: From: Akash GoelThe GuC log buffer flush work item does a register access to send the ack to GuC and this work item, if not synced before suspend, can potentially get executed after the GFX device is suspended. The work item function uses rpm_get/rpm_put calls around the Hw access, this covers the runtime suspend case but for system suspend case (which can be done asychronously/forcefully) sync would be required as kernel can potentially schedule the work items even after some devices, including GFX, have been put to suspend. Also sync has to be done conditionally i.e. only for the system suspend case, as sync along with rpm_get/rpm_put calls can cause a deadlock for rpm suspend path. Cc: Imre Deak Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.c| 4 ++-- drivers/gpu/drm/i915/i915_guc_submission.c | 8 +++- drivers/gpu/drm/i915/intel_guc.h | 2 +- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index cdee60b..2ae0ad4 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1427,7 +1427,7 @@ static int i915_drm_suspend(struct drm_device *dev) goto out; } - intel_guc_suspend(dev); + intel_guc_suspend(dev, false); intel_display_suspend(dev); @@ -2321,7 +2321,7 @@ static int intel_runtime_suspend(struct device *device) i915_gem_release_all_mmaps(dev_priv); mutex_unlock(>struct_mutex); - intel_guc_suspend(dev); + intel_guc_suspend(dev, true); intel_runtime_pm_disable_interrupts(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index ef0c116..1af8a8b 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1519,7 +1519,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv) * intel_guc_suspend() - notify GuC entering suspend state * @dev: drm device */ -int intel_guc_suspend(struct drm_device *dev) +int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend) { struct drm_i915_private *dev_priv = to_i915(dev); struct intel_guc *guc = _priv->guc; @@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev) return 0; gen9_disable_guc_interrupts(dev_priv); + /* Sync is needed only for the system suspend case, runtime suspend +* case is covered due to rpm get/put calls used around Hw access in +* the work item function. +*/ + if (!rpm_suspend && (i915.guc_log_level >= 0)) + flush_work(_priv->guc.log.flush_work); In which case (rpm suspend) the flush_work is idle and this a noop. That you have to pass around such state suggests that you are papering over a bug? In case of rpm suspend the flush_work may not be a NOOP. Can use the flush_work for runtime suspend also but in spite of that can't prevent the 'RPM wakelock' asserts, as the work item can get executed after the rpm ref count drops to zero and before runtime suspend kicks in (after autosuspend delay). For that you had earlier suggested to use rpm get/put in the work item function, around the register access, but with that had to remove the flush_work from the suspend hook, otherwise a deadlock can happen. So doing the flush_work conditionally for system suspend case, as rpm get/put won't cause the resume of device in that case. Actually I had discussed about this with Imre and as per his inputs prepared this patch. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/19] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/17/2016 4:37 PM, Tvrtko Ursulin wrote: On 17/08/16 11:14, akash.g...@intel.com wrote: From: Sagar Arun KambleGuC ukernel sends an interrupt to Host to flush the log buffer and expects Host to correspondingly update the read pointer information in the state structure, once it has consumed the log buffer contents by copying them to a file or buffer. Even if Host couldn't copy the contents, it can still update the read pointer so that logging state is not disturbed on GuC side. v2: - Use a dedicated workqueue for handling flush interrupt. (Tvrtko) - Reduce the overall log buffer copying time by skipping the copy of crash buffer area for regular cases and copying only the state structure data in first page. v3: - Create a vmalloc mapping of log buffer. (Chris) - Cover the flush acknowledgment under rpm get & put.(Chris) - Revert the change of skipping the copy of crash dump area, as not really needed, will be covered by subsequent patch. v4: - Destroy the wq under the same condition in which it was created, pass dev_piv pointer instead of dev to newly added GuC function, add more comments & rename variable for clarity. (Tvrtko) v5: - Allocate & destroy the dedicated wq, for handling flush interrupt, from the setup/teardown routines of GuC logging. (Chris) - Validate the log buffer size value retrieved from state structure and do some minor cleanup. (Tvrtko) - Fix error/warnings reported by checkpatch. (Tvrtko) - Rebase. v6: - Remove the interrupts_enabled check from guc_capture_logs_work, need to process that last work item also, queued just before disabling the interrupt as log buffer flush interrupt handling is a bit different case where GuC is actually expecting an ACK from host, which should be provided to keep the logging going. Sync against the work will be done by caller disabling the interrupt. - Don't sample the log buffer size value from state structure, directly use the expected value to move the pointer & do the copy and that cannot go wrong (out of bounds) as Driver only allocated the log buffer and the relay buffers. Driver should refrain from interpreting the log packet, as much possible and let Userspace parser detect the anomaly. (Chris) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 186 + drivers/gpu/drm/i915/i915_irq.c| 28 - drivers/gpu/drm/i915/intel_guc.h | 4 + 3 files changed, 217 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index b062da6..ade51cb 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc, return host2guc_action(guc, data, ARRAY_SIZE(data)); } +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) +{ +u32 data[1]; + +data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE; + +return host2guc_action(guc, data, 1); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -828,6 +837,163 @@ err: return NULL; } +static void guc_move_to_next_buf(struct intel_guc *guc) +{ +} + +static void *guc_get_write_buffer(struct intel_guc *guc) +{ +return NULL; +} + +static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type) +{ +if (type == GUC_ISR_LOG_BUFFER) +return (GUC_LOG_ISR_PAGES + 1) * PAGE_SIZE; +else if (type == GUC_DPC_LOG_BUFFER) +return (GUC_LOG_DPC_PAGES + 1) * PAGE_SIZE; +else +return (GUC_LOG_CRASH_PAGES + 1) * PAGE_SIZE; +} Could do it with a switch statement to get automatic reminder of size not being handled if some day a new log buffer type gets added. It would probably more in the style of the rest of the driver as well. Fine will use switch statement here. Should I use BUG_ON for the default/unhandled case ? case GUC_ISR_LOG_BUFFER case GUC_DPC_LOG_BUFFER case GUC_LOG_CRASH_PAGES default BUG_ON(1) + +static void guc_read_update_log_buffer(struct intel_guc *guc) +{ +struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; +struct guc_log_buffer_state log_buffer_state_local; +void *src_data_ptr, *dst_data_ptr; +unsigned int buffer_size; +enum guc_log_buffer_type type; + +if (WARN_ON(!guc->log.buf_addr)) +return; + +/* Get the pointer to shared GuC log buffer */ +log_buffer_state = src_data_ptr = guc->log.buf_addr; + +/* Get the pointer to local buffer to store the logs */ +dst_data_ptr = log_buffer_snapshot_state = guc_get_write_buffer(guc); + +/* Actual logs are present from the 2nd page */ +src_data_ptr
Re: [Intel-gfx] [PATCH 14/18] drm/i915: Forcefully flush GuC log buffer on reset
On 8/16/2016 4:57 PM, Tvrtko Ursulin wrote: On 15/08/16 15:49, akash.g...@intel.com wrote: From: Sagar Arun KambleBefore capturing the GuC logs as a part of error state, there should be a force log buffer flush action sent to GuC before proceeding with GPU reset and re-initializing GUC. There could be some data in the log buffer which is yet to be captured and those logs would be particularly useful to understand that why the GPU reset was initiated. v2: - Avoid the wait via flush_work, to serialize against an ongoing log buffer flush, from the error state capture path. (Chris) - Rebase. Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_gpu_error.c | 2 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 30 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 33 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 94297aa..b73c671 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1301,6 +1301,8 @@ static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv, if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0)) return; +i915_guc_flush_logs(dev_priv, false); + error->guc_log = i915_error_object_create(dev_priv, dev_priv->guc.log.vma); } diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index b8d6313..85df2f3 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -185,6 +185,16 @@ static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) return host2guc_action(guc, data, 1); } +static int host2guc_force_logbuffer_flush(struct intel_guc *guc) +{ +u32 data[2]; + +data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH; +data[1] = 0; + +return host2guc_action(guc, data, 2); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -1536,6 +1546,26 @@ void i915_guc_capture_logs(struct drm_i915_private *dev_priv) intel_runtime_pm_put(dev_priv); } +void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait) +{ +if (!i915.enable_guc_submission || (i915.guc_log_level < 0)) +return; + +/* First disable the interrupts, will be renabled afterwards */ +gen9_disable_guc_interrupts(dev_priv); + +/* Before initiating the forceful flush, wait for any pending/ongoing + * flush to complete otherwise forceful flush may not happen, but wait + * can't be done for some paths like error state capture in which case + * take a chance & directly attempt the forceful flush. + */ +if (can_wait) +flush_work(_priv->guc.log.flush_work); + +/* Ask GuC to update the log buffer state */ +host2guc_force_logbuffer_flush(_priv->guc); Should you just call i915_guc_capture_logs from here? Error capture could also potentially benefit from it and you could remove it from the debugfs patch then. Actually earlier it was done like that only, but now after adding the patch, [PATCH 13/18] drm/i915: Augment i915 error state to include the dump of GuC log buffer, Contents of GuC log buffer is anyways made part of the error state, so thought it may not be of any real use to capture the left over logs in the relay sub buffer also. For the analysis purpose, GuC logs part of the error state dump would be good enough. best regards Akash +} + void i915_guc_unregister(struct drm_i915_private *dev_priv) { if (!i915.enable_guc_submission) diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h index 8598f38..d7eda42 100644 --- a/drivers/gpu/drm/i915/intel_guc.h +++ b/drivers/gpu/drm/i915/intel_guc.h @@ -182,6 +182,7 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request *rq); void i915_guc_submission_disable(struct drm_i915_private *dev_priv); void i915_guc_submission_fini(struct drm_i915_private *dev_priv); void i915_guc_capture_logs(struct drm_i915_private *dev_priv); +void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait); void i915_guc_register(struct drm_i915_private *dev_priv); void i915_guc_unregister(struct drm_i915_private *dev_priv); Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 14/18] drm/i915: Forcefully flush GuC log buffer on reset
On 8/16/2016 2:55 PM, Tvrtko Ursulin wrote: On 16/08/16 06:25, Goel, Akash wrote: On 8/15/2016 9:18 PM, Tvrtko Ursulin wrote: On 15/08/16 15:49, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> Before capturing the GuC logs as a part of error state, there should be a force log buffer flush action sent to GuC before proceeding with GPU reset and re-initializing GUC. There could be some data in the log buffer which is yet to be captured and those logs would be particularly useful to understand that why the GPU reset was initiated. v2: - Avoid the wait via flush_work, to serialize against an ongoing log buffer flush, from the error state capture path. (Chris) Could you explain if the patch does anything now that the flush has been removed? flush_work for the regular log buffer flush work item has been removed but the forceful command is still sent to GuC. In fact I don't even understand what it was doing before. :) I am sorry for that. If the idea is to send a flush command to GuC so it can raise an interrupt for a partially full buffer, Yes exactly this is the idea. But then isn't the order wrong? Should it first send the flush command to the GuC and then wait for something maybe gets flushed? As I tried to clarify in my last email that GuC firmware just ignores the forceful flush command received from Host, if it sees there is a pending request for regular log buffer flush, for which it hasn't received the ack. So from the Host side, we need to first wait for the regular log buffer flush work item to finish execution, if any, and then send the forceful flush command to GuC. I can see that it could be tricky since the timing is undefined, but I don't Yes it is deinitely tricky with respect to the timing. understand where it currently actually processes that potential extra packets. The extra left over logs are captured manually just after sending the forceful flush command to GuC. i915_guc_flush_logs(dev_priv, true); /* GuC would have updated log buffer by now, so capture it */ i915_guc_capture_logs(dev_priv); Especially since it disabled interrupts before hand. Had disabled the interrupt, out of paranoia, to avoid a situation of work item getting scheduled again (for a different buffer type) while we manually collect the extra logs. Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/18] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/15/2016 10:26 PM, Chris Wilson wrote: On Mon, Aug 15, 2016 at 10:16:56PM +0530, Goel, Akash wrote: On 8/15/2016 9:36 PM, Chris Wilson wrote: On Mon, Aug 15, 2016 at 08:19:47PM +0530, akash.g...@intel.com wrote: +static void guc_read_update_log_buffer(struct intel_guc *guc) +{ + struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; + struct guc_log_buffer_state log_buffer_state_local; + void *src_data_ptr, *dst_data_ptr; + unsigned int buffer_size, expected_size; + enum guc_log_buffer_type type; + + if (WARN_ON(!guc->log.buf_addr)) + return; + + /* Get the pointer to shared GuC log buffer */ + log_buffer_state = src_data_ptr = guc->log.buf_addr; + + /* Get the pointer to local buffer to store the logs */ + dst_data_ptr = log_buffer_snapshot_state = guc_get_write_buffer(guc); + + /* Actual logs are present from the 2nd page */ + src_data_ptr += PAGE_SIZE; + dst_data_ptr += PAGE_SIZE; + + for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) { + /* Make a copy of the state structure in GuC log buffer (which +* is uncached mapped) on the stack to avoid reading from it +* multiple times. +*/ + memcpy(_buffer_state_local, log_buffer_state, + sizeof(struct guc_log_buffer_state)); + buffer_size = log_buffer_state_local.size; + + if (log_buffer_snapshot_state) { + /* First copy the state structure in snapshot buffer */ + memcpy(log_buffer_snapshot_state, _buffer_state_local, + sizeof(struct guc_log_buffer_state)); + + /* The write pointer could have been updated by the GuC +* firmware, after sending the flush interrupt to Host, +* for consistency set the write pointer value to same +* value of sampled_write_ptr in the snapshot buffer. +*/ + log_buffer_snapshot_state->write_ptr = + log_buffer_snapshot_state->sampled_write_ptr; + + log_buffer_snapshot_state++; + + /* Now copy the actual logs, but before that validate +* the buffer size value retrieved from state structure. +*/ + if (type == GUC_ISR_LOG_BUFFER) + expected_size = (GUC_LOG_ISR_PAGES+1)*PAGE_SIZE; + else if (type == GUC_DPC_LOG_BUFFER) + expected_size = (GUC_LOG_DPC_PAGES+1)*PAGE_SIZE; + else + expected_size = (GUC_LOG_CRASH_PAGES+1)*PAGE_SIZE; + + if (unlikely(buffer_size != expected_size)) { + DRM_ERROR("unexpected log buffer size\n"); + /* Continue with further copying, already state +* structure has been copied which is enough to +* let Userspace know about the anomaly. +*/ + buffer_size = expected_size; Urm, no. You tell userspace one thing and then do another. This code should just be a conduit and not apply its own outdated interpretation. Userspace parser would get to know from the state structure about the anomalous buffer size. It will, but it won't be told what the kernel did. So if believes the GuC (as it should since it is a packet that should be unadulterated) the entire stream is then corrupt. Please suggest that what should be done here ideally. Should the further copying (for this snapshot) be skipped ? The kernel should be avoiding interpretting the log packets as much as possible - I would prefer it if we just moved the byte stream without trying to interpret it as datagrams. But there is probably some merit to at least using the log packets (datagrams). It would have been ideal if log packets can be dumped without any interpretation. We copy the payload without any interpretation, only some bits of header we parse. We also have to interpret the header (in subsequent patch) to copy only the updated payload data, for better performance. + } + + memcpy(dst_data_ptr, src_data_ptr, buffer_size); Where do you validate that buffer_size is sane before copying? Sorry didn't get you, the check for buffer_size is being done right before this memcpy. There is no explicit check for valid src_data_ptr + buffer_size or dst_data_ptr + buffer_size, and a quick glance at the code suggested no reason to believe they must be valid. Actually if buffer_size has been validated & corrected, then bot
Re: [Intel-gfx] [PATCH 14/18] drm/i915: Forcefully flush GuC log buffer on reset
On 8/15/2016 9:18 PM, Tvrtko Ursulin wrote: On 15/08/16 15:49, akash.g...@intel.com wrote: From: Sagar Arun KambleBefore capturing the GuC logs as a part of error state, there should be a force log buffer flush action sent to GuC before proceeding with GPU reset and re-initializing GUC. There could be some data in the log buffer which is yet to be captured and those logs would be particularly useful to understand that why the GPU reset was initiated. v2: - Avoid the wait via flush_work, to serialize against an ongoing log buffer flush, from the error state capture path. (Chris) Could you explain if the patch does anything now that the flush has been removed? flush_work for the regular log buffer flush work item has been removed but the forceful command is still sent to GuC. In fact I don't even understand what it was doing before. :) I am sorry for that. If the idea is to send a flush command to GuC so it can raise an interrupt for a partially full buffer, Yes exactly this is the idea. then i915_guc_flush_logs should send the flush command and wait for that interrupt/work. But the function is first waiting for the work item to complete and then sending the flush command to the GuC. So I am confused. Actually GuC firmware just ignores the forceful flush command received from Host, if it sees there is a pending request for regular log buffer flush, for which it hasn't received the ack. So that's why from Host side, before sending the forceful flush command to GuC, had to first wait for the regular log buffer flush work item to finish execution. Best regards Akash Regards, Tvrtko - Rebase. Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_gpu_error.c | 2 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 30 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 33 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 94297aa..b73c671 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1301,6 +1301,8 @@ static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv, if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0)) return; +i915_guc_flush_logs(dev_priv, false); + error->guc_log = i915_error_object_create(dev_priv, dev_priv->guc.log.vma); } diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index b8d6313..85df2f3 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -185,6 +185,16 @@ static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) return host2guc_action(guc, data, 1); } +static int host2guc_force_logbuffer_flush(struct intel_guc *guc) +{ +u32 data[2]; + +data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH; +data[1] = 0; + +return host2guc_action(guc, data, 2); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -1536,6 +1546,26 @@ void i915_guc_capture_logs(struct drm_i915_private *dev_priv) intel_runtime_pm_put(dev_priv); } +void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait) +{ +if (!i915.enable_guc_submission || (i915.guc_log_level < 0)) +return; + +/* First disable the interrupts, will be renabled afterwards */ +gen9_disable_guc_interrupts(dev_priv); + +/* Before initiating the forceful flush, wait for any pending/ongoing + * flush to complete otherwise forceful flush may not happen, but wait + * can't be done for some paths like error state capture in which case + * take a chance & directly attempt the forceful flush. + */ +if (can_wait) +flush_work(_priv->guc.log.flush_work); + +/* Ask GuC to update the log buffer state */ +host2guc_force_logbuffer_flush(_priv->guc); +} + void i915_guc_unregister(struct drm_i915_private *dev_priv) { if (!i915.enable_guc_submission) diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h index 8598f38..d7eda42 100644 --- a/drivers/gpu/drm/i915/intel_guc.h +++ b/drivers/gpu/drm/i915/intel_guc.h @@ -182,6 +182,7 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request *rq); void i915_guc_submission_disable(struct drm_i915_private *dev_priv); void i915_guc_submission_fini(struct drm_i915_private *dev_priv); void i915_guc_capture_logs(struct drm_i915_private *dev_priv); +void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait); void i915_guc_register(struct drm_i915_private *dev_priv); void i915_guc_unregister(struct drm_i915_private *dev_priv); ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org
Re: [Intel-gfx] [PATCH 06/18] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/15/2016 9:36 PM, Chris Wilson wrote: On Mon, Aug 15, 2016 at 08:19:47PM +0530, akash.g...@intel.com wrote: +static void guc_read_update_log_buffer(struct intel_guc *guc) +{ + struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; + struct guc_log_buffer_state log_buffer_state_local; + void *src_data_ptr, *dst_data_ptr; + unsigned int buffer_size, expected_size; + enum guc_log_buffer_type type; + + if (WARN_ON(!guc->log.buf_addr)) + return; + + /* Get the pointer to shared GuC log buffer */ + log_buffer_state = src_data_ptr = guc->log.buf_addr; + + /* Get the pointer to local buffer to store the logs */ + dst_data_ptr = log_buffer_snapshot_state = guc_get_write_buffer(guc); + + /* Actual logs are present from the 2nd page */ + src_data_ptr += PAGE_SIZE; + dst_data_ptr += PAGE_SIZE; + + for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) { + /* Make a copy of the state structure in GuC log buffer (which +* is uncached mapped) on the stack to avoid reading from it +* multiple times. +*/ + memcpy(_buffer_state_local, log_buffer_state, + sizeof(struct guc_log_buffer_state)); + buffer_size = log_buffer_state_local.size; + + if (log_buffer_snapshot_state) { + /* First copy the state structure in snapshot buffer */ + memcpy(log_buffer_snapshot_state, _buffer_state_local, + sizeof(struct guc_log_buffer_state)); + + /* The write pointer could have been updated by the GuC +* firmware, after sending the flush interrupt to Host, +* for consistency set the write pointer value to same +* value of sampled_write_ptr in the snapshot buffer. +*/ + log_buffer_snapshot_state->write_ptr = + log_buffer_snapshot_state->sampled_write_ptr; + + log_buffer_snapshot_state++; + + /* Now copy the actual logs, but before that validate +* the buffer size value retrieved from state structure. +*/ + if (type == GUC_ISR_LOG_BUFFER) + expected_size = (GUC_LOG_ISR_PAGES+1)*PAGE_SIZE; + else if (type == GUC_DPC_LOG_BUFFER) + expected_size = (GUC_LOG_DPC_PAGES+1)*PAGE_SIZE; + else + expected_size = (GUC_LOG_CRASH_PAGES+1)*PAGE_SIZE; + + if (unlikely(buffer_size != expected_size)) { + DRM_ERROR("unexpected log buffer size\n"); + /* Continue with further copying, already state +* structure has been copied which is enough to +* let Userspace know about the anomaly. +*/ + buffer_size = expected_size; Urm, no. You tell userspace one thing and then do another. This code should just be a conduit and not apply its own outdated interpretation. Userspace parser would get to know from the state structure about the anomalous buffer size. Please suggest that what should be done here ideally. Should the further copying (for this snapshot) be skipped ? + } + + memcpy(dst_data_ptr, src_data_ptr, buffer_size); Where do you validate that buffer_size is sane before copying? Sorry didn't get you, the check for buffer_size is being done right before this memcpy. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 08/18] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
On 8/15/2016 9:42 PM, Chris Wilson wrote: On Mon, Aug 15, 2016 at 05:09:45PM +0100, Chris Wilson wrote: On Mon, Aug 15, 2016 at 08:19:49PM +0530, akash.g...@intel.com wrote: +void i915_guc_register(struct drm_i915_private *dev_priv) +{ + if (!i915.enable_guc_submission) + return; The final state of i915.enable_guc_submission is not known at this time. As per the below sequence, i915.enable_guc_submission would have been set to its final value by this time, i915_driver_load i915_load_modeset_init i915_gem_init_hw intel_guc_setup i915_guc_submission_init i915_guc_submission_enable i915_driver_register i915_debugfs_register i915_guc_register Does it matter if you set up the log even though guc is not used? I think it would be better to do setup only if guc submission is enabled. Would this not be better driver from guc_submission_enable and guc_submission_disable? With the caveat that you probably need both. i.e. you have to wait for both the GuC to be enabled and for sysfs to be available. Sorry I am really confused. Isn't this a right location ? creating the relay file after the debugfs registration has been done. Other logging related setup is being done at i915_guc_submission_init(). Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 11/18] drm/i915: Optimization to reduce the sampling time of GuC log buffer
On 8/15/2016 9:06 PM, Tvrtko Ursulin wrote: On 15/08/16 15:49, akash.g...@intel.com wrote: From: Akash GoelGuC firmware sends an interrupt to flush the log buffer when it becomes half full, so Driver doesn't really need to sample the complete buffer and can just copy only the newly written data by GuC into the local buffer, i.e. as per the read & write pointer values. Moreover the flush interrupt would generally come for one type of log buffer, when it becomes half full, so at that time the other 2 types of log buffer would comparatively have much lesser unread data in them. In case of overflow reported by GuC, Driver do need to copy the entire buffer as the whole buffer would contain the unread data. v2: Rebase. v3: Fix the blooper of doing the copy twice. (Tvrtko) Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 40 +- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c7b4a57..b8d6313 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1003,6 +1003,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) void *src_data_ptr, *dst_data_ptr; unsigned int buffer_size, expected_size; enum guc_log_buffer_type type; +unsigned int read_offset, write_offset, bytes_to_copy; +bool new_overflow; if (WARN_ON(!guc->log.buf_addr)) return; @@ -1025,11 +1027,14 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) memcpy(_buffer_state_local, log_buffer_state, sizeof(struct guc_log_buffer_state)); buffer_size = log_buffer_state_local.size; +read_offset = log_buffer_state_local.read_ptr; +write_offset = log_buffer_state_local.sampled_write_ptr; /* Bookkeeping stuff */ guc->log.flush_count[type] += log_buffer_state_local.flush_to_file; if (log_buffer_state_local.buffer_full_cnt != guc->log.prev_overflow_count[type]) { +new_overflow = 1; guc->log.total_overflow_count[type] += (log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[type]); @@ -1043,7 +1048,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) guc->log.prev_overflow_count[type] = log_buffer_state_local.buffer_full_cnt; DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); -} +} else +new_overflow = 0; Nitpick: normally the rule is if one branch has curlies all of them have to. Checkpatch I think warns about that, or maybe only in strict mode. Did ran checkpatch with strict option. Probably overlooked the warning. Will check again if (log_buffer_snapshot_state) { /* First copy the state structure in snapshot buffer */ @@ -1055,8 +1061,7 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) * for consistency set the write pointer value to same * value of sampled_write_ptr in the snapshot buffer. */ -log_buffer_snapshot_state->write_ptr = -log_buffer_snapshot_state->sampled_write_ptr; +log_buffer_snapshot_state->write_ptr = write_offset; log_buffer_snapshot_state++; @@ -1079,7 +1084,31 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) buffer_size = expected_size; } -memcpy(dst_data_ptr, src_data_ptr, buffer_size); +if (unlikely(new_overflow)) { +/* copy the whole buffer in case of overflow */ +read_offset = 0; +write_offset = buffer_size; +} else if (unlikely((read_offset > buffer_size) || +(write_offset > buffer_size))) { Could also check for read_offset == write_offset for even more safety? That is already handled implicitly, in this case we don't do any copy. As per the below code bytes_to_copy will come as 0. if (read_offset <= write_offset) { bytes_to_copy = write_offset - read_offset; Best regards Akash +DRM_ERROR("invalid log buffer state\n"); +/* copy whole buffer as offsets are unreliable */ +read_offset = 0; +write_offset = buffer_size; +} + +/* Just copy the newly written data */ +if (read_offset <= write_offset) { +bytes_to_copy = write_offset - read_offset; +memcpy(dst_data_ptr + read_offset, + src_data_ptr + read_offset, bytes_to_copy); +} else { +bytes_to_copy = buffer_size - read_offset; +memcpy(dst_data_ptr + read_offset, +
Re: [Intel-gfx] [PATCH 08/18] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
On 8/15/2016 8:59 PM, Tvrtko Ursulin wrote: On 15/08/16 15:49, akash.g...@intel.com wrote: From: Akash GoelAdded a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the User to capture GuC firmware logs. Availed relay framework to implement the interface, where Driver will have to just use a relay API to store snapshots of the GuC log buffer in the buffer managed by relay. The snapshot will be taken when GuC firmware sends a log buffer flush interrupt and up to four snapshots could be stored in the relay buffer. The relay buffer will be operated in a mode where it will overwrite the data not yet collected by User. Besides mmap method, through which User can directly access the relay buffer contents, relay also supports the 'poll' method. Through the 'poll' call on log file, User can come to know whenever a new snapshot of the log buffer is taken by Driver, so can run in tandem with the Driver and capture the logs in a sustained/streaming manner, without any loss of data. v2: Defer the creation of relay channel & associated debugfs file, as debugfs setup is now done at the end of i915 Driver load. (Chris) v3: - Switch to no-overwrite mode for relay. - Fix the relay sub buffer switching sequence. v4: - Update i915 Kconfig to select RELAY config. (TvrtKo) - Log a message when there is no sub buffer available to capture the GuC log buffer. (Tvrtko) - Increase the number of relay sub buffers to 8 from 4, to have sufficient buffering for boot time logs v5: - Fix the alignment, indentation issues and some minor cleanup. (Tvrtko) - Update the comment to elaborate on why a relay channel has to be associated with the debugfs file. (Tvrtko) Suggested-by: Chris Wilson Signed-off-by: Sourab Gupta Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_drv.c| 2 + drivers/gpu/drm/i915/i915_guc_submission.c | 211 - drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 215 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 7769e46..fc900d2 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -11,6 +11,7 @@ config DRM_I915 select DRM_KMS_HELPER select DRM_PANEL select DRM_MIPI_DSI +select RELAY # i915 depends on ACPI_VIDEO when ACPI is enabled # but for select to work, need to select ACPI_VIDEO's dependencies, ick select BACKLIGHT_LCD_SUPPORT if ACPI diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 13ae340..cdee60b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1133,6 +1133,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) /* Reveal our presence to userspace */ if (drm_dev_register(dev, 0) == 0) { i915_debugfs_register(dev_priv); +i915_guc_register(dev_priv); i915_setup_sysfs(dev); } else DRM_ERROR("Failed to register driver for userspace access!\n"); @@ -1171,6 +1172,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) intel_opregion_unregister(dev_priv); i915_teardown_sysfs(_priv->drm); +i915_guc_unregister(dev_priv); i915_debugfs_unregister(dev_priv); drm_dev_unregister(_priv->drm); diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 2b27b87..9b1054c 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -23,6 +23,8 @@ */ #include #include +#include +#include #include "i915_drv.h" #include "intel_guc.h" @@ -837,13 +839,159 @@ err: return NULL; } +/* + * Sub buffer switch callback. Called whenever relay has to switch to a new + * sub buffer, relay stays on the same sub buffer if 0 is returned. + */ +static int subbuf_start_callback(struct rchan_buf *buf, + void *subbuf, + void *prev_subbuf, + size_t prev_padding) +{ +/* Use no-overwrite mode by default, where relay will stop accepting + * new data if there are no empty sub buffers left. + * There is no strict synchronization enforced by relay between Consumer + * and Producer. In overwrite mode, there is a possibility of getting + * inconsistent/garbled data, the producer could be writing on to the + * same sub buffer from which Consumer is reading. This can't be avoided + * unless Consumer is fast enough and can always run in tandem with + * Producer. + */ +if (relay_buf_full(buf)) +return 0; + +return 1; +} + +/* + * file_create() callback. Creates relay file in debugfs. + */ +static struct dentry *create_buf_file_callback(const char *filename, +
Re: [Intel-gfx] [PATCH 06/18] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/15/2016 8:50 PM, Tvrtko Ursulin wrote: On 15/08/16 15:49, akash.g...@intel.com wrote: From: Sagar Arun KambleGuC ukernel sends an interrupt to Host to flush the log buffer and expects Host to correspondingly update the read pointer information in the state structure, once it has consumed the log buffer contents by copying them to a file or buffer. Even if Host couldn't copy the contents, it can still update the read pointer so that logging state is not disturbed on GuC side. v2: - Use a dedicated workqueue for handling flush interrupt. (Tvrtko) - Reduce the overall log buffer copying time by skipping the copy of crash buffer area for regular cases and copying only the state structure data in first page. v3: - Create a vmalloc mapping of log buffer. (Chris) - Cover the flush acknowledgment under rpm get & put.(Chris) - Revert the change of skipping the copy of crash dump area, as not really needed, will be covered by subsequent patch. v4: - Destroy the wq under the same condition in which it was created, pass dev_piv pointer instead of dev to newly added GuC function, add more comments & rename variable for clarity. (Tvrtko) v5: - Allocate & destroy the dedicated wq, for handling flush interrupt, from the setup/teardown routines of GuC logging. (Chris) - Validate the log buffer size value retrieved from state structure and do some minor cleanup. (Tvrtko) - Fix error/warnings reported by checkpatch. (Tvrtko) - Rebase. Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 202 + drivers/gpu/drm/i915/i915_irq.c| 29 - drivers/gpu/drm/i915/intel_guc.h | 4 + 3 files changed, 234 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index b062da6..2b27b87 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc, return host2guc_action(guc, data, ARRAY_SIZE(data)); } +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) +{ +u32 data[1]; + +data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE; + +return host2guc_action(guc, data, 1); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -828,6 +837,179 @@ err: return NULL; } +static void guc_move_to_next_buf(struct intel_guc *guc) +{ +} + +static void *guc_get_write_buffer(struct intel_guc *guc) +{ +return NULL; +} + +static void guc_read_update_log_buffer(struct intel_guc *guc) +{ +struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; +struct guc_log_buffer_state log_buffer_state_local; +void *src_data_ptr, *dst_data_ptr; +unsigned int buffer_size, expected_size; +enum guc_log_buffer_type type; + +if (WARN_ON(!guc->log.buf_addr)) +return; + +/* Get the pointer to shared GuC log buffer */ +log_buffer_state = src_data_ptr = guc->log.buf_addr; + +/* Get the pointer to local buffer to store the logs */ +dst_data_ptr = log_buffer_snapshot_state = guc_get_write_buffer(guc); + +/* Actual logs are present from the 2nd page */ +src_data_ptr += PAGE_SIZE; +dst_data_ptr += PAGE_SIZE; + +for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) { +/* Make a copy of the state structure in GuC log buffer (which + * is uncached mapped) on the stack to avoid reading from it + * multiple times. + */ +memcpy(_buffer_state_local, log_buffer_state, + sizeof(struct guc_log_buffer_state)); +buffer_size = log_buffer_state_local.size; + +if (log_buffer_snapshot_state) { +/* First copy the state structure in snapshot buffer */ +memcpy(log_buffer_snapshot_state, _buffer_state_local, + sizeof(struct guc_log_buffer_state)); + +/* The write pointer could have been updated by the GuC + * firmware, after sending the flush interrupt to Host, + * for consistency set the write pointer value to same + * value of sampled_write_ptr in the snapshot buffer. + */ +log_buffer_snapshot_state->write_ptr = +log_buffer_snapshot_state->sampled_write_ptr; + +log_buffer_snapshot_state++; + +/* Now copy the actual logs, but before that validate + * the buffer size value retrieved from state structure. + */ +if (type == GUC_ISR_LOG_BUFFER) +expected_size = (GUC_LOG_ISR_PAGES+1)*PAGE_SIZE; +else if (type == GUC_DPC_LOG_BUFFER) +expected_size = (GUC_LOG_DPC_PAGES+1)*PAGE_SIZE; +else +
Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
On 8/15/2016 2:50 PM, Tvrtko Ursulin wrote: On 12/08/16 17:31, Goel, Akash wrote: On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> As per the current i915 Driver load sequence, debugfs registration is done at the end and so the relay channel debugfs file is also created after that but the GuC firmware is loaded much earlier in the sequence. As a result Driver could miss capturing the boot-time logs of GuC firmware if there are flush interrupts from the GuC side. Relay has a provision to support early logging where initially only relay channel can be created, to have buffers for storing logs, and later on channel can be associated with a debugfs file at appropriate time. Have availed that, which allows Driver to capture boot time logs also, which can be collected once Userspace comes up. Suggested-by: Chris Wilson <ch...@chris-wilson.co.uk> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_guc_submission.c | 61 +- 1 file changed, 44 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index af48f62..1c287d7 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct intel_guc *guc) relay_close(guc->log.relay_chan); } -static int guc_create_log_relay_file(struct intel_guc *guc) +static int guc_create_relay_channel(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); struct rchan *guc_log_relay_chan; -struct dentry *log_dir; size_t n_subbufs, subbuf_size; -/* For now create the log file in /sys/kernel/debug/dri/0 dir */ -log_dir = dev_priv->drm.primary->debugfs_root; - -/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is - * not mounted and so can't create the relay file. - * The relay API seems to fit well with debugfs only. It only needs a dentry, I don't see that it has to be a debugfs one. Besides dentry, there are other requirements for using relay, which can be met only for a debugfs file. debugfs wasn't the preferred choice to place the log file, but had no other option, as relay API is compatible with debugfs only. What are those and For availing relay there are 3 requirements :- a) Need the associated ‘dentry’ pointer of the file, while opening the relay channel. b) Should be able to use 'relay_file_operations' fops for the file. c) Set the 'i_private' field of file’s inode to the pointer of relay channel buffer. All the above 3 requirements can be met for a debugfs file in a straightforward manner. But not all of them can be met for a file created inside sysfs or if the file is created inside /dev as a character device file. should they be mentioned in the comment above? Or should I mention them in the cover letter or commit message. Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 15/20] drm/i915: Debugfs support for GuC logging control
On 8/12/2016 9:27 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun KambleThis patch provides debugfs interface i915_guc_output_control for on the fly enabling/disabling of logging in GuC firmware and controlling the verbosity level of logs. The value written to the file, should have bit 0 set to enable logging and bits 4-7 should contain the verbosity info. v2: Add a forceful flush, to collect left over logs, on disabling logging. Useful for Validation. v3: Besides minor cleanup, implement read method for the debugfs file and set the guc_log_level to -1 when logging is disabled. (Tvrtko) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c| 44 - drivers/gpu/drm/i915/i915_guc_submission.c | 63 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 107 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 14e0dcf..f472fbcd3 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2674,6 +2674,47 @@ static int i915_guc_log_dump(struct seq_file *m, void *data) return 0; } +static int i915_guc_log_control_get(void *data, u64 *val) +{ +struct drm_device *dev = data; +struct drm_i915_private *dev_priv = to_i915(dev); + +if (!dev_priv->guc.log.obj) +return -EINVAL; + +*val = i915.guc_log_level; + +return 0; +} + +static int i915_guc_log_control_set(void *data, u64 val) +{ +struct drm_device *dev = data; +struct drm_i915_private *dev_priv = to_i915(dev); +int ret; + +ret = mutex_lock_interruptible(>struct_mutex); +if (ret) +return ret; + +if (!dev_priv->guc.log.obj) { +ret = -EINVAL; +goto end; +} + +intel_runtime_pm_get(dev_priv); +ret = i915_guc_log_control(dev_priv, val); +intel_runtime_pm_put(dev_priv); + +end: +mutex_unlock(>struct_mutex); +return ret; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops, +i915_guc_log_control_get, i915_guc_log_control_set, +"%lld\n"); + static int i915_edp_psr_status(struct seq_file *m, void *data) { struct drm_info_node *node = m->private; @@ -5477,7 +5518,8 @@ static const struct i915_debugfs_files { {"i915_fbc_false_color", _fbc_fc_fops}, {"i915_dp_test_data", _displayport_test_data_fops}, {"i915_dp_test_type", _displayport_test_type_fops}, -{"i915_dp_test_active", _displayport_test_active_fops} +{"i915_dp_test_active", _displayport_test_active_fops}, +{"i915_guc_log_control", _guc_log_control_fops} }; void intel_display_crc_init(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 4a75c16..041cf68 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc *guc) return host2guc_action(guc, data, 2); } +static int host2guc_logging_control(struct intel_guc *guc, u32 control_val) +{ +u32 data[2]; + +data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING; +data[1] = control_val; + +return host2guc_action(guc, data, 2); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -1538,3 +1548,56 @@ void i915_guc_register(struct drm_i915_private *dev_priv) guc_log_late_setup(_priv->guc); mutex_unlock(_priv->drm.struct_mutex); } + +int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val) +{ +union guc_log_control log_param; +int ret; + +log_param.logging_enabled = control_val & 0x1; +log_param.verbosity = (control_val >> 4) & 0xF; Maybe "log_param.value = control_val" would also work since guc_log_control is conveniently defined as an union. Doesn't matter though. + +if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN || +log_param.verbosity > GUC_LOG_VERBOSITY_MAX) +return -EINVAL; + +/* This combination doesn't make sense & won't have any effect */ +if (!log_param.logging_enabled && (i915.guc_log_level < 0)) +return 0; I wonder if it would work and maybe look nicer to generalize as: int guc_log_level; guc_log_level = log_param.logging_enabled ? log_param.verbosity : -1; if (i915.guc_log_level == guc_log_level) return 0; Fine, will try to refactor the code as per your suggestions. Thanks for the suggestions. + +ret = host2guc_logging_control(_priv->guc, log_param.value); +if (ret < 0) { +DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret); +return ret; +} + +i915.guc_log_level = log_param.verbosity; This would then become i915.guc_log_level =
Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps
On 8/12/2016 8:46 PM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 08:43:58PM +0530, Goel, Akash wrote: On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote: Unreleated and unmentioned change to no guard page. Best to remove IMHO. Can keep the RB in that case. Though its not called out, sorry for that, but isn't it better to avoid using the guard page, which will save 4KB of vmalloc virtual space (which is scarce) for every mapping created by Driver. Updating the commit message would be fine to mention about this ?. Too late, already applied without the new flag. ohh, the patch is already queued for merge ? Yes, that's why I dropped the guard page when I found out it was being added. Send a patch to add the flag and we can discuss whether we think our code is adequate to not require the protection. Fine, will prepare a separate patch to avoid using the guard page. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash GoelAs per the current i915 Driver load sequence, debugfs registration is done at the end and so the relay channel debugfs file is also created after that but the GuC firmware is loaded much earlier in the sequence. As a result Driver could miss capturing the boot-time logs of GuC firmware if there are flush interrupts from the GuC side. Relay has a provision to support early logging where initially only relay channel can be created, to have buffers for storing logs, and later on channel can be associated with a debugfs file at appropriate time. Have availed that, which allows Driver to capture boot time logs also, which can be collected once Userspace comes up. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 61 +- 1 file changed, 44 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index af48f62..1c287d7 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct intel_guc *guc) relay_close(guc->log.relay_chan); } -static int guc_create_log_relay_file(struct intel_guc *guc) +static int guc_create_relay_channel(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); struct rchan *guc_log_relay_chan; -struct dentry *log_dir; size_t n_subbufs, subbuf_size; -/* For now create the log file in /sys/kernel/debug/dri/0 dir */ -log_dir = dev_priv->drm.primary->debugfs_root; - -/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is - * not mounted and so can't create the relay file. - * The relay API seems to fit well with debugfs only. It only needs a dentry, I don't see that it has to be a debugfs one. Besides dentry, there are other requirements for using relay, which can be met only for a debugfs file. debugfs wasn't the preferred choice to place the log file, but had no other option, as relay API is compatible with debugfs only. Also retrieving dentry of a file is not so straight forward, as it might seem (spent considerable time on this initially). - */ -if (!log_dir) { -DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n"); -return -ENODEV; -} - /* Keep the size of sub buffers same as shared log buffer */ subbuf_size = guc->log.obj->base.size; /* Store up to 8 snaphosts, which is large enough to buffer sufficient @@ -1127,7 +1114,7 @@ static int guc_create_log_relay_file(struct intel_guc *guc) */ n_subbufs = 8; -guc_log_relay_chan = relay_open("guc_log", log_dir, +guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size, n_subbufs, _callbacks, dev_priv); if (!guc_log_relay_chan) { @@ -1140,6 +1127,33 @@ static int guc_create_log_relay_file(struct intel_guc *guc) return 0; } +static int guc_create_log_relay_file(struct intel_guc *guc) +{ +struct drm_i915_private *dev_priv = guc_to_i915(guc); +struct dentry *log_dir; +int ret; + +/* For now create the log file in /sys/kernel/debug/dri/0 dir */ +log_dir = dev_priv->drm.primary->debugfs_root; + +/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is + * not mounted and so can't create the relay file. + * The relay API seems to fit well with debugfs only. + */ +if (!log_dir) { +DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n"); +return -ENODEV; +} + +ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir); +if (ret) { +DRM_DEBUG_DRIVER("Couldn't associate the channel with file %d\n", ret); +return ret; +} + +return 0; +} + static void guc_log_cleanup(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); @@ -1167,7 +1181,7 @@ static int guc_create_log_extras(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); void *vaddr; -int ret; +int ret = 0; lockdep_assert_held(_priv->drm.struct_mutex); @@ -1190,7 +1204,15 @@ static int guc_create_log_extras(struct intel_guc *guc) guc->log.buf_addr = vaddr; } -return 0; +if (!guc->log.relay_chan) { +/* Create a relay channel, so that we have buffers for storing + * the GuC firmware logs, the channel will be linked with a file + * later on when debugfs is registered. + */ +ret = guc_create_relay_channel(guc); +} + +return ret; } static void guc_create_log(struct intel_guc *guc) @@ -1231,6 +1253,7 @@ static void guc_create_log(struct intel_guc *guc)
Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/12/2016 7:37 PM, Tvrtko Ursulin wrote: On 12/08/16 14:45, Goel, Akash wrote: On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> GuC ukernel sends an interrupt to Host to flush the log buffer and expects Host to correspondingly update the read pointer information in the state structure, once it has consumed the log buffer contents by copying them to a file or buffer. Even if Host couldn't copy the contents, it can still update the read pointer so that logging state is not disturbed on GuC side. v2: - Use a dedicated workqueue for handling flush interrupt. (Tvrtko) - Reduce the overall log buffer copying time by skipping the copy of crash buffer area for regular cases and copying only the state structure data in first page. v3: - Create a vmalloc mapping of log buffer. (Chris) - Cover the flush acknowledgment under rpm get & put.(Chris) - Revert the change of skipping the copy of crash dump area, as not really needed, will be covered by subsequent patch. v4: - Destroy the wq under the same condition in which it was created, pass dev_piv pointer instead of dev to newly added GuC function, add more comments & rename variable for clarity. (Tvrtko) Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_drv.c| 14 +++ drivers/gpu/drm/i915/i915_guc_submission.c | 150 + drivers/gpu/drm/i915/i915_irq.c| 5 +- drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 170 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 0fcd1c0..fc2da32 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv) if (dev_priv->hotplug.dp_wq == NULL) goto out_free_wq; +if (HAS_GUC_SCHED(dev_priv)) { This just reminded me that a previous patch had: +if (HAS_GUC_UCODE(dev)) +dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT; In the interrupt setup. I don't think there is a bug right now, but there is a disagreement between the two which would be good to resolve. This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED for correctness. I think. Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch. As per Chris's comments will move the wq init/destroy to the GuC logging setup/teardown routines (guc_create_log_extras, guc_log_cleanup) You are fine with that ?. Yes thats OK I think. +/* Need a dedicated wq to process log buffer flush interrupts + * from GuC without much delay so as to avoid any loss of logs. + */ +dev_priv->guc.log.wq = +alloc_ordered_workqueue("i915-guc_log", 0); +if (dev_priv->guc.log.wq == NULL) +goto out_free_hotplug_dp_wq; +} + return 0; +out_free_hotplug_dp_wq: +destroy_workqueue(dev_priv->hotplug.dp_wq); out_free_wq: destroy_workqueue(dev_priv->wq); out_err: @@ -782,6 +794,8 @@ out_err: static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv) { +if (HAS_GUC_SCHED(dev_priv)) +destroy_workqueue(dev_priv->guc.log.wq); destroy_workqueue(dev_priv->hotplug.dp_wq); destroy_workqueue(dev_priv->wq); } diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c7c679f..2635b67 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc, return host2guc_action(guc, data, ARRAY_SIZE(data)); } +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) +{ +u32 data[1]; + +data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE; + +return host2guc_action(guc, data, 1); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -840,6 +849,127 @@ err: return NULL; } +static void guc_move_to_next_buf(struct intel_guc *guc) +{ +return; +} + +static void* guc_get_write_buffer(struct intel_guc *guc) +{ +return NULL; +} + +static void guc_read_update_log_buffer(struct intel_guc *guc) +{ +struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; +struct guc_log_buffer_state log_buffer_state_local; +void *src_data_ptr, *dst_data_ptr; +u32 i, buffer_size; unsigned int i if you can be bothered. Fine will do that for both i & buffer_size. buffer_size can match the type of log_buffer_state_local.size or use something else if more appropriate. But I remember earlier in one of the patch, you suggested to use u32 as a ty
Re: [Intel-gfx] [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
On 8/12/2016 7:23 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash GoelAdded a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the User to capture GuC firmware logs. Availed relay framework to implement the interface, where Driver will have to just use a relay API to store snapshots of the GuC log buffer in the buffer managed by relay. The snapshot will be taken when GuC firmware sends a log buffer flush interrupt and up to four snaphots could be stored in the relay buffer. snapshots The relay buffer will be operated in a mode where it will overwrite the data not yet collected by User. Besides mmap method, through which User can directly access the relay buffer contents, relay also supports the 'poll' method. Through the 'poll' call on log file, User can come to know whenever a new snapshot of the log buffer is taken by Driver, so can run in tandem with the Driver and capture the logs in a sustained/streaming manner, without any loss of data. v2: Defer the creation of relay channel & associated debugfs file, as debugfs setup is now done at the end of i915 Driver load. (Chris) v3: - Switch to no-overwrite mode for relay. - Fix the relay sub buffer switching sequence. v4: - Update i915 Kconfig to select RELAY config. (TvrtKo) - Log a message when there is no sub buffer available to capture the GuC log buffer. (Tvrtko) - Increase the number of relay sub buffers to 8 from 4, to have sufficient buffering for boot time logs Suggested-by: Chris Wilson Signed-off-by: Sourab Gupta Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_drv.c| 2 + drivers/gpu/drm/i915/i915_guc_submission.c | 206 - drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 209 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 7769e46..fc900d2 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -11,6 +11,7 @@ config DRM_I915 select DRM_KMS_HELPER select DRM_PANEL select DRM_MIPI_DSI +select RELAY # i915 depends on ACPI_VIDEO when ACPI is enabled # but for select to work, need to select ACPI_VIDEO's dependencies, ick select BACKLIGHT_LCD_SUPPORT if ACPI diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index fc2da32..cb8c943 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1145,6 +1145,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) /* Reveal our presence to userspace */ if (drm_dev_register(dev, 0) == 0) { i915_debugfs_register(dev_priv); +i915_guc_register(dev_priv); i915_setup_sysfs(dev); } else DRM_ERROR("Failed to register driver for userspace access!\n"); @@ -1183,6 +1184,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) intel_opregion_unregister(dev_priv); i915_teardown_sysfs(_priv->drm); +i915_guc_unregister(dev_priv); i915_debugfs_unregister(dev_priv); drm_dev_unregister(_priv->drm); diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 2635b67..1a2d648 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -23,6 +23,8 @@ */ #include #include +#include +#include #include "i915_drv.h" #include "intel_guc.h" @@ -851,12 +853,33 @@ err: static void guc_move_to_next_buf(struct intel_guc *guc) { -return; +/* Make sure the updates made in the sub buffer are visible when + * Consumer sees the following update to offset inside the sub buffer. + */ +smp_wmb(); + +/* All data has been written, so now move the offset of sub buffer. */ +relay_reserve(guc->log.relay_chan, guc->log.obj->base.size); + +/* Switch to the next sub buffer */ +relay_flush(guc->log.relay_chan); } static void* guc_get_write_buffer(struct intel_guc *guc) { -return NULL; +/* FIXME: Cover the check under a lock ? */ Need to resolve before r-b in any case. After the last patch in this series, where relay channel will be created before enabling the GuC interrupts, the need of lock will not be there so will remove these comments in that patch. +if (!guc->log.relay_chan) +return NULL; + +/* Just get the base address of a new sub buffer and copy data into it + * ourselves. NULL will be returned in no-overwrite mode, if all sub + * buffers are full. Could have used the relay_write() to indirectly + * copy the data, but that would have been bit convoluted, as we need to + * write to only certain locations inside a sub buffer which cannot be + * done
Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
On 8/12/2016 9:22 PM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote: On 8/12/2016 9:02 PM, Chris Wilson wrote: There's (or will be) a function to dump the error object in a uniform manner. This patch is obsolete. There is a print_error_obj() function, but that prints one dword per line. It used to. It will shortly be a compressed stream. Pretty printing is left to userspace. But invariably, we only will be interpreting the error state or Guc log buffer dump, and it will be really convenient if we can have 4 dwords per line matching the log sample size. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
On 8/12/2016 9:02 PM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 04:20:03PM +0100, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash GoelAdded the dump of GuC log buffer to i915 error state, as the contents of GuC log buffer would also be useful to determine that why the GPU reset was triggered. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++ 2 files changed, 28 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 28ffac5..4bd3790 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -509,6 +509,7 @@ struct drm_i915_error_state { struct intel_overlay_error_state *overlay; struct intel_display_error_state *display; struct drm_i915_error_object *semaphore_obj; + struct drm_i915_error_object *guc_log_obj; struct drm_i915_error_engine { int engine_id; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index eecb870..561b523 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -546,6 +546,21 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, } } + if ((obj = error->guc_log_obj)) { + err_printf(m, "GuC log buffer = 0x%08x\n", + lower_32_bits(obj->gtt_offset)); + for (i = 0; i < obj->page_count; i++) { + for (elt = 0; elt < PAGE_SIZE/4; elt += 4) { Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like it is counting in u32 * 4 chunks so it might be. Or I might be confused.. It will be PAGE_SIZE / 4 only. It took me some iterations to get it right. PAGE_SIZE/4 is number of dwords and elt+=4 is covering 4 dwords in every iteration There's (or will be) a function to dump the error object in a uniform manner. This patch is obsolete. There is a print_error_obj() function, but that prints one dword per line. For GuC log buffer its better (for ease of interpretation) to print 4 dwords per line as each sample if of 4 dwords, also headers are of 8 dwords. Other benefit is that it reduces the line count of the error state file (Compared to other captured buffers like ring buffer, batch buffers, status page, size of Log buffer is more, 76 KB). Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts
On 8/12/2016 8:35 PM, Tvrtko Ursulin wrote: On 12/08/16 15:31, Goel, Akash wrote: On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote: +static void gen9_guc2host_events_work(struct work_struct *work) +{ +struct drm_i915_private *dev_priv = +container_of(work, struct drm_i915_private, guc.events_work); + +spin_lock_irq(_priv->irq_lock); +/* Speed up work cancellation during disabling guc interrupts. */ +if (!dev_priv->guc.interrupts_enabled) { +spin_unlock_irq(_priv->irq_lock); +return; I suppose locking for early exit is something about ensuring the worker sees the update to dev_priv->guc.interrupts_enabled done on another CPU? Yes locking (providing implicit barrier) will ensure that update made from another CPU is immediately visible to the worker. What if the disable happens after the unlock above? It would wait in disable until the irq handler exits. Most probably it will not have to wait, as irq handler would have completed if work item began the execution. Irq handler just queues the work item, which gets scheduled later on. Using the lock is beneficial for the case where the execution of work item and interrupt disabling is done around the same time. Ok maybe I am missing something. When can the interrupt disabling happen? Will it be controlled by the debugfs file or is it driver load/unload and suspend/resume? yes disabling will happen for all the above 3 scenarios. +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) +{ +bool interrupts_enabled; + +if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) { +spin_lock(_priv->irq_lock); +interrupts_enabled = dev_priv->guc.interrupts_enabled; +spin_unlock(_priv->irq_lock); Not sure that taking a lock around only this read is needed. Again same reason as above, to make sure an update made on another CPU is immediately visible to the irq handler. I don't get it, see above. :) Here also If interrupt disabling & ISR execution happens around the same time then ISR might miss the reset of 'interrupts_enabled' flag and queue the new work. What if reset of interrupts_enabled happens just as the ISR releases the lock? Then ISR will proceed ahead and queue the work item. Lock is useful if reset of interrupts_enabled flag just happens before the ISR inspects the value of that flag. Also lock will help when interrupts_enabled flag is set again, next ISR will definitely see it as set. And same applies to the case when interrupt is re-enabled, ISR might still see the 'interrupts_enabled' flag as false. It will eventually see the update though. +if (interrupts_enabled) { +/* Sample the log buffer flush related bits & clear them + * out now itself from the message identity register to + * minimize the probability of losing a flush interrupt, + * when there are back to back flush interrupts. + * There can be a new flush interrupt, for different log + * buffer type (like for ISR), whilst Host is handling + * one (for DPC). Since same bit is used in message + * register for ISR & DPC, it could happen that GuC + * sets the bit for 2nd interrupt but Host clears out + * the bit on handling the 1st interrupt. + */ +u32 msg = I915_READ(SOFT_SCRATCH(15)) & +(GUC2HOST_MSG_CRASH_DUMP_POSTED | + GUC2HOST_MSG_FLUSH_LOG_BUFFER); +if (msg) { +/* Clear the message bits that are handled */ +I915_WRITE(SOFT_SCRATCH(15), +I915_READ(SOFT_SCRATCH(15)) & ~msg); Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it twice? Thought reading it again (just before the update) is bit safer compared to reading it once, as there is a potential race problem here. GuC could also write to the SOFT_SCRATCH(15) register, set new events bit, while Host clears off the bit of handled events. Don't get it. If there is a race between read and write there still is, don't see how a second read makes it safer. Yes can't avoid the race completely by double reads, but can reduce the race window size. There was only one thing between the two reads, and that was "if (msg)": +u32 msg = I915_READ(SOFT_SCRATCH(15)) & +(GUC2HOST_MSG_CRASH_DUMP_POSTED | + GUC2HOST_MSG_FLUSH_LOG_BUFFER); +if (msg) { +/* Clear the message bits that are handled */ +I915_WRITE(SOFT_SCRATCH(15), +I915_READ(SOFT_SCRATCH(15)) & ~msg); Also I felt code looked better in current form, as macros GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used only once. Will change as per the initial implementation. u32 msg = I915_REA
Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps
On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Chris Wilsonvmaps has a provision for controlling the page protection bits, with which we can use to control the mapping type, e.g. WB, WC, UC or even WT. To allow the caller to choose their mapping type, we add a parameter to i915_gem_object_pin_map - but we still only allow one vmap to be cached per object. If the object is currently not pinned, then we recreate the previous vmap with the new access type, but if it was pinned we report an error. This effectively limits the access via i915_gem_object_pin_map to a single mapping type for the lifetime of the object. Not usually a problem, but something to be aware of when setting up the object's vmap. We will want to vary the access type to enable WC mappings of ringbuffer and context objects on !llc platforms, as well as other objects where we need coherent access to the GPU's pages without going through the GTT v2: Remove the redundant braces around pin count check and fix the marker in documentation (Chris) v3: - Add a new enum for the vmalloc mapping type & pass that as an argument to i915_object_pin_map. (Tvrtko) - Use PAGE_MASK to extract or filter the mapping type info and remove a superfluous BUG_ON.(Tvrtko) v4: - Rename the enums and clean up the pin_map function. (Chris) Signed-off-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h| 9 - drivers/gpu/drm/i915/i915_gem.c| 58 +++--- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 2 +- drivers/gpu/drm/i915/i915_guc_submission.c | 2 +- drivers/gpu/drm/i915/intel_lrc.c | 8 ++--- drivers/gpu/drm/i915/intel_ringbuffer.c| 2 +- 6 files changed, 60 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4bd3790..6603812 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -834,6 +834,11 @@ enum i915_cache_level { I915_CACHE_WT, /* hsw:gt3e WriteThrough for scanouts */ }; +enum i915_map_type { +I915_MAP_WB = 0, +I915_MAP_WC, +}; + struct i915_ctx_hang_stats { /* This context had batch pending when hang was declared */ unsigned batch_pending; @@ -3150,6 +3155,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) /** * i915_gem_object_pin_map - return a contiguous mapping of the entire object * @obj - the object to map into kernel address space + * @map_type - whether the vmalloc mapping should be using WC or WB pgprot_t * * Calls i915_gem_object_pin_pages() to prevent reaping of the object's * pages and then returns a contiguous mapping of the backing storage into @@ -3161,7 +3167,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) * Returns the pointer through which to access the mapped object, or an * ERR_PTR() on error. */ -void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj); +void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj, +enum i915_map_type map_type); /** * i915_gem_object_unpin_map - releases an earlier mapping diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 03548db..7dabbc3f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2077,10 +2077,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj) list_del(>global_list); if (obj->mapping) { -if (is_vmalloc_addr(obj->mapping)) -vunmap(obj->mapping); +void *ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK); +if (is_vmalloc_addr(ptr)) +vunmap(ptr); else -kunmap(kmap_to_page(obj->mapping)); +kunmap(kmap_to_page(ptr)); obj->mapping = NULL; } @@ -2253,7 +2254,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) } /* The 'mapping' part of i915_gem_object_pin_map() below */ -static void *i915_gem_object_map(const struct drm_i915_gem_object *obj) +static void *i915_gem_object_map(const struct drm_i915_gem_object *obj, + enum i915_map_type type) { unsigned long n_pages = obj->base.size >> PAGE_SHIFT; struct sg_table *sgt = obj->pages; @@ -2263,9 +2265,10 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj) struct page **pages = stack_pages; unsigned long i = 0; void *addr; +bool use_wc = (type == I915_MAP_WC); /* A single page can always be kmapped */ -if (n_pages == 1) +if (n_pages == 1 && !use_wc) return kmap(sg_page(sgt->sgl)); if (n_pages > ARRAY_SIZE(stack_pages)) { @@ -2281,7 +2284,8 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
Re: [Intel-gfx] [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions
On 8/12/2016 7:25 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash GoelWith the addition of new Host2GuC actions related to GuC logging, there is a need of a lock to serialize them, as they can execute concurrently with each other and also with other existing actions. v2: Use mutex in place of spinlock to serialize, as sleep can happen while waiting for the action's response from GuC. (Tvrtko) Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++ drivers/gpu/drm/i915/intel_guc.h | 3 +++ 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 1a2d648..cb9672b 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) return -EINVAL; intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); +mutex_lock(>action_lock); I would probably take the mutex before grabbing forcewake as a general rule. Not that I think it matters in this case since we don't expect any contention on this one. Yes did not expected a contention for this mutex, hence thought it use just around the code where it is actually needed. Will move it before the forcewake, as you suggested, to conform to the rules. Best regards Akash dev_priv->guc.action_count += 1; dev_priv->guc.action_cmd = data[0]; @@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) } dev_priv->guc.action_status = status; +mutex_unlock(>action_lock); intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); return ret; @@ -1312,6 +1314,7 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv) return -ENOMEM; ida_init(>ctx_ids); +mutex_init(>action_lock); guc_create_log(guc); guc_create_ads(guc); diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h index 96ef7dc..e4ec8d8 100644 --- a/drivers/gpu/drm/i915/intel_guc.h +++ b/drivers/gpu/drm/i915/intel_guc.h @@ -156,6 +156,9 @@ struct intel_guc { uint64_t submissions[I915_NUM_ENGINES]; uint32_t last_seqno[I915_NUM_ENGINES]; + +/* To serialize the Host2GuC actions */ +struct mutex action_lock; }; /* intel_guc_loader.c */ With or without the mutex vs forcewake ordering change: Reviewed-by: Tvrtko Ursulin Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts
On 8/12/2016 7:56 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash GoelGuC firmware sends an interrupt to flush the log buffer when it becomes half full. GuC firmware also tracks how many times the buffer overflowed. It would be useful to maintain a statistics of how many flush interrupts were received and for which type of log buffer, along with the overflow count of each buffer type. Augmented i915_log_info debugfs to report back these statistics. v2: - Update the logic to detect multiple overflows between the 2 flush interrupts and also log a message for overflow (Tvrtko) - Track the number of times there was no free sub buffer to capture the GuC log buffer. (Tvrtko) Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c| 28 drivers/gpu/drm/i915/i915_guc_submission.c | 19 +++ drivers/gpu/drm/i915/i915_irq.c| 2 ++ drivers/gpu/drm/i915/intel_guc.h | 7 +++ 4 files changed, 56 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 51b59d5..14e0dcf 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2539,6 +2539,32 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data) return 0; } +static void i915_guc_log_info(struct seq_file *m, + struct drm_i915_private *dev_priv) +{ +struct intel_guc *guc = _priv->guc; + +seq_printf(m, "\nGuC logging stats:\n"); + +seq_printf(m, "\tISR: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_ISR_LOG_BUFFER], +guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]); + +seq_printf(m, "\tDPC: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_DPC_LOG_BUFFER], +guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]); + +seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER], +guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]); Why is the width for overflow only 8 chars and not 10 like for flush since both are u32? Looks to be a discrepancy. I will check. Both should be 10 as per the max value of u32, which takes 10 digits in decimal form. + +seq_printf(m, "\tTotal flush interrupt count: %u\n", + guc->log.flush_interrupt_count); + +seq_printf(m, "\tCapture miss count: %u\n", + guc->log.capture_miss_count); +} + static void i915_guc_client_info(struct seq_file *m, struct drm_i915_private *dev_priv, struct i915_guc_client *client) @@ -2613,6 +2639,8 @@ static int i915_guc_info(struct seq_file *m, void *data) seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client); i915_guc_client_info(m, dev_priv, ); +i915_guc_log_info(m, dev_priv); + /* Add more as required ... */ return 0; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index cb9672b..1ca1866 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -913,6 +913,24 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) sizeof(struct guc_log_buffer_state)); buffer_size = log_buffer_state_local.size; +guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; +if (log_buffer_state_local.buffer_full_cnt != +guc->log.prev_overflow_count[i]) { +guc->log.total_overflow_count[i] += +(log_buffer_state_local.buffer_full_cnt - + guc->log.prev_overflow_count[i]); + +if (log_buffer_state_local.buffer_full_cnt < +guc->log.prev_overflow_count[i]) { +/* buffer_full_cnt is a 4 bit counter */ +guc->log.total_overflow_count[i] += 16; +} + +guc->log.prev_overflow_count[i] = +log_buffer_state_local.buffer_full_cnt; +DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); +} + if (log_buffer_snapshot_state) { /* First copy the state structure in local buffer */ memcpy(log_buffer_snapshot_state, _buffer_state_local, @@ -953,6 +971,7 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) * getting consumed by User at a slow rate. */ DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n"); +guc->log.capture_miss_count++; } } diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index d4d6f0a..b08d1d2 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1705,6 +1705,8 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
Re: [Intel-gfx] [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer
On 8/12/2016 8:12 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash GoelGuC firmware sends an interrupt to flush the log buffer when it becomes half full, so Driver doesn't really need to sample the complete buffer and can just copy only the newly written data by GuC into the local buffer, i.e. as per the read & write pointer values. Moreover the flush interrupt would generally come for one type of log buffer, when it becomes half full, so at that time the other 2 types of log buffer would comparatively have much lesser unread data in them. In case of overflow reported by GuC, Driver do need to copy the entire buffer as the whole buffer would contain the unread data. v2: Rebase. Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 40 +- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 1ca1866..8e0f360 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; struct guc_log_buffer_state log_buffer_state_local; void *src_data_ptr, *dst_data_ptr; -u32 i, buffer_size; +bool new_overflow; +u32 i, buffer_size, read_offset, write_offset, bytes_to_copy; if (!guc->log.buf_addr) return; @@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) memcpy(_buffer_state_local, log_buffer_state, sizeof(struct guc_log_buffer_state)); buffer_size = log_buffer_state_local.size; +read_offset = log_buffer_state_local.read_ptr; +write_offset = log_buffer_state_local.sampled_write_ptr; guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; if (log_buffer_state_local.buffer_full_cnt != guc->log.prev_overflow_count[i]) { Wrong alignment. You can try checkpatch.pl for all of those. Sorry for all the alignment & indentation issues. Should the above condition be written like this ? if (log_buffer_state_local.buffer_full_cnt != guc->log.prev_overflow_count[i]) { +new_overflow = 1; true/false since it is a bool fine will do that. guc->log.total_overflow_count[i] += (log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[i]); @@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) guc->log.prev_overflow_count[i] = log_buffer_state_local.buffer_full_cnt; DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); -} +} else +new_overflow = 0; if (log_buffer_snapshot_state) { /* First copy the state structure in local buffer */ @@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) * for consistency set the write pointer value to same * value of sampled_write_ptr in the snapshot buffer. */ -log_buffer_snapshot_state->write_ptr = -log_buffer_snapshot_state->sampled_write_ptr; +log_buffer_snapshot_state->write_ptr = write_offset; log_buffer_snapshot_state++; /* Now copy the actual logs */ memcpy(dst_data_ptr, src_data_ptr, buffer_size); The confusing bit - the memcpy above still copies the whole buffer, no? Really very sorry for this blooper. Best regards Akash +if (unlikely(new_overflow)) { +/* copy the whole buffer in case of overflow */ +read_offset = 0; +write_offset = buffer_size; +} else if (unlikely((read_offset > buffer_size) || +(write_offset > buffer_size))) { +DRM_ERROR("invalid log buffer state\n"); +/* copy whole buffer as offsets are unreliable */ +read_offset = 0; +write_offset = buffer_size; +} + +/* Just copy the newly written data */ +if (read_offset <= write_offset) { +bytes_to_copy = write_offset - read_offset; +memcpy(dst_data_ptr + read_offset, + src_data_ptr + read_offset, bytes_to_copy); +} else { +bytes_to_copy = buffer_size - read_offset; +memcpy(dst_data_ptr + read_offset, + src_data_ptr + read_offset, bytes_to_copy); + +bytes_to_copy = write_offset; +memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy); +} src_data_ptr += buffer_size;
Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts
On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote: On 12/08/16 14:10, Goel, Akash wrote: On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> There are certain types of interrupts which Host can recieve from GuC. GuC ukernel sends an interrupt to Host for certain events, like for example retrieve/consume the logs generated by ukernel. This patch adds support to receive interrupts from GuC but currently enables & partially handles only the interrupt sent by GuC ukernel. Future patches will add support for handling other interrupt types. v2: - Use common low level routines for PM IER/IIR programming (Chris) - Rename interrupt functions to gen9_xxx from gen8_xxx (Chris) - Replace disabling of wake ref asserts with rpm get/put (Chris) v3: - Update comments for more clarity. (Tvrtko) - Remove the masking of GuC interrupt, which was kept masked till the start of bottom half, its not really needed as there is only a single instance of work item & wq is ordered. (Tvrtko) v4: - Rebase. - Rename guc_events to pm_guc_events so as to be indicative of the register/control block it is associated with. (Chris) - Add handling for back to back log buffer flush interrupts. v5: - Move the read & clearing of register, containing Guc2Host message bits, outside the irq spinlock. (Tvrtko) Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 5 ++ drivers/gpu/drm/i915/i915_irq.c| 100 +++-- drivers/gpu/drm/i915/i915_reg.h| 11 drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_guc.h | 4 ++ drivers/gpu/drm/i915/intel_guc_loader.c| 4 ++ 7 files changed, 124 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a608a5c..28ffac5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1779,6 +1779,7 @@ struct drm_i915_private { u32 pm_imr; u32 pm_ier; u32 pm_rps_events; +u32 pm_guc_events; u32 pipestat_irq_mask[I915_MAX_PIPES]; struct i915_hotplug hotplug; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index ad3b55f..c7c679f 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev) if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS) return 0; +gen9_disable_guc_interrupts(dev_priv); + ctx = dev_priv->kernel_context; data[0] = HOST2GUC_ACTION_ENTER_S_STATE; @@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev) if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS) return 0; +if (i915.guc_log_level >= 0) +gen9_enable_guc_interrupts(dev_priv); + ctx = dev_priv->kernel_context; data[0] = HOST2GUC_ACTION_EXIT_S_STATE; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 5f93309..5f1974f 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct drm_i915_private *dev_priv, } while (0) static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir); +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir); /* For display hotplug interrupt */ static inline void @@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv) gen6_reset_rps_interrupts(dev_priv); } +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(_priv->irq_lock); +gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events); +spin_unlock_irq(_priv->irq_lock); +} + +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(_priv->irq_lock); +if (!dev_priv->guc.interrupts_enabled) { +WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) & +dev_priv->pm_guc_events); +dev_priv->guc.interrupts_enabled = true; +gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events); +} +spin_unlock_irq(_priv->irq_lock); +} + +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(_priv->irq_lock); +dev_priv->guc.interrupts_enabled = false; + +gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events); + +spin_unlock_irq(_priv->irq_lock); +synchronize_irq(dev_priv->drm.irq); + +gen9_reset_guc_interrupts(dev_priv); +} + /** * bdw_update_port_irq - update DE port interrup
Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun KambleGuC ukernel sends an interrupt to Host to flush the log buffer and expects Host to correspondingly update the read pointer information in the state structure, once it has consumed the log buffer contents by copying them to a file or buffer. Even if Host couldn't copy the contents, it can still update the read pointer so that logging state is not disturbed on GuC side. v2: - Use a dedicated workqueue for handling flush interrupt. (Tvrtko) - Reduce the overall log buffer copying time by skipping the copy of crash buffer area for regular cases and copying only the state structure data in first page. v3: - Create a vmalloc mapping of log buffer. (Chris) - Cover the flush acknowledgment under rpm get & put.(Chris) - Revert the change of skipping the copy of crash dump area, as not really needed, will be covered by subsequent patch. v4: - Destroy the wq under the same condition in which it was created, pass dev_piv pointer instead of dev to newly added GuC function, add more comments & rename variable for clarity. (Tvrtko) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.c| 14 +++ drivers/gpu/drm/i915/i915_guc_submission.c | 150 + drivers/gpu/drm/i915/i915_irq.c| 5 +- drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 170 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 0fcd1c0..fc2da32 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv) if (dev_priv->hotplug.dp_wq == NULL) goto out_free_wq; +if (HAS_GUC_SCHED(dev_priv)) { This just reminded me that a previous patch had: +if (HAS_GUC_UCODE(dev)) +dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT; In the interrupt setup. I don't think there is a bug right now, but there is a disagreement between the two which would be good to resolve. This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED for correctness. I think. Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch. As per Chris's comments will move the wq init/destroy to the GuC logging setup/teardown routines (guc_create_log_extras, guc_log_cleanup) You are fine with that ?. +/* Need a dedicated wq to process log buffer flush interrupts + * from GuC without much delay so as to avoid any loss of logs. + */ +dev_priv->guc.log.wq = +alloc_ordered_workqueue("i915-guc_log", 0); +if (dev_priv->guc.log.wq == NULL) +goto out_free_hotplug_dp_wq; +} + return 0; +out_free_hotplug_dp_wq: +destroy_workqueue(dev_priv->hotplug.dp_wq); out_free_wq: destroy_workqueue(dev_priv->wq); out_err: @@ -782,6 +794,8 @@ out_err: static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv) { +if (HAS_GUC_SCHED(dev_priv)) +destroy_workqueue(dev_priv->guc.log.wq); destroy_workqueue(dev_priv->hotplug.dp_wq); destroy_workqueue(dev_priv->wq); } diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c7c679f..2635b67 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc, return host2guc_action(guc, data, ARRAY_SIZE(data)); } +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) +{ +u32 data[1]; + +data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE; + +return host2guc_action(guc, data, 1); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -840,6 +849,127 @@ err: return NULL; } +static void guc_move_to_next_buf(struct intel_guc *guc) +{ +return; +} + +static void* guc_get_write_buffer(struct intel_guc *guc) +{ +return NULL; +} + +static void guc_read_update_log_buffer(struct intel_guc *guc) +{ +struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; +struct guc_log_buffer_state log_buffer_state_local; +void *src_data_ptr, *dst_data_ptr; +u32 i, buffer_size; unsigned int i if you can be bothered. Fine will do that for both i & buffer_size. But I remember earlier in one of the patch, you suggested to use u32 as a type for some variables. Please could you share the guideline. Should u32, u64 be used we are exactly sure of the range of the variable, like for variables containing the register values ? + +if (!guc->log.buf_addr) +return; Can it hit this? If yes, I think better
Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts
On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun KambleThere are certain types of interrupts which Host can recieve from GuC. GuC ukernel sends an interrupt to Host for certain events, like for example retrieve/consume the logs generated by ukernel. This patch adds support to receive interrupts from GuC but currently enables & partially handles only the interrupt sent by GuC ukernel. Future patches will add support for handling other interrupt types. v2: - Use common low level routines for PM IER/IIR programming (Chris) - Rename interrupt functions to gen9_xxx from gen8_xxx (Chris) - Replace disabling of wake ref asserts with rpm get/put (Chris) v3: - Update comments for more clarity. (Tvrtko) - Remove the masking of GuC interrupt, which was kept masked till the start of bottom half, its not really needed as there is only a single instance of work item & wq is ordered. (Tvrtko) v4: - Rebase. - Rename guc_events to pm_guc_events so as to be indicative of the register/control block it is associated with. (Chris) - Add handling for back to back log buffer flush interrupts. v5: - Move the read & clearing of register, containing Guc2Host message bits, outside the irq spinlock. (Tvrtko) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 5 ++ drivers/gpu/drm/i915/i915_irq.c| 100 +++-- drivers/gpu/drm/i915/i915_reg.h| 11 drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_guc.h | 4 ++ drivers/gpu/drm/i915/intel_guc_loader.c| 4 ++ 7 files changed, 124 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a608a5c..28ffac5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1779,6 +1779,7 @@ struct drm_i915_private { u32 pm_imr; u32 pm_ier; u32 pm_rps_events; +u32 pm_guc_events; u32 pipestat_irq_mask[I915_MAX_PIPES]; struct i915_hotplug hotplug; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index ad3b55f..c7c679f 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev) if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS) return 0; +gen9_disable_guc_interrupts(dev_priv); + ctx = dev_priv->kernel_context; data[0] = HOST2GUC_ACTION_ENTER_S_STATE; @@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev) if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS) return 0; +if (i915.guc_log_level >= 0) +gen9_enable_guc_interrupts(dev_priv); + ctx = dev_priv->kernel_context; data[0] = HOST2GUC_ACTION_EXIT_S_STATE; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 5f93309..5f1974f 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct drm_i915_private *dev_priv, } while (0) static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir); +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir); /* For display hotplug interrupt */ static inline void @@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv) gen6_reset_rps_interrupts(dev_priv); } +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(_priv->irq_lock); +gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events); +spin_unlock_irq(_priv->irq_lock); +} + +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(_priv->irq_lock); +if (!dev_priv->guc.interrupts_enabled) { +WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) & +dev_priv->pm_guc_events); +dev_priv->guc.interrupts_enabled = true; +gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events); +} +spin_unlock_irq(_priv->irq_lock); +} + +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(_priv->irq_lock); +dev_priv->guc.interrupts_enabled = false; + +gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events); + +spin_unlock_irq(_priv->irq_lock); +synchronize_irq(dev_priv->drm.irq); + +gen9_reset_guc_interrupts(dev_priv); +} + /** * bdw_update_port_irq - update DE port interrupt * @dev_priv: driver private @@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct *work) mutex_unlock(_priv->rps.hw_lock); } +static void
Re: [Intel-gfx] [PATCH 14/20] drm/i915: Forcefully flush GuC log buffer on reset
On 8/12/2016 12:03 PM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 11:55:17AM +0530, akash.g...@intel.com wrote: From: Sagar Arun KambleBefore capturing the GuC logs as a part of error state, there should be a force log buffer flush action sent to GuC before proceeding with GPU reset and re-initializing GUC. There could be some data in the log buffer which is yet to be captured and those logs would be particularly useful to understand that why the GPU reset was initiated. Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_gpu_error.c | 2 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 27 +++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 30 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 561b523..5e358e2 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1232,6 +1232,8 @@ static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv, if (!dev_priv->guc.log.obj) return; + i915_guc_flush_logs(dev_priv); This is an invalid context for this function, flush_work() is illegal inside error capture. Actually the concerned work item should not take much time for execution and also it doesn't acquire any such locks due to which it can get blocked. Should there be no wait whatsoever in error capture ? Will have to drop this patch. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/12/2016 12:21 PM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 12:14:28PM +0530, Goel, Akash wrote: On 8/12/2016 11:58 AM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 11:55:09AM +0530, akash.g...@intel.com wrote: diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 0fcd1c0..fc2da32 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv) { + if (HAS_GUC_SCHED(dev_priv)) + destroy_workqueue(dev_priv->guc.log.wq); if (dev_priv->guc.log.wq) destroy_workqueue(dev_priv->guc.log.wq); This shouldn't be here, but in guc teardown. Likewise this is Fine will move it to GuC teardown. @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv) if (dev_priv->hotplug.dp_wq == NULL) goto out_free_wq; + if (HAS_GUC_SCHED(dev_priv)) { + /* Need a dedicated wq to process log buffer flush interrupts +* from GuC without much delay so as to avoid any loss of logs. +*/ + dev_priv->guc.log.wq = creating guc specific wq, not drm_i915_private's. They can even be managed by guc.log? Sorry for the inconsistency here, but didn't get your question. dev_priv->guc.log.wq Just somewhere inside guc, I was just noting that you probably already have setup/teardown for dev_priv->guc.log itself. Fine, will move the dedicated wq creation/destruction in the setup/teardown routines for guc.log. Best Regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/12/2016 11:58 AM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 11:55:09AM +0530, akash.g...@intel.com wrote: diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 0fcd1c0..fc2da32 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv) { + if (HAS_GUC_SCHED(dev_priv)) + destroy_workqueue(dev_priv->guc.log.wq); if (dev_priv->guc.log.wq) destroy_workqueue(dev_priv->guc.log.wq); This shouldn't be here, but in guc teardown. Likewise this is Fine will move it to GuC teardown. @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv) if (dev_priv->hotplug.dp_wq == NULL) goto out_free_wq; + if (HAS_GUC_SCHED(dev_priv)) { + /* Need a dedicated wq to process log buffer flush interrupts +* from GuC without much delay so as to avoid any loss of logs. +*/ + dev_priv->guc.log.wq = creating guc specific wq, not drm_i915_private's. They can even be managed by guc.log? Sorry for the inconsistency here, but didn't get your question. dev_priv->guc.log.wq dev_priv->guc.events_work Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 17/17] drm/i915: Use rt priority kthread to do GuC log buffer sampling
On 7/21/2016 11:13 AM, Chris Wilson wrote: On Thu, Jul 21, 2016 at 09:11:42AM +0530, Goel, Akash wrote: On 7/21/2016 1:04 AM, Chris Wilson wrote: In the end, just the silly locking and placement of complete_all() is dangerous. reinit_completion() lacks the barrier to be used like this really, at any rate, racy with the irq handler, so use sparingly or when you control the irq handler. Sorry I forgot to add a comment that guc_cancel_log_flush_work_sync() should be invoked only after ensuring that there will be no more flush interrupts, which will happen either by explicitly disabling the interrupt or disabling the logging and that's what is done at the 2 call sites. Since had covered reinit_completion() under the irq_lock, thought an explicit barrier is not needed. You hadn't controlled everything via the irq_lock, and nor should you. spin_lock_irq(_priv->irq_lock); if (guc->log.flush_signal) { guc->log.flush_signal = false; reinit_completion(>log.flush_completion); spin_unlock_irq(_priv->irq_lock); i915_guc_capture_logs(_priv->drm); complete_all(>log.flush_completion); The placement of complete_all isn't right for the case, where a guc_cancel_log_flush_work_sync() is called but there was no prior flush interrupt received. Exactly. (Also not sure if log.signal = 0 is sane, Did log.signal = 0 for fast cancellation. Will remove that. A smp_wmb() after reinit_completion(_completion) would be fine ? Don't worry, the race can only be controlled by controlling the irq. In the end, I think something more like while (signal) ... complete_all(); schedule(); reinit_completion(); is the simplest. Thanks much, so will have the task body like this. do { set_current_state(TASK_INT); while (cmpxchg(, 1, 0)) { i915_guc_capture_logs(); }; complete_all(log.complete); if (kthread_should_stop()) break; schedule(); reinit_completion(); } while(1); or the current callsites really require the flush.) Sync against a ongoing/pending flush is being done for the 2 forceful flush cases, which will be effective only if the pending flush is completed, so forceful flush should be serialized with a pending flush. Or you just signal=true, wakeup task, wait_timeout. Otherwise you haven't really serialized anything without disabling the interrupt. Agree without disabling the interrupt, serialization cannot be provided, For the sync can use, { WARN_ON(guc->interrupts_enabled); wait_for_completion_interruptible_timeout( guc->log.complete, 5 /* in jiffies*/); } Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 17/17] drm/i915: Use rt priority kthread to do GuC log buffer sampling
On 7/21/2016 1:04 AM, Chris Wilson wrote: On Sun, Jul 10, 2016 at 07:11:24PM +0530, akash.g...@intel.com wrote: @@ -1707,8 +1692,8 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) I915_READ(SOFT_SCRATCH(15)) & ~msg); /* Handle flush interrupt event in bottom half */ - queue_work(dev_priv->guc.log.wq, - _priv->guc.events_work); + smp_store_mb(dev_priv->guc.log.flush_signal, 1); + wake_up_process(dev_priv->guc.log.flush_task); } +void guc_cancel_log_flush_work_sync(struct drm_i915_private *dev_priv) +{ + spin_lock_irq(_priv->irq_lock); + dev_priv->guc.log.flush_signal = false; + spin_unlock_irq(_priv->irq_lock); + + if (dev_priv->guc.log.flush_task) + wait_for_completion(_priv->guc.log.flush_completion); +} + +static int guc_log_flush_worker(void *arg) +{ + struct drm_i915_private *dev_priv = arg; + struct intel_guc *guc = _priv->guc; + + /* Install ourselves with high priority to reduce signalling latency */ + struct sched_param param = { .sched_priority = 1 }; + sched_setscheduler_nocheck(current, SCHED_FIFO, ); + + do { + set_current_state(TASK_INTERRUPTIBLE); + + spin_lock_irq(_priv->irq_lock); + if (guc->log.flush_signal) { + guc->log.flush_signal = false; + reinit_completion(>log.flush_completion); + spin_unlock_irq(_priv->irq_lock); + i915_guc_capture_logs(_priv->drm); + complete_all(>log.flush_completion); + } else { + spin_unlock_irq(_priv->irq_lock); + if (kthread_should_stop()) + break; + + schedule(); + } + } while (1); + __set_current_state(TASK_RUNNING); + + return 0; This looks decidely fishy. Sorry for that. irq handler: smp_store_mb(log.signal, 1); wake_up_process(log.tsk); worker: do { set_current_state(TASK_INT); while (cmpxchg(, 1, 0)) { reinit_completion(log.complete); i915_guc_capture_logs(); } complete_all(log.complete); if (kthread_should_stop()) break; schedule(); } while(1); __set_current_state(TASK_RUNNING); flush: smp_store_mb(log.signal, 0); wait_for_completion(log.complete); In the end, just the silly locking and placement of complete_all() is dangerous. reinit_completion() lacks the barrier to be used like this really, at any rate, racy with the irq handler, so use sparingly or when you control the irq handler. Sorry I forgot to add a comment that guc_cancel_log_flush_work_sync() should be invoked only after ensuring that there will be no more flush interrupts, which will happen either by explicitly disabling the interrupt or disabling the logging and that's what is done at the 2 call sites. Since had covered reinit_completion() under the irq_lock, thought an explicit barrier is not needed. spin_lock_irq(_priv->irq_lock); if (guc->log.flush_signal) { guc->log.flush_signal = false; reinit_completion(>log.flush_completion); spin_unlock_irq(_priv->irq_lock); i915_guc_capture_logs(_priv->drm); complete_all(>log.flush_completion); The placement of complete_all isn't right for the case, where a guc_cancel_log_flush_work_sync() is called but there was no prior flush interrupt received. (Also not sure if log.signal = 0 is sane, Did log.signal = 0 for fast cancellation. Will remove that. A smp_wmb() after reinit_completion(_completion) would be fine ? or the current callsites really require the flush.) Sync against a ongoing/pending flush is being done for the 2 forceful flush cases, which will be effective only if the pending flush is completed, so forceful flush should be serialized with a pending flush. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control
On 7/20/2016 5:20 PM, Tvrtko Ursulin wrote: On 20/07/16 12:29, Goel, Akash wrote: On 7/20/2016 4:10 PM, Tvrtko Ursulin wrote: On 20/07/16 11:12, Goel, Akash wrote: On 7/20/2016 3:17 PM, Tvrtko Ursulin wrote: +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops, +NULL, i915_guc_log_control_set, +"0x%08llx\n"); Does the readback still work with no get method? readback will give a 'Permission denied' error Is that what we want? I think it would be nice to allow read-back unless there is a specific reason why it shouldn't be allowed. Ok can implement a dummy read back function but what should be shown/returned on read. Should I show/return the guc_log_level value (which is also available from /sys/module/i915/parameters/) ? I would return the same value that was written in. Is the problem that it is not stored anywhere? Maybe reconstruct it from i915.guc_log_level ? The verbosity value will be same as guc_log_level. But whether logging on GuC side is currently enabled or disabled can't be inferred (it could have been disabled at run time). So will have to store the exact value written by User. That's what I meant. Code currently seem to decompose the value written via debugfs and store it in i915.guc_log_level: 0x00 = -1 0x10 = -1 ... 0x01 = 0 0x11 = 1 0x21 = 2 0x31 = 3 ... So for readback you could translate back from i915.guc_log_level to the debugfs format. Sorry for all the mess. Should I add a new field 'debugfs_ctrl_val' in guc structure, to store the value previously written to debugfs file, considering guc_log_level only gives an indication of the verbosity level ? Actually in future there may be other additions also to the value written to guc_log_control debugfs, have right now exposed only logging & verbosity level controls to User, as they are deemed most useful right now. But there are some other controls also which can be passed to GuC firmware through UK_LOG_ENABLE_LOGGING host2guc action. I see. Would it work, for time being at least, to set i915.guc_log_level to -1 when logging is disabled via debugfs? Actually had thought about this, but didn't pursue since on doing so will have to adjust some of the guc_log_level related asserts/ conditions. Will do it now as currently this looks to be the best alternative. Thanks much for the inputs. Best regards Akash It think that also has the advantage of making the current guc logging state consistent when observed from the outside. Otherwise the debugfs value and module parameter may disagree on it, as you said before. Which is not that great I think. Apart from making the reported stated consistent, that way you could, at least for the time being, get away without storing a copy of guc_log_control but reconstruct it from the module parameter on read-back. Regards, Tvrtko You could avoid storing a copy of guc_log_control like that. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control
On 7/20/2016 4:10 PM, Tvrtko Ursulin wrote: On 20/07/16 11:12, Goel, Akash wrote: On 7/20/2016 3:17 PM, Tvrtko Ursulin wrote: +ret = -EINVAL; +goto end; +} + +intel_runtime_pm_get(dev_priv); +ret = i915_guc_log_control(dev, val); +intel_runtime_pm_put(dev_priv); + +end: +mutex_unlock(>struct_mutex); +return ret; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops, +NULL, i915_guc_log_control_set, +"0x%08llx\n"); Does the readback still work with no get method? readback will give a 'Permission denied' error Is that what we want? I think it would be nice to allow read-back unless there is a specific reason why it shouldn't be allowed. Ok can implement a dummy read back function but what should be shown/returned on read. Should I show/return the guc_log_level value (which is also available from /sys/module/i915/parameters/) ? I would return the same value that was written in. Is the problem that it is not stored anywhere? Maybe reconstruct it from i915.guc_log_level ? The verbosity value will be same as guc_log_level. But whether logging on GuC side is currently enabled or disabled can't be inferred (it could have been disabled at run time). So will have to store the exact value written by User. That's what I meant. Code currently seem to decompose the value written via debugfs and store it in i915.guc_log_level: 0x00 = -1 0x10 = -1 ... 0x01 = 0 0x11 = 1 0x21 = 2 0x31 = 3 ... So for readback you could translate back from i915.guc_log_level to the debugfs format. Sorry for all the mess. Should I add a new field 'debugfs_ctrl_val' in guc structure, to store the value previously written to debugfs file, considering guc_log_level only gives an indication of the verbosity level ? Actually in future there may be other additions also to the value written to guc_log_control debugfs, have right now exposed only logging & verbosity level controls to User, as they are deemed most useful right now. But there are some other controls also which can be passed to GuC firmware through UK_LOG_ENABLE_LOGGING host2guc action. Best regards Akash Although I have suggested below even more... Although it is not ideal that we got two formats for the same thing. Thinking about that, why not use the same format in debugfs as for the module param? ... that why do we have to have two formats? Isn't that a bit confusing? Why couldn't we use the same integer values from i915.guc_log_level for debugfs control ? And I forgot, i915.guc_log_level == 0 is logging enabled with minimum verbosity? i915.guc_log_level == 0 just indicates the minimum verbosity. But logging could still be disabled on GuC side. Yes, I can't remember any precedent where zero means enabled so it is just weird. But it is too late to change it now. :( For example, Driver boots with 'i915.guc_log_level = 0' so logging is enabled, later User disables the logging by echoing 0x0 on the guc_log_control debugfs file. That's fine Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control
On 7/20/2016 3:17 PM, Tvrtko Ursulin wrote: On 20/07/16 10:32, Goel, Akash wrote: On 7/20/2016 2:38 PM, Tvrtko Ursulin wrote: On 20/07/16 05:42, Goel, Akash wrote: On 7/19/2016 4:54 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> This patch provides debugfs interface i915_guc_output_control for on the fly enabling/disabling of logging in GuC firmware and controlling the verbosity level of logs. The value written to the file, should have bit 0 set to enable logging and bits 4-7 should contain the verbosity info. v2: Add a forceful flush, to collect left over logs, on disabling logging. Useful for Validation. Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c| 32 - drivers/gpu/drm/i915/i915_guc_submission.c | 57 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 89 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 5e35565..3c9c7f7 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2644,6 +2644,35 @@ static int i915_guc_log_dump(struct seq_file *m, void *data) return 0; } +static int +i915_guc_log_control_set(void *data, u64 val) +{ +struct drm_device *dev = data; +struct drm_i915_private *dev_priv = dev->dev_private; to_i915 should be used. Sorry for missing this, need to use this at other places also. +int ret; + +ret = mutex_lock_interruptible(>struct_mutex); +if (ret) +return ret; + +if (!i915.enable_guc_submission || !dev_priv->guc.log.obj) { Wouldn't guc.log.obj be enough? Actually failure in allocation of log buffer, at boot time, is not considered fatal and submission through GuC is still done. So i915.enable_guc_submission could be 1 with guc.log.obj as NULL. If guc.log.obj is NULL it will return -EINVAL without trying to create it here. If you intended for this function to try and create the log object if not already present, via i915_guc_log_control, in that case the condition above should only be if (!i915.enable_guc_submisison), no? If guc.log.obj is found to be NULL, we consider logging can't be enabled at run time. Allocation of log buffer is supposed to done at boot time only, otherwise GuC would have to be reset & firmware to be reloaded to pass the log buffer address at run time, which is probably not desirable. That's why in the first patch decoupled the allocation of log buffer from log_level value. Okay so why then the check above shouldn't just be; if (!dev_priv->guc.log.obj) as I originally suggested? Right, so sorry got confused, I misread & interpreted that you are suggesting to have !i915.enable_guc_submission check instead. (!dev_priv->guc.log.obj) check should suffice. +ret = -EINVAL; +goto end; +} + +intel_runtime_pm_get(dev_priv); +ret = i915_guc_log_control(dev, val); +intel_runtime_pm_put(dev_priv); + +end: +mutex_unlock(>struct_mutex); +return ret; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops, +NULL, i915_guc_log_control_set, +"0x%08llx\n"); Does the readback still work with no get method? readback will give a 'Permission denied' error Is that what we want? I think it would be nice to allow read-back unless there is a specific reason why it shouldn't be allowed. Ok can implement a dummy read back function but what should be shown/returned on read. Should I show/return the guc_log_level value (which is also available from /sys/module/i915/parameters/) ? I would return the same value that was written in. Is the problem that it is not stored anywhere? Maybe reconstruct it from i915.guc_log_level ? The verbosity value will be same as guc_log_level. But whether logging on GuC side is currently enabled or disabled can't be inferred (it could have been disabled at run time). So will have to store the exact value written by User. Although it is not ideal that we got two formats for the same thing. Thinking about that, why not use the same format in debugfs as for the module param? And I forgot, i915.guc_log_level == 0 is logging enabled with minimum verbosity? i915.guc_log_level == 0 just indicates the minimum verbosity. But logging could still be disabled on GuC side. For example, Driver boots with 'i915.guc_log_level = 0' so logging is enabled, later User disables the logging by echoing 0x0 on the guc_log_control debugfs file. Best regards Akash Is it too late to change that? :) Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 08/17] drm/i915: Forcefully flush GuC log buffer on reset
On 7/20/2016 2:42 PM, Chris Wilson wrote: On Wed, Jul 20, 2016 at 09:51:45AM +0530, Goel, Akash wrote: On 7/19/2016 4:51 PM, Chris Wilson wrote: On Tue, Jul 19, 2016 at 12:12:20PM +0100, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> If GuC logs are being captured, there should be a force log buffer flush action sent to GuC before proceeding with GPU reset and re-initializing GUC. Those logs would be useful to understand why the GPU reset was initiated. v2: Rebase. Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++ drivers/gpu/drm/i915/i915_irq.c| 2 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 35 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 9b436fa..8cc31c6 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -183,6 +183,16 @@ static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) return host2guc_action(guc, data, 1); } +static int host2guc_force_logbuffer_flush(struct intel_guc *guc) +{ + u32 data[2]; + + data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH; + data[1] = 0; + + return host2guc_action(guc, data, 2); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -1404,6 +1414,28 @@ void i915_guc_capture_logs(struct drm_device *dev) intel_runtime_pm_put(dev_priv); } +void i915_guc_capture_logs_on_reset(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + + mutex_lock(>struct_mutex); Not sure what are the repercussion of taking the mutex on the i915_reset_and_wakeup and path (error capture, hangcheck, dont' know this area well). Check with Chris and Mika I suppose (cc-ed)? Took the struct_mutex, just to avoid a very remote possibility where i915_guc_capture_logs_on_reset & debugfs function i915_guc_log_control executes concurrently. Flat out invalid to take struct_mutex on the error capture path, or any lock at all really (just in case of driver bugs). Consider it to be an atomic context that may preempt the driver at any point. Actually I see that i915_reset() too takes the struct_mutex right at the beginning and I have plugged the call to i915_guc_capture_logs_on_reset() just before that. Postmortem state is captured from i915_capture_error_state(), and as I recall one of the raison d'etre for this facility was to include the guc log in the error state. Sorry I missed augmenting the error state with guc firmware logs. For that also a prior flush will be needed, will do the flush without acquiring the struct_mutex. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control
On 7/20/2016 2:38 PM, Tvrtko Ursulin wrote: On 20/07/16 05:42, Goel, Akash wrote: On 7/19/2016 4:54 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> This patch provides debugfs interface i915_guc_output_control for on the fly enabling/disabling of logging in GuC firmware and controlling the verbosity level of logs. The value written to the file, should have bit 0 set to enable logging and bits 4-7 should contain the verbosity info. v2: Add a forceful flush, to collect left over logs, on disabling logging. Useful for Validation. Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c| 32 - drivers/gpu/drm/i915/i915_guc_submission.c | 57 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 89 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 5e35565..3c9c7f7 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2644,6 +2644,35 @@ static int i915_guc_log_dump(struct seq_file *m, void *data) return 0; } +static int +i915_guc_log_control_set(void *data, u64 val) +{ +struct drm_device *dev = data; +struct drm_i915_private *dev_priv = dev->dev_private; to_i915 should be used. Sorry for missing this, need to use this at other places also. +int ret; + +ret = mutex_lock_interruptible(>struct_mutex); +if (ret) +return ret; + +if (!i915.enable_guc_submission || !dev_priv->guc.log.obj) { Wouldn't guc.log.obj be enough? Actually failure in allocation of log buffer, at boot time, is not considered fatal and submission through GuC is still done. So i915.enable_guc_submission could be 1 with guc.log.obj as NULL. If guc.log.obj is NULL it will return -EINVAL without trying to create it here. If you intended for this function to try and create the log object if not already present, via i915_guc_log_control, in that case the condition above should only be if (!i915.enable_guc_submisison), no? If guc.log.obj is found to be NULL, we consider logging can't be enabled at run time. Allocation of log buffer is supposed to done at boot time only, otherwise GuC would have to be reset & firmware to be reloaded to pass the log buffer address at run time, which is probably not desirable. That's why in the first patch decoupled the allocation of log buffer from log_level value. +ret = -EINVAL; +goto end; +} + +intel_runtime_pm_get(dev_priv); +ret = i915_guc_log_control(dev, val); +intel_runtime_pm_put(dev_priv); + +end: +mutex_unlock(>struct_mutex); +return ret; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops, +NULL, i915_guc_log_control_set, +"0x%08llx\n"); Does the readback still work with no get method? readback will give a 'Permission denied' error Is that what we want? I think it would be nice to allow read-back unless there is a specific reason why it shouldn't be allowed. Ok can implement a dummy read back function but what should be shown/returned on read. Should I show/return the guc_log_level value (which is also available from /sys/module/i915/parameters/) ? + static int i915_edp_psr_status(struct seq_file *m, void *data) { struct drm_info_node *node = m->private; @@ -5464,7 +5493,8 @@ static const struct i915_debugfs_files { {"i915_fbc_false_color", _fbc_fc_fops}, {"i915_dp_test_data", _displayport_test_data_fops}, {"i915_dp_test_type", _displayport_test_type_fops}, -{"i915_dp_test_active", _displayport_test_active_fops} +{"i915_dp_test_active", _displayport_test_active_fops}, +{"i915_guc_log_control", _guc_log_control_fops} }; void intel_display_crc_init(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 8cc31c6..2e3b723 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -193,6 +193,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc *guc) return host2guc_action(guc, data, 2); } +static int host2guc_logging_control(struct intel_guc *guc, u32 control_val) +{ +u32 data[2]; + +data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING; +data[1] = control_val; + +return host2guc_action(guc, data, 2); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -1455,3 +1465,50 @@ void i915_guc_register(struct drm_device *dev) guc_log_late_setup(dev); mutex_unlock(>struct_mutex); } + +int i915_guc_log_control(struct drm_device *dev, uint64_t control_val) +{
Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control
On 7/19/2016 4:54 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Sagar Arun KambleThis patch provides debugfs interface i915_guc_output_control for on the fly enabling/disabling of logging in GuC firmware and controlling the verbosity level of logs. The value written to the file, should have bit 0 set to enable logging and bits 4-7 should contain the verbosity info. v2: Add a forceful flush, to collect left over logs, on disabling logging. Useful for Validation. Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c| 32 - drivers/gpu/drm/i915/i915_guc_submission.c | 57 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 89 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 5e35565..3c9c7f7 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2644,6 +2644,35 @@ static int i915_guc_log_dump(struct seq_file *m, void *data) return 0; } +static int +i915_guc_log_control_set(void *data, u64 val) +{ +struct drm_device *dev = data; +struct drm_i915_private *dev_priv = dev->dev_private; to_i915 should be used. Sorry for missing this, need to use this at other places also. +int ret; + +ret = mutex_lock_interruptible(>struct_mutex); +if (ret) +return ret; + +if (!i915.enable_guc_submission || !dev_priv->guc.log.obj) { Wouldn't guc.log.obj be enough? Actually failure in allocation of log buffer, at boot time, is not considered fatal and submission through GuC is still done. So i915.enable_guc_submission could be 1 with guc.log.obj as NULL. +ret = -EINVAL; +goto end; +} + +intel_runtime_pm_get(dev_priv); +ret = i915_guc_log_control(dev, val); +intel_runtime_pm_put(dev_priv); + +end: +mutex_unlock(>struct_mutex); +return ret; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops, +NULL, i915_guc_log_control_set, +"0x%08llx\n"); Does the readback still work with no get method? readback will give a 'Permission denied' error + static int i915_edp_psr_status(struct seq_file *m, void *data) { struct drm_info_node *node = m->private; @@ -5464,7 +5493,8 @@ static const struct i915_debugfs_files { {"i915_fbc_false_color", _fbc_fc_fops}, {"i915_dp_test_data", _displayport_test_data_fops}, {"i915_dp_test_type", _displayport_test_type_fops}, -{"i915_dp_test_active", _displayport_test_active_fops} +{"i915_dp_test_active", _displayport_test_active_fops}, +{"i915_guc_log_control", _guc_log_control_fops} }; void intel_display_crc_init(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 8cc31c6..2e3b723 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -193,6 +193,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc *guc) return host2guc_action(guc, data, 2); } +static int host2guc_logging_control(struct intel_guc *guc, u32 control_val) +{ +u32 data[2]; + +data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING; +data[1] = control_val; + +return host2guc_action(guc, data, 2); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -1455,3 +1465,50 @@ void i915_guc_register(struct drm_device *dev) guc_log_late_setup(dev); mutex_unlock(>struct_mutex); } + +int i915_guc_log_control(struct drm_device *dev, uint64_t control_val) +{ +struct drm_i915_private *dev_priv = dev->dev_private; to_i915 Actually, function should take dev_priv if not even guc depending on the established convention in the file. Ok for all the new logging related exported functions, will use dev_priv. +union guc_log_control log_param; +int ret; + +log_param.logging_enabled = control_val & 0x1; +log_param.verbosity = (control_val >> 4) & 0xF; + +if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN || +log_param.verbosity > GUC_LOG_VERBOSITY_MAX) +return -EINVAL; + +/* This combination doesn't make sense & won't have any effect */ +if (!log_param.logging_enabled && (i915.guc_log_level < 0)) +return -EINVAL; Hm, disabling while already disabled - why should that return an error? Might be annoying in scripts. Just to make the User aware. Ok will suppress this and return 0. + +ret = host2guc_logging_control(_priv->guc, log_param.value); +if (ret < 0) { +DRM_DEBUG_DRIVER("host2guc action failed\n"); Add ret to the log since it is easy? fine will do that. +return ret; +} + +i915.guc_log_level = log_param.verbosity; + +/* If log_level was
Re: [Intel-gfx] [PATCH 08/17] drm/i915: Forcefully flush GuC log buffer on reset
On 7/19/2016 4:51 PM, Chris Wilson wrote: On Tue, Jul 19, 2016 at 12:12:20PM +0100, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Sagar Arun KambleIf GuC logs are being captured, there should be a force log buffer flush action sent to GuC before proceeding with GPU reset and re-initializing GUC. Those logs would be useful to understand why the GPU reset was initiated. v2: Rebase. Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++ drivers/gpu/drm/i915/i915_irq.c| 2 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 35 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 9b436fa..8cc31c6 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -183,6 +183,16 @@ static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) return host2guc_action(guc, data, 1); } +static int host2guc_force_logbuffer_flush(struct intel_guc *guc) +{ + u32 data[2]; + + data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH; + data[1] = 0; + + return host2guc_action(guc, data, 2); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -1404,6 +1414,28 @@ void i915_guc_capture_logs(struct drm_device *dev) intel_runtime_pm_put(dev_priv); } +void i915_guc_capture_logs_on_reset(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + + mutex_lock(>struct_mutex); Not sure what are the repercussion of taking the mutex on the i915_reset_and_wakeup and path (error capture, hangcheck, dont' know this area well). Check with Chris and Mika I suppose (cc-ed)? Took the struct_mutex, just to avoid a very remote possibility where i915_guc_capture_logs_on_reset & debugfs function i915_guc_log_control executes concurrently. Flat out invalid to take struct_mutex on the error capture path, or any lock at all really (just in case of driver bugs). Consider it to be an atomic context that may preempt the driver at any point. Actually I see that i915_reset() too takes the struct_mutex right at the beginning and I have plugged the call to i915_guc_capture_logs_on_reset() just before that. Also it is being called after i915_error_wake_up(), so any client waiting on a request would have backed off and any new attempt by clients to lock the struct_mutex should see i915_reset_in_progress as true. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/17] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
On 7/19/2016 5:01 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash GoelAdded a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the User to capture GuC firmware logs. Availed relay framework to implement the interface, where Driver will have to just use a relay API to store snapshots of the GuC log buffer in the buffer managed by relay. The snapshot will be taken when GuC firmware sends a log buffer flush interrupt and up to four snaphots could be stored in the relay buffer. The relay buffer will be operated in a mode where it will overwrite the data not yet collected by User. Besides mmap method, through which User can directly access the relay buffer contents, relay also supports the 'poll' method. Through the 'poll' call on log file, User can come to know whenever a new snapshot of the log buffer is taken by Driver, so can run in tandem with the Driver and capture the logs in a sustained/streaming manner, without any loss of data. v2: Defer the creation of relay channel & associated debugfs file, as debugfs setup is now done at the end of i915 Driver load. (Chris) v3: - Switch to no-overwrite mode for relay. - Fix the relay sub buffer switching sequence. Suggested-by: Chris Wilson Signed-off-by: Sourab Gupta Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.c| 2 + drivers/gpu/drm/i915/i915_guc_submission.c | 197 - drivers/gpu/drm/i915/intel_guc.h | 3 + 3 files changed, 199 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 25c6b9b..43c9900 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1177,6 +1177,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) /* Reveal our presence to userspace */ if (drm_dev_register(dev, 0) == 0) { i915_debugfs_register(dev_priv); +i915_guc_register(dev); i915_setup_sysfs(dev); } else DRM_ERROR("Failed to register driver for userspace access!\n"); @@ -1215,6 +1216,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) intel_opregion_unregister(dev_priv); i915_teardown_sysfs(_priv->drm); +i915_guc_unregister(_priv->drm); i915_debugfs_unregister(dev_priv); drm_dev_unregister(_priv->drm); diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index d3dbb8e..9b436fa 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -23,6 +23,8 @@ */ #include #include +#include +#include #include "i915_drv.h" #include "intel_guc.h" @@ -836,12 +838,33 @@ err: static void guc_move_to_next_buf(struct intel_guc *guc) { -return; +/* Make sure our updates are in the sub buffer are visible when + * Consumer sees a newly produced sub buffer. + */ +smp_wmb(); + +/* All data has been written, so now move the offset of sub buffer. */ +relay_reserve(guc->log.relay_chan, guc->log.obj->base.size); + +/* Switch to the next sub buffer */ +relay_flush(guc->log.relay_chan); } static void* guc_get_write_buffer(struct intel_guc *guc) { -return NULL; +/* FIXME: Cover the check under a lock ? */ +if (!guc->log.relay_chan) +return NULL; + +/* Just get the base address of a new sub buffer and copy data into it + * ourselves. NULL will be returned in no-overwrite mode, if all sub + * buffers are full. Could have used the relay_write() to indirectly + * copy the data, but that would have been bit convoluted, as we need to + * write to only certain locations inside a sub buffer which cannot be + * done without using relay_reserve() along with relay_write(). So its + * better to use relay_reserve() alone. + */ +return relay_reserve(guc->log.relay_chan, 0); } static void guc_read_update_log_buffer(struct drm_device *dev) @@ -906,6 +929,119 @@ static void guc_read_update_log_buffer(struct drm_device *dev) guc_move_to_next_buf(guc); } +/* + * Sub buffer switch callback. Called whenever relay has to switch to a new + * sub buffer, relay stays on the same sub buffer if 0 is returned. + */ +static int subbuf_start_callback(struct rchan_buf *buf, + void *subbuf, + void *prev_subbuf, + size_t prev_padding) +{ +/* Use no-overwrite mode by default, where relay will stop accepting + * new data if there are no empty sub buffers left. + * There is no strict synchronization enforced by relay between Consumer + * and Producer. In overwrite mode, there is a possibility of getting + * inconsistent/garbled data, the producer could be writing on to the + * same sub buffer from which
Re: [Intel-gfx] [PATCH 06/17] drm/i915: Handle log buffer flush interrupt event from GuC
On 7/19/2016 4:28 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Sagar Arun KambleGuC ukernel sends an interrupt to Host to flush the log buffer and expects Host to correspondingly update the read pointer information in the state structure, once it has consumed the log buffer contents by copying them to a file or buffer. Even if Host couldn't copy the contents, it can still update the read pointer so that logging state is not disturbed on GuC side. v2: - Use a dedicated workqueue for handling flush interrupt. (Tvrtko) - Reduce the overall log buffer copying time by skipping the copy of crash buffer area for regular cases and copying only the state structure data in first page. v3: - Create a vmalloc mapping of log buffer. (Chris) - Cover the flush acknowledgment under rpm get & put.(Chris) - Revert the change of skipping the copy of crash dump area, as not really needed, will be covered by subsequent patch. Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.c| 13 +++ drivers/gpu/drm/i915/i915_guc_submission.c | 148 + drivers/gpu/drm/i915/i915_irq.c| 5 +- drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 167 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index b9a8117..25c6b9b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -791,8 +791,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv) if (dev_priv->hotplug.dp_wq == NULL) goto out_free_wq; +if (HAS_GUC_SCHED(dev_priv)) { +/* Need a dedicated wq to process log buffer flush interrupts + * from GuC without much delay so as to avoid any loss of logs. + */ +dev_priv->guc.log.wq = +alloc_ordered_workqueue("i915-guc_log", 0); +if (dev_priv->guc.log.wq == NULL) +goto out_free_hotplug_dp_wq; +} + return 0; +out_free_hotplug_dp_wq: +destroy_workqueue(dev_priv->hotplug.dp_wq); out_free_wq: destroy_workqueue(dev_priv->wq); out_err: @@ -803,6 +815,7 @@ out_err: static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv) { +destroy_workqueue(dev_priv->guc.log.wq); I am ignoring the wq parts of the patch since the next series may look different in this respect. However you may need to have wq destruction under the same HAS_GUC_SCHED condition as when you create it. Thanks, will do. Sorry, my bad. destroy_workqueue(dev_priv->hotplug.dp_wq); destroy_workqueue(dev_priv->wq); } diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 0bac172..d3dbb8e 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc, return host2guc_action(guc, data, ARRAY_SIZE(data)); } +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) +{ +u32 data[1]; + +data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE; + +return host2guc_action(guc, data, 1); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -825,6 +834,123 @@ err: return NULL; } +static void guc_move_to_next_buf(struct intel_guc *guc) +{ +return; +} + +static void* guc_get_write_buffer(struct intel_guc *guc) +{ +return NULL; +} + +static void guc_read_update_log_buffer(struct drm_device *dev) dev_priv should be passed in for driver internal functions. +{ +struct drm_i915_private *dev_priv = dev->dev_private; +struct intel_guc *guc = _priv->guc; +struct guc_log_buffer_state *log_buffer_state, *log_buffer_copy_state; +struct guc_log_buffer_state log_buffer_state_local; +void *src_data_ptr, *dst_data_ptr; +u32 i, buffer_size; + +if (!guc->log.obj || !guc->log.buf_addr) +return; + +log_buffer_state = src_data_ptr = guc->log.buf_addr; + +/* Get the pointer to local buffer to store the logs */ +dst_data_ptr = log_buffer_copy_state = guc_get_write_buffer(guc); This will return NULL so the loop below doesn't do anything much. I assume at this point in the patch series things are not wired up yet? The below loop will still update the state structures, lying in the first page of GuC log buffer. There is no local buffer yet to store the logs. + +/* Actual logs are present from the 2nd page */ +src_data_ptr += PAGE_SIZE; +dst_data_ptr += PAGE_SIZE; + +for (i = 0; i < GUC_MAX_LOG_BUFFER; i++) { +log_buffer_state_local = *log_buffer_state; +buffer_size = log_buffer_state_local.size; + +if (log_buffer_copy_state) { +/* First copy the state structure */ +
Re: [Intel-gfx] [PATCH 10/17] drm/i915: New module param to control the size of buffer used for storing GuC firmware logs
On 7/18/2016 6:36 PM, Tvrtko Ursulin wrote: On 18/07/16 13:19, Goel, Akash wrote: On 7/18/2016 3:36 PM, Tvrtko Ursulin wrote: On 15/07/16 16:36, Goel, Akash wrote: On 7/15/2016 4:45 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> On recieving the log buffer flush interrupt from GuC firmware, Driver stores the snapshot of the log buffer in a local buffer, from which Userspace can pull the logs. By default Driver store, up to, 4 snapshots of the log buffer in a local buffer (managed by relay). Added a new module (read only) param, 'guc_log_size', through which User can specify the number of snapshots of log buffer to be stored in local buffer. This can be used to ensure capturing of all boot time logs even with high verbosity level. v2: Rename module param to more apt name 'guc_log_buffer_nr'. (Nikula) Suggested-by: Chris Wilson <ch...@chris-wilson.co.uk> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +-- drivers/gpu/drm/i915/i915_params.c | 5 + drivers/gpu/drm/i915/i915_params.h | 1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 2e3b723..009d7c0 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1046,8 +1046,7 @@ static int guc_create_log_relay_file(struct intel_guc *guc) /* Keep the size of sub buffers same as shared log buffer */ subbuf_size = guc->log.obj->base.size; -/* TODO: Decide based on the User's input */ -n_subbufs = 4; +n_subbufs = i915.guc_log_buffer_nr; guc_log_relay_chan = relay_open("guc_log", log_dir, subbuf_size, n_subbufs, _callbacks, dev); diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 8b13bfa..d30c972 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -57,6 +57,7 @@ struct i915_params i915 __read_mostly = { .enable_guc_loading = -1, .enable_guc_submission = -1, .guc_log_level = -1, +.guc_log_buffer_nr = 4, .enable_dp_mst = true, .inject_load_failure = 0, .enable_dpcd_backlight = false, @@ -214,6 +215,10 @@ module_param_named(guc_log_level, i915.guc_log_level, int, 0400); MODULE_PARM_DESC(guc_log_level, "GuC firmware logging level (-1:disabled (default), 0-3:enabled)"); +module_param_named(guc_log_buffer_nr, i915.guc_log_buffer_nr, int, 0400); +MODULE_PARM_DESC(guc_log_buffer_nr, +"Number of sub buffers to store GuC firmware logs (default: 4)"); + module_param_named_unsafe(enable_dp_mst, i915.enable_dp_mst, bool, 0600); MODULE_PARM_DESC(enable_dp_mst, "Enable multi-stream transport (MST) for new DisplayPort sinks. (default: true)"); diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index 0ad020b..14ca855 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -48,6 +48,7 @@ struct i915_params { int enable_guc_loading; int enable_guc_submission; int guc_log_level; +int guc_log_buffer_nr; int use_mmio_flip; int mmio_debug; int edp_vswing; I did not figure out after a quick read of Documentation/filesystems/relay.txt whether we really need this to be configurable? If I got it right number of sub-buffers here only has a relation to the userspace relay consumer latency. If the userspace is responsive should just two be enough? Or the existing default of four was shown in practice that it is better and good enough? Yes one of the use of this module parameter is to give User some leeway i.e. more time to collect logs from the relay buffer. User may not be always able to match the rate at which logs are being produced from the GuC side. 2 could be too less. Even 4, when running a benchmark, was proving less and not able to match the Driver rate (this might change after some optimization is done from User space side also, like splice). Okay, it makes sense for it to be bigger than four by default then, correct? The other use is to ensure capturing of all boot time logs, even with maximum verbosity level. The default number of sub buffers may not always be sufficient to store all the logs from boot, by the time User is ready to capture the logs. Saw about 8 flush interrupts coming from GuC during the boot. How important it is for a default value to capture all activity since boot? I think we need to keep in mind here that amount of that activity may be a lot different with different setups so it might not be that interesting after all. Someone will log in via a display manager, which may generate a widely differing amount of GPU activity, until they start the logger. Someone else on the other
Re: [Intel-gfx] [PATCH 15/17] drm/i915: Increase GuC log buffer size to reduce flush interrupts
On 7/18/2016 3:24 PM, Tvrtko Ursulin wrote: On 15/07/16 17:20, Goel, Akash wrote: On 7/15/2016 8:37 PM, Tvrtko Ursulin wrote: On 15/07/16 15:42, Goel, Akash wrote: On 7/15/2016 5:27 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> In cases where GuC generate logs at a very high rate, correspondingly the rate of flush interrupts is also very high. So far total 8 pages were allocated for storing both ISR & DPC logs. As per the half-full draining protocol followed by GuC, by doubling the number of pages, the frequency of flush interrupts can be cut down to almost half, which then helps in reducing the logging overhead. So now allocating 8 pages apiece for ISR & DPC logs. Suggested-by: Tvrtko Ursulin <tvrtko.ursu...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/intel_guc_fwif.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h index 1de6928..7521ed5 100644 --- a/drivers/gpu/drm/i915/intel_guc_fwif.h +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h @@ -104,9 +104,9 @@ #define GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3) #define GUC_LOG_CRASH_PAGES1 #define GUC_LOG_CRASH_SHIFT4 -#define GUC_LOG_DPC_PAGES3 +#define GUC_LOG_DPC_PAGES7 #define GUC_LOG_DPC_SHIFT6 -#define GUC_LOG_ISR_PAGES3 +#define GUC_LOG_ISR_PAGES7 #define GUC_LOG_ISR_SHIFT9 #define GUC_LOG_BUF_ADDR_SHIFT12 @@ -436,9 +436,9 @@ enum guc_log_buffer_type { *| Crash dump state header | * Page1 +---+ *| ISR logs| - * Page5 +---+ - *| DPC logs| * Page9 +---+ + *| DPC logs| + * Page17 +---+ *| Crash Dump logs | *+---+ * I don't mind - but does it help? And how much and for what? Haven't you later found that the uncached reads were the main issue? This change along with kthread patch, helped reduce the overflow counts and even eliminate them for some benchmarks. Though with the impending optimization for Uncached reads there should be further improvements but in my view, notwithstanding the improvement w.r.t overflow count, its still a better configuration to work with as flush interrupt frequency is cut down to half and not able to see any apparent downsides to it. I was primarily thinking to go with a minimal and simplest set of patches to implement the feature. I second that and working with the same intent. Logic was that apparently none of the smart and complex optimisations managed to solve the dropped interrupt issue, until the slowness of the uncached read was discovered to be the real/main issue. So it seems that is something that definitely needs to be implemented. (Whether or not it will be possible to use SSE instructions to do the read I don't know.) log buffer resizing and rt priority kthread changes have definitely helped significantly. Only of late we realized that there is a potential way to speed up Uncached reads also. Moreover I am yet to test that on kernel side. So until that is tested & proves to be enough, we have to rely on the other optimizations & can't dismiss them Maybe, depends if, what I thought was the case, none of the other optimizations actually enabled a drop-free logging in all interesting scenarios. If we conclude that simply improving the copy speed removes the need for any other optimisations and complications, we can talk about whether every individual one of those still makes sense. In my opinion we should keep this change, regardless of the copying speed up. Moreover this is a straight forward change. Actually this also helps in reducing the output log file size, apart from reducing the flush interrupt count. With the original settings, 44 KB was needed for one snapshot. With the modified settings, 76 KB is needed for one snapshot but it will be equivalent to 2 snapshots of the original setting. So 12KB saving, every 88 KB, over the original setting. Best regards Akash Assuming it is possible, then the question is whether there is need for all the other optimisations. Ie. do we need the kthread with rtprio or would a simple worker be enough? I think we can take a call, once we have the results with Uncached read optimization. Agreed. Lets see how that works out and the discuss on how the final series should look like. Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 10/17] drm/i915: New module param to control the size of buffer used for storing GuC firmware logs
On 7/18/2016 3:36 PM, Tvrtko Ursulin wrote: On 15/07/16 16:36, Goel, Akash wrote: On 7/15/2016 4:45 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> On recieving the log buffer flush interrupt from GuC firmware, Driver stores the snapshot of the log buffer in a local buffer, from which Userspace can pull the logs. By default Driver store, up to, 4 snapshots of the log buffer in a local buffer (managed by relay). Added a new module (read only) param, 'guc_log_size', through which User can specify the number of snapshots of log buffer to be stored in local buffer. This can be used to ensure capturing of all boot time logs even with high verbosity level. v2: Rename module param to more apt name 'guc_log_buffer_nr'. (Nikula) Suggested-by: Chris Wilson <ch...@chris-wilson.co.uk> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +-- drivers/gpu/drm/i915/i915_params.c | 5 + drivers/gpu/drm/i915/i915_params.h | 1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 2e3b723..009d7c0 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1046,8 +1046,7 @@ static int guc_create_log_relay_file(struct intel_guc *guc) /* Keep the size of sub buffers same as shared log buffer */ subbuf_size = guc->log.obj->base.size; -/* TODO: Decide based on the User's input */ -n_subbufs = 4; +n_subbufs = i915.guc_log_buffer_nr; guc_log_relay_chan = relay_open("guc_log", log_dir, subbuf_size, n_subbufs, _callbacks, dev); diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 8b13bfa..d30c972 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -57,6 +57,7 @@ struct i915_params i915 __read_mostly = { .enable_guc_loading = -1, .enable_guc_submission = -1, .guc_log_level = -1, +.guc_log_buffer_nr = 4, .enable_dp_mst = true, .inject_load_failure = 0, .enable_dpcd_backlight = false, @@ -214,6 +215,10 @@ module_param_named(guc_log_level, i915.guc_log_level, int, 0400); MODULE_PARM_DESC(guc_log_level, "GuC firmware logging level (-1:disabled (default), 0-3:enabled)"); +module_param_named(guc_log_buffer_nr, i915.guc_log_buffer_nr, int, 0400); +MODULE_PARM_DESC(guc_log_buffer_nr, +"Number of sub buffers to store GuC firmware logs (default: 4)"); + module_param_named_unsafe(enable_dp_mst, i915.enable_dp_mst, bool, 0600); MODULE_PARM_DESC(enable_dp_mst, "Enable multi-stream transport (MST) for new DisplayPort sinks. (default: true)"); diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index 0ad020b..14ca855 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -48,6 +48,7 @@ struct i915_params { int enable_guc_loading; int enable_guc_submission; int guc_log_level; +int guc_log_buffer_nr; int use_mmio_flip; int mmio_debug; int edp_vswing; I did not figure out after a quick read of Documentation/filesystems/relay.txt whether we really need this to be configurable? If I got it right number of sub-buffers here only has a relation to the userspace relay consumer latency. If the userspace is responsive should just two be enough? Or the existing default of four was shown in practice that it is better and good enough? Yes one of the use of this module parameter is to give User some leeway i.e. more time to collect logs from the relay buffer. User may not be always able to match the rate at which logs are being produced from the GuC side. 2 could be too less. Even 4, when running a benchmark, was proving less and not able to match the Driver rate (this might change after some optimization is done from User space side also, like splice). Okay, it makes sense for it to be bigger than four by default then, correct? The other use is to ensure capturing of all boot time logs, even with maximum verbosity level. The default number of sub buffers may not always be sufficient to store all the logs from boot, by the time User is ready to capture the logs. Saw about 8 flush interrupts coming from GuC during the boot. How important it is for a default value to capture all activity since boot? I think we need to keep in mind here that amount of that activity may be a lot different with different setups so it might not be that interesting after all. Someone will log in via a display manager, which may generate a widely differing amount of GPU activity, until they start the logger. Someone else on the other hand might be booting to vt only, starting the logger, and only then starting the g
Re: [Intel-gfx] [PATCH 14/17] drm/i915: Add stats for GuC log buffer flush interrupts
On 7/18/2016 5:03 PM, Tvrtko Ursulin wrote: On 18/07/16 11:59, Goel, Akash wrote: On 7/18/2016 3:46 PM, Tvrtko Ursulin wrote: On 15/07/16 16:58, Goel, Akash wrote: On 7/15/2016 5:21 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> GuC firmware sends an interrupt to flush the log buffer when it becomes half full. GuC firmware also tracks how many times the buffer overflowed. It would be useful to maintain a statistics of how many flush interrupts were received and for which type of log buffer, along with the overflow count of each buffer type. Augmented i915_log_info debugfs to report back these statistics. Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c| 26 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 8 drivers/gpu/drm/i915/i915_irq.c| 1 + drivers/gpu/drm/i915/intel_guc.h | 6 ++ 4 files changed, 41 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 3c9c7f7..888a18a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2538,6 +2538,30 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data) return 0; } +static void i915_guc_log_info(struct seq_file *m, + struct drm_i915_private *dev_priv) +{ +struct intel_guc *guc = _priv->guc; + +seq_printf(m, "\nGuC logging stats:\n"); + +seq_printf(m, "\tISR: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_ISR_LOG_BUFFER], +guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]); + +seq_printf(m, "\tDPC: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_DPC_LOG_BUFFER], +guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]); + +seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER], +guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]); + +seq_printf(m, "\tTotal flush interrupt count: %u\n", +guc->log.flush_interrupt_count); + +} + static void i915_guc_client_info(struct seq_file *m, struct drm_i915_private *dev_priv, struct i915_guc_client *client) @@ -2611,6 +2635,8 @@ static int i915_guc_info(struct seq_file *m, void *data) seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client); i915_guc_client_info(m, dev_priv, ); +i915_guc_log_info(m, dev_priv); + /* Add more as required ... */ return 0; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c1e637f..9c94a43 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -914,6 +914,14 @@ static void guc_read_update_log_buffer(struct drm_device *dev) log_buffer_state_local = *log_buffer_state; buffer_size = log_buffer_state_local.size; +guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; +if (log_buffer_state_local.buffer_full_cnt != +guc->log.prev_overflow_count[i]) { +guc->log.prev_overflow_count[i] = +log_buffer_state_local.buffer_full_cnt; +guc->log.total_overflow_count[i]++; Is log_buffer_state_local.buffer_full_cnt guaranteed to be one here? Or you would need to increase total_overflow_count by its value? buffer_full_cnt will not remain as one. Its a 4 bit counter, will be incremented monotonically by GuC firmware on every new detection of overflow, so will increase from 0 to 15 & then wrap around. Hence have to use '!=' in the condition instead of '>'. But can it happen that it jumps by more than one between being sampled here? In which case you would need to replace: guc->log.total_overflow_count[i]++; by something like: guc->log.total_overflow_count[i] += log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[i]; (Doesn't handle the wrap though, just to illustrate my point.) Actually logic in GuC firmware is such that overflow counter cannot increment by more than 1 without Driver coming into picture in between, by the virtue of flush interrupt. Hm, and what happens to the data and overflow counter if the driver is not responsive enough? GuC will not stall and keep writing the logs into the buffer, if Driver is slow in responding to the previous flush interrupt. But the overflow detection is done through a bit weird logic, which is executed only when GuC receives the response of the last flush interrupt from Driver, and increment is done by 1 only irrespective of how late the acknowledgement came from Driver
Re: [Intel-gfx] [PATCH 13/17] drm/i915: New lock to serialize the Host2GuC actions
On 7/18/2016 4:48 PM, Tvrtko Ursulin wrote: On 18/07/16 11:46, Goel, Akash wrote: On 7/18/2016 3:42 PM, Tvrtko Ursulin wrote: On 15/07/16 16:51, Goel, Akash wrote: On 7/15/2016 5:10 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> With the addition of new Host2GuC actions related to GuC logging, there is a need of a lock to serialize them, as they can execute concurrently with each other and also with other existing actions. After which patch in this series is this required? From patch 6 or 7 saw the problem, when enabled flush interrupts from boot (guc_log_level >= 0). That means this patch should come before 6 or 7. :) Also new HOST2GUC actions LOG_BUFFER_FILE_FLUSH_COMPLETE & UK_LOG_ENABLE_LOGGING can execute concurrently with each other. Right I see, from the worker/thread vs debugfs activity. Will use mutex to serialize and place the patch earlier in the series. Please suggest which would be better, mutex_lock() or mutex_lock_interruptible(). Interruptible from the debugfs paths, otherwise not. Yes calls from debugfs path should ideally use interruptible version, but then how to determine that whether the given host2guc_action call came from debugfs path. Should I add a new argument 'interruptible_wait' to host2guc_action() or to keep things simple use mutex_lock() only ? I thought it would be cleaner to abstract the lock usage, for serialization, entirely inside the host2guc_action only. --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) return -EINVAL; intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); +spin_lock(>action_lock); The code below can sleep waiting for a response from GuC so you cannot use a spinlock. Mutex I suppose... Sorry I missed the sleep. Probably I did not see any problem, in spite of a spinlock, as _wait_for macro does not sleep when used in atomic context, does a busy wait instead. I wonder about that in general, since in_atomic is not a reliable indicator. But that is beside the point. You probably haven't seen it because the action completes in the first shorter, atomic sleep, check. Actually I had profiled host2guc_logbuffer_flush_complete() and saw that on some occasions it was taking more than 100 micro seconds, so presumably it would have went past the first wait. But most of the times it was less than 10 micro seconds only. ret = wait_for_us(host2guc_action_response(dev_priv, ), 10); if (ret) ret = wait_for(host2guc_action_response(dev_priv, ), 10); Yes presumably so. In that case keep in mind that in_atomic always returns false in spinlock sections unless the kernel has CONFIG_PREEMPT_COUNT enabled. Thanks for this info, will be mindful of this in future. Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 14/17] drm/i915: Add stats for GuC log buffer flush interrupts
On 7/18/2016 3:46 PM, Tvrtko Ursulin wrote: On 15/07/16 16:58, Goel, Akash wrote: On 7/15/2016 5:21 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> GuC firmware sends an interrupt to flush the log buffer when it becomes half full. GuC firmware also tracks how many times the buffer overflowed. It would be useful to maintain a statistics of how many flush interrupts were received and for which type of log buffer, along with the overflow count of each buffer type. Augmented i915_log_info debugfs to report back these statistics. Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c| 26 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 8 drivers/gpu/drm/i915/i915_irq.c| 1 + drivers/gpu/drm/i915/intel_guc.h | 6 ++ 4 files changed, 41 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 3c9c7f7..888a18a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2538,6 +2538,30 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data) return 0; } +static void i915_guc_log_info(struct seq_file *m, + struct drm_i915_private *dev_priv) +{ +struct intel_guc *guc = _priv->guc; + +seq_printf(m, "\nGuC logging stats:\n"); + +seq_printf(m, "\tISR: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_ISR_LOG_BUFFER], +guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]); + +seq_printf(m, "\tDPC: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_DPC_LOG_BUFFER], +guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]); + +seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER], +guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]); + +seq_printf(m, "\tTotal flush interrupt count: %u\n", +guc->log.flush_interrupt_count); + +} + static void i915_guc_client_info(struct seq_file *m, struct drm_i915_private *dev_priv, struct i915_guc_client *client) @@ -2611,6 +2635,8 @@ static int i915_guc_info(struct seq_file *m, void *data) seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client); i915_guc_client_info(m, dev_priv, ); +i915_guc_log_info(m, dev_priv); + /* Add more as required ... */ return 0; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c1e637f..9c94a43 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -914,6 +914,14 @@ static void guc_read_update_log_buffer(struct drm_device *dev) log_buffer_state_local = *log_buffer_state; buffer_size = log_buffer_state_local.size; +guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; +if (log_buffer_state_local.buffer_full_cnt != +guc->log.prev_overflow_count[i]) { +guc->log.prev_overflow_count[i] = +log_buffer_state_local.buffer_full_cnt; +guc->log.total_overflow_count[i]++; Is log_buffer_state_local.buffer_full_cnt guaranteed to be one here? Or you would need to increase total_overflow_count by its value? buffer_full_cnt will not remain as one. Its a 4 bit counter, will be incremented monotonically by GuC firmware on every new detection of overflow, so will increase from 0 to 15 & then wrap around. Hence have to use '!=' in the condition instead of '>'. But can it happen that it jumps by more than one between being sampled here? In which case you would need to replace: guc->log.total_overflow_count[i]++; by something like: guc->log.total_overflow_count[i] += log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[i]; (Doesn't handle the wrap though, just to illustrate my point.) Actually logic in GuC firmware is such that overflow counter cannot increment by more than 1 without Driver coming into picture in between, by the virtue of flush interrupt. But nevertheless the logic on Driver side should be like the way you suggested. Does this revised logic looks fine ? if (log_buffer_state_local.buffer_full_cnt != guc->log.prev_overflow_count[i]) { new_overflow = 1; guc->log.total_overflow_count[i] += (log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[i]); if (log_buffer_state_local.buffer_full_cnt < guc->log.prev_overflow_count[i]) guc->log.total_overflow_count[i] += 15; log_buffer_state_local.buffer_full_cnt = guc->log.prev_overflow_
Re: [Intel-gfx] [PATCH 13/17] drm/i915: New lock to serialize the Host2GuC actions
On 7/18/2016 3:42 PM, Tvrtko Ursulin wrote: On 15/07/16 16:51, Goel, Akash wrote: On 7/15/2016 5:10 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> With the addition of new Host2GuC actions related to GuC logging, there is a need of a lock to serialize them, as they can execute concurrently with each other and also with other existing actions. After which patch in this series is this required? From patch 6 or 7 saw the problem, when enabled flush interrupts from boot (guc_log_level >= 0). That means this patch should come before 6 or 7. :) Also new HOST2GUC actions LOG_BUFFER_FILE_FLUSH_COMPLETE & UK_LOG_ENABLE_LOGGING can execute concurrently with each other. Right I see, from the worker/thread vs debugfs activity. Will use mutex to serialize and place the patch earlier in the series. Please suggest which would be better, mutex_lock() or mutex_lock_interruptible(). Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++ drivers/gpu/drm/i915/intel_guc.h | 3 +++ 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 6043166..c1e637f 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) return -EINVAL; intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); +spin_lock(>action_lock); The code below can sleep waiting for a response from GuC so you cannot use a spinlock. Mutex I suppose... Sorry I missed the sleep. Probably I did not see any problem, in spite of a spinlock, as _wait_for macro does not sleep when used in atomic context, does a busy wait instead. I wonder about that in general, since in_atomic is not a reliable indicator. But that is beside the point. You probably haven't seen it because the action completes in the first shorter, atomic sleep, check. Actually I had profiled host2guc_logbuffer_flush_complete() and saw that on some occasions it was taking more than 100 micro seconds, so presumably it would have went past the first wait. But most of the times it was less than 10 micro seconds only. ret = wait_for_us(host2guc_action_response(dev_priv, ), 10); if (ret) ret = wait_for(host2guc_action_response(dev_priv, ), 10); Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 11/17] drm/i915: Support to create write combined type vmaps
On 7/15/2016 5:01 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Chris Wilsonvmaps has a provision for controlling the page protection bits, with which we can use to control the mapping type, e.g. WB, WC, UC or even WT. To allow the caller to choose their mapping type, we add a parameter to i915_gem_object_pin_map - but we still only allow one vmap to be cached per object. If the object is currently not pinned, then we recreate the previous vmap with the new access type, but if it was pinned we report an error. This effectively limits the access via i915_gem_object_pin_map to a single mapping type for the lifetime of the object. Not usually a problem, but something to be aware of when setting up the object's vmap. We will want to vary the access type to enable WC mappings of ringbuffer and context objects on !llc platforms, as well as other objects where we need coherent access to the GPU's pages without going through the GTT v2: Remove the redundant braces around pin count check and fix the marker in documentation (Chris) Signed-off-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h| 4 ++- drivers/gpu/drm/i915/i915_gem.c| 57 +++--- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 2 +- drivers/gpu/drm/i915/i915_guc_submission.c | 2 +- drivers/gpu/drm/i915/intel_lrc.c | 8 ++--- drivers/gpu/drm/i915/intel_ringbuffer.c| 2 +- 6 files changed, 54 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6e2ddfa..84afa17 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3248,6 +3248,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) /** * i915_gem_object_pin_map - return a contiguous mapping of the entire object * @obj - the object to map into kernel address space + * @use_wc - whether the mapping should be using WC or WB pgprot_t * * Calls i915_gem_object_pin_pages() to prevent reaping of the object's * pages and then returns a contiguous mapping of the backing storage into @@ -3259,7 +3260,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) * Returns the pointer through which to access the mapped object, or an * ERR_PTR() on error. */ -void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj); +void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj, + bool use_wc); Could you make it an enum instead of a bool? Commit message suggests more modes will potentially be added and if so, and we start with an enum straight away, it will make for less churn in the future. func(something, true) is always also quite unreadabe in the code because one has to remember or remind himself what it really means. Something like func(something, MAP_WC) would be simply self-documenting. Thanks nice suggestion, will do that. enum only or macros also will do ? #define MAP_CACHED 0x1 #define MAP_WC 0x2 /** * i915_gem_object_unpin_map - releases an earlier mapping diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 8f50919..c431b40 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2471,10 +2471,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj) list_del(>global_list); if (obj->mapping) { -if (is_vmalloc_addr(obj->mapping)) -vunmap(obj->mapping); +void *ptr = (void *)((uintptr_t)obj->mapping & ~1); How many bits we have to play with here? Is there a suitable define somewhere we could use for a mask instead of hardcoded "1" or we could add one if you think that would be better? As Chris said, will use PAGE_MASK. +if (is_vmalloc_addr(ptr)) +vunmap(ptr); else -kunmap(kmap_to_page(obj->mapping)); +kunmap(kmap_to_page(ptr)); obj->mapping = NULL; } @@ -2647,7 +2648,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) } /* The 'mapping' part of i915_gem_object_pin_map() below */ -static void *i915_gem_object_map(const struct drm_i915_gem_object *obj) +static void *i915_gem_object_map(const struct drm_i915_gem_object *obj, +bool use_wc) { unsigned long n_pages = obj->base.size >> PAGE_SHIFT; struct sg_table *sgt = obj->pages; @@ -2659,7 +2661,7 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj) void *addr; /* A single page can always be kmapped */ -if (n_pages == 1) +if (n_pages == 1 && !use_wc) return kmap(sg_page(sgt->sgl)); if (n_pages > ARRAY_SIZE(stack_pages)) { @@ -2675,7 +2677,8 @@ static void *i915_gem_object_map(const struct
Re: [Intel-gfx] [PATCH 15/17] drm/i915: Increase GuC log buffer size to reduce flush interrupts
On 7/15/2016 8:37 PM, Tvrtko Ursulin wrote: On 15/07/16 15:42, Goel, Akash wrote: On 7/15/2016 5:27 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> In cases where GuC generate logs at a very high rate, correspondingly the rate of flush interrupts is also very high. So far total 8 pages were allocated for storing both ISR & DPC logs. As per the half-full draining protocol followed by GuC, by doubling the number of pages, the frequency of flush interrupts can be cut down to almost half, which then helps in reducing the logging overhead. So now allocating 8 pages apiece for ISR & DPC logs. Suggested-by: Tvrtko Ursulin <tvrtko.ursu...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/intel_guc_fwif.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h index 1de6928..7521ed5 100644 --- a/drivers/gpu/drm/i915/intel_guc_fwif.h +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h @@ -104,9 +104,9 @@ #define GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3) #define GUC_LOG_CRASH_PAGES1 #define GUC_LOG_CRASH_SHIFT4 -#define GUC_LOG_DPC_PAGES3 +#define GUC_LOG_DPC_PAGES7 #define GUC_LOG_DPC_SHIFT6 -#define GUC_LOG_ISR_PAGES3 +#define GUC_LOG_ISR_PAGES7 #define GUC_LOG_ISR_SHIFT9 #define GUC_LOG_BUF_ADDR_SHIFT12 @@ -436,9 +436,9 @@ enum guc_log_buffer_type { *| Crash dump state header | * Page1 +---+ *| ISR logs| - * Page5 +---+ - *| DPC logs| * Page9 +---+ + *| DPC logs| + * Page17 +---+ *| Crash Dump logs | *+---+ * I don't mind - but does it help? And how much and for what? Haven't you later found that the uncached reads were the main issue? This change along with kthread patch, helped reduce the overflow counts and even eliminate them for some benchmarks. Though with the impending optimization for Uncached reads there should be further improvements but in my view, notwithstanding the improvement w.r.t overflow count, its still a better configuration to work with as flush interrupt frequency is cut down to half and not able to see any apparent downsides to it. I was primarily thinking to go with a minimal and simplest set of patches to implement the feature. I second that and working with the same intent. Logic was that apparently none of the smart and complex optimisations managed to solve the dropped interrupt issue, until the slowness of the uncached read was discovered to be the real/main issue. So it seems that is something that definitely needs to be implemented. (Whether or not it will be possible to use SSE instructions to do the read I don't know.) log buffer resizing and rt priority kthread changes have definitely helped significantly. Only of late we realized that there is a potential way to speed up Uncached reads also. Moreover I am yet to test that on kernel side. So until that is tested & proves to be enough, we have to rely on the other optimizations & can't dismiss them Assuming it is possible, then the question is whether there is need for all the other optimisations. Ie. do we need the kthread with rtprio or would a simple worker be enough? I think we can take a call, once we have the results with Uncached read optimization. Do we need the new i915 param for tweaking the relay sub-buffers? In my opinion it will be really useful to have this provision, as I tried to explain in the other mail. Do we need the increase of the log buffer size? Though this seems to be a benign change which is definitely good to have, but again can decide upon it once we have the results. The extra patch to do smarter reads? If we do not have the issue of the dropped interrupts with none of these extra patches applied, then we could afford to not bother with them now. Would make the series shorter and review easier and the feature in quicker. Agree with you. Had none of these optimizations in the initial version of the series, but was compelled to add them later when realized the rate at which GuC was generating the logs. Best regards Akash Or maybe we do need all the advanced stuff, I don't know, I am just asking the question and would like to see some data. Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 14/17] drm/i915: Add stats for GuC log buffer flush interrupts
On 7/15/2016 5:21 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash GoelGuC firmware sends an interrupt to flush the log buffer when it becomes half full. GuC firmware also tracks how many times the buffer overflowed. It would be useful to maintain a statistics of how many flush interrupts were received and for which type of log buffer, along with the overflow count of each buffer type. Augmented i915_log_info debugfs to report back these statistics. Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c| 26 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 8 drivers/gpu/drm/i915/i915_irq.c| 1 + drivers/gpu/drm/i915/intel_guc.h | 6 ++ 4 files changed, 41 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 3c9c7f7..888a18a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2538,6 +2538,30 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data) return 0; } +static void i915_guc_log_info(struct seq_file *m, + struct drm_i915_private *dev_priv) +{ +struct intel_guc *guc = _priv->guc; + +seq_printf(m, "\nGuC logging stats:\n"); + +seq_printf(m, "\tISR: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_ISR_LOG_BUFFER], +guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]); + +seq_printf(m, "\tDPC: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_DPC_LOG_BUFFER], +guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]); + +seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER], +guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]); + +seq_printf(m, "\tTotal flush interrupt count: %u\n", +guc->log.flush_interrupt_count); + +} + static void i915_guc_client_info(struct seq_file *m, struct drm_i915_private *dev_priv, struct i915_guc_client *client) @@ -2611,6 +2635,8 @@ static int i915_guc_info(struct seq_file *m, void *data) seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client); i915_guc_client_info(m, dev_priv, ); +i915_guc_log_info(m, dev_priv); + /* Add more as required ... */ return 0; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c1e637f..9c94a43 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -914,6 +914,14 @@ static void guc_read_update_log_buffer(struct drm_device *dev) log_buffer_state_local = *log_buffer_state; buffer_size = log_buffer_state_local.size; +guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; +if (log_buffer_state_local.buffer_full_cnt != +guc->log.prev_overflow_count[i]) { +guc->log.prev_overflow_count[i] = +log_buffer_state_local.buffer_full_cnt; +guc->log.total_overflow_count[i]++; Is log_buffer_state_local.buffer_full_cnt guaranteed to be one here? Or you would need to increase total_overflow_count by its value? buffer_full_cnt will not remain as one. Its a 4 bit counter, will be incremented monotonically by GuC firmware on every new detection of overflow, so will increase from 0 to 15 & then wrap around. Hence have to use '!=' in the condition instead of '>'. Best regards Akash +} + if (log_buffer_copy_state) { /* First copy the state structure */ memcpy(log_buffer_copy_state, _buffer_state_local, diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index bdd7a67..c3fb67e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1711,6 +1711,7 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) _priv->guc.events_work); } } +dev_priv->guc.log.flush_interrupt_count++; spin_unlock(_priv->irq_lock); } } diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h index 611f4a7..e911a32 100644 --- a/drivers/gpu/drm/i915/intel_guc.h +++ b/drivers/gpu/drm/i915/intel_guc.h @@ -128,6 +128,12 @@ struct intel_guc_log { struct workqueue_struct *wq; void *buf_addr; struct rchan *relay_chan; + +/* logging related stats */ +u32 flush_interrupt_count; +u32 prev_overflow_count[GUC_MAX_LOG_BUFFER]; +u32 total_overflow_count[GUC_MAX_LOG_BUFFER]; +u32 flush_count[GUC_MAX_LOG_BUFFER]; }; struct intel_guc { Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org
Re: [Intel-gfx] [PATCH 13/17] drm/i915: New lock to serialize the Host2GuC actions
On 7/15/2016 5:10 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash GoelWith the addition of new Host2GuC actions related to GuC logging, there is a need of a lock to serialize them, as they can execute concurrently with each other and also with other existing actions. After which patch in this series is this required? From patch 6 or 7 saw the problem, when enabled flush interrupts from boot (guc_log_level >= 0). Also new HOST2GUC actions LOG_BUFFER_FILE_FLUSH_COMPLETE & UK_LOG_ENABLE_LOGGING can execute concurrently with each other. Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++ drivers/gpu/drm/i915/intel_guc.h | 3 +++ 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 6043166..c1e637f 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) return -EINVAL; intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); +spin_lock(>action_lock); The code below can sleep waiting for a response from GuC so you cannot use a spinlock. Mutex I suppose... Sorry I missed the sleep. Probably I did not see any problem, in spite of a spinlock, as _wait_for macro does not sleep when used in atomic context, does a busy wait instead. Best Regards Akash dev_priv->guc.action_count += 1; dev_priv->guc.action_cmd = data[0]; @@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) } dev_priv->guc.action_status = status; +spin_unlock(>action_lock); intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); return ret; @@ -1304,6 +1306,7 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv) return -ENOMEM; ida_init(>ctx_ids); +spin_lock_init(>action_lock); I think this should go to guc_client_alloc which is where the guc client object is allocated and initialized. guc_create_log(guc); guc_create_ads(guc); diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h index d56bde6..611f4a7 100644 --- a/drivers/gpu/drm/i915/intel_guc.h +++ b/drivers/gpu/drm/i915/intel_guc.h @@ -157,6 +157,9 @@ struct intel_guc { uint64_t submissions[I915_NUM_ENGINES]; uint32_t last_seqno[I915_NUM_ENGINES]; + +/* To serialize the Host2GuC actions */ +spinlock_t action_lock; }; /* intel_guc_loader.c */ Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 10/17] drm/i915: New module param to control the size of buffer used for storing GuC firmware logs
On 7/15/2016 4:45 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash GoelOn recieving the log buffer flush interrupt from GuC firmware, Driver stores the snapshot of the log buffer in a local buffer, from which Userspace can pull the logs. By default Driver store, up to, 4 snapshots of the log buffer in a local buffer (managed by relay). Added a new module (read only) param, 'guc_log_size', through which User can specify the number of snapshots of log buffer to be stored in local buffer. This can be used to ensure capturing of all boot time logs even with high verbosity level. v2: Rename module param to more apt name 'guc_log_buffer_nr'. (Nikula) Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +-- drivers/gpu/drm/i915/i915_params.c | 5 + drivers/gpu/drm/i915/i915_params.h | 1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 2e3b723..009d7c0 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1046,8 +1046,7 @@ static int guc_create_log_relay_file(struct intel_guc *guc) /* Keep the size of sub buffers same as shared log buffer */ subbuf_size = guc->log.obj->base.size; -/* TODO: Decide based on the User's input */ -n_subbufs = 4; +n_subbufs = i915.guc_log_buffer_nr; guc_log_relay_chan = relay_open("guc_log", log_dir, subbuf_size, n_subbufs, _callbacks, dev); diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 8b13bfa..d30c972 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -57,6 +57,7 @@ struct i915_params i915 __read_mostly = { .enable_guc_loading = -1, .enable_guc_submission = -1, .guc_log_level = -1, +.guc_log_buffer_nr = 4, .enable_dp_mst = true, .inject_load_failure = 0, .enable_dpcd_backlight = false, @@ -214,6 +215,10 @@ module_param_named(guc_log_level, i915.guc_log_level, int, 0400); MODULE_PARM_DESC(guc_log_level, "GuC firmware logging level (-1:disabled (default), 0-3:enabled)"); +module_param_named(guc_log_buffer_nr, i915.guc_log_buffer_nr, int, 0400); +MODULE_PARM_DESC(guc_log_buffer_nr, +"Number of sub buffers to store GuC firmware logs (default: 4)"); + module_param_named_unsafe(enable_dp_mst, i915.enable_dp_mst, bool, 0600); MODULE_PARM_DESC(enable_dp_mst, "Enable multi-stream transport (MST) for new DisplayPort sinks. (default: true)"); diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index 0ad020b..14ca855 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -48,6 +48,7 @@ struct i915_params { int enable_guc_loading; int enable_guc_submission; int guc_log_level; +int guc_log_buffer_nr; int use_mmio_flip; int mmio_debug; int edp_vswing; I did not figure out after a quick read of Documentation/filesystems/relay.txt whether we really need this to be configurable? If I got it right number of sub-buffers here only has a relation to the userspace relay consumer latency. If the userspace is responsive should just two be enough? Or the existing default of four was shown in practice that it is better and good enough? Yes one of the use of this module parameter is to give User some leeway i.e. more time to collect logs from the relay buffer. User may not be always able to match the rate at which logs are being produced from the GuC side. 2 could be too less. Even 4, when running a benchmark, was proving less and not able to match the Driver rate (this might change after some optimization is done from User space side also, like splice). The other use is to ensure capturing of all boot time logs, even with maximum verbosity level. The default number of sub buffers may not always be sufficient to store all the logs from boot, by the time User is ready to capture the logs. Saw about 8 flush interrupts coming from GuC during the boot. I am just not sure this is a useful module parameter without some more data. Even if it is needed, as minimum I think the name should reflect this is about the relay side of things and not the GuC log buffer itself. So something like i915.guc_relay_log_subbuf_nr or something. Fine will use this name. With the matching description of course. Is the current description not apt ? "Number of sub buffers to store GuC firmware logs (default: 4)");" Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 15/17] drm/i915: Increase GuC log buffer size to reduce flush interrupts
On 7/15/2016 5:27 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Akash GoelIn cases where GuC generate logs at a very high rate, correspondingly the rate of flush interrupts is also very high. So far total 8 pages were allocated for storing both ISR & DPC logs. As per the half-full draining protocol followed by GuC, by doubling the number of pages, the frequency of flush interrupts can be cut down to almost half, which then helps in reducing the logging overhead. So now allocating 8 pages apiece for ISR & DPC logs. Suggested-by: Tvrtko Ursulin Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/intel_guc_fwif.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h index 1de6928..7521ed5 100644 --- a/drivers/gpu/drm/i915/intel_guc_fwif.h +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h @@ -104,9 +104,9 @@ #define GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3) #define GUC_LOG_CRASH_PAGES1 #define GUC_LOG_CRASH_SHIFT4 -#define GUC_LOG_DPC_PAGES3 +#define GUC_LOG_DPC_PAGES7 #define GUC_LOG_DPC_SHIFT6 -#define GUC_LOG_ISR_PAGES3 +#define GUC_LOG_ISR_PAGES7 #define GUC_LOG_ISR_SHIFT9 #define GUC_LOG_BUF_ADDR_SHIFT12 @@ -436,9 +436,9 @@ enum guc_log_buffer_type { *| Crash dump state header | * Page1 +---+ *| ISR logs| - * Page5 +---+ - *| DPC logs| * Page9 +---+ + *| DPC logs| + * Page17 +---+ *| Crash Dump logs | *+---+ * I don't mind - but does it help? And how much and for what? Haven't you later found that the uncached reads were the main issue? This change along with kthread patch, helped reduce the overflow counts and even eliminate them for some benchmarks. Though with the impending optimization for Uncached reads there should be further improvements but in my view, notwithstanding the improvement w.r.t overflow count, its still a better configuration to work with as flush interrupt frequency is cut down to half and not able to see any apparent downsides to it. Best Regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/17] drm/i915: Support for GuC interrupts
On 7/11/2016 7:13 PM, Tvrtko Ursulin wrote: On 11/07/16 14:38, Goel, Akash wrote: On 7/11/2016 6:53 PM, Tvrtko Ursulin wrote: On 11/07/16 14:15, Goel, Akash wrote: On 7/11/2016 4:00 PM, Tvrtko Ursulin wrote: +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) +{ +if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) { +spin_lock(_priv->irq_lock); +if (dev_priv->guc.interrupts_enabled) { +/* Sample the log buffer flush related bits & clear them + * out now itself from the message identity register to + * minimize the probability of losing a flush interrupt, + * when there are back to back flush interrupts. + * There can be a new flush interrupt, for different log + * buffer type (like for ISR), whilst Host is handling + * one (for DPC). Since same bit is used in message + * register for ISR & DPC, it could happen that GuC + * sets the bit for 2nd interrupt but Host clears out + * the bit on handling the 1st interrupt. + */ +u32 msg = I915_READ(SOFT_SCRATCH(15)) & +(GUC2HOST_MSG_CRASH_DUMP_POSTED | + GUC2HOST_MSG_FLUSH_LOG_BUFFER); +if (msg) { +/* Clear the message bits that are handled */ +I915_WRITE(SOFT_SCRATCH(15), +I915_READ(SOFT_SCRATCH(15)) & ~msg); + +/* Handle flush interrupt event in bottom half */ +queue_work(dev_priv->wq, _priv->guc.events_work); Since the later patch is changing this to use a thread, since you have established worker is too slow - especially the shared one - I would really recommend you start with the kthread straight away. Not have the worker for a while in the same series and then later change it to a thread. Actually it won't be appropriate to say that shared worker thread is too slow, but having a dedicated kthread definitely helps. I kept the kthread patch at the last so that as per the response, review comments can drop it also. I think it should only be one implementation in the patch series. If we agreed on a kthread make it so from the start. Agree but actually right now, added the kthread patch more as a RFC and presumed this won't be the final version of the series. Will do the needful, as per the review comments, in the next version. Ack. And describe in the commit message why it was selected etc. +} +} +spin_unlock(_priv->irq_lock); Why does the above needs to be done under the irq_lock ? Using the irq_lock for 'guc.interrupts_enabled', especially useful while disabling the interrupt. Why? I don't see how it gains you anything and so it seems preferable not to hold it over mmio accesses. Yes not needed for the mmio access part. Just needed for the inspection of 'guc.interrupts_enabled' value. Will reorder the code. You don't need it just for reading that value, you can just drop it. Its not strictly needed as its a mere read. But as per my limited understanding, without the spinlock (which provides an implicit barrier also) ISR might miss the reset of 'interrupts_enabled' flag, from a thread on other CPU, and queue the new work. The update will be visible eventually though. And same applies to the case when 'interrupts_enabled' flag is set from other CPU. Good practice to use locks for accessing shared variables ?. Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/17] drm/i915: Support for GuC interrupts
On 7/11/2016 6:53 PM, Tvrtko Ursulin wrote: On 11/07/16 14:15, Goel, Akash wrote: On 7/11/2016 4:00 PM, Tvrtko Ursulin wrote: +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) +{ +if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) { +spin_lock(_priv->irq_lock); +if (dev_priv->guc.interrupts_enabled) { +/* Sample the log buffer flush related bits & clear them + * out now itself from the message identity register to + * minimize the probability of losing a flush interrupt, + * when there are back to back flush interrupts. + * There can be a new flush interrupt, for different log + * buffer type (like for ISR), whilst Host is handling + * one (for DPC). Since same bit is used in message + * register for ISR & DPC, it could happen that GuC + * sets the bit for 2nd interrupt but Host clears out + * the bit on handling the 1st interrupt. + */ +u32 msg = I915_READ(SOFT_SCRATCH(15)) & +(GUC2HOST_MSG_CRASH_DUMP_POSTED | + GUC2HOST_MSG_FLUSH_LOG_BUFFER); +if (msg) { +/* Clear the message bits that are handled */ +I915_WRITE(SOFT_SCRATCH(15), +I915_READ(SOFT_SCRATCH(15)) & ~msg); + +/* Handle flush interrupt event in bottom half */ +queue_work(dev_priv->wq, _priv->guc.events_work); Since the later patch is changing this to use a thread, since you have established worker is too slow - especially the shared one - I would really recommend you start with the kthread straight away. Not have the worker for a while in the same series and then later change it to a thread. Actually it won't be appropriate to say that shared worker thread is too slow, but having a dedicated kthread definitely helps. I kept the kthread patch at the last so that as per the response, review comments can drop it also. I think it should only be one implementation in the patch series. If we agreed on a kthread make it so from the start. Agree but actually right now, added the kthread patch more as a RFC and presumed this won't be the final version of the series. Will do the needful, as per the review comments, in the next version. And describe in the commit message why it was selected etc. +} +} +spin_unlock(_priv->irq_lock); Why does the above needs to be done under the irq_lock ? Using the irq_lock for 'guc.interrupts_enabled', especially useful while disabling the interrupt. Why? I don't see how it gains you anything and so it seems preferable not to hold it over mmio accesses. Yes not needed for the mmio access part. Just needed for the inspection of 'guc.interrupts_enabled' value. Will reorder the code. Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/17] drm/i915: Support for GuC interrupts
On 7/11/2016 4:00 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Sagar Arun KambleThere are certain types of interrupts which Host can recieve from GuC. GuC ukernel sends an interrupt to Host for certain events, like for example retrieve/consume the logs generated by ukernel. This patch adds support to receive interrupts from GuC but currently enables & partially handles only the interrupt sent by GuC ukernel. Future patches will add support for handling other interrupt types. v2: - Use common low level routines for PM IER/IIR programming (Chris) - Rename interrupt functions to gen9_xxx from gen8_xxx (Chris) - Replace disabling of wake ref asserts with rpm get/put (Chris) v3: - Update comments for more clarity. (Tvrtko) - Remove the masking of GuC interrupt, which was kept masked till the start of bottom half, its not really needed as there is only a single instance of work item & wq is ordered. (Tvrtko) v4: - Rebase. - Rename guc_events to pm_guc_events so as to be indicative of the register/control block it is associated with. (Chris) - Add handling for back to back log buffer flush interrupts. Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 5 ++ drivers/gpu/drm/i915/i915_irq.c| 98 -- drivers/gpu/drm/i915/i915_reg.h| 11 drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_guc.h | 4 ++ drivers/gpu/drm/i915/intel_guc_loader.c| 4 ++ 7 files changed, 122 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c3a579f..6e2ddfa 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1794,6 +1794,7 @@ struct drm_i915_private { u32 pm_imr; u32 pm_ier; u32 pm_rps_events; +u32 pm_guc_events; u32 pipestat_irq_mask[I915_MAX_PIPES]; struct i915_hotplug hotplug; + /** * bdw_update_port_irq - update DE port interrupt * @dev_priv: driver private @@ -1174,6 +1208,21 @@ static void gen6_pm_rps_work(struct work_struct *work) mutex_unlock(_priv->rps.hw_lock); } +static void gen9_guc2host_events_work(struct work_struct *work) +{ +struct drm_i915_private *dev_priv = +container_of(work, struct drm_i915_private, guc.events_work); + +spin_lock_irq(_priv->irq_lock); +/* Speed up work cancellation during disabling guc interrupts. */ +if (!dev_priv->guc.interrupts_enabled) { +spin_unlock_irq(_priv->irq_lock); +return; +} +spin_unlock_irq(_priv->irq_lock); + +/* TODO: Handle the events for which GuC interrupted host */ +} /** * ivybridge_parity_work - Workqueue called when a parity error interrupt @@ -1346,11 +1395,13 @@ static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv, DRM_ERROR("The master control interrupt lied (GT3)!\n"); } -if (master_ctl & GEN8_GT_PM_IRQ) { +if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) { gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2)); -if (gt_iir[2] & dev_priv->pm_rps_events) { +if (gt_iir[2] & (dev_priv->pm_rps_events | + dev_priv->pm_guc_events)) { I915_WRITE_FW(GEN8_GT_IIR(2), - gt_iir[2] & dev_priv->pm_rps_events); + gt_iir[2] & (dev_priv->pm_rps_events | + dev_priv->pm_guc_events)); ret = IRQ_HANDLED; } else DRM_ERROR("The master control interrupt lied (PM)!\n"); @@ -1382,6 +1433,9 @@ static void gen8_gt_irq_handler(struct drm_i915_private *dev_priv, if (gt_iir[2] & dev_priv->pm_rps_events) gen6_rps_irq_handler(dev_priv, gt_iir[2]); + +if (gt_iir[2] & dev_priv->pm_guc_events) +gen9_guc_irq_handler(dev_priv, gt_iir[2]); } static bool bxt_port_hotplug_long_detect(enum port port, u32 val) @@ -1628,6 +1682,38 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir) } } +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) +{ +if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) { +spin_lock(_priv->irq_lock); +if (dev_priv->guc.interrupts_enabled) { +/* Sample the log buffer flush related bits & clear them + * out now itself from the message identity register to + * minimize the probability of losing a flush interrupt, + * when there are back to back flush interrupts. + * There can be a new flush interrupt, for different log + * buffer type (like for ISR), whilst Host is handling + * one (for DPC). Since same bit is used in message + * register for ISR & DPC,
Re: [Intel-gfx] [PATCH 01/17] drm/i915: Decouple GuC log setup from verbosity parameter
On 7/11/2016 5:20 PM, Tvrtko Ursulin wrote: On 11/07/16 12:41, Goel, Akash wrote: On 7/11/2016 3:07 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> b/drivers/gpu/drm/i915/i915_guc_submission.c index 2112e02..8a9a0cb 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c index 605c696..b211bd0 100644 --- a/drivers/gpu/drm/i915/intel_guc_loader.c +++ b/drivers/gpu/drm/i915/intel_guc_loader.c @@ -175,11 +175,13 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv) params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER | GUC_CTL_VCS2_ENABLED; -if (i915.guc_log_level >= 0) { -params[GUC_CTL_LOG_PARAMS] = guc->log_flags; +params[GUC_CTL_LOG_PARAMS] = guc->log_flags; guc->log_flags will be zero when logging is not configured because guc is a part of dev_priv. So it looks safe - although I reckon it would be clearer to set this (GUC_CTL_LOG_PARAMS) explicitly inside the if-else below? If logging is not enabled at (due to guc_log_level < 0), then also log_flags needs to be setup & passed to GuC firmware. log_flags shall not be zero even when logging is not be enabled (at boot time). Actually log_flags will also contain the address of the log buffer. Ah yes, I got confused by jumping between one file with your patch applied and one without it. + +if (i915.guc_log_level >= 0) params[GUC_CTL_DEBUG] = i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT; -} +else +params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED; I also wonder how come GUC_LOG_DISABLED isn't set today when i915.guc_log_level == -1, given that: #define GUC_LOG_DISABLED (1 << 6) Is that bit set by default somehow if i915 does not program it? Yes currently GUC_LOG_DISABLED won't get set for guc_log_level = -1. But then log buffer address will go as NULL and GUC_LOG_VALID flag will go as 0, for guc_log_level = -1. So this way logging on GuC side will not get enabled. I hope I understood your concern correctly. Yes, this clarifies it. Although I do have one more question then - what happens if at boot i915.guc_log_level == -1 and then with later patches logging gets enabled via debugfs - who and how sets params[GUC_CTL_DEBUG]? Host2GuC overrides this parameter? Yes through Host2GuC action type, UK_LOG_ENABLE_LOGGING, Host will request GuC firmware to enable/disable logging and alter the verbosity level. The params[GUC_CTL_DEBUG] is just part of the firmware initialization parameters and is not used after that. Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 01/17] drm/i915: Decouple GuC log setup from verbosity parameter
On 7/11/2016 3:07 PM, Tvrtko Ursulin wrote: On 10/07/16 14:41, akash.g...@intel.com wrote: From: Sagar Arun Kambleb/drivers/gpu/drm/i915/i915_guc_submission.c index 2112e02..8a9a0cb 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -832,9 +832,6 @@ static void guc_create_log(struct intel_guc *guc) unsigned long offset; uint32_t size, flags; -if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN) -return; - if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX) i915.guc_log_level = GUC_LOG_VERBOSITY_MAX; diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c index 605c696..b211bd0 100644 --- a/drivers/gpu/drm/i915/intel_guc_loader.c +++ b/drivers/gpu/drm/i915/intel_guc_loader.c @@ -175,11 +175,13 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv) params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER | GUC_CTL_VCS2_ENABLED; -if (i915.guc_log_level >= 0) { -params[GUC_CTL_LOG_PARAMS] = guc->log_flags; +params[GUC_CTL_LOG_PARAMS] = guc->log_flags; guc->log_flags will be zero when logging is not configured because guc is a part of dev_priv. So it looks safe - although I reckon it would be clearer to set this (GUC_CTL_LOG_PARAMS) explicitly inside the if-else below? If logging is not enabled at (due to guc_log_level < 0), then also log_flags needs to be setup & passed to GuC firmware. log_flags shall not be zero even when logging is not be enabled (at boot time). Actually log_flags will also contain the address of the log buffer. + +if (i915.guc_log_level >= 0) params[GUC_CTL_DEBUG] = i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT; -} +else +params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED; I also wonder how come GUC_LOG_DISABLED isn't set today when i915.guc_log_level == -1, given that: #define GUC_LOG_DISABLED (1 << 6) Is that bit set by default somehow if i915 does not program it? Yes currently GUC_LOG_DISABLED won't get set for guc_log_level = -1. But then log buffer address will go as NULL and GUC_LOG_VALID flag will go as 0, for guc_log_level = -1. So this way logging on GuC side will not get enabled. I hope I understood your concern correctly. Best regards Akash if (guc->ads_obj) { u32 ads = (u32)i915_gem_obj_ggtt_offset(guc->ads_obj) Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 03/14] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set
On 7/3/2016 3:08 PM, Chris Wilson wrote: On Sun, Jul 03, 2016 at 12:21:20AM +0530, akash.g...@intel.com wrote: From: Akash GoelSo far PM IER/IIR/IMR registers were being used only for Turbo related interrupts. But interrupts coming from GuC also use the same set. As a precursor to supporting GuC interrupts, added new low level routines so as to allow sharing the programming of PM IER/IIR/IMR registers between Turbo & GuC. Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow easy sharing of it between Turbo & GuC without involving a rmw operation. v2: - For appropriateness & avoid any ambiguity, rename old functions enable/disable pm_irq to mask/unmask pm_irq and rename new functions enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko) - Use u32 in place of uint32_t. (Tvrtko) Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_irq.c | 63 - drivers/gpu/drm/i915/intel_drv.h| 3 ++ drivers/gpu/drm/i915/intel_ringbuffer.c | 4 +-- 4 files changed, 53 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9ef4919..85a7103 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1806,6 +1806,7 @@ struct drm_i915_private { }; u32 gt_irq_mask; u32 pm_irq_mask; + u32 pm_ier_mask; Oops. u32 pm_imr; and u32 pm_ier; Fine, will rename. u32 pm_rps_events; u32 pipestat_irq_mask[I915_MAX_PIPES]; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 4378a65..dd5ae6d 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -314,7 +314,7 @@ static void snb_update_pm_irq(struct drm_i915_private *dev_priv, } } -void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask) +void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask) { if (WARN_ON(!intel_irqs_enabled(dev_priv))) return; @@ -322,28 +322,62 @@ void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask) snb_update_pm_irq(dev_priv, mask, mask); } -static void __gen6_disable_pm_irq(struct drm_i915_private *dev_priv, - uint32_t mask) +static void __gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask) { snb_update_pm_irq(dev_priv, mask, 0); } -void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask) +void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask) { if (WARN_ON(!intel_irqs_enabled(dev_priv))) return; - __gen6_disable_pm_irq(dev_priv, mask); + __gen6_mask_pm_irq(dev_priv, mask); } -void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv) +void gen6_reset_pm_irq(struct drm_i915_private *dev_priv, u32 reset_mask) reset_pm_iir Thanks, will update. { i915_reg_t reg = gen6_pm_iir(dev_priv); - spin_lock_irq(_priv->irq_lock); - I915_WRITE(reg, dev_priv->pm_rps_events); - I915_WRITE(reg, dev_priv->pm_rps_events); + assert_spin_locked(_priv->irq_lock); + + I915_WRITE(reg, reset_mask); + I915_WRITE(reg, reset_mask); POSTING_READ(reg); +} + +void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, u32 enable_mask) +{ + u32 new_val; + + assert_spin_locked(_priv->irq_lock); + + new_val = dev_priv->pm_ier_mask; + new_val |= enable_mask; + + dev_priv->pm_ier_mask = new_val; dev_priv->pm_ier |= new_val; Sorry, my bad. + I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask); + gen6_unmask_pm_irq(dev_priv, enable_mask); What barrier do you need between the hw and the caller? I presume there is a POSTING_READ in this callchain, would be nice to document it. /* unmask_pm_irq provides a POSTING_READ */ Thanks, will add the comment. So will assume that POSTING_READ is good enough here. +} + +void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, u32 disable_mask) +{ + u32 new_val; + + assert_spin_locked(_priv->irq_lock); + + new_val = dev_priv->pm_ier_mask; + new_val &= ~disable_mask; + + dev_priv->pm_ier_mask = new_val; dev_priv->pm_ier &= ~disable_mask; + __gen6_mask_pm_irq(dev_priv, disable_mask); + I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask); Do we need a barrier upon disabling? (Usually we need a stronger barrier on enabling to ensure we don't miss an interrupt when enabling, but for disabling we don't care.) So no modification needed here, as you mentioned that we don't need to care about the register update getting completed in the disabling case. +} + +void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv) +{ +
Re: [Intel-gfx] [PATCH 13/14] drm/i915: Add stats for GuC log buffer flush interrupts
On 7/3/2016 3:14 PM, Chris Wilson wrote: On Sun, Jul 03, 2016 at 12:21:30AM +0530, akash.g...@intel.com wrote: From: Akash GoelGuC firmware sends an interrupt to flush the log buffer when it becomes half full. GuC firmware also tracks how many times the buffer overflowed. It would be useful to maintain a statistics of how many flush For what purpose? Would not tracepoints be even more useful? Having a stats would be useful to get an idea of the volume & the rate at which logs are being generated from GuC side and whether Driver is quick enough to capture all of them. Yes tracepoint would also be very useful. Please see below the logging related stats, in the output of ‘i915_guc_info’ on execution of ‘gem_exec_nop’ IGT. GuC total action count: 623531 GuC action failure count: 0 GuC last action command: 0x30 GuC last action status: 0xf000 GuC last action error code: 0 GuC submissions: render ring :9019910, last seqno 0x01a4390b blitter ring:6188291, last seqno 0x01a4390d bsd ring:6179075, last seqno 0x01a4390c video enhancement ring :6156547, last seqno 0x01a4390e Total: 27543823 GuC execbuf client @ 8801659fb100: Priority 2, GuC ctx index: 0, PD offset 0x800 Doorbell id 0, offset: 0x0, cookie 0x1a4490f WQ size 8192, offset: 0x1000, tail 4336 Work queue full: 0 Failed to queue: 0 Failed doorbell: 0 Last submission result: 0 Submissions: 9019910 render ring Submissions: 6188291 blitter ring Submissions: 6179075 bsd ring Submissions: 6156547 video enhancement ring Total: 27543823 GuC logging stats: ISR: flush count 321718, overflow count0 DPC: flush count 303788, overflow count1 CRASH: flush count 0, overflow count0 Total flush interrupt count: 625511 Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/14] drm/i915: Handle log buffer flush interrupt event from GuC
On 7/3/2016 5:51 PM, Goel, Akash wrote: On 7/3/2016 2:45 PM, Chris Wilson wrote: On Sun, Jul 03, 2016 at 12:21:22AM +0530, akash.g...@intel.com wrote: +static void guc_read_update_log_buffer(struct drm_device *dev, bool capture_all) +{ +struct drm_i915_private *dev_priv = dev->dev_private; +struct intel_guc *guc = _priv->guc; +struct guc_log_buffer_state *log_buffer_state; +struct guc_log_buffer_state *log_buffer_copy_state; +void *src_ptr, *dst_ptr; +u32 num_pages_to_copy; +int i; + +if (!guc->log.obj) +return; + +num_pages_to_copy = guc->log.obj->base.size / PAGE_SIZE; +/* Don't really need to copy crash buffer area in regular cases as there + * won't be any unread data there. + */ +if (!capture_all) +num_pages_to_copy -= (GUC_LOG_CRASH_PAGES + 1); + +log_buffer_state = src_ptr = +kmap_atomic(i915_gem_object_get_page(guc->log.obj, 0)); So why not use i915_gem_object_pin_map() from the start? That will cut down on the churn later. Fine, will reorder the series and squash the other patch 'drm/i915: Use uncached(WC) mapping for accessing the GuC log buffer' with this patch. Sorry got confused, will use the i915_gem_object_pin_map() here instead of kmap and keep the WC mapping patch at the end of series only. Then will just have to modify the call to i915_gem_object_pin_map() to pass the WC flag. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/14] drm/i915: Handle log buffer flush interrupt event from GuC
On 7/3/2016 2:45 PM, Chris Wilson wrote: On Sun, Jul 03, 2016 at 12:21:22AM +0530, akash.g...@intel.com wrote: +static void guc_read_update_log_buffer(struct drm_device *dev, bool capture_all) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + struct intel_guc *guc = _priv->guc; + struct guc_log_buffer_state *log_buffer_state; + struct guc_log_buffer_state *log_buffer_copy_state; + void *src_ptr, *dst_ptr; + u32 num_pages_to_copy; + int i; + + if (!guc->log.obj) + return; + + num_pages_to_copy = guc->log.obj->base.size / PAGE_SIZE; + /* Don't really need to copy crash buffer area in regular cases as there +* won't be any unread data there. +*/ + if (!capture_all) + num_pages_to_copy -= (GUC_LOG_CRASH_PAGES + 1); + + log_buffer_state = src_ptr = + kmap_atomic(i915_gem_object_get_page(guc->log.obj, 0)); So why not use i915_gem_object_pin_map() from the start? That will cut down on the churn later. Fine, will reorder the series and squash the other patch 'drm/i915: Use uncached(WC) mapping for accessing the GuC log buffer' with this patch. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 04/11] drm/i915: Support for GuC interrupts
On 7/1/2016 2:17 PM, Tvrtko Ursulin wrote: On 01/07/16 07:16, Goel, Akash wrote: [snip] +/* Process all the GuC to Host events in bottom half */ +gen6_disable_pm_irq(dev_priv, +GEN9_GUC_TO_HOST_INT_EVENT); Why it is important to disable the interrupt here? Not for the queue work I think. We want to & can handle one interrupt at a time, unless the queued work item is executed we can't process the next interrupt, so better to keep the interrupt masked. Sorry this is what is my understanding. So it is queued in hardware and will get asserted when unmasked? As per my understanding, if the interrupt is masked (IMR), it won't be queued, will be ignored & so will not be asserted on unmasking. If the interrupt wasn't masked, but was disabled (in IER) then it will be asserted (in IIR) when its enabled. Also, is it safe with regards to potentially losing the interrupt? Particularly for the FLUSH_LOG_BUFFER case, GuC won't send a new flush interrupt unless its gets an acknowledgement (flush signal) of the previous one from Host. Ah so the previous comment is really impossible? I mean the need to mask? Sorry my comments were not fully correct. GuC can send a new flush interrupt, even if the previous one is pending, but that will be for a different log buffer type (3 types of log buffer ISR, DPC, CRASH). For the same buffer type, GuC won't send a new flush interrupt unless its gets an acknowledgement of the previous one from Host. But as you said the workqueue is ordered and furthermore there is a single instance of work item, so the serialization will be provided implicitly and there is no real need to mask the interrupt. As mentioned above, a new flush interrupt can come while the previous one is being processed on Host but due to a single instance of work item either that new interrupt will not do anything effectively if work item was in a pending state or will re queue the work item if it was getting executed at that time. Also the state of all 3 log buffer types are being parsed irrespective for which one the interrupt actually came, and the whole buffer is being captured (this is how it has been recommended to handle the flush interrupts from Host side). So if a new interrupt comes while the work item was in a pending state, then effectively work of this new interrupt will also be done when work item is executed later. So will remove the masking then ? I think so, because if I understood what you wrote, masking can lose us an interrupt. If a new flush interrupt comes while the work item was getting executed then there is a potential of losing an opportunity to sample the log buffer. Will not mask the interrupt. Thanks for persisting on this. Possibly just put a comment up there explaining that. +queue_work(dev_priv->wq, _priv->guc.events_work); Because dev_priv->wq is a one a time in order wq so if something else is running on it and taking time, can that also be a cause of dropping an interrupt or being late with sending the flush signal to the guc and so losing some logs? Its a Driver's private workqueue and Turbo work item is also queued inside this workqueue which too needs to be executed without much delay. But yes the flush work item can get substantially delayed in case if there are other work items queued before it, especially the mm.retire_work (but generally executes every ~1 second). Best would be if the log buffer (44KB data) can be sampled in IRQ context (or Tasklet context) itself. I was just trying to understand if you perhaps need a dedicated wq. I don't have a feel at all on how much data guc logging generates per second. If the interrupt is low frequency even with a lot of cmd submission happening it could be fine like it is. Actually with maximum verbosity level, I am seeing flush interrupt every ms, with 'gem_exec_nop' IGT, as there are lot of submissions being done. But such may not happen in real life scenario. I think, if needed, later on we can either have a dedicated high priority work queue for logging work or use the tasklet context to do the processing. Hm, do you need to add some DRM_ERROR or something if wq starts lagging behind the flush interrupts? How many missed flush interrupts can we afford before the logging buffer starts getting overwritten? Actually if GuC is producing logs at such a fast rate then we can't afford to miss even a single interrupt, if we don't want to lose any logs. When the log buffer becomes half full, GuC sends a flush interrupt. GuC firmware expects that while it is writing to 2nd half of the buffer, first half would get consumed by Host and then get a flush completed acknowledgement from Host, so that it does not end up doing any overwrite causing loss of logs. There is a buffer_full_cnt field in the state structure which GuC firmware increments every time it detects a potential log buffer overflow. Probably this
Re: [Intel-gfx] [PATCH 04/11] drm/i915: Support for GuC interrupts
On 6/28/2016 7:14 PM, Tvrtko Ursulin wrote: On 28/06/16 12:12, Goel, Akash wrote: On 6/28/2016 3:33 PM, Tvrtko Ursulin wrote: On 27/06/16 13:16, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> There are certain types of interrupts which Host can recieve from GuC. GuC ukernel sends an interrupt to Host for certain events, like for example retrieve/consume the logs generated by ukernel. This patch adds support to receive interrupts from GuC but currently enables & partially handles only the interrupt sent by GuC ukernel. Future patches will add support for handling other interrupt types. v2: Use common low level routines for PM IER/IIR programming (Chris) Rename interrupt functions to gen9_xxx from gen8_xxx (Chris) Replace disabling of wake ref asserts with rpm get/put (Chris) Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 5 ++ drivers/gpu/drm/i915/i915_irq.c| 95 -- drivers/gpu/drm/i915/i915_reg.h| 11 drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_guc.h | 5 ++ drivers/gpu/drm/i915/intel_guc_loader.c| 4 ++ 7 files changed, 120 insertions(+), 4 deletions(-) +static void gen9_guc2host_events_work(struct work_struct *work) +{ +struct drm_i915_private *dev_priv = +container_of(work, struct drm_i915_private, guc.events_work); + +spin_lock_irq(_priv->irq_lock); +/* Speed up work cancelation during disabling guc interrupts. */ +if (!dev_priv->guc.interrupts_enabled) { +spin_unlock_irq(_priv->irq_lock); +return; +} + +/* Though this work item gets synced during rpm suspend, but still need + * a rpm get/put to avoid the warning, as it could get executed in a + * window, where rpm ref count has dropped to zero but rpm suspend has + * not kicked in. Generally device is expected to be active only at this + * time so get/put should be really quick. + */ +intel_runtime_pm_get(dev_priv); + +gen6_enable_pm_irq(dev_priv, GEN9_GUC_TO_HOST_INT_EVENT); +spin_unlock_irq(_priv->irq_lock); + +/* TODO: Handle the events for which GuC interrupted host */ + +intel_runtime_pm_put(dev_priv); +} static bool bxt_port_hotplug_long_detect(enum port port, u32 val) @@ -1653,6 +1722,20 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir) } } +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) +{ +if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) { +spin_lock(_priv->irq_lock); +if (dev_priv->guc.interrupts_enabled) { So it is expected interrupts will always be enabled when i915.guc_log_level is set, correct? Yes currently only when guc_log_level > 0, interrupt should be enabled. But we need to disable/enable the interrupt upon suspend/resume and across GPU reset. So interrupt may not be always in a enabled state when guc_log_level>0. Also do you need to check against dev_priv->guc.interrupts_enabled at all then? Or from an opposite angle, would you instead need to log the fact unexpected interrupt was received here? I think this check is needed, to avoid the race in disabling interrupt. Please refer the sequence in interrupt disabling function (same as rps disabling), there we first set the interrupts_enabled flag to false, then wait for the work item to finish execution and then program the IMR register. Right I see now that it is copy-pasted existing sequence. In this case I won't question it further. :) +/* Process all the GuC to Host events in bottom half */ +gen6_disable_pm_irq(dev_priv, +GEN9_GUC_TO_HOST_INT_EVENT); Why it is important to disable the interrupt here? Not for the queue work I think. We want to & can handle one interrupt at a time, unless the queued work item is executed we can't process the next interrupt, so better to keep the interrupt masked. Sorry this is what is my understanding. So it is queued in hardware and will get asserted when unmasked? As per my understanding, if the interrupt is masked (IMR), it won't be queued, will be ignored & so will not be asserted on unmasking. If the interrupt wasn't masked, but was disabled (in IER) then it will be asserted (in IIR) when its enabled. Also, is it safe with regards to potentially losing the interrupt? Particularly for the FLUSH_LOG_BUFFER case, GuC won't send a new flush interrupt unless its gets an acknowledgement (flush signal) of the previous one from Host. Ah so the previous comment is really impossible? I mean the need to mask? Sorry my comments were not fully correct. GuC can send a new flush interrupt, even if the previous one
Re: [Intel-gfx] [PATCH 04/11] drm/i915: Support for GuC interrupts
On 6/28/2016 3:33 PM, Tvrtko Ursulin wrote: On 27/06/16 13:16, akash.g...@intel.com wrote: From: Sagar Arun KambleThere are certain types of interrupts which Host can recieve from GuC. GuC ukernel sends an interrupt to Host for certain events, like for example retrieve/consume the logs generated by ukernel. This patch adds support to receive interrupts from GuC but currently enables & partially handles only the interrupt sent by GuC ukernel. Future patches will add support for handling other interrupt types. v2: Use common low level routines for PM IER/IIR programming (Chris) Rename interrupt functions to gen9_xxx from gen8_xxx (Chris) Replace disabling of wake ref asserts with rpm get/put (Chris) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 5 ++ drivers/gpu/drm/i915/i915_irq.c| 95 -- drivers/gpu/drm/i915/i915_reg.h| 11 drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_guc.h | 5 ++ drivers/gpu/drm/i915/intel_guc_loader.c| 4 ++ 7 files changed, 120 insertions(+), 4 deletions(-) +static void gen9_guc2host_events_work(struct work_struct *work) +{ +struct drm_i915_private *dev_priv = +container_of(work, struct drm_i915_private, guc.events_work); + +spin_lock_irq(_priv->irq_lock); +/* Speed up work cancelation during disabling guc interrupts. */ +if (!dev_priv->guc.interrupts_enabled) { +spin_unlock_irq(_priv->irq_lock); +return; +} + +/* Though this work item gets synced during rpm suspend, but still need + * a rpm get/put to avoid the warning, as it could get executed in a + * window, where rpm ref count has dropped to zero but rpm suspend has + * not kicked in. Generally device is expected to be active only at this + * time so get/put should be really quick. + */ +intel_runtime_pm_get(dev_priv); + +gen6_enable_pm_irq(dev_priv, GEN9_GUC_TO_HOST_INT_EVENT); +spin_unlock_irq(_priv->irq_lock); + +/* TODO: Handle the events for which GuC interrupted host */ + +intel_runtime_pm_put(dev_priv); +} /** * ivybridge_parity_work - Workqueue called when a parity error interrupt @@ -1371,11 +1435,13 @@ static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv, DRM_ERROR("The master control interrupt lied (GT3)!\n"); } -if (master_ctl & GEN8_GT_PM_IRQ) { +if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) { gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2)); -if (gt_iir[2] & dev_priv->pm_rps_events) { +if (gt_iir[2] & (dev_priv->pm_rps_events | + dev_priv->guc_events)) { I915_WRITE_FW(GEN8_GT_IIR(2), - gt_iir[2] & dev_priv->pm_rps_events); + gt_iir[2] & (dev_priv->pm_rps_events | + dev_priv->guc_events)); ret = IRQ_HANDLED; } else DRM_ERROR("The master control interrupt lied (PM)!\n"); @@ -1407,6 +1473,9 @@ static void gen8_gt_irq_handler(struct drm_i915_private *dev_priv, if (gt_iir[2] & dev_priv->pm_rps_events) gen6_rps_irq_handler(dev_priv, gt_iir[2]); + +if (gt_iir[2] & dev_priv->guc_events) +gen9_guc_irq_handler(dev_priv, gt_iir[2]); } static bool bxt_port_hotplug_long_detect(enum port port, u32 val) @@ -1653,6 +1722,20 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir) } } +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) +{ +if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) { +spin_lock(_priv->irq_lock); +if (dev_priv->guc.interrupts_enabled) { So it is expected interrupts will always be enabled when i915.guc_log_level is set, correct? Yes currently only when guc_log_level > 0, interrupt should be enabled. But we need to disable/enable the interrupt upon suspend/resume and across GPU reset. So interrupt may not be always in a enabled state when guc_log_level>0. Also do you need to check against dev_priv->guc.interrupts_enabled at all then? Or from an opposite angle, would you instead need to log the fact unexpected interrupt was received here? I think this check is needed, to avoid the race in disabling interrupt. Please refer the sequence in interrupt disabling function (same as rps disabling), there we first set the interrupts_enabled flag to false, then wait for the work item to finish execution and then program the IMR register. +/* Process all the GuC to Host events in bottom half */ +gen6_disable_pm_irq(dev_priv, +GEN9_GUC_TO_HOST_INT_EVENT); Why it is important to disable the interrupt here? Not for the queue work I think. We
Re: [Intel-gfx] [PATCH 10/11] drm/i915: Support to create write combined type vmaps
On 6/28/2016 3:22 PM, Chris Wilson wrote: On Mon, Jun 27, 2016 at 05:46:57PM +0530, akash.g...@intel.com wrote: From: Chris Wilsondiff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 20c701c..3ef1ee5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3197,6 +3197,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) /** * i915_gem_object_pin_map - return a contiguous mapping of the entire object * @obj - the object to map into kernel address space + * _wc - whether the mapping should be using WC or WB pgprot_t s/&/@/ I think Sorry my bad. /* get, pin, and map the pages of the object into kernel space */ -void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj) +void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj, bool use_wc) { + void *ptr; + bool has_wc; + bool pinned; int ret; lockdep_assert_held(>base.dev->struct_mutex); + GEM_BUG_ON((obj->ops->flags & I915_GEM_OBJECT_HAS_STRUCT_PAGE) == 0); ret = i915_gem_object_get_pages(obj); if (ret) return ERR_PTR(ret); + GEM_BUG_ON(obj->pages == NULL); i915_gem_object_pin_pages(obj); - if (!obj->mapping) { - obj->mapping = i915_gem_object_map(obj); - if (!obj->mapping) { - i915_gem_object_unpin_pages(obj); - return ERR_PTR(-ENOMEM); + pinned = (obj->pages_pin_count > 1); Too many () Sorry is the above condition not correct ? If pin count is more than 1 then it implies that pages have been pinned elsewhere also, so pages were already pinned before they were pinned one more time, inside this function. Please let me know, will fix it. Best regards Akash Hmm. It may look a bit dubious if I add my r-b here. But I didn't spot any rebasing errors. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/11] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
On 6/28/2016 3:17 PM, Chris Wilson wrote: On Mon, Jun 27, 2016 at 05:46:53PM +0530, akash.g...@intel.com wrote: +static void guc_remove_log_relay_file(struct intel_guc *guc) +{ + relay_close(guc->log_relay_chan); +} + +static void guc_create_log_relay_file(struct intel_guc *guc) +{ + struct drm_i915_private *dev_priv = guc_to_i915(guc); + struct drm_device *dev = dev_priv->dev; + struct dentry *log_dir; + struct rchan *guc_log_relay_chan; + size_t n_subbufs, subbuf_size; + + if (guc->log_relay_chan) + return; + + /* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is +* not mounted and so can't create the relay file. +* The relay API seems to fit well with debugfs only. +*/ Ah. dev->primary->debugfs_root does not exist until the end of driver loading. You need to add an intel_guc_register() to the i915_driver_register() after we call drm_dev_rigster() (that then calls this function). Similarly, this needs to be torn down in unregister. Yes, realized this today, that can’t get to the ‘dri’ directory until the end of Driver load. So will have to create the relay file after i915_driver_register(). + if (!dev->primary->debugfs_root) { + /* logging will remain off */ + i915.guc_log_level = -1; + return; + } + + /* For now create the log file in /sys/kernel/debug/dri dir. */ + log_dir = dev->primary->debugfs_root->d_parent; In future, this will be something like /sys/kernel/gpu/i915/guc_log, so I don't see a good argument for not being more canonical in the debugfs placement and using dev->primary->debugfs_root (i.e. /.../dri/0) Yes can now use the dev->primary->debugfs_root itself. Actually earlier 'i915_debugfs_files' were being created inside other drm_minor directories also (i.e. dri/64 & dri/128), but now they are restricted only to dri/0. Best regards Akash At the very least, you need to explain why we don't use dri/0/ -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 03/11] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set
On 6/28/2016 2:05 PM, Tvrtko Ursulin wrote: On 27/06/16 17:35, Goel, Akash wrote: On 6/27/2016 9:16 PM, Tvrtko Ursulin wrote: On 27/06/16 13:16, akash.g...@intel.com wrote: From: Akash Goel <akash.g...@intel.com> So far PM IER/IIR/IMR registers were being used only for Turbo related interrupts. But interrupts coming from GuC also use the same set. As a precursor to supporting GuC interrupts, added new low level routines so as to allow sharing the programming of PM IER/IIR/IMR registers between Turbo & GuC. Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow easy sharing of it between Turbo & GuC without involving a rmw operation. Suggested-by: Chris Wilson <ch...@chris-wilson.co.uk> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_irq.c | 55 drivers/gpu/drm/i915/intel_drv.h | 6 + 3 files changed, 52 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9ef4919..85a7103 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1806,6 +1806,7 @@ struct drm_i915_private { }; u32 gt_irq_mask; u32 pm_irq_mask; +u32 pm_ier_mask; u32 pm_rps_events; u32 pipestat_irq_mask[I915_MAX_PIPES]; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 4378a65..7316ab4 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -336,14 +336,52 @@ void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask) __gen6_disable_pm_irq(dev_priv, mask); } -void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv) +void gen6_reset_pm_interrupts(struct drm_i915_private *dev_priv, + uint32_t reset_mask) Kernel prefers u32. It is not that overall i915 is clean in that respect, but every time maintainers merge patches checkpatch shouts about it, and more noise tougher it is to spot more important issues. I would appreciate if u32 was used throughout. Fine, will use u32. Thanks! { i915_reg_t reg = gen6_pm_iir(dev_priv); -spin_lock_irq(_priv->irq_lock); -I915_WRITE(reg, dev_priv->pm_rps_events); -I915_WRITE(reg, dev_priv->pm_rps_events); +assert_spin_locked(_priv->irq_lock); + +I915_WRITE(reg, reset_mask); +I915_WRITE(reg, reset_mask); POSTING_READ(reg); +} + +void gen6_enable_pm_interrupts(struct drm_i915_private *dev_priv, + uint32_t enable_mask) +{ +uint32_t new_val; + +assert_spin_locked(_priv->irq_lock); + +new_val = dev_priv->pm_ier_mask; +new_val |= enable_mask; + +dev_priv->pm_ier_mask = new_val; +I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask); +gen6_enable_pm_irq(dev_priv, enable_mask); Hm, will this be confusing that we will have gen6_enable_pm_interrupts and gen6_enable_pm_irq, so extremely similar names and same parameters, but for different use? Sorry for using confusing, ambiguous names. Maybe rename the old one to gen6_unmask_pm_irq and name this one gen6_enable_pm_irq ? If there is really need to have both. Or add some kerneldoc explaining which one is used for what? Can I do like this, keep gen6_enable_pm_interrupts as is and rename gen6_enable_pm_irq to gen6_unmask_pm_irq. Similarly also rename gen6_disable_pm_irq to gen6_mask_pm_irq. Yes for mask/unmask, but I think the suffix really needs to be the same since it is the same functional family. Fine, so will rename gen6_enable_pm_interrupts to gen6_enable_pm_irq, and gen6_enable_pm_irq to gen6_unmask_pm_irq Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 03/11] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set
On 6/27/2016 9:16 PM, Tvrtko Ursulin wrote: On 27/06/16 13:16, akash.g...@intel.com wrote: From: Akash GoelSo far PM IER/IIR/IMR registers were being used only for Turbo related interrupts. But interrupts coming from GuC also use the same set. As a precursor to supporting GuC interrupts, added new low level routines so as to allow sharing the programming of PM IER/IIR/IMR registers between Turbo & GuC. Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow easy sharing of it between Turbo & GuC without involving a rmw operation. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_irq.c | 55 drivers/gpu/drm/i915/intel_drv.h | 6 + 3 files changed, 52 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9ef4919..85a7103 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1806,6 +1806,7 @@ struct drm_i915_private { }; u32 gt_irq_mask; u32 pm_irq_mask; +u32 pm_ier_mask; u32 pm_rps_events; u32 pipestat_irq_mask[I915_MAX_PIPES]; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 4378a65..7316ab4 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -336,14 +336,52 @@ void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask) __gen6_disable_pm_irq(dev_priv, mask); } -void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv) +void gen6_reset_pm_interrupts(struct drm_i915_private *dev_priv, + uint32_t reset_mask) Kernel prefers u32. It is not that overall i915 is clean in that respect, but every time maintainers merge patches checkpatch shouts about it, and more noise tougher it is to spot more important issues. I would appreciate if u32 was used throughout. Fine, will use u32. { i915_reg_t reg = gen6_pm_iir(dev_priv); -spin_lock_irq(_priv->irq_lock); -I915_WRITE(reg, dev_priv->pm_rps_events); -I915_WRITE(reg, dev_priv->pm_rps_events); +assert_spin_locked(_priv->irq_lock); + +I915_WRITE(reg, reset_mask); +I915_WRITE(reg, reset_mask); POSTING_READ(reg); +} + +void gen6_enable_pm_interrupts(struct drm_i915_private *dev_priv, + uint32_t enable_mask) +{ +uint32_t new_val; + +assert_spin_locked(_priv->irq_lock); + +new_val = dev_priv->pm_ier_mask; +new_val |= enable_mask; + +dev_priv->pm_ier_mask = new_val; +I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask); +gen6_enable_pm_irq(dev_priv, enable_mask); Hm, will this be confusing that we will have gen6_enable_pm_interrupts and gen6_enable_pm_irq, so extremely similar names and same parameters, but for different use? Sorry for using confusing, ambiguous names. Maybe rename the old one to gen6_unmask_pm_irq and name this one gen6_enable_pm_irq ? If there is really need to have both. Or add some kerneldoc explaining which one is used for what? Can I do like this, keep gen6_enable_pm_interrupts as is and rename gen6_enable_pm_irq to gen6_unmask_pm_irq. Similarly also rename gen6_disable_pm_irq to gen6_mask_pm_irq. Best regards Akash +} + +void gen6_disable_pm_interrupts(struct drm_i915_private *dev_priv, +uint32_t disable_mask) +{ +uint32_t new_val; + +assert_spin_locked(_priv->irq_lock); + +new_val = dev_priv->pm_ier_mask; +new_val &= ~disable_mask; + +dev_priv->pm_ier_mask = new_val; +__gen6_disable_pm_irq(dev_priv, disable_mask); +I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask); +} + +void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(_priv->irq_lock); +gen6_reset_pm_interrupts(dev_priv, dev_priv->pm_rps_events); dev_priv->rps.pm_iir = 0; spin_unlock_irq(_priv->irq_lock); } @@ -355,9 +393,7 @@ void gen6_enable_rps_interrupts(struct drm_i915_private *dev_priv) WARN_ON(dev_priv->rps.pm_iir); WARN_ON(I915_READ(gen6_pm_iir(dev_priv)) & dev_priv->pm_rps_events); dev_priv->rps.interrupts_enabled = true; -I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) | -dev_priv->pm_rps_events); -gen6_enable_pm_irq(dev_priv, dev_priv->pm_rps_events); +gen6_enable_pm_interrupts(dev_priv, dev_priv->pm_rps_events); spin_unlock_irq(_priv->irq_lock); } @@ -379,9 +415,7 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv) I915_WRITE(GEN6_PMINTRMSK, gen6_sanitize_rps_pm_mask(dev_priv, ~0)); -__gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events); -I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) & -~dev_priv->pm_rps_events); +gen6_disable_pm_interrupts(dev_priv,
Re: [Intel-gfx] [PATCH 01/11] drm/i915: Decouple GuC log setup from verbosity parameter
On 6/27/2016 9:26 PM, Tvrtko Ursulin wrote: On 27/06/16 16:32, Goel, Akash wrote: On 6/27/2016 8:30 PM, Tvrtko Ursulin wrote: On 27/06/16 13:16, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> GuC Log buffer allocation was tied up with verbosity level kernel parameter i915.guc_log_level. User could be given a provision to enable logging at runtime and not necessarily during load time only. This patch will perform allocation of shared log buffer always but will initially enable logging on GuC side through init params based on i915.guc_log_level. Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 --- drivers/gpu/drm/i915/intel_guc_loader.c| 8 +--- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 355b647..28a810f 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -826,9 +826,6 @@ static void guc_create_log(struct intel_guc *guc) unsigned long offset; uint32_t size, flags; -if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN) -return; - if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX) i915.guc_log_level = GUC_LOG_VERBOSITY_MAX; diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c index 8fe96a2..db3c897 100644 --- a/drivers/gpu/drm/i915/intel_guc_loader.c +++ b/drivers/gpu/drm/i915/intel_guc_loader.c @@ -173,11 +173,13 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv) params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER | GUC_CTL_VCS2_ENABLED; -if (i915.guc_log_level >= 0) { -params[GUC_CTL_LOG_PARAMS] = guc->log_flags; +params[GUC_CTL_LOG_PARAMS] = guc->log_flags; + +if (i915.guc_log_level >= 0) params[GUC_CTL_DEBUG] = i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT; -} +else +params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED; if (guc->ads_obj) { u32 ads = (u32)i915_gem_obj_ggtt_offset(guc->ads_obj) I did not manage to understand what is the benefit of always allocating the log buffer? If the user never enables logging it just wasted 11 pages of memory, correct? Yes if User never enables the logging at runtime, 11 RAM pages will be wasted. Currently the pages are permanently pinned in GGTT also. The GGTT address of log buffer is passed in the GuC firmware init params, at firmware loading time. Probably this can be circumvented, if pages can be pinned right before enabling logging (but using the same GGTT address). Looking at the later patches in the series, could you instead create the log buffer when logging is enabled via debugfs or implicitly via the relayfs access? Or is the problem then that you would then have to reset the GuC to activate it? Yes GuC would have to be reset & firmware needs to be reloaded to pass the log buffer address. Right, as minimum I think commit message needs to explain that. The current explanation does not hold anyway since it is not possible to enable it via modifying the module parameter. Right, there should have been an explanation citing the constraint in late allocation of log buffer when logging is enabled. Sorry for missing. Btw have you considered keeping the module param as a global GuC logging enable and adding new code on top? So keep the current code to only allocate the buffer when module param is set, and then if it isn't fail the later userspace triggered attempts to start the logging (in debugfs or relayfs)? Yes that was considered, keeping module param as the master control and allowing disable/enable of logging at runtime (through debugfs) only when module param is set at boot time. IIRC there was a request from Validation to keep logging control independent of boot time value of module param. So even if system booted with guc_log_level as -1, still allow the logging to be enabled at runtime later, through a debugfs interface 'i915_guc_log_control'. Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 01/11] drm/i915: Decouple GuC log setup from verbosity parameter
On 6/27/2016 8:30 PM, Tvrtko Ursulin wrote: On 27/06/16 13:16, akash.g...@intel.com wrote: From: Sagar Arun KambleGuC Log buffer allocation was tied up with verbosity level kernel parameter i915.guc_log_level. User could be given a provision to enable logging at runtime and not necessarily during load time only. This patch will perform allocation of shared log buffer always but will initially enable logging on GuC side through init params based on i915.guc_log_level. Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 --- drivers/gpu/drm/i915/intel_guc_loader.c| 8 +--- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 355b647..28a810f 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -826,9 +826,6 @@ static void guc_create_log(struct intel_guc *guc) unsigned long offset; uint32_t size, flags; -if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN) -return; - if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX) i915.guc_log_level = GUC_LOG_VERBOSITY_MAX; diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c index 8fe96a2..db3c897 100644 --- a/drivers/gpu/drm/i915/intel_guc_loader.c +++ b/drivers/gpu/drm/i915/intel_guc_loader.c @@ -173,11 +173,13 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv) params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER | GUC_CTL_VCS2_ENABLED; -if (i915.guc_log_level >= 0) { -params[GUC_CTL_LOG_PARAMS] = guc->log_flags; +params[GUC_CTL_LOG_PARAMS] = guc->log_flags; + +if (i915.guc_log_level >= 0) params[GUC_CTL_DEBUG] = i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT; -} +else +params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED; if (guc->ads_obj) { u32 ads = (u32)i915_gem_obj_ggtt_offset(guc->ads_obj) I did not manage to understand what is the benefit of always allocating the log buffer? If the user never enables logging it just wasted 11 pages of memory, correct? Yes if User never enables the logging at runtime, 11 RAM pages will be wasted. Currently the pages are permanently pinned in GGTT also. The GGTT address of log buffer is passed in the GuC firmware init params, at firmware loading time. Probably this can be circumvented, if pages can be pinned right before enabling logging (but using the same GGTT address). Looking at the later patches in the series, could you instead create the log buffer when logging is enabled via debugfs or implicitly via the relayfs access? Or is the problem then that you would then have to reset the GuC to activate it? Yes GuC would have to be reset & firmware needs to be reloaded to pass the log buffer address. Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/11] drm/i915: New module param to control the size of buffer used for storing GuC firmware logs
On 6/27/2016 7:01 PM, Jani Nikula wrote: On Mon, 27 Jun 2016, akash.g...@intel.com wrote: From: Akash GoelOn recieving the log buffer flush interrupt from GuC firmware, Driver stores the snapshot of the log buffer in a local buffer, from which Userspace can pull the logs. By default Driver store, up to, 4 snapshots of the log buffer in a local buffer (managed by relay). Added a new module (read only) param, 'guc_log_size', through which User can specify the number of snapshots of log buffer to be stored in local buffer. This can be used to ensure capturing of all boot time logs even with high verbosity level. Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +-- drivers/gpu/drm/i915/i915_params.c | 5 + drivers/gpu/drm/i915/i915_params.h | 1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index fd26a9e..8c0fd83 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -999,8 +999,7 @@ static void guc_create_log_relay_file(struct intel_guc *guc) /* Keep the size of sub buffers same as shared log buffer */ subbuf_size = guc->log_obj->base.size; - /* TODO: Decide based on the User's input */ - n_subbufs = 4; + n_subbufs = i915.guc_log_size; guc_log_relay_chan = relay_open("guc_log", log_dir, subbuf_size, n_subbufs, _callbacks, dev); diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 8b13bfa..14ce0c4 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -57,6 +57,7 @@ struct i915_params i915 __read_mostly = { .enable_guc_loading = -1, .enable_guc_submission = -1, .guc_log_level = -1, + .guc_log_size = 4, .enable_dp_mst = true, .inject_load_failure = 0, .enable_dpcd_backlight = false, @@ -214,6 +215,10 @@ module_param_named(guc_log_level, i915.guc_log_level, int, 0400); MODULE_PARM_DESC(guc_log_level, "GuC firmware logging level (-1:disabled (default), 0-3:enabled)"); +module_param_named(guc_log_size, i915.guc_log_size, int, 0400); +MODULE_PARM_DESC(guc_log_size, + "Number of sub buffers to store GuC firmware logs (default: 4)"); + I guess my battle against adding all sorts of module parameters all the time is a futile and lost one. :( Please at least make it clear what the unit of the size is. It's not obvious to me, and I shouldn't have to look at the source for that. Sorry for not choosing a suitable name in first place. I agree the name should be indicative of the unit. As you would have seen, the parameter provides number of snapshots of the Log buffer which can be stored on Driver side. The size of one snapshot or Log buffer is not so important here and can change in future. Please suggest an appropriate name ('guc_log_buffer_nr' ?) Best regards Akash BR, Jani. module_param_named_unsafe(enable_dp_mst, i915.enable_dp_mst, bool, 0600); MODULE_PARM_DESC(enable_dp_mst, "Enable multi-stream transport (MST) for new DisplayPort sinks. (default: true)"); diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index 0ad020b..89fa832 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -48,6 +48,7 @@ struct i915_params { int enable_guc_loading; int enable_guc_submission; int guc_log_level; + int guc_log_size; int use_mmio_flip; int mmio_debug; int edp_vswing; ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 00/12] Support for sustained capturing of GuC firmware logs
On 6/3/2016 12:45 PM, Daniel Vetter wrote: On Thu, Jun 02, 2016 at 12:21:49PM +0200, Johannes Berg wrote: On Thu, 2016-06-02 at 10:16 +, Daniel Vetter wrote: I still kinda like relayfs, except that it's not available in non- debug builds. But so are plenty of other really interesting files we have hidden in there. sysfs isn't the solution, I already have a black eye from the sysfs maintainer for our error state. Heh. I tend to agree though. No idea really where to put stuff. One option might be to have an official debug directory (like we have power already) in sysfs as the canonical place where drivers can dump stuff. We're not the only ones with too much data to get to userspace for debugging driver/hw issues, e.g. wireless firmware has pretty similar solutions. We have two things in wireless: 1) the devcoredump stuff, but that's a one-time event when something bad happens and dumps a big blob into userspace, doesn't seem relevant here 2) continuous logging, which uses a debugfs file (though it could be relayfs as well, doesn't really make a difference) relayfs apparently moved in with debugfs. And a requirement (or at least strong wishlist item) is that we can get at the data on production systems (which really shouldn't mount debugfs). Seems like there's no place to dump debug information outside of debugfs :( There could be something said for using tracing, but that's only independent of debugfs since the tracefs introduction in kernel 4.1. We tried looking into tracing stuff for our performance counters, and at least there the mismatch for dumping large-scale stuff was too much. But tracefs looks like just the tracing debugfs directory cut out into a separate filesystem, exactly to avoid that dreaded debugfs-is-insecure issues. I'd say we should smash it into debugfs, and if these troubles persist then maybe we need to clean up the mess in there a bit and expose it as drm_debugfs or whatever. Probably a topic for kernel summit even. At least I feel like there's not enough consensus to add ABI at this point. Hi Daniel, Thanks much for your inputs. So, on interim basis, can we have a relay backed debugfs interface only i.e. /sys/kernel/debug/dri/guc_log. And once the support for drm_debugfs is added, its just a matter of changing the file location, i.e. move it inside the drm_debugfs. Best regards Akash -Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 03/12] drm/i915: Support for GuC interrupts
On 5/28/2016 8:05 PM, Chris Wilson wrote: On Sat, May 28, 2016 at 07:15:52PM +0530, Goel, Akash wrote: On 5/28/2016 5:43 PM, Chris Wilson wrote: On Sat, May 28, 2016 at 02:52:16PM +0530, Goel, Akash wrote: On 5/28/2016 1:26 AM, Chris Wilson wrote: On Sat, May 28, 2016 at 01:12:54AM +0530, akash.g...@intel.com wrote: +void gen8_reset_guc_interrupts(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + i915_reg_t reg = gen6_pm_iir(dev_priv); >From the looks of this we have multiple shadows for the same register. That's very bad. Now the platforms might be mutually exclusive, but it is still a mistake that will catch us out. Will check how it is in newer platforms. + spin_lock_irq(_priv->irq_lock); + I915_WRITE(reg, dev_priv->guc_events); + I915_WRITE(reg, dev_priv->guc_events); What? Not even the tiniest of comments to explain? Sorry actually just copied these steps as is from the gen6_reset_rps_interrupts(), considering that the same set of registers (IIR, IER, IMR) are involved here. So the double clearing of IIR followed by posting read could be needed here also. Move it all to i915_irq.c and export routines to manipulate pm_iir such that multiple users do not conflict. Sorry but all interrupt related stuff for rps & GuC is already inside i915_irq.c file. Didn't notice, because this code didn't match my expectations for an interface exported from i915_irq.c Also the IER, IMR, IIR registers are being updated in a non conflicting manner, no overlap between the PM bits & GuC events bits. They share a register, that mandates arbitration. I think the arbitration (& serialization) is already being provided by irq_lock. You mean to say need to have single set of routines only for interrupt reset/enable/disable operations for rps & GuC. Yes. Fine will make them to use a single set of low level routines. + POSTING_READ(reg); Again. Not even the tiniest of comments to explain? + spin_unlock_irq(_priv->irq_lock); +} + +void gen8_enable_guc_interrupts(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + + spin_lock_irq(_priv->irq_lock); + if (!dev_priv->guc.interrupts_enabled) { + WARN_ON(I915_READ(gen6_pm_iir(dev_priv)) & + dev_priv->guc_events); + dev_priv->guc.interrupts_enabled = true; + I915_WRITE(gen6_pm_ier(dev_priv), + I915_READ(gen6_pm_ier(dev_priv)) | dev_priv->guc_events); ier should be known, rmw on the reg should not be required. Sorry same as above, copy paste from gen6_enable_rps_interrupts(). Without rmw, would this be fine ? if (dev_priv->rps.interrupts_enabled) I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_rps_events | dev_priv->guc_events); else I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->guc_events); Still has the presumption of owning a register that is ostensibly used by others. Since pm_ier is a shared register and being used by others also, rmw seem to be more suited here. Otherwise need to be aware of who all is sharing it so as to update it without disturbing the bits owned by others. Exactly, see above. The best interfaces from i915_irq.c do not use rmw on the register values. Fine will try to do away with use rmw operation for pm_ier by maintaining a bit mask of enabled interrupts (just like pm_irq_mask). +static void gen8_guc2host_events_work(struct work_struct *work) +{ + struct drm_i915_private *dev_priv = + container_of(work, struct drm_i915_private, guc.events_work); + + spin_lock_irq(_priv->irq_lock); + /* Speed up work cancelation during disabling guc interrupts. */ + if (!dev_priv->guc.interrupts_enabled) { + spin_unlock_irq(_priv->irq_lock); + return; + } + + DISABLE_RPM_WAKEREF_ASSERTS(dev_priv); This just shouts that the code is broken. You mean to say that ideally the wakeref_count (& power.usage_count) should already be non zero here. Yes. If it is not under your control, then you have a bug in your code. Existing DISABLE_RPM_WAKEREF_ASSERTS tell us where we know we have a bug (and hacks in place whilst we wait for patch review). This work item can also execute in a window where wakeref_count (& power.usage_count) have become zero but runtime suspend has not yet kicked in (due to auto-suspend delay), so "RPM wakelock ref not held during HW access" warning would come. i.e. your code is buggy, as DISABLE_RPM_WAKEREF_ASSERTS implied. But isn't this applicable to rps work item also ?. If there is a way found to circumvent this, then same can be applied to GuC work item also. DISABLE_RPM_WAKEREF_ASSERTS is a stopgap solution. void
Re: [Intel-gfx] [RFC 03/12] drm/i915: Support for GuC interrupts
On 5/28/2016 5:43 PM, Chris Wilson wrote: On Sat, May 28, 2016 at 02:52:16PM +0530, Goel, Akash wrote: On 5/28/2016 1:26 AM, Chris Wilson wrote: On Sat, May 28, 2016 at 01:12:54AM +0530, akash.g...@intel.com wrote: From: Sagar Arun Kamble <sagar.a.kam...@intel.com> There are certain types of interrupts which Host can recieve from GuC. GuC ukernel sends an interrupt to Host for certain events, like for example retrieve/consume the logs generated by ukernel. This patch adds support to receive interrupts from GuC but currently enables & partially handles only the interrupt sent by GuC ukernel. Future patches will add support for handling other interrupt types. Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com> Signed-off-by: Akash Goel <akash.g...@intel.com> --- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 2 + drivers/gpu/drm/i915/i915_irq.c| 100 - drivers/gpu/drm/i915/i915_reg.h| 11 drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_guc.h | 5 ++ drivers/gpu/drm/i915/intel_guc_loader.c| 1 + 7 files changed, 120 insertions(+), 3 deletions(-) static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir); +static void gen8_guc_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir); /* For display hotplug interrupt */ static inline void @@ -400,6 +401,55 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv) synchronize_irq(dev_priv->dev->irq); } +void gen8_reset_guc_interrupts(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + i915_reg_t reg = gen6_pm_iir(dev_priv); >From the looks of this we have multiple shadows for the same register. That's very bad. Now the platforms might be mutually exclusive, but it is still a mistake that will catch us out. Will check how it is in newer platforms. + spin_lock_irq(_priv->irq_lock); + I915_WRITE(reg, dev_priv->guc_events); + I915_WRITE(reg, dev_priv->guc_events); What? Not even the tiniest of comments to explain? Sorry actually just copied these steps as is from the gen6_reset_rps_interrupts(), considering that the same set of registers (IIR, IER, IMR) are involved here. So the double clearing of IIR followed by posting read could be needed here also. Move it all to i915_irq.c and export routines to manipulate pm_iir such that multiple users do not conflict. Sorry but all interrupt related stuff for rps & GuC is already inside i915_irq.c file. Also the IER, IMR, IIR registers are being updated in a non conflicting manner, no overlap between the PM bits & GuC events bits. You mean to say need to have single set of routines only for interrupt reset/enable/disable operations for rps & GuC. + POSTING_READ(reg); Again. Not even the tiniest of comments to explain? + spin_unlock_irq(_priv->irq_lock); +} + +void gen8_enable_guc_interrupts(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + + spin_lock_irq(_priv->irq_lock); + if (!dev_priv->guc.interrupts_enabled) { + WARN_ON(I915_READ(gen6_pm_iir(dev_priv)) & + dev_priv->guc_events); + dev_priv->guc.interrupts_enabled = true; + I915_WRITE(gen6_pm_ier(dev_priv), + I915_READ(gen6_pm_ier(dev_priv)) | dev_priv->guc_events); ier should be known, rmw on the reg should not be required. Sorry same as above, copy paste from gen6_enable_rps_interrupts(). Without rmw, would this be fine ? if (dev_priv->rps.interrupts_enabled) I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_rps_events | dev_priv->guc_events); else I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->guc_events); Still has the presumption of owning a register that is ostensibly used by others. Since pm_ier is a shared register and being used by others also, rmw seem to be more suited here. Otherwise need to be aware of who all is sharing it so as to update it without disturbing the bits owned by others. +static void gen8_guc2host_events_work(struct work_struct *work) +{ + struct drm_i915_private *dev_priv = + container_of(work, struct drm_i915_private, guc.events_work); + + spin_lock_irq(_priv->irq_lock); + /* Speed up work cancelation during disabling guc interrupts. */ + if (!dev_priv->guc.interrupts_enabled) { + spin_unlock_irq(_priv->irq_lock); + return; + } + + DISABLE_RPM_WAKEREF_ASSERTS(dev_priv); This just shouts that the code is broken. You mean to say that ideally the wakeref_count (& power.usage_count) should alr