Re: [Intel-gfx] [PATCH 1/2] shmem: Support for registration of driver/file owner specific ops

2016-11-10 Thread Goel, Akash



On 11/10/2016 11:06 AM, Hugh Dickins wrote:

On Fri, 4 Nov 2016, akash.g...@intel.com wrote:

From: Chris Wilson 

This provides support for the drivers or shmem file owners to register
a set of callbacks, which can be invoked from the address space
operations methods implemented by shmem.  This allow the file owners to
hook into the shmem address space operations to do some extra/custom
operations in addition to the default ones.

The private_data field of address_space struct is used to store the
pointer to driver specific ops.  Currently only one ops field is defined,
which is migratepage, but can be extended on an as-needed basis.

The need for driver specific operations arises since some of the
operations (like migratepage) may not be handled completely within shmem,
so as to be effective, and would need some driver specific handling also.
Specifically, i915.ko would like to participate in migratepage().
i915.ko uses shmemfs to provide swappable backing storage for its user
objects, but when those objects are in use by the GPU it must pin the
entire object until the GPU is idle.  As a result, large chunks of memory
can be arbitrarily withdrawn from page migration, resulting in premature
out-of-memory due to fragmentation.  However, if i915.ko can receive the
migratepage() request, it can then flush the object from the GPU, remove
its pin and thus enable the migration.

Since gfx allocations are one of the major consumer of system memory, its
imperative to have such a mechanism to effectively deal with
fragmentation.  And therefore the need for such a provision for initiating
driver specific actions during address space operations.


Thank you for persisting with this, and sorry for all my delay.



v2:
- Drop dev_ prefix from the members of shmem_dev_info structure. (Joonas)
- Change the return type of shmem_set_device_op() to void and remove the
  check for pre-existing data. (Joonas)
- Rename shmem_set_device_op() to shmem_set_dev_info() to be consistent
  with shmem_dev_info structure. (Joonas)

Cc: Hugh Dickins 
Cc: linux...@kvack.org
Cc: linux-ker...@vger.linux.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Reviewed-by: Chris Wilson 


That doesn't seem quite right: the From line above implies that Chris
wrote it, and should be first Signer; but perhaps the From line is wrong.


Chris only wrote this patch initially, will do the required correction.


---
 include/linux/shmem_fs.h | 13 +
 mm/shmem.c   | 17 -
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index ff078e7..454c3ba 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -39,11 +39,24 @@ struct shmem_sb_info {
unsigned long shrinklist_len; /* Length of shrinklist */
 };

+struct shmem_dev_info {
+   void *private_data;
+   int (*migratepage)(struct address_space *mapping,
+  struct page *newpage, struct page *page,
+  enum migrate_mode mode, void *dev_priv_data);


Aren't the private_data field and dev_priv_data arg a little bit
confusing and redundant?  Can't the migratepage() deduce dev_priv
for itself from mapping->private_data (perhaps wrapped by a
shmem_get_dev_info()), by using container_of()?


Yes looks like migratepage() can deduce dev_priv from mapping->private_data.
Can we keep the private_data as a placeholder ?. Will 
s/dev_priv_data/private_data/.


As per your suggestion, in the other patch, object pointer can be 
derived from mapping->private_data (container_of) and dev_priv in turn 
can be derived from object pointer.



+};
+
 static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
 {
return container_of(inode, struct shmem_inode_info, vfs_inode);
 }

+static inline void shmem_set_dev_info(struct address_space *mapping,
+ struct shmem_dev_info *info)
+{
+   mapping->private_data = info;


Nit: if this stays as is, I'd prefer dev_info there and above,
since shmem.c uses info all over for its shmem_inode_info pointer.
But in second patch I suggest obj_info may be better than dev_info.


Fine will s/info/dev_info.

Best regards
Akash


+}
+
 /*
  * Functions in mm/shmem.c called directly from elsewhere:
  */
diff --git a/mm/shmem.c b/mm/shmem.c
index ad7813d..fce8de3 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1290,6 +1290,21 @@ static int shmem_writepage(struct page *page, struct 
writeback_control *wbc)
return 0;
 }

+#ifdef CONFIG_MIGRATION
+static int shmem_migratepage(struct address_space *mapping,
+struct page *newpage, struct page *page,
+enum migrate_mode mode)
+{
+   struct shmem_dev_info *dev_info = mapping->private_data;
+
+   if (dev_info && 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-09 Thread Goel, Akash



On 11/10/2016 12:09 PM, Hugh Dickins wrote:

On Fri, 4 Nov 2016, akash.g...@intel.com wrote:

From: Chris Wilson 

On a long run of more than 2-3 days, physical memory tends to get
fragmented severely, which considerably slows down the system. In such a
scenario, the shrinker is also unable to help as lack of memory is not
the actual problem, since it has been observed that there are enough free
pages of 0 order. This also manifests itself when an indiviual zone in
the mm runs out of pages and if we cannot migrate pages between zones,
the kernel hits an out-of-memory even though there are free pages (and
often all of swap) available.

To address the issue of external fragementation, kernel does a compaction
(which involves migration of pages) but it's efficacy depends upon how
many pages are marked as MOVABLE, as only those pages can be migrated.

Currently the backing pages for GPU buffers are allocated from shmemfs
with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
swap space, it may not be possible always to reclaim or swap-out pages of
all the inactive objects, to make way for free space allowing formation
of higher order groups of physically-contiguous pages on compaction.

Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
pin the pages if they are in use by GPU, which will prevent their
migration. So the migratepage callback in shmem is also hooked up to get
a notification when kernel initiates the page migration. On the
notification, i915.ko appropriately unpin the pages.  With this we can
effectively mark the GPU pages as MOVABLE and hence mitigate the
fragmentation problem.

v2:
 - Rename the migration routine to gem_shrink_migratepage, move it to the
   shrinker file, and use the existing constructs (Chris)
 - To cleanup, add a new helper function to encapsulate all page migration
   skip conditions (Chris)
 - Add a new local helper function in shrinker file, for dropping the
   backing pages, and call the same from gem_shrink() also (Chris)

v3:
 - Fix/invert the check on the return value of unsafe_drop_pages (Chris)

v4:
 - Minor tidy

v5:
 - Fix unsafe usage of unsafe_drop_pages()
 - Rebase onto vmap-notifier

v6:
- Remove i915_gem_object_get/put across unsafe_drop_pages() as with
  struct_mutex protection object can't disappear. (Chris)

Testcase: igt/gem_shrink
Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 


I'm confused!  But perhaps it's gone around and around between you all,
I'm not sure what the rules are then.  I think this sequence implies
that Sourab wrote it originally, then Akash and Chris passed it on
with refinements - but then Chris wouldn't add Reviewed-by.


Thank you very much for the review and sorry for all the needless confusion.

Chris actually conceived the patches and prepared an initial version of 
them (hence he is the Author).
I & Sourab did the further refinements and fixed issues (all those 
page_private stuff).

Chris then reviewed the final patch and also recently did a rebase for it.


---
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 132 +++
 3 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4735b417..7f2717b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1357,6 +1357,8 @@ struct intel_l3_parity {
 };

 struct i915_gem_mm {
+   struct shmem_dev_info shmem_info;
+
/** Memory allocator for GTT stolen memory */
struct drm_mm stolen;
/** Protects the usage of the GTT stolen memory allocator. This is
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1f995ce..f0d4ce7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct 
drm_i915_gem_object *obj)
if (obj->mm.madv == I915_MADV_WILLNEED)
mark_page_accessed(page);

+   set_page_private(page, 0);
put_page(page);
}
obj->mm.dirty = false;
@@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void)
sg->length += PAGE_SIZE;
}
last_pfn = page_to_pfn(page);
+   set_page_private(page, (unsigned long)obj);

/* Check that the i965g/gm workaround works. */
WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x0010UL));
@@ 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-04 Thread Goel, Akash



On 11/4/2016 7:07 PM, Chris Wilson wrote:

Best if we send these as a new series to unconfuse CI.


Okay will send as a new series.


On Fri, Nov 04, 2016 at 06:18:26PM +0530, akash.g...@intel.com wrote:

+static int do_migrate_page(struct drm_i915_gem_object *obj)
+{
+   struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+   int ret = 0;
+
+   if (!can_migrate_page(obj))
+   return -EBUSY;
+
+   /* HW access would be required for a GGTT bound object, for which
+* device has to be kept awake. But a deadlock scenario can arise if
+* the attempt is made to resume the device, when either a suspend
+* or a resume operation is already happening concurrently from some
+* other path and that only also triggers compaction. So only unbind
+* if the device is currently awake.
+*/
+   if (!intel_runtime_pm_get_if_in_use(dev_priv))
+   return -EBUSY;
+
+   i915_gem_object_get(obj);
+   if (!unsafe_drop_pages(obj))
+   ret = -EBUSY;
+   i915_gem_object_put(obj);


Since the object release changes, we can now do this without the
i915_gem_object_get / i915_gem_object_put (as we are guarded by the BKL
struct_mutex).
Fine will remove object_get/put as with struct_mutex protection object 
can't disappear across unsafe_drop_pages().


Best regards
Akash



-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 2/2] drm/i915: Make GPU pages movable

2016-10-18 Thread Goel, Akash



On 10/18/2016 5:35 PM, Joonas Lahtinen wrote:

On ma, 2016-04-04 at 14:18 +0100, Chris Wilson wrote:

From: Akash Goel 

On a long run of more than 2-3 days, physical memory tends to get
fragmented severely, which considerably slows down the system. In such a
scenario, the shrinker is also unable to help as lack of memory is not
the actual problem, since it has been observed that there are enough free
pages of 0 order. This also manifests itself when an indiviual zone in
the mm runs out of pages and if we cannot migrate pages between zones,
the kernel hits an out-of-memory even though there are free pages (and
often all of swap) available.

To address the issue of external fragementation, kernel does a compaction
(which involves migration of pages) but it's efficacy depends upon how
many pages are marked as MOVABLE, as only those pages can be migrated.

Currently the backing pages for GFX buffers are allocated from shmemfs
with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
swap space, it may not be possible always to reclaim or swap-out pages of
all the inactive objects, to make way for free space allowing formation
of higher order groups of physically-contiguous pages on compaction.

Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
pin the pages if they are in use by GPU, which will prevent their
migration. So the migratepage callback in shmem is also hooked up to get
a notification when kernel initiates the page migration. On the
notification, i915.ko appropriately unpin the pages.  With this we can
effectively mark the GPU pages as MOVABLE and hence mitigate the
fragmentation problem.

v2:
 - Rename the migration routine to gem_shrink_migratepage, move it to the
   shrinker file, and use the existing constructs (Chris)
 - To cleanup, add a new helper function to encapsulate all page migration
   skip conditions (Chris)
 - Add a new local helper function in shrinker file, for dropping the
   backing pages, and call the same from gem_shrink() also (Chris)

v3:
 - Fix/invert the check on the return value of unsafe_drop_pages (Chris)

v4:
 - Minor tidy

Testcase: igt/gem_shrink
Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Reviewed-by: Chris Wilson 


Could this patch be re-spinned on top of current nightly?


Sure will rebase it on top of nightly.


After removing;


WARN(page_count(newpage) != 1, "Unexpected ref count for newpage\n")


and


if (ret)
DRM_DEBUG_DRIVER("page=%p migration returned %d\n", page, ret);


This is;

Reviewed-by: Joonas Lahtinen 

Thanks much for the review.
But there is a precursor patch also, there has been no traction on that.
[1/2] shmem: Support for registration of Driver/file owner specific ops
https://patchwork.freedesktop.org/patch/77935/

Best regards
Akash



Regards, Joonas


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/guc: WA to address the Ringbuffer coherency issue

2016-10-14 Thread Goel, Akash



On 10/14/2016 11:45 PM, Chris Wilson wrote:

On Fri, Oct 14, 2016 at 11:53:44PM +0530, akash.g...@intel.com wrote:

From: Akash Goel 

Driver accesses the ringbuffer pages, via GMADR BAR, if the pages are
pinned in mappable aperture portion of GGTT and for ringbuffer pages
allocated from Stolen memory, access can only be done through GMADR BAR.
In case of GuC based submission, updates done in ringbuffer via GMADR
may not get commited to memory by the time the Command streamer starts
reading them, resulting in fetching of stale data.


Please leave a blank line between paragraphs, or try to not leave so
much whitespace at the end of a sentence.


I am sorry. Will be mindful of this from now.


For Host based submission, such problem is not there as the write to Ring
Tail or ELSP register happens from the Host side prior to submission.
Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already
enforces the ordering between outstanding GMADR writes & new GTTMADR access.
MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to
registers within GT is contained within GT, so ordering is not enforced
resulting in a race, which can manifest in form of a hang.
To ensure the flush of in flight GMADR writes, a POSTING READ is done to
GuC register prior to doorbell ring.
There is already a similar WA in i915_gem_object_flush_gtt_write_domain(),
which takes care of GMADR writes from User space to GEM buffers, but not the
ringbuffer writes from KMD.
This WA is needed on all recent HW.

Cc: Chris Wilson 
Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index a1f76c8..43c8a72 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -601,6 +601,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
  */
 static void i915_guc_submit(struct drm_i915_gem_request *rq)
 {
+   struct drm_i915_private *dev_priv = rq->i915;
unsigned int engine_id = rq->engine->id;
struct intel_guc *guc = >i915->guc;
struct i915_guc_client *client = guc->execbuf_client;
@@ -608,6 +609,11 @@ static void i915_guc_submit(struct drm_i915_gem_request 
*rq)

spin_lock(>wq_lock);
guc_wq_item_append(client, rq);
+
+   /* WA to flush out the pending GMADR writes to ring buffer. */
+   if (i915_vma_is_map_and_fenceable(rq->ring->vma))
+   POSTING_READ(GUC_STATUS);


Did you test POSTING_READ_FW() ?
Sorry though we haven't explicitly tried POSTING_READ_FW() but it should 
work since, as per the __gen9_fw_ranges[] table, GuC registers 
(C000-Cxxx) do not lie in any Forcewake domain range.




Otherwise it makes an unfortunate amount of sense, and I feel justified
in what I had to do in flush_gtt_write_domwin! :)

Yes your hunch, expectedly, was spot on :).

Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915: Allocate intel_engine_cs structure only for the enabled engines (rev4)

2016-10-14 Thread Goel, Akash



On 10/13/2016 10:50 PM, Patchwork wrote:

== Series Details ==

Series: drm/i915: Allocate intel_engine_cs structure only for the enabled 
engines (rev4)
URL   : https://patchwork.freedesktop.org/series/13435/
State : warning

== Summary ==

Series 13435v4 drm/i915: Allocate intel_engine_cs structure only for the 
enabled engines
https://patchwork.freedesktop.org/api/1.0/series/13435/revisions/4/mbox/

Test kms_pipe_crc_basic:
Subgroup nonblocking-crc-pipe-b:
pass   -> DMESG-WARN (fi-ilk-650)


Have filed a new BZ: https://bugs.freedesktop.org/show_bug.cgi?id=98251

Most likely the above failure isn't related with the concerned patch.

Best regards
Akash

> Subgroup read-crc-pipe-b-frame-sequence:
> dmesg-warn -> PASS   (fi-ilk-650)
> Test vgem_basic:
> Subgroup unload:
> skip   -> PASS   (fi-skl-6770hq)
>

fi-bdw-5557u total:246  pass:231  dwarn:0   dfail:0   fail:0   skip:15
fi-bsw-n3050 total:246  pass:204  dwarn:0   dfail:0   fail:0   skip:42
fi-bxt-t5700 total:246  pass:216  dwarn:0   dfail:0   fail:0   skip:30
fi-byt-j1900 total:246  pass:212  dwarn:2   dfail:0   fail:1   skip:31
fi-byt-n2820 total:246  pass:210  dwarn:0   dfail:0   fail:1   skip:35
fi-hsw-4770  total:246  pass:223  dwarn:0   dfail:0   fail:0   skip:23
fi-hsw-4770r total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22
fi-ilk-650   total:246  pass:183  dwarn:1   dfail:0   fail:2   skip:60
fi-ivb-3520m total:246  pass:221  dwarn:0   dfail:0   fail:0   skip:25
fi-ivb-3770  total:246  pass:221  dwarn:0   dfail:0   fail:0   skip:25
fi-kbl-7200u total:246  pass:222  dwarn:0   dfail:0   fail:0   skip:24
fi-skl-6260u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14
fi-skl-6700hqtotal:246  pass:223  dwarn:0   dfail:0   fail:0   skip:23
fi-skl-6700k total:246  pass:221  dwarn:1   dfail:0   fail:0   skip:24
fi-skl-6770hqtotal:246  pass:229  dwarn:1   dfail:0   fail:1   skip:15
fi-snb-2520m total:246  pass:210  dwarn:0   dfail:0   fail:0   skip:36
fi-snb-2600  total:246  pass:209  dwarn:0   dfail:0   fail:0   skip:37

Results at /archive/results/CI_IGT_test/Patchwork_2708/

dbcf6fbb541e70fac7db669631958eab2e4e0d9c drm-intel-nightly: 
2016y-10m-13d-15h-31m-19s UTC integration manifest
391ff6c drm/i915: Allocate intel_engine_cs structure only for the enabled 
engines


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915: Allocate intel_engine_cs structure only for the enabled engines (rev3)

2016-10-13 Thread Goel, Akash



On 10/10/2016 6:03 PM, Patchwork wrote:

== Series Details ==

Series: drm/i915: Allocate intel_engine_cs structure only for the enabled 
engines (rev3)
URL   : https://patchwork.freedesktop.org/series/13435/
State : warning

== Summary ==

Series 13435v3 drm/i915: Allocate intel_engine_cs structure only for the 
enabled engines
https://patchwork.freedesktop.org/api/1.0/series/13435/revisions/3/mbox/

Test vgem_basic:
Subgroup unload:
pass   -> SKIP   (fi-skl-6260u)
pass   -> SKIP   (fi-skl-6700hq)
skip   -> PASS   (fi-skl-6700k)


Checked with Chris about the above failure.
He said that the above unload failure for vgem module can't be
attributed to the patch, most likely a CI framework issue.

Best regards
Akash


fi-bdw-5557u total:248  pass:231  dwarn:0   dfail:0   fail:0   skip:17
fi-bsw-n3050 total:248  pass:204  dwarn:0   dfail:0   fail:0   skip:44
fi-bxt-t5700 total:248  pass:217  dwarn:0   dfail:0   fail:0   skip:31
fi-byt-j1900 total:248  pass:214  dwarn:1   dfail:0   fail:1   skip:32
fi-byt-n2820 total:248  pass:210  dwarn:0   dfail:0   fail:1   skip:37
fi-hsw-4770  total:248  pass:224  dwarn:0   dfail:0   fail:0   skip:24
fi-hsw-4770r total:248  pass:224  dwarn:0   dfail:0   fail:0   skip:24
fi-ilk-650   total:248  pass:185  dwarn:0   dfail:0   fail:2   skip:61
fi-ivb-3520m total:248  pass:221  dwarn:0   dfail:0   fail:0   skip:27
fi-ivb-3770  total:248  pass:207  dwarn:0   dfail:0   fail:0   skip:41
fi-kbl-7200u total:248  pass:222  dwarn:0   dfail:0   fail:0   skip:26
fi-skl-6260u total:248  pass:232  dwarn:0   dfail:0   fail:0   skip:16
fi-skl-6700hqtotal:248  pass:223  dwarn:1   dfail:0   fail:0   skip:24
fi-skl-6700k total:248  pass:222  dwarn:1   dfail:0   fail:0   skip:25
fi-skl-6770hqtotal:248  pass:231  dwarn:1   dfail:0   fail:1   skip:15
fi-snb-2520m total:248  pass:211  dwarn:0   dfail:0   fail:0   skip:37
fi-snb-2600  total:248  pass:209  dwarn:0   dfail:0   fail:0   skip:39

Results at /archive/results/CI_IGT_test/Patchwork_2652/

f35ed31aea66b3230c366fcba5f3456ae2cb956e drm-intel-nightly: 
2016y-10m-10d-11h-28m-51s UTC integration manifest
401facf drm/i915: Allocate intel_engine_cs structure only for the enabled 
engines


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for Support for sustained capturing of GuC firmware logs (rev11)

2016-10-13 Thread Goel, Akash



On 10/13/2016 1:18 PM, Tvrtko Ursulin wrote:


On 12/10/2016 19:36, Saarinen, Jani wrote:

== Series Details ==

Series: Support for sustained capturing of GuC firmware logs (rev11)
URL   : https://patchwork.freedesktop.org/series/7910/
State : warning

== Summary ==

Series 7910v11 Support for sustained capturing of GuC firmware logs
https://patchwork.freedesktop.org/api/1.0/series/7910/revisions/11/mbox/

Test drv_module_reload_basic:
 skip   -> PASS   (fi-skl-6770hq)
Test kms_flip:
 Subgroup basic-flip-vs-modeset:
 dmesg-warn -> PASS   (fi-skl-6770hq)
Test kms_pipe_crc_basic:
 Subgroup nonblocking-crc-pipe-c:
 pass   -> DMESG-WARN (fi-ivb-3770)

  [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid,
remainder is 215
  [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid,
remainder is 215


Either we have a BAT BZ for this or a new one should be raised.


Have filed a new BZ: https://bugs.freedesktop.org/show_bug.cgi?id=98225

Most likely the above failure isn't related with the GuC logging patch 
set. Moreover GuC based submission (& logging) are anyways disabled by 
default.


Best regards
Akash


That and resolving the question on how to merge it given the relayfs
change, but otherwise is ready.

Regards,

Tvrtko


Test kms_psr_sink_crc:
 Subgroup psr_basic:
 dmesg-warn -> PASS   (fi-skl-6700hq)
Test vgem_basic:
 Subgroup unload:
 skip   -> PASS   (fi-kbl-7200u)
 skip   -> PASS   (fi-hsw-4770)

fi-bdw-5557u total:248  pass:232  dwarn:0   dfail:0   fail:0
skip:16
fi-bsw-n3050 total:248  pass:205  dwarn:0   dfail:0   fail:0
skip:43
fi-bxt-t5700 total:248  pass:217  dwarn:0   dfail:0   fail:0
skip:31
fi-byt-j1900 total:248  pass:213  dwarn:2   dfail:0   fail:1
skip:32
fi-byt-n2820 total:248  pass:211  dwarn:0   dfail:0   fail:1
skip:36
fi-hsw-4770  total:248  pass:225  dwarn:0   dfail:0   fail:0
skip:23
fi-hsw-4770r total:248  pass:225  dwarn:0   dfail:0   fail:0
skip:23
fi-ivb-3520m total:248  pass:222  dwarn:0   dfail:0   fail:0
skip:26
fi-ivb-3770  total:248  pass:221  dwarn:1   dfail:0   fail:0
skip:26
fi-kbl-7200u total:248  pass:223  dwarn:0   dfail:0   fail:0
skip:25
fi-skl-6260u total:248  pass:233  dwarn:0   dfail:0   fail:0
skip:15
fi-skl-6700hqtotal:248  pass:225  dwarn:0   dfail:0   fail:0
skip:23
fi-skl-6700k total:248  pass:222  dwarn:1   dfail:0   fail:0
skip:25
fi-skl-6770hqtotal:248  pass:231  dwarn:1   dfail:0   fail:1
skip:15
fi-snb-2520m total:248  pass:211  dwarn:0   dfail:0   fail:0
skip:37
fi-snb-2600  total:248  pass:210  dwarn:0   dfail:0   fail:0
skip:38

Results at /archive/results/CI_IGT_test/Patchwork_2691/

14740bb25ec36fe4ce8042af3eb48aeb45e5bc13 drm-intel-nightly: 2016y-10m-
12d-16h-18m-24s UTC integration manifest a590f8c drm/i915: Mark the GuC
log buffer flush interrupts handling WQ as freezable a001c3d
drm/i915: Early
creation of relay channel for capturing boot time logs af3ee1c
drm/i915: Use
SSE4.1 movntdqa based memcpy for sampling GuC log buffer
fbbd457 drm/i915: Debugfs support for GuC logging control 656513f
drm/i915: Support for forceful flush of GuC log buffer a68d17f drm/i915:
Augment i915 error state to include the dump of GuC log buffer da8274a
drm/i915: Increase GuC log buffer size to reduce flush interrupts
4f24c12 drm/i915: Optimization to reduce the sampling time of GuC log
buffer
4739ad8 drm/i915: Add stats for GuC log buffer flush interrupts
2e8c052 drm/i915: New lock to serialize the Host2GuC actions 954e48b
drm/i915: Add a relay backed debugfs interface for capturing GuC logs
23a81bb relay: Use per CPU constructs for the relay channel buffer
pointers
8fd01d3 drm/i915: Handle log buffer flush interrupt event from GuC
44610d4 drm/i915: Support for GuC interrupts
05ede72 drm/i915: Add low level set of routines for programming PM
IER/IIR/IMR register set ffbd48f drm/i915: New structure to contain GuC
logging related fields 317ba9e drm/i915: Add GuC ukernel logging related
fields to fw interface file
4832507 drm/i915: Decouple GuC log setup from verbosity parameter

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Jani Saarinen
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file

2016-10-10 Thread Goel, Akash



On 10/10/2016 7:22 PM, Tvrtko Ursulin wrote:


On 10/10/2016 11:59, akash.g...@intel.com wrote:

From: Akash Goel 

This patch provides a test utility which helps capture GuC firmware
logs and
then dump them to file.
The logs are pulled from a debugfs file
'/sys/kernel/debug/dri/guc_log' and
by default stored into a file 'guc_log_dump.dat'. The name, including the
location, of the output file can be changed through a command line
argument.

The utility goes into an infinite loop where it waits for the arrival
of new
logs and as soon as new set of logs are produced it captures them in
its local
buffer which is then flushed out to the file on disk.
Any time when logging needs to be ended, User can stop this utility
(CTRL+C).

Before entering into a loop, it first discards whatever logs are
present in
the debugfs file.
This way User can first launch this utility and then start a
workload/activity
for which GuC firmware logs are to be actually captured and keep
running the
utility for as long as its needed, like once the workload is over this
utility
can be forcefully stopped.

If the logging wasn't enabled on GuC side by the Driver at boot time,
utility
will first enable the logging and later on when it is stopped (CTRL+C)
it will
also pause the logging on GuC side.

v2:
- Use combination of alarm system call & SIGALRM signal to run the
utility
   for required duration. (Tvrtko)
- Fix inconsistencies, do minor cleanup and refactoring. (Tvrtko)

v3:
- Fix discrepancy for the output file command line option and update the
   Usage/help string.

v4:
- Update the exit condition for flusher thread, now will exit only after
   the capture loop is over and not when the flag to stop logging is set.
   This handles a corner case, due to which the dump of last captured
buffer
   was getting missed.
- Add a newline character at the end of assert messages.
- Avoid the assert for the case, which occurs very rarely, when there
are no
   bytes read from the relay file.

Cc: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin  (v3)
---
  tools/Makefile.sources   |   1 +
  tools/intel_guc_logger.c | 438
+++
  2 files changed, 439 insertions(+)
  create mode 100644 tools/intel_guc_logger.c

diff --git a/tools/Makefile.sources b/tools/Makefile.sources
index 2bb6c8e..be58871 100644
--- a/tools/Makefile.sources
+++ b/tools/Makefile.sources
@@ -19,6 +19,7 @@ tools_prog_lists =\
  intel_gpu_time\
  intel_gpu_top\
  intel_gtt\
+intel_guc_logger\
  intel_infoframes\
  intel_l3_parity\
  intel_lid\
diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c
new file mode 100644
index 000..159a54e
--- /dev/null
+++ b/tools/intel_guc_logger.c
@@ -0,0 +1,438 @@
+
+#define _GNU_SOURCE  /* For using O_DIRECT */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "igt.h"
+
+#define MB(x) ((uint64_t)(x) * 1024 * 1024)
+#ifndef PAGE_SIZE
+  #define PAGE_SIZE 4096
+#endif
+/* Currently the size of GuC log buffer is 19 pages & so is the size
of relay
+ * subbuffer. If the size changes in future, then this define also
needs to be
+ * updated accordingly.
+ */
+#define SUBBUF_SIZE (19*PAGE_SIZE)
+/* Need large buffering from logger side to hide the DISK IO latency,
Driver
+ * can only store 8 snapshots of GuC log buffer in relay.
+ */
+#define NUM_SUBBUFS 100
+
+#define RELAY_FILE_NAME  "guc_log"
+#define DEFAULT_OUTPUT_FILE_NAME  "guc_log_dump.dat"
+#define CONTROL_FILE_NAME "i915_guc_log_control"
+
+char *read_buffer;
+char *out_filename;
+int poll_timeout = 2; /* by default 2ms timeout */
+pthread_mutex_t mutex;
+pthread_t flush_thread;
+int verbosity_level = 3; /* by default capture logs at max verbosity */
+uint32_t produced, consumed;
+uint64_t total_bytes_written;
+int num_buffers = NUM_SUBBUFS;
+int relay_fd, outfile_fd = -1;
+uint32_t test_duration, max_filesize;
+pthread_cond_t underflow_cond, overflow_cond;
+bool stop_logging, discard_oldlogs, capturing_stopped;
+
+static void guc_log_control(bool enable_logging)
+{
+int control_fd;
+char data[19];
+uint64_t val;
+int ret;
+
+control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY);
+igt_assert_f(control_fd >= 0, "couldn't open the guc log control
file\n");
+
+val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0;
+
+ret = snprintf(data, sizeof(data), "0x%" PRIx64, val);
+igt_assert(ret > 2 && ret < sizeof(data));
+
+ret = write(control_fd, data, ret);
+igt_assert_f(ret > 0, "couldn't write to the log control file\n");
+
+close(control_fd);
+}
+
+static void int_sig_handler(int sig)
+{
+igt_info("received signal %d\n", 

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Allocate intel_engine_cs structure only for the enabled engines

2016-10-07 Thread Goel, Akash



On 10/7/2016 5:14 PM, Chris Wilson wrote:

On Fri, Oct 07, 2016 at 09:58:07AM -, Patchwork wrote:

== Series Details ==

Series: drm/i915: Allocate intel_engine_cs structure only for the enabled 
engines
URL   : https://patchwork.freedesktop.org/series/13435/
State : failure

== Summary ==

Series 13435v1 drm/i915: Allocate intel_engine_cs structure only for the 
enabled engines
https://patchwork.freedesktop.org/api/1.0/series/13435/revisions/1/mbox/

Test drv_module_reload_basic:
dmesg-warn -> PASS   (fi-ilk-650)
Test gem_exec_parallel:
Subgroup basic:
pass   -> INCOMPLETE (fi-snb-2600)
Test gem_sync:
Subgroup basic-store-all:
pass   -> INCOMPLETE (fi-bxt-t5700)
pass   -> INCOMPLETE (fi-byt-j1900)
pass   -> INCOMPLETE (fi-bsw-n3050)
pass   -> INCOMPLETE (fi-hsw-4770)
pass   -> INCOMPLETE (fi-skl-6700k)
pass   -> INCOMPLETE (fi-skl-6770hq)
pass   -> INCOMPLETE (fi-hsw-4770r)
pass   -> INCOMPLETE (fi-snb-2520m)
pass   -> INCOMPLETE (fi-kbl-7200u)
pass   -> INCOMPLETE (fi-skl-6700hq)
pass   -> INCOMPLETE (fi-ivb-3520m)
pass   -> INCOMPLETE (fi-ivb-3770)
pass   -> INCOMPLETE (fi-bdw-5557u)
pass   -> INCOMPLETE (fi-skl-6260u)


This is due to missing:

git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 8c08ced..44ef6b5 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -415,7 +415,7 @@ intel_engine_sync_index(struct intel_engine_cs *engine,
 * vcs2 -> 0 = rcs, 1 = vcs, 2 = bcs, 3 = vecs;
 */

-   idx = (other - engine) - 1;
+   idx = (other->id - engine->id) - 1;
if (idx < 0)
idx += I915_NUM_ENGINES;

I believe that's the only case where we compare elements of the array,
and even scheduled for removal.

Thank you very much for finding this anomaly.
So the cross engine synchronization was going for a toss, causing the 
above tests to get stuck or execute slowly ?.


best regards
Akash

-Chris





___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for Support for sustained capturing of GuC firmware logs (rev10)

2016-09-12 Thread Goel, Akash



On 9/12/2016 1:53 PM, Patchwork wrote:

== Series Details ==

Series: Support for sustained capturing of GuC firmware logs (rev10)
URL   : https://patchwork.freedesktop.org/series/7910/
State : failure

== Summary ==

Series 7910v10 Support for sustained capturing of GuC firmware logs
http://patchwork.freedesktop.org/api/1.0/series/7910/revisions/10/mbox/

Test drv_module_reload_basic:
skip   -> PASS   (fi-skl-6260u)
Test kms_cursor_legacy:
Subgroup basic-cursor-vs-flip-varying-size:
pass   -> FAIL   (fi-bsw-n3050)


This subtest seems to have a history of the sporadic failures as per
the link 
http://benchsrv.fi.intel.com/archive/results/CI_IGT_test/igt@kms_cursor_leg...@basic-cursor-vs-flip-varying-size.html


Filed a new BZ: https://bugs.freedesktop.org/show_bug.cgi?id=97775

The failure is sporadic, is most likely pre-existent & unrelated to the
GuC logging patch set.

Best regards
Akash


fi-bdw-5557u total:252  pass:236  dwarn:0   dfail:0   fail:1   skip:15
fi-bsw-n3050 total:252  pass:204  dwarn:0   dfail:0   fail:2   skip:46
fi-hsw-4770k total:252  pass:229  dwarn:0   dfail:0   fail:1   skip:22
fi-hsw-4770r total:252  pass:225  dwarn:0   dfail:0   fail:1   skip:26
fi-ilk-650   total:252  pass:182  dwarn:0   dfail:0   fail:3   skip:67
fi-ivb-3520m total:252  pass:220  dwarn:0   dfail:0   fail:1   skip:31
fi-ivb-3770  total:252  pass:220  dwarn:0   dfail:0   fail:1   skip:31
fi-skl-6260u total:252  pass:237  dwarn:0   dfail:0   fail:1   skip:14
fi-skl-6700k total:252  pass:222  dwarn:1   dfail:0   fail:1   skip:28
fi-snb-2520m total:252  pass:206  dwarn:0   dfail:0   fail:1   skip:45
fi-snb-2600  total:252  pass:206  dwarn:0   dfail:0   fail:1   skip:45

Results at /archive/results/CI_IGT_test/Patchwork_2491/

5986f290e25f42d3d5df390411cc43683deb1301 drm-intel-nightly: 
2016y-09m-08d-09h-11m-50s UTC integration manifest
b66ec09 drm/i915: Mark the GuC log buffer flush interrupts handling WQ as 
freezable
f0170a8 drm/i915: Early creation of relay channel for capturing boot time logs
28365d9 drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer
65eafef drm/i915: Debugfs support for GuC logging control
0db5fb8 drm/i915: Support for forceful flush of GuC log buffer
0a7b34a drm/i915: Augment i915 error state to include the dump of GuC log buffer
671b49b drm/i915: Increase GuC log buffer size to reduce flush interrupts
270f061 drm/i915: Optimization to reduce the sampling time of GuC log buffer
a2df951 drm/i915: Add stats for GuC log buffer flush interrupts
4147500 drm/i915: New lock to serialize the Host2GuC actions
e101194 drm/i915: Add a relay backed debugfs interface for capturing GuC logs
eabdd2a relay: Use per CPU constructs for the relay channel buffer pointers
b77518d drm/i915: Handle log buffer flush interrupt event from GuC
de54755 drm/i915: Support for GuC interrupts
c3228bb drm/i915: Add low level set of routines for programming PM IER/IIR/IMR 
register set
1c4e929 drm/i915: New structure to contain GuC logging related fields
b073561 drm/i915: Add GuC ukernel logging related fields to fw interface file
6ed3738 drm/i915: Decouple GuC log setup from verbosity parameter


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file

2016-09-07 Thread Goel, Akash



On 9/7/2016 3:07 PM, Tvrtko Ursulin wrote:


On 07/09/16 09:44, Chris Wilson wrote:

On Wed, Sep 07, 2016 at 01:40:27PM +0530, Goel, Akash wrote:

On 9/6/2016 9:22 PM, Tvrtko Ursulin wrote:


[snip]


+while (!stop_logging)
+{
+if (test_duration && (igt_seconds_elapsed() >
test_duration)) {


If you agree to allow no poll period the this would not work
right? In
that case you would need to use alarm(2) or something.



Can calculate the timeout value for poll call as,
 if (poll_timeout < 0) {
 timeout = test_duration - igt_seconds_elapsed())
 }


My point was that with indefinite poll loop will not run if there is
not
log data so timeout will not work implemented like this.


I understood your concern in first place but probably didn't put
forth my point clearly.

For more clarity, this is how think it can be addressed.

--- a/tools/intel_guc_logger.c
+++ b/tools/intel_guc_logger.c
@@ -370,6 +370,8 @@ int main(int argc, char **argv)
  {
  struct pollfd relay_poll_fd;
  struct timespec start={};
+uint32_t time_elapsed;
+int timeout;
  int nfds;
  int ret;

@@ -395,10 +397,17 @@ int main(int argc, char **argv)

  while (!stop_logging)
  {
-if (test_duration && (igt_seconds_elapsed() >
test_duration)) {
-igt_debug("Ran for stipulated %d seconds, exit now\n",
test_duration);
-stop_logging = true;
-break;
+timeout = poll_timeout;
+if (test_duration) {
+time_elapsed = igt_seconds_elapsed();
+if (time_elapsed >= test_duration) {
+igt_debug("Ran for stipulated %d seconds, exit
now\n", test_duration);
+stop_logging = true;
+break;
+}
+if (poll_timeout < 0)
+timeout = (test_duration - time_elapsed) * 1000;
}

  /* Wait/poll for the new data to be available, relay doesn't
@@ -412,7 +421,7 @@ int main(int argc, char **argv)
   * than a jiffy gap between 2 flush interrupts) and relay runs
   * out of sub buffers to store the new logs.
   */
-ret = poll(_poll_fd, nfds, poll_timeout);
+ret = poll(_poll_fd, nfds, timeout);
  if (ret < 0) {
  if (errno == EINTR)
  break;

So will not do polling with indefinite timeout and adjust the
timeout value as per test's duration.
Does it look ok ?


Since the comment still refers to a kernel bug that you've fixed, it can
just go. The timeout calculation is indeed more simply expressed as
alarm(timeout).


Yes I wrote privately that's especially true since there is already a
handler for SIGINT which would do the right thing for SIGALRM as well. I
don't feel so strongly about this but now that we both think the same
maybe go for the simpler implementation if you don't mind Akash?


Thanks much for suggestion.
Will use 'alarm(timeout)', its definitely much simpler.


And fixing the blocking read() is about 10 lines in the kernel...


Haven't checked but if that is the case, since we are already fixing
relayfs issues, it would be good to do that one as well since it would
simplify the logger. Because if we do it straight away then we know
logger can use it, and if we leave it for later then it gets uglier for
the logger.

But if we cannot make the fix go in the same kernel version (or earlier)
than the GuC logging then I think we don't need to block on that.



Sorry not sure that whether we would gain much by trying to add the 
support for blocking read in relay.


For a regular disk file, which is of a fixed size, it makes sense to 
have a provision to block the reader until file's data is paged in from 
the disk into RAM.


But for relay, data to be read would invariably be generated dynamically 
which can stop at anytime and thus the reader could get blocked for ever.


I think the current relay semantics are fine that if there is no data 
left to be read in channel buffers zero will be returned and Clients

can get to know about the generation of new data through poll (using a
timeout).

Best regards
Akash


Regards,






Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file

2016-09-07 Thread Goel, Akash



On 9/6/2016 9:22 PM, Tvrtko Ursulin wrote:


On 06/09/16 16:33, Goel, Akash wrote:

On 9/6/2016 6:47 PM, Tvrtko Ursulin wrote:

Hi,

On 06/09/16 11:43, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

This patch provides a test utility which helps capture GuC firmware
logs and
then dump them to file.
The logs are pulled from a debugfs file
'/sys/kernel/debug/dri/guc_log' and
stored into a file '/tmp/guc_log_dump.dat', the name of the output
file can
be changed through a command line argument.

The utility goes into an infinite loop where it waits for the arrival
of new
logs and as soon as new set of logs are produced it captures them in
its local
buffer which is then flushed out to the file on disk.
Any time when logging needs to be ended, User can stop this utility
(CTRL+C).

Before entering into a loop, it first discards whatever logs are
present in
the debugfs file.
This way User can first launch this utility and then start a
workload/activity
for which GuC firmware logs are to be actually captured and keep
running the
utility for as long as its needed, like once the workload is over this
utility
can be forcefully stopped.

If the logging wasn't enabled on GuC side by the Driver at boot time,
utility
will first enable the logging and later on when it is stopped (CTRL+C)
it will
also pause the logging on GuC side.

Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  tools/Makefile.sources   |   1 +
  tools/intel_guc_logger.c | 441
+++
  2 files changed, 442 insertions(+)
  create mode 100644 tools/intel_guc_logger.c

diff --git a/tools/Makefile.sources b/tools/Makefile.sources
index 2bb6c8e..be58871 100644
--- a/tools/Makefile.sources
+++ b/tools/Makefile.sources
@@ -19,6 +19,7 @@ tools_prog_lists =\
  intel_gpu_time\
  intel_gpu_top\
  intel_gtt\
+intel_guc_logger\
  intel_infoframes\
  intel_l3_parity\
  intel_lid\
diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c
new file mode 100644
index 000..92172fa
--- /dev/null
+++ b/tools/intel_guc_logger.c
@@ -0,0 +1,441 @@
+
+#define _GNU_SOURCE  /* For using O_DIRECT */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "igt.h"
+
+#define MB(x) ((uint64_t)(x) * 1024 * 1024)
+#ifndef PAGE_SIZE
+  #define PAGE_SIZE 4096
+#endif
+#define SUBBUF_SIZE (19*PAGE_SIZE)
+/* Need large buffering from logger side to hide the DISK IO latency,
Driver
+ * can only store 8 snapshots of GuC log buffer in relay.
+ */
+#define NUM_SUBBUFS 100
+
+#define RELAY_FILE_NAME  "guc_log"
+#define CONTROL_FILE_NAME "i915_guc_log_control"
+
+char *read_buffer;
+char *out_filename;
+int poll_timeout = 2; /* by default 2ms timeout */
+pthread_mutex_t mutex;
+pthread_t flush_thread;
+int verbosity_level = 3; /* by default capture logs at max
verbosity */
+uint32_t produced, consumed;
+uint64_t total_bytes_written;
+int num_buffers = NUM_SUBBUFS;
+int relay_fd, outfile_fd = -1;
+bool stop_logging, discard_oldlogs;
+uint32_t test_duration, max_filesize;
+pthread_cond_t underflow_cond, overflow_cond;
+
+static void guc_log_control(bool enable_logging)
+{
+int control_fd;
+char data[19];
+uint64_t val;
+int ret;
+
+control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY);
+if (control_fd < 0)
+igt_assert_f(0, "Couldn't open the guc log control file");
+
+val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0;
+
+snprintf(data, sizeof(data), "0x%" PRIx64, val);
+ret = write(control_fd, data, strlen(data) + 1);


Minor: It looks safe like it is but something like below would maybe be
more robust?

ret = snprintf(data, sizeof(data), "0x%" PRIx64, val);
igt_assert(ret > 2 && ret < sizeof(data));


ok will add, but possibility of failure will be really remote here.
but igt_assert(ret > 0) should suffice.


Yes there is no possibility for failure as it stands, just more robust
implementation should someone change something in the future. That's why
I said you could also decide to keep it as is. My version also avoided
the strlen since snprintf already tells you that.



fine, will use your version then.


ret = write(control_fd, data, ret);
igt_assert(ret > 0); // assuming short writes can't happen

Up to you.


+if (ret < 0)
+igt_assert_f(0, "Couldn't write to the log control file");
+
+close(control_fd);
+}
+
+static void int_sig_handler(int sig)
+{
+igt_info("Received signal %d\n", sig);
+
+stop_logging = true;
+}
+
+static void pull_leftover_data(void)
+{
+unsigned int bytes_read = 0;
+int ret;
+
+while (1) {
+/* Read the logs from relay buffer 

Re: [Intel-gfx] [PATCH] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file

2016-09-06 Thread Goel, Akash



On 9/6/2016 6:47 PM, Tvrtko Ursulin wrote:


Hi,

On 06/09/16 11:43, akash.g...@intel.com wrote:

From: Akash Goel 

This patch provides a test utility which helps capture GuC firmware
logs and
then dump them to file.
The logs are pulled from a debugfs file
'/sys/kernel/debug/dri/guc_log' and
stored into a file '/tmp/guc_log_dump.dat', the name of the output
file can
be changed through a command line argument.

The utility goes into an infinite loop where it waits for the arrival
of new
logs and as soon as new set of logs are produced it captures them in
its local
buffer which is then flushed out to the file on disk.
Any time when logging needs to be ended, User can stop this utility
(CTRL+C).

Before entering into a loop, it first discards whatever logs are
present in
the debugfs file.
This way User can first launch this utility and then start a
workload/activity
for which GuC firmware logs are to be actually captured and keep
running the
utility for as long as its needed, like once the workload is over this
utility
can be forcefully stopped.

If the logging wasn't enabled on GuC side by the Driver at boot time,
utility
will first enable the logging and later on when it is stopped (CTRL+C)
it will
also pause the logging on GuC side.

Signed-off-by: Akash Goel 
---
  tools/Makefile.sources   |   1 +
  tools/intel_guc_logger.c | 441
+++
  2 files changed, 442 insertions(+)
  create mode 100644 tools/intel_guc_logger.c

diff --git a/tools/Makefile.sources b/tools/Makefile.sources
index 2bb6c8e..be58871 100644
--- a/tools/Makefile.sources
+++ b/tools/Makefile.sources
@@ -19,6 +19,7 @@ tools_prog_lists =\
  intel_gpu_time\
  intel_gpu_top\
  intel_gtt\
+intel_guc_logger\
  intel_infoframes\
  intel_l3_parity\
  intel_lid\
diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c
new file mode 100644
index 000..92172fa
--- /dev/null
+++ b/tools/intel_guc_logger.c
@@ -0,0 +1,441 @@
+
+#define _GNU_SOURCE  /* For using O_DIRECT */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "igt.h"
+
+#define MB(x) ((uint64_t)(x) * 1024 * 1024)
+#ifndef PAGE_SIZE
+  #define PAGE_SIZE 4096
+#endif
+#define SUBBUF_SIZE (19*PAGE_SIZE)
+/* Need large buffering from logger side to hide the DISK IO latency,
Driver
+ * can only store 8 snapshots of GuC log buffer in relay.
+ */
+#define NUM_SUBBUFS 100
+
+#define RELAY_FILE_NAME  "guc_log"
+#define CONTROL_FILE_NAME "i915_guc_log_control"
+
+char *read_buffer;
+char *out_filename;
+int poll_timeout = 2; /* by default 2ms timeout */
+pthread_mutex_t mutex;
+pthread_t flush_thread;
+int verbosity_level = 3; /* by default capture logs at max verbosity */
+uint32_t produced, consumed;
+uint64_t total_bytes_written;
+int num_buffers = NUM_SUBBUFS;
+int relay_fd, outfile_fd = -1;
+bool stop_logging, discard_oldlogs;
+uint32_t test_duration, max_filesize;
+pthread_cond_t underflow_cond, overflow_cond;
+
+static void guc_log_control(bool enable_logging)
+{
+int control_fd;
+char data[19];
+uint64_t val;
+int ret;
+
+control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY);
+if (control_fd < 0)
+igt_assert_f(0, "Couldn't open the guc log control file");
+
+val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0;
+
+snprintf(data, sizeof(data), "0x%" PRIx64, val);
+ret = write(control_fd, data, strlen(data) + 1);


Minor: It looks safe like it is but something like below would maybe be
more robust?

ret = snprintf(data, sizeof(data), "0x%" PRIx64, val);
igt_assert(ret > 2 && ret < sizeof(data));


ok will add, but possibility of failure will be really remote here.
but igt_assert(ret > 0) should suffice.


ret = write(control_fd, data, ret);
igt_assert(ret > 0); // assuming short writes can't happen

Up to you.


+if (ret < 0)
+igt_assert_f(0, "Couldn't write to the log control file");
+
+close(control_fd);
+}
+
+static void int_sig_handler(int sig)
+{
+igt_info("Received signal %d\n", sig);
+
+stop_logging = true;
+}
+
+static void pull_leftover_data(void)
+{
+unsigned int bytes_read = 0;
+int ret;
+
+while (1) {
+/* Read the logs from relay buffer */
+ret = read(relay_fd, read_buffer, SUBBUF_SIZE);
+if (!ret)
+break;
+else if (ret < 0)
+igt_assert_f(0, "Failed to read from the guc log file");
+else if (ret < SUBBUF_SIZE)
+igt_assert_f(0, "invalid read from relay file");
+
+bytes_read += ret;
+
+if (outfile_fd > 0) {



= 0 I think. Or is it even needed since open_output_file asserts if it

fails to open?


Actually pull_leftover_data() will be called twice, once before opening
the 

Re: [Intel-gfx] [PATCH 15/19] drm/i915: Debugfs support for GuC logging control

2016-08-20 Thread Goel, Akash



On 8/19/2016 11:48 PM, Chris Wilson wrote:

On Fri, Aug 19, 2016 at 02:13:14PM +0530, akash.g...@intel.com wrote:

+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   if (!dev_priv->guc.log.vma)
+   return -EINVAL;


return -ENODEV;

Fine will change the return code.



+
+   *val = i915.guc_log_level;
+
+   return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   int ret;


if (!dev_priv->guc.log.vma)
return -ENODEV;

you don't need struct_mutex to check for its existence.


Fine will lock the struct_mutex after the NULL vma check.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 14/19] drm/i915: Forcefully flush GuC log buffer on reset

2016-08-20 Thread Goel, Akash



On 8/19/2016 11:40 PM, Chris Wilson wrote:

On Fri, Aug 19, 2016 at 02:13:13PM +0530, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

Before capturing the GuC logs as a part of error state, there should be a
force log buffer flush action sent to GuC before proceeding with GPU reset
and re-initializing GUC. There could be some data in the log buffer which
is yet to be captured and those logs would be particularly useful to
understand that why the GPU reset was initiated.


There's no point if we can't wait for any writes to complete, so just take
the snapshot of the log at the time of the hang.





+void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait)
+{
+   if (!i915.enable_guc_submission || (i915.guc_log_level < 0))
+   return;
+
+   /* First disable the interrupts, will be renabled afterwards */
+   gen9_disable_guc_interrupts(dev_priv);


calls synchronize_irq() which is also illegal from the atomic context of
error capture.


Fine, will not call gen9_disable_guc_interrupts, just like flush_work, 
from the error state capture path.


But I feel it could still be useful to invoke 
host2guc_force_logbuffer_flush().


Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 17/19] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer

2016-08-20 Thread Goel, Akash



On 8/19/2016 11:49 PM, Chris Wilson wrote:

On Fri, Aug 19, 2016 at 02:13:16PM +0530, akash.g...@intel.com wrote:

From: Akash Goel 

In order to have fast reads from the GuC log buffer, used SSE4.1 movntdqa
based memcpy function i915_memcpy_from_wc.
GuC log buffer has a WC type vmalloc mapping and copying using movntqda
from WC type memory is almost as fast as reading from WB memory.
This will further reduce the log buffer sampling time, so is needed dearly
to deal with the flush interrupt storm when GuC is generating logs at a
very high rate.
Ideally SSE 4.1 should be present on all chipsets supporting GuC based
submisssions, but if not then logging will not be enabled.

v2: Rebase.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 


Should be squashed with patch 16 (use MAP_WC).
Fine will squash, but please could you tell that what issue could be 
there with 2 patches being separate. Either both will be merged or none 
of them will be merged.


Best regards
Akash


-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 08/19] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-08-20 Thread Goel, Akash



On 8/19/2016 11:33 PM, Chris Wilson wrote:

On Fri, Aug 19, 2016 at 02:13:07PM +0530, akash.g...@intel.com wrote:

 static void *guc_get_write_buffer(struct intel_guc *guc)
 {
-   return NULL;
+   /* FIXME: Cover the check under a lock ? */
+   if (!guc->log.relay_chan)
+   return NULL;
+
+   /* Just get the base address of a new sub buffer and copy data into it
+* ourselves. NULL will be returned in no-overwrite mode, if all sub
+* buffers are full. Could have used the relay_write() to indirectly
+* copy the data, but that would have been bit convoluted, as we need to
+* write to only certain locations inside a sub buffer which cannot be
+* done without using relay_reserve() along with relay_write(). So its
+* better to use relay_reserve() alone.
+*/
+   return relay_reserve(guc->log.relay_chan, 0);
 }


You have to chase through the code a long way to check whether or not
the allocation is correct.

Please do consider adding a check such as

GEM_BUG_ON(guc->log.relay_chan->size < guc->log.vma->size);
(near the allocation)


Fine, will add a check after the allocation, but not sure how useful it
will be, as we shall trust relay to do the memory allocation for the
sub-buffers as per the requested 'subbuf_size'.

subbuf_size = guc->log.vma->obj->base.size;
n_subbufs = 8;
guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size,
n_subbufs, _callbacks, dev_priv);


GEM_BUG_ON(guc->log.relay_chan->subbuf_size < guc->log.vma->obj->base.size);




GEM_BUG_ON(write_offset + buffer_size > guc->log.relay_chan->size);
(before the memcpy, or whatever is appropriate).


There is a check already for read_offset/write_offset before the memcpy.

I think it would be better to add this check
GEM_BUG_ON(guc->log.relay_chan->subbuf_size < guc->log.vma->obj->base.size);

just before
return relay_reserve(guc->log.relay_chan, 0);

Best regards
Akash


Just to leave some clues to the reader as to what is going on.
-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-18 Thread Goel, Akash



On 8/18/2016 8:25 PM, Imre Deak wrote:

On to, 2016-08-18 at 20:05 +0530, Goel, Akash wrote:


On 8/18/2016 7:48 PM, Imre Deak wrote:

On to, 2016-08-18 at 19:17 +0530, Goel, Akash wrote:

[...]
Thanks for the inputs. Sorry not familiar with freezable WQ semantics.
But after looking at code, this is what I understood :-
1. freezable Workqueues will be frozen before the system suspend
callbacks are invoked for the devices.


Yes.


2. Any work item queued after the WQ is marked frozen will be scheduled
later, on resume.


Yes.


3. But if a work item was already present in the freezable Workqueue,
before it was frozen and it did not complete, then system suspend
itself will be aborted.


System suspend will be aborted only if any kernel thread didn't
complete within a reasonable amount of time (freeze_timeout_msecs, 20
sec by default). Otherwise already queued items will be properly
waited upon and suspend will proceed.

Sorry for getting this wrong.
What I understood is that even if there are pending work items on
freezable WQ after freeze_timeout_msecs, then also system suspend would
be performed.


In case of timeout suspend_prepare()->suspend_freeze_processes()
->freeze_kernel_threads()->try_to_freeze_tasks() will return -EBUSY and
suspend will fail.

So sorry, there was a typo in my last mail, instead of writing 'system 
suspend would be aborted', I wrote 'system suspend would be performed'.



Sorry couldn't find an explicit/synchronous wait in kernel for the
pending work items for freezable WQs, but it doesn't matter.


The above try_to_freeze_tasks() will wait until
freeze_workqueues_busy() indicates that there are no work items active
on any freezable queues.


Thanks much for clarifying. I will go through that function again.

Best regards
Akash

--Imre






4. So if the log.flush_wq is marked as freezable, then flush of
work item will not be required for the system suspend case.
And runtime suspend case is already covered with rpm get/put
around register access in work item function.


Yes.



It seems there are 2 config options CONFIG_SUSPEND_FREEZER


This is set whenever system suspend is enabled.


and
CONFIG_FREEZER


This is set except for one platform (powerpc), where I assume freezing
of the tasks is achieved in a different way. In any case it doesn't
matter for us.


Many thanks for providing all this info.

Will then mark the log.flush_wq as freezable.

Best regards
Akash

--Imre


which have to be enabled for all the above to happen.
If these config options will always be enabled then probably marking
log.flush_wq would work.

Please kindly confirm whether I understood correctly or not, accordingly
will proceed further.

Best regards
Akash




--Imre


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-18 Thread Goel, Akash



On 8/18/2016 7:48 PM, Imre Deak wrote:

On to, 2016-08-18 at 19:17 +0530, Goel, Akash wrote:

[...]
Thanks for the inputs. Sorry not familiar with freezable WQ semantics.
But after looking at code, this is what I understood :-
1. freezable Workqueues will be frozen before the system suspend
callbacks are invoked for the devices.


Yes.


2. Any work item queued after the WQ is marked frozen will be scheduled
later, on resume.


Yes.


3. But if a work item was already present in the freezable Workqueue,
before it was frozen and it did not complete, then system suspend
itself will be aborted.


System suspend will be aborted only if any kernel thread didn't
complete within a reasonable amount of time (freeze_timeout_msecs, 20
sec by default). Otherwise already queued items will be properly
waited upon and suspend will proceed.

Sorry for getting this wrong.
What I understood is that even if there are pending work items on
freezable WQ after freeze_timeout_msecs, then also system suspend would 
be performed.
Sorry couldn't find an explicit/synchronous wait in kernel for the 
pending work items for freezable WQs, but it doesn't matter.





4. So if the log.flush_wq is marked as freezable, then flush of
work item will not be required for the system suspend case.
And runtime suspend case is already covered with rpm get/put
around register access in work item function.


Yes.



It seems there are 2 config options CONFIG_SUSPEND_FREEZER


This is set whenever system suspend is enabled.


and
CONFIG_FREEZER


This is set except for one platform (powerpc), where I assume freezing
of the tasks is achieved in a different way. In any case it doesn't
matter for us.


Many thanks for providing all this info.

Will then mark the log.flush_wq as freezable.

Best regards
Akash

--Imre


which have to be enabled for all the above to happen.
If these config options will always be enabled then probably marking
log.flush_wq would work.

Please kindly confirm whether I understood correctly or not, accordingly
will proceed further.

Best regards
Akash




--Imre


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-18 Thread Goel, Akash



On 8/18/2016 6:29 PM, Imre Deak wrote:

On to, 2016-08-18 at 16:54 +0530, Goel, Akash wrote:


On 8/18/2016 4:25 PM, Imre Deak wrote:

On to, 2016-08-18 at 09:15 +0530, Goel, Akash wrote:


On 8/17/2016 9:07 PM, Goel, Akash wrote:



On 8/17/2016 6:41 PM, Imre Deak wrote:

On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote:


On 8/17/2016 5:11 PM, Chris Wilson wrote:

On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin
wrote:





+int intel_guc_suspend(struct drm_device *dev, bool
rpm_suspend)
 {
 struct drm_i915_private *dev_priv = to_i915(dev);
 struct intel_guc *guc = _priv->guc;
@@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct
drm_device *dev)
 return 0;

 gen9_disable_guc_interrupts(dev_priv);
+/* Sync is needed only for the system suspend case,
runtime
suspend
+ * case is covered due to rpm get/put calls used
around Hw
access in
+ * the work item function.
+ */
+if (!rpm_suspend && (i915.guc_log_level >= 0))
+flush_work(_priv->guc.log.flush_work);


In which case (rpm suspend) the flush_work is idle and this a
noop.
That
you have to pass around such state suggests that you are
papering
over a
bug?

In case of rpm suspend the flush_work may not be a NOOP.
Can use the flush_work for runtime suspend also but in spite of
that
can't prevent the 'RPM wakelock' asserts, as the work item can
get
executed after the rpm ref count drops to zero and before
runtime
suspend kicks in (after autosuspend delay).

For that you had earlier suggested to use rpm get/put in the
work item
function, around the register access, but with that had to
remove the
flush_work from the suspend hook, otherwise a deadlock can
happen.
So doing the flush_work conditionally for system suspend case,
as rpm
get/put won't cause the resume of device in that case.

Actually I had discussed about this with Imre and as per his
inputs
prepared this patch.


There would be this alternative:


Thanks much for suggesting the alternate approach.

Just to confirm whether I understood everything correctly,


in gen9_guc_irq_handler():
   WARN_ON(!intel_runtime_pm_get_if_in_use());

Used WARN, as we don't expect the device to be suspended at this
juncture, so intel_runtime_pm_get_if_in_use() should return true.


   if (!queue_work(log.flush_work))

If queue_work returns 0, then work item is already pending, so it
won't
be queued hence can release the rpm ref count now only.

   intel_runtime_pm_put();




and dropping the reference at the end of the work item.

This will be just like the __intel_autoenable_gt_powersave


This would make the flush_work() a nop in case of
runtime_suspend().

So can call the flush_work unconditionally.

Hope I understood it correctly.


Yes, the above is correct except for my mistake in
handling intel_runtime_pm_get_if_in_use() returning false as discussed
below.




Hi Imre,

You had suggested to use the below code from irq handler, suspecting
that intel_runtime_pm_get_if_in_use() can return false, if interrupt
gets handled just after device goes out of use.

if (intel_runtime_pm_get_if_in_use()) {
if (!queue_work(log.flush_work))
intel_runtime_pm_put();
}

Do you mean to say that interrupt can come when rpm suspend has
already
started but before the interrupt is disabled from the suspend hook ?
Like if interrupt comes b/w 1) & 4), then runtime_pm_get_if_in_use()
will return false.
1)  Autosuspend delay elapses (device is marked as suspending)
2)  intel_runtime_suspend
3)  intel_guc_suspend
4)  gen9_disable_guc_interrupts(dev_pri
v);


No, it can return false anytime the last RPM reference is dropped, that
is even before the autosuspend delay elapses.


Sorry I missed that pm_runtime_get_if_in_use() will return 0 if RPM ref
count has dropped to 0, even if device is still in runtime active state
(as autosuspend delay has not elapsed).

 > But that still makes the

likelihood for a missed work item scheduling small, because 1) we want
to reduce the autosuspend delay considerably from the current 10 sec
and 2) because what you say below about the GPU actually idling before
the RPM refcount going to 0.


If the above hypothesis is correct, then it implies that interrupt
has
to come after autosuspend delay has elapsed for the above scenario to
arise.

I think it would be unlikely for the interrupt to come so late
because
device would have gone idle just before the autosuspend period
started
and so no GuC submissions would have been done after that.


Right.


So the probability of missing a work item could be very less and we
can bear that.


I haven't looked into what is the consequence of missing a work item,
you know this better. In any case - since it is still a possibility -
if it's a problem you could still make sure in intel_guc_suspend() that
any pending work is completed by calling gu

Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-18 Thread Goel, Akash



On 8/18/2016 4:25 PM, Imre Deak wrote:

On to, 2016-08-18 at 09:15 +0530, Goel, Akash wrote:


On 8/17/2016 9:07 PM, Goel, Akash wrote:



On 8/17/2016 6:41 PM, Imre Deak wrote:

On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote:


On 8/17/2016 5:11 PM, Chris Wilson wrote:

On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin
wrote:





+int intel_guc_suspend(struct drm_device *dev, bool
rpm_suspend)
 {
 struct drm_i915_private *dev_priv = to_i915(dev);
 struct intel_guc *guc = _priv->guc;
@@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct
drm_device *dev)
 return 0;

 gen9_disable_guc_interrupts(dev_priv);
+/* Sync is needed only for the system suspend case,
runtime
suspend
+ * case is covered due to rpm get/put calls used
around Hw
access in
+ * the work item function.
+ */
+if (!rpm_suspend && (i915.guc_log_level >= 0))
+flush_work(_priv->guc.log.flush_work);


In which case (rpm suspend) the flush_work is idle and this a
noop.
That
you have to pass around such state suggests that you are
papering
over a
bug?

In case of rpm suspend the flush_work may not be a NOOP.
Can use the flush_work for runtime suspend also but in spite of
that
can't prevent the 'RPM wakelock' asserts, as the work item can
get
executed after the rpm ref count drops to zero and before
runtime
suspend kicks in (after autosuspend delay).

For that you had earlier suggested to use rpm get/put in the
work item
function, around the register access, but with that had to
remove the
flush_work from the suspend hook, otherwise a deadlock can
happen.
So doing the flush_work conditionally for system suspend case,
as rpm
get/put won't cause the resume of device in that case.

Actually I had discussed about this with Imre and as per his
inputs
prepared this patch.


There would be this alternative:


Thanks much for suggesting the alternate approach.

Just to confirm whether I understood everything correctly,


in gen9_guc_irq_handler():
   WARN_ON(!intel_runtime_pm_get_if_in_use());

Used WARN, as we don't expect the device to be suspended at this
juncture, so intel_runtime_pm_get_if_in_use() should return true.


   if (!queue_work(log.flush_work))

If queue_work returns 0, then work item is already pending, so it
won't
be queued hence can release the rpm ref count now only.

   intel_runtime_pm_put();




and dropping the reference at the end of the work item.

This will be just like the __intel_autoenable_gt_powersave


This would make the flush_work() a nop in case of
runtime_suspend().

So can call the flush_work unconditionally.

Hope I understood it correctly.


Yes, the above is correct except for my mistake in
handling intel_runtime_pm_get_if_in_use() returning false as discussed
below.




Hi Imre,

You had suggested to use the below code from irq handler, suspecting
that intel_runtime_pm_get_if_in_use() can return false, if interrupt
gets handled just after device goes out of use.

if (intel_runtime_pm_get_if_in_use()) {
if (!queue_work(log.flush_work))
intel_runtime_pm_put();
}

Do you mean to say that interrupt can come when rpm suspend has
already
started but before the interrupt is disabled from the suspend hook ?
Like if interrupt comes b/w 1) & 4), then runtime_pm_get_if_in_use()
will return false.
1)  Autosuspend delay elapses (device is marked as suspending)
2)  intel_runtime_suspend
3)  intel_guc_suspend
4)  gen9_disable_guc_interrupts(dev_pri
v);


No, it can return false anytime the last RPM reference is dropped, that
is even before the autosuspend delay elapses.


Sorry I missed that pm_runtime_get_if_in_use() will return 0 if RPM ref 
count has dropped to 0, even if device is still in runtime active state 
(as autosuspend delay has not elapsed).


> But that still makes the

likelihood for a missed work item scheduling small, because 1) we want
to reduce the autosuspend delay considerably from the current 10 sec
and 2) because what you say below about the GPU actually idling before
the RPM refcount going to 0.


If the above hypothesis is correct, then it implies that interrupt
has
to come after autosuspend delay has elapsed for the above scenario to
arise.

I think it would be unlikely for the interrupt to come so late
because
device would have gone idle just before the autosuspend period
started
and so no GuC submissions would have been done after that.


Right.


So the probability of missing a work item could be very less and we
can bear that.


I haven't looked into what is the consequence of missing a work item,
you know this better. In any case - since it is still a possibility -
if it's a problem you could still make sure in intel_guc_suspend() that
any pending work is completed by calling guc_read_update_log_buffer(),
host2guc_logbuffer_flush_complete() if necessary after disabling
interrupts in intel

Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-17 Thread Goel, Akash



On 8/17/2016 9:07 PM, Goel, Akash wrote:



On 8/17/2016 6:41 PM, Imre Deak wrote:

On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote:


On 8/17/2016 5:11 PM, Chris Wilson wrote:

On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote:





+int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend)
 {
 struct drm_i915_private *dev_priv = to_i915(dev);
 struct intel_guc *guc = _priv->guc;
@@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev)
 return 0;

 gen9_disable_guc_interrupts(dev_priv);
+/* Sync is needed only for the system suspend case, runtime
suspend
+ * case is covered due to rpm get/put calls used around Hw
access in
+ * the work item function.
+ */
+if (!rpm_suspend && (i915.guc_log_level >= 0))
+flush_work(_priv->guc.log.flush_work);


In which case (rpm suspend) the flush_work is idle and this a noop.
That
you have to pass around such state suggests that you are papering
over a
bug?

In case of rpm suspend the flush_work may not be a NOOP.
Can use the flush_work for runtime suspend also but in spite of that
can't prevent the 'RPM wakelock' asserts, as the work item can get
executed after the rpm ref count drops to zero and before runtime
suspend kicks in (after autosuspend delay).

For that you had earlier suggested to use rpm get/put in the work item
function, around the register access, but with that had to remove the
flush_work from the suspend hook, otherwise a deadlock can happen.
So doing the flush_work conditionally for system suspend case, as rpm
get/put won't cause the resume of device in that case.

Actually I had discussed about this with Imre and as per his inputs
prepared this patch.


There would be this alternative:


Thanks much for suggesting the alternate approach.

Just to confirm whether I understood everything correctly,


in gen9_guc_irq_handler():
   WARN_ON(!intel_runtime_pm_get_if_in_use());

Used WARN, as we don't expect the device to be suspended at this
juncture, so intel_runtime_pm_get_if_in_use() should return true.


   if (!queue_work(log.flush_work))

If queue_work returns 0, then work item is already pending, so it won't
be queued hence can release the rpm ref count now only.

   intel_runtime_pm_put();




and dropping the reference at the end of the work item.

This will be just like the __intel_autoenable_gt_powersave


This would make the flush_work() a nop in case of runtime_suspend().

So can call the flush_work unconditionally.

Hope I understood it correctly.


Hi Imre,

You had suggested to use the below code from irq handler, suspecting 
that intel_runtime_pm_get_if_in_use() can return false, if interrupt 
gets handled just after device goes out of use.


if (intel_runtime_pm_get_if_in_use()) {
if (!queue_work(log.flush_work))
intel_runtime_pm_put();
}

Do you mean to say that interrupt can come when rpm suspend has already 
started but before the interrupt is disabled from the suspend hook ?

Like if interrupt comes b/w 1) & 4), then runtime_pm_get_if_in_use()
will return false.
1)  Autosuspend delay elapses (device is marked as suspending)
2)  intel_runtime_suspend
3)  intel_guc_suspend
4)  gen9_disable_guc_interrupts(dev_priv);

If the above hypothesis is correct, then it implies that interrupt has 
to come after autosuspend delay has elapsed for the above scenario to arise.


I think it would be unlikely for the interrupt to come so late because 
device would have gone idle just before the autosuspend period started 
and so no GuC submissions would have been done after that.

So the probability of missing a work item could be very less and we
can bear that.

Best regards
Akash


Best regards
Akash


--Imre


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-17 Thread Goel, Akash



On 8/17/2016 6:41 PM, Imre Deak wrote:

On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote:


On 8/17/2016 5:11 PM, Chris Wilson wrote:

On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote:


On 17/08/16 11:14, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

The GuC log buffer flush work item does a register access to send the ack
to GuC and this work item, if not synced before suspend, can potentially
get executed after the GFX device is suspended.
The work item function uses rpm_get/rpm_put calls around the Hw access,
this covers the runtime suspend case but for system suspend case (which can
be done asychronously/forcefully) sync would be required as kernel can
potentially schedule the work items even after some devices, including GFX,
have been put to suspend.
Also sync has to be done conditionally i.e. only for the system suspend
case, as sync along with rpm_get/rpm_put calls can cause a deadlock for rpm
suspend path.

Cc: Imre Deak <imre.d...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c| 4 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c | 8 +++-
 drivers/gpu/drm/i915/intel_guc.h   | 2 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index cdee60b..2ae0ad4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1427,7 +1427,7 @@ static int i915_drm_suspend(struct drm_device *dev)
goto out;
}

-   intel_guc_suspend(dev);
+   intel_guc_suspend(dev, false);

intel_display_suspend(dev);

@@ -2321,7 +2321,7 @@ static int intel_runtime_suspend(struct device *device)
i915_gem_release_all_mmaps(dev_priv);
mutex_unlock(>struct_mutex);

-   intel_guc_suspend(dev);
+   intel_guc_suspend(dev, true);

intel_runtime_pm_disable_interrupts(dev_priv);

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ef0c116..1af8a8b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1519,7 +1519,7 @@ void i915_guc_submission_fini(struct drm_i915_private 
*dev_priv)
  * intel_guc_suspend() - notify GuC entering suspend state
  * @dev:   drm device
  */
-int intel_guc_suspend(struct drm_device *dev)
+int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
struct intel_guc *guc = _priv->guc;
@@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev)
return 0;

gen9_disable_guc_interrupts(dev_priv);
+   /* Sync is needed only for the system suspend case, runtime suspend
+* case is covered due to rpm get/put calls used around Hw access in
+* the work item function.
+*/
+   if (!rpm_suspend && (i915.guc_log_level >= 0))
+   flush_work(_priv->guc.log.flush_work);


In which case (rpm suspend) the flush_work is idle and this a noop. That
you have to pass around such state suggests that you are papering over a
bug?

In case of rpm suspend the flush_work may not be a NOOP.
Can use the flush_work for runtime suspend also but in spite of that
can't prevent the 'RPM wakelock' asserts, as the work item can get
executed after the rpm ref count drops to zero and before runtime
suspend kicks in (after autosuspend delay).

For that you had earlier suggested to use rpm get/put in the work item
function, around the register access, but with that had to remove the
flush_work from the suspend hook, otherwise a deadlock can happen.
So doing the flush_work conditionally for system suspend case, as rpm
get/put won't cause the resume of device in that case.

Actually I had discussed about this with Imre and as per his inputs
prepared this patch.


There would be this alternative:


Thanks much for suggesting the alternate approach.

Just to confirm whether I understood everything correctly,


in gen9_guc_irq_handler():
   WARN_ON(!intel_runtime_pm_get_if_in_use());
Used WARN, as we don't expect the device to be suspended at this 
juncture, so intel_runtime_pm_get_if_in_use() should return true.



   if (!queue_work(log.flush_work))

If queue_work returns 0, then work item is already pending, so it won't
be queued hence can release the rpm ref count now only.

   intel_runtime_pm_put();




and dropping the reference at the end of the work item.

This will be just like the __intel_autoenable_gt_powersave


This would make the flush_work() a nop in case of runtime_suspend().

So can call the flush_work unconditionally.

Hope I understood it correctly.

Best regards
Akash


--Imre


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-17 Thread Goel, Akash



On 8/17/2016 5:11 PM, Chris Wilson wrote:

On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote:


On 17/08/16 11:14, akash.g...@intel.com wrote:

From: Akash Goel 

The GuC log buffer flush work item does a register access to send the ack
to GuC and this work item, if not synced before suspend, can potentially
get executed after the GFX device is suspended.
The work item function uses rpm_get/rpm_put calls around the Hw access,
this covers the runtime suspend case but for system suspend case (which can
be done asychronously/forcefully) sync would be required as kernel can
potentially schedule the work items even after some devices, including GFX,
have been put to suspend.
Also sync has to be done conditionally i.e. only for the system suspend
case, as sync along with rpm_get/rpm_put calls can cause a deadlock for rpm
suspend path.

Cc: Imre Deak 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_drv.c| 4 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c | 8 +++-
 drivers/gpu/drm/i915/intel_guc.h   | 2 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index cdee60b..2ae0ad4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1427,7 +1427,7 @@ static int i915_drm_suspend(struct drm_device *dev)
goto out;
}

-   intel_guc_suspend(dev);
+   intel_guc_suspend(dev, false);

intel_display_suspend(dev);

@@ -2321,7 +2321,7 @@ static int intel_runtime_suspend(struct device *device)
i915_gem_release_all_mmaps(dev_priv);
mutex_unlock(>struct_mutex);

-   intel_guc_suspend(dev);
+   intel_guc_suspend(dev, true);

intel_runtime_pm_disable_interrupts(dev_priv);

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ef0c116..1af8a8b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1519,7 +1519,7 @@ void i915_guc_submission_fini(struct drm_i915_private 
*dev_priv)
  * intel_guc_suspend() - notify GuC entering suspend state
  * @dev:   drm device
  */
-int intel_guc_suspend(struct drm_device *dev)
+int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
struct intel_guc *guc = _priv->guc;
@@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev)
return 0;

gen9_disable_guc_interrupts(dev_priv);
+   /* Sync is needed only for the system suspend case, runtime suspend
+* case is covered due to rpm get/put calls used around Hw access in
+* the work item function.
+*/
+   if (!rpm_suspend && (i915.guc_log_level >= 0))
+   flush_work(_priv->guc.log.flush_work);


In which case (rpm suspend) the flush_work is idle and this a noop. That
you have to pass around such state suggests that you are papering over a
bug?

In case of rpm suspend the flush_work may not be a NOOP.
Can use the flush_work for runtime suspend also but in spite of that 
can't prevent the 'RPM wakelock' asserts, as the work item can get

executed after the rpm ref count drops to zero and before runtime
suspend kicks in (after autosuspend delay).

For that you had earlier suggested to use rpm get/put in the work item 
function, around the register access, but with that had to remove the 
flush_work from the suspend hook, otherwise a deadlock can happen.
So doing the flush_work conditionally for system suspend case, as rpm 
get/put won't cause the resume of device in that case.


Actually I had discussed about this with Imre and as per his inputs 
prepared this patch.


Best regards
Akash






-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/19] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-17 Thread Goel, Akash



On 8/17/2016 4:37 PM, Tvrtko Ursulin wrote:


On 17/08/16 11:14, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
   crash buffer area for regular cases and copying only the state
   structure data in first page.

v3:
  - Create a vmalloc mapping of log buffer. (Chris)
  - Cover the flush acknowledgment under rpm get & put.(Chris)
  - Revert the change of skipping the copy of crash dump area, as
not really needed, will be covered by subsequent patch.

v4:
  - Destroy the wq under the same condition in which it was created,
pass dev_piv pointer instead of dev to newly added GuC function,
add more comments & rename variable for clarity. (Tvrtko)

v5:
- Allocate & destroy the dedicated wq, for handling flush interrupt,
   from the setup/teardown routines of GuC logging. (Chris)
- Validate the log buffer size value retrieved from state structure
   and do some minor cleanup. (Tvrtko)
- Fix error/warnings reported by checkpatch. (Tvrtko)
- Rebase.

v6:
  - Remove the interrupts_enabled check from guc_capture_logs_work, need
to process that last work item also, queued just before disabling the
interrupt as log buffer flush interrupt handling is a bit different
case where GuC is actually expecting an ACK from host, which
should be
provided to keep the logging going.
Sync against the work will be done by caller disabling the interrupt.
  - Don't sample the log buffer size value from state structure, directly
use the expected value to move the pointer & do the copy and that
cannot
go wrong (out of bounds) as Driver only allocated the log buffer
and the
relay buffers. Driver should refrain from interpreting the log
packet,
as much possible and let Userspace parser detect the anomaly. (Chris)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 186
+
  drivers/gpu/drm/i915/i915_irq.c|  28 -
  drivers/gpu/drm/i915/intel_guc.h   |   4 +
  3 files changed, 217 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index b062da6..ade51cb 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
intel_guc *guc,
  return host2guc_action(guc, data, ARRAY_SIZE(data));
  }

+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+u32 data[1];
+
+data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+return host2guc_action(guc, data, 1);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -828,6 +837,163 @@ err:
  return NULL;
  }

+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+}
+
+static void *guc_get_write_buffer(struct intel_guc *guc)
+{
+return NULL;
+}
+
+static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type
type)
+{
+if (type == GUC_ISR_LOG_BUFFER)
+return (GUC_LOG_ISR_PAGES + 1) * PAGE_SIZE;
+else if (type == GUC_DPC_LOG_BUFFER)
+return (GUC_LOG_DPC_PAGES + 1) * PAGE_SIZE;
+else
+return (GUC_LOG_CRASH_PAGES + 1) * PAGE_SIZE;
+}


Could do it with a switch statement to get automatic reminder of size
not being handled if some day a new log buffer type gets added. It would
probably more in the style of the rest of the driver as well.


Fine will use switch statement here.

Should I use BUG_ON for the default/unhandled case ?
case GUC_ISR_LOG_BUFFER

case GUC_DPC_LOG_BUFFER

case GUC_LOG_CRASH_PAGES

default
BUG_ON(1)


+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
+struct guc_log_buffer_state log_buffer_state_local;
+void *src_data_ptr, *dst_data_ptr;
+unsigned int buffer_size;
+enum guc_log_buffer_type type;
+
+if (WARN_ON(!guc->log.buf_addr))
+return;
+
+/* Get the pointer to shared GuC log buffer */
+log_buffer_state = src_data_ptr = guc->log.buf_addr;
+
+/* Get the pointer to local buffer to store the logs */
+dst_data_ptr = log_buffer_snapshot_state =
guc_get_write_buffer(guc);
+
+/* Actual logs are present from the 2nd page */
+src_data_ptr 

Re: [Intel-gfx] [PATCH 14/18] drm/i915: Forcefully flush GuC log buffer on reset

2016-08-16 Thread Goel, Akash



On 8/16/2016 4:57 PM, Tvrtko Ursulin wrote:


On 15/08/16 15:49, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

Before capturing the GuC logs as a part of error state, there should be a
force log buffer flush action sent to GuC before proceeding with GPU
reset
and re-initializing GUC. There could be some data in the log buffer which
is yet to be captured and those logs would be particularly useful to
understand that why the GPU reset was initiated.

v2:
- Avoid the wait via flush_work, to serialize against an ongoing log
   buffer flush, from the error state capture path. (Chris)
- Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_gpu_error.c  |  2 ++
  drivers/gpu/drm/i915/i915_guc_submission.c | 30
++
  drivers/gpu/drm/i915/intel_guc.h   |  1 +
  3 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 94297aa..b73c671 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1301,6 +1301,8 @@ static void
i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
  if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0))
  return;

+i915_guc_flush_logs(dev_priv, false);
+
  error->guc_log = i915_error_object_create(dev_priv,
dev_priv->guc.log.vma);
  }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index b8d6313..85df2f3 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -185,6 +185,16 @@ static int
host2guc_logbuffer_flush_complete(struct intel_guc *guc)
  return host2guc_action(guc, data, 1);
  }

+static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
+{
+u32 data[2];
+
+data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH;
+data[1] = 0;
+
+return host2guc_action(guc, data, 2);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -1536,6 +1546,26 @@ void i915_guc_capture_logs(struct
drm_i915_private *dev_priv)
  intel_runtime_pm_put(dev_priv);
  }

+void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool
can_wait)
+{
+if (!i915.enable_guc_submission || (i915.guc_log_level < 0))
+return;
+
+/* First disable the interrupts, will be renabled afterwards */
+gen9_disable_guc_interrupts(dev_priv);
+
+/* Before initiating the forceful flush, wait for any
pending/ongoing
+ * flush to complete otherwise forceful flush may not happen, but
wait
+ * can't be done for some paths like error state capture in which
case
+ * take a chance & directly attempt the forceful flush.
+ */
+if (can_wait)
+flush_work(_priv->guc.log.flush_work);
+
+/* Ask GuC to update the log buffer state */
+host2guc_force_logbuffer_flush(_priv->guc);


Should you just call i915_guc_capture_logs from here? Error capture
could also potentially benefit from it and you could remove it from the
debugfs patch then.

Actually earlier it was done like that only, but now after adding the 
patch,
[PATCH 13/18] drm/i915: Augment i915 error state to include the dump of 
GuC log buffer,
Contents of GuC log buffer is anyways made part of the error state, so 
thought it may not be of any real use to capture the left over logs in 
the relay sub buffer also.
For the analysis purpose, GuC logs part of the error state dump would be 
good enough.


best regards
Akash

+}
+
  void i915_guc_unregister(struct drm_i915_private *dev_priv)
  {
  if (!i915.enable_guc_submission)
diff --git a/drivers/gpu/drm/i915/intel_guc.h
b/drivers/gpu/drm/i915/intel_guc.h
index 8598f38..d7eda42 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -182,6 +182,7 @@ int i915_guc_wq_check_space(struct
drm_i915_gem_request *rq);
  void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
  void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
  void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool
can_wait);
  void i915_guc_register(struct drm_i915_private *dev_priv);
  void i915_guc_unregister(struct drm_i915_private *dev_priv);




Regards,

Tvrtko


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 14/18] drm/i915: Forcefully flush GuC log buffer on reset

2016-08-16 Thread Goel, Akash



On 8/16/2016 2:55 PM, Tvrtko Ursulin wrote:


On 16/08/16 06:25, Goel, Akash wrote:

On 8/15/2016 9:18 PM, Tvrtko Ursulin wrote:

On 15/08/16 15:49, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

Before capturing the GuC logs as a part of error state, there should
be a
force log buffer flush action sent to GuC before proceeding with GPU
reset
and re-initializing GUC. There could be some data in the log buffer
which
is yet to be captured and those logs would be particularly useful to
understand that why the GPU reset was initiated.

v2:
- Avoid the wait via flush_work, to serialize against an ongoing log
   buffer flush, from the error state capture path. (Chris)


Could you explain if the patch does anything now that the flush has been
removed?


flush_work for the regular log buffer flush work item has been removed
but the forceful command is still sent to GuC.


In fact I don't even understand what it was doing before. :)


I am sorry for that.


If the idea is to send a flush command to GuC so it can raise an
interrupt for a partially full buffer,

Yes exactly this is the idea.


But then isn't the order wrong? Should it first send the flush command
to the GuC and then wait for something maybe gets flushed?
As I tried to clarify in my last email that GuC firmware just ignores 
the forceful flush command received from Host, if it sees there is a 
pending request for regular log buffer flush, for which it hasn't 
received the ack.


So from the Host side, we need to first wait for the regular log buffer 
flush work item to finish execution, if any, and then send the forceful

flush command to GuC.


I can see that it could be tricky since the timing is undefined, but I don't

Yes it is deinitely tricky with respect to the timing.


understand where it currently actually processes that potential extra
packets.
The extra left over logs are captured manually just after sending the 
forceful flush command to GuC.

i915_guc_flush_logs(dev_priv, true);
/* GuC would have updated log buffer by now, so capture it */
i915_guc_capture_logs(dev_priv);


Especially since it disabled interrupts before hand.
Had disabled the interrupt, out of paranoia, to avoid a situation of 
work item getting scheduled again (for a different buffer type) while we 
manually collect the extra logs.


Best regards
Akash



Regards,

Tvrtko


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/18] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-15 Thread Goel, Akash



On 8/15/2016 10:26 PM, Chris Wilson wrote:

On Mon, Aug 15, 2016 at 10:16:56PM +0530, Goel, Akash wrote:



On 8/15/2016 9:36 PM, Chris Wilson wrote:

On Mon, Aug 15, 2016 at 08:19:47PM +0530, akash.g...@intel.com wrote:

+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+   struct guc_log_buffer_state *log_buffer_state, 
*log_buffer_snapshot_state;
+   struct guc_log_buffer_state log_buffer_state_local;
+   void *src_data_ptr, *dst_data_ptr;
+   unsigned int buffer_size, expected_size;
+   enum guc_log_buffer_type type;
+
+   if (WARN_ON(!guc->log.buf_addr))
+   return;
+
+   /* Get the pointer to shared GuC log buffer */
+   log_buffer_state = src_data_ptr = guc->log.buf_addr;
+
+   /* Get the pointer to local buffer to store the logs */
+   dst_data_ptr = log_buffer_snapshot_state = guc_get_write_buffer(guc);
+
+   /* Actual logs are present from the 2nd page */
+   src_data_ptr += PAGE_SIZE;
+   dst_data_ptr += PAGE_SIZE;
+
+   for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
+   /* Make a copy of the state structure in GuC log buffer (which
+* is uncached mapped) on the stack to avoid reading from it
+* multiple times.
+*/
+   memcpy(_buffer_state_local, log_buffer_state,
+  sizeof(struct guc_log_buffer_state));
+   buffer_size = log_buffer_state_local.size;
+
+   if (log_buffer_snapshot_state) {
+   /* First copy the state structure in snapshot buffer */
+   memcpy(log_buffer_snapshot_state, 
_buffer_state_local,
+  sizeof(struct guc_log_buffer_state));
+
+   /* The write pointer could have been updated by the GuC
+* firmware, after sending the flush interrupt to Host,
+* for consistency set the write pointer value to same
+* value of sampled_write_ptr in the snapshot buffer.
+*/
+   log_buffer_snapshot_state->write_ptr =
+   log_buffer_snapshot_state->sampled_write_ptr;
+
+   log_buffer_snapshot_state++;
+
+   /* Now copy the actual logs, but before that validate
+* the buffer size value retrieved from state structure.
+*/
+   if (type == GUC_ISR_LOG_BUFFER)
+   expected_size = (GUC_LOG_ISR_PAGES+1)*PAGE_SIZE;
+   else if (type == GUC_DPC_LOG_BUFFER)
+   expected_size = (GUC_LOG_DPC_PAGES+1)*PAGE_SIZE;
+   else
+   expected_size = 
(GUC_LOG_CRASH_PAGES+1)*PAGE_SIZE;
+
+   if (unlikely(buffer_size != expected_size)) {
+   DRM_ERROR("unexpected log buffer size\n");
+   /* Continue with further copying, already state
+* structure has been copied which is enough to
+* let Userspace know about the anomaly.
+*/
+   buffer_size = expected_size;


Urm, no.

You tell userspace one thing and then do another. This code should just
be a conduit and not apply its own outdated interpretation.


Userspace parser would get to know from the state structure about
the anomalous buffer size.


It will, but it won't be told what the kernel did. So if believes the
GuC (as it should since it is a packet that should be unadulterated) the
entire stream is then corrupt.


Please suggest that what should be done here ideally.

Should the further copying (for this snapshot) be skipped ?


The kernel should be avoiding interpretting the log packets as much as
possible - I would prefer it if we just moved the byte stream without
trying to interpret it as datagrams. But there is probably some merit to
at least using the log packets (datagrams).

It would have been ideal if log packets can be dumped without any 
interpretation.


We copy the payload without any interpretation, only some bits of header 
we parse.


We also have to interpret the header (in subsequent patch) to copy only 
the updated payload data, for better performance.



+   }
+
+   memcpy(dst_data_ptr, src_data_ptr, buffer_size);


Where do you validate that buffer_size is sane before copying?

Sorry didn't get you, the check for buffer_size is being done right
before this memcpy.


There is no explicit check for valid src_data_ptr + buffer_size or
dst_data_ptr + buffer_size, and a quick glance at the code suggested no
reason to believe they must be valid.
Actually if buffer_size has been validated & corrected, then bot

Re: [Intel-gfx] [PATCH 14/18] drm/i915: Forcefully flush GuC log buffer on reset

2016-08-15 Thread Goel, Akash



On 8/15/2016 9:18 PM, Tvrtko Ursulin wrote:


On 15/08/16 15:49, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

Before capturing the GuC logs as a part of error state, there should be a
force log buffer flush action sent to GuC before proceeding with GPU
reset
and re-initializing GUC. There could be some data in the log buffer which
is yet to be captured and those logs would be particularly useful to
understand that why the GPU reset was initiated.

v2:
- Avoid the wait via flush_work, to serialize against an ongoing log
   buffer flush, from the error state capture path. (Chris)


Could you explain if the patch does anything now that the flush has been
removed?


flush_work for the regular log buffer flush work item has been removed
but the forceful command is still sent to GuC.


In fact I don't even understand what it was doing before. :)


I am sorry for that.


If the idea is to send a flush command to GuC so it can raise an
interrupt for a partially full buffer,

Yes exactly this is the idea.


then i915_guc_flush_logs should
send the flush command and wait for that interrupt/work.

But the function is first waiting for the work item to complete and then
sending the flush command to the GuC. So I am confused.

Actually GuC firmware just ignores the forceful flush command received 
from Host, if it sees there is a pending request for regular log buffer

flush, for which it hasn't received the ack.

So that's why from Host side, before sending the forceful flush command 
to GuC, had to first wait for the regular log buffer flush work item to

finish execution.

Best regards
Akash


Regards,

Tvrtko


- Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_gpu_error.c  |  2 ++
  drivers/gpu/drm/i915/i915_guc_submission.c | 30
++
  drivers/gpu/drm/i915/intel_guc.h   |  1 +
  3 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 94297aa..b73c671 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1301,6 +1301,8 @@ static void
i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
  if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0))
  return;

+i915_guc_flush_logs(dev_priv, false);
+
  error->guc_log = i915_error_object_create(dev_priv,
dev_priv->guc.log.vma);
  }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index b8d6313..85df2f3 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -185,6 +185,16 @@ static int
host2guc_logbuffer_flush_complete(struct intel_guc *guc)
  return host2guc_action(guc, data, 1);
  }

+static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
+{
+u32 data[2];
+
+data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH;
+data[1] = 0;
+
+return host2guc_action(guc, data, 2);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -1536,6 +1546,26 @@ void i915_guc_capture_logs(struct
drm_i915_private *dev_priv)
  intel_runtime_pm_put(dev_priv);
  }

+void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool
can_wait)
+{
+if (!i915.enable_guc_submission || (i915.guc_log_level < 0))
+return;
+
+/* First disable the interrupts, will be renabled afterwards */
+gen9_disable_guc_interrupts(dev_priv);
+
+/* Before initiating the forceful flush, wait for any
pending/ongoing
+ * flush to complete otherwise forceful flush may not happen, but
wait
+ * can't be done for some paths like error state capture in which
case
+ * take a chance & directly attempt the forceful flush.
+ */
+if (can_wait)
+flush_work(_priv->guc.log.flush_work);
+
+/* Ask GuC to update the log buffer state */
+host2guc_force_logbuffer_flush(_priv->guc);
+}
+
  void i915_guc_unregister(struct drm_i915_private *dev_priv)
  {
  if (!i915.enable_guc_submission)
diff --git a/drivers/gpu/drm/i915/intel_guc.h
b/drivers/gpu/drm/i915/intel_guc.h
index 8598f38..d7eda42 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -182,6 +182,7 @@ int i915_guc_wq_check_space(struct
drm_i915_gem_request *rq);
  void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
  void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
  void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool
can_wait);
  void i915_guc_register(struct drm_i915_private *dev_priv);
  void i915_guc_unregister(struct drm_i915_private *dev_priv);



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org

Re: [Intel-gfx] [PATCH 06/18] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-15 Thread Goel, Akash



On 8/15/2016 9:36 PM, Chris Wilson wrote:

On Mon, Aug 15, 2016 at 08:19:47PM +0530, akash.g...@intel.com wrote:

+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+   struct guc_log_buffer_state *log_buffer_state, 
*log_buffer_snapshot_state;
+   struct guc_log_buffer_state log_buffer_state_local;
+   void *src_data_ptr, *dst_data_ptr;
+   unsigned int buffer_size, expected_size;
+   enum guc_log_buffer_type type;
+
+   if (WARN_ON(!guc->log.buf_addr))
+   return;
+
+   /* Get the pointer to shared GuC log buffer */
+   log_buffer_state = src_data_ptr = guc->log.buf_addr;
+
+   /* Get the pointer to local buffer to store the logs */
+   dst_data_ptr = log_buffer_snapshot_state = guc_get_write_buffer(guc);
+
+   /* Actual logs are present from the 2nd page */
+   src_data_ptr += PAGE_SIZE;
+   dst_data_ptr += PAGE_SIZE;
+
+   for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
+   /* Make a copy of the state structure in GuC log buffer (which
+* is uncached mapped) on the stack to avoid reading from it
+* multiple times.
+*/
+   memcpy(_buffer_state_local, log_buffer_state,
+  sizeof(struct guc_log_buffer_state));
+   buffer_size = log_buffer_state_local.size;
+
+   if (log_buffer_snapshot_state) {
+   /* First copy the state structure in snapshot buffer */
+   memcpy(log_buffer_snapshot_state, 
_buffer_state_local,
+  sizeof(struct guc_log_buffer_state));
+
+   /* The write pointer could have been updated by the GuC
+* firmware, after sending the flush interrupt to Host,
+* for consistency set the write pointer value to same
+* value of sampled_write_ptr in the snapshot buffer.
+*/
+   log_buffer_snapshot_state->write_ptr =
+   log_buffer_snapshot_state->sampled_write_ptr;
+
+   log_buffer_snapshot_state++;
+
+   /* Now copy the actual logs, but before that validate
+* the buffer size value retrieved from state structure.
+*/
+   if (type == GUC_ISR_LOG_BUFFER)
+   expected_size = (GUC_LOG_ISR_PAGES+1)*PAGE_SIZE;
+   else if (type == GUC_DPC_LOG_BUFFER)
+   expected_size = (GUC_LOG_DPC_PAGES+1)*PAGE_SIZE;
+   else
+   expected_size = 
(GUC_LOG_CRASH_PAGES+1)*PAGE_SIZE;
+
+   if (unlikely(buffer_size != expected_size)) {
+   DRM_ERROR("unexpected log buffer size\n");
+   /* Continue with further copying, already state
+* structure has been copied which is enough to
+* let Userspace know about the anomaly.
+*/
+   buffer_size = expected_size;


Urm, no.

You tell userspace one thing and then do another. This code should just
be a conduit and not apply its own outdated interpretation.

Userspace parser would get to know from the state structure about the 
anomalous buffer size.


Please suggest that what should be done here ideally.

Should the further copying (for this snapshot) be skipped ?


+   }
+
+   memcpy(dst_data_ptr, src_data_ptr, buffer_size);


Where do you validate that buffer_size is sane before copying?
Sorry didn't get you, the check for buffer_size is being done right 
before this memcpy.


Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 08/18] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-08-15 Thread Goel, Akash



On 8/15/2016 9:42 PM, Chris Wilson wrote:

On Mon, Aug 15, 2016 at 05:09:45PM +0100, Chris Wilson wrote:

On Mon, Aug 15, 2016 at 08:19:49PM +0530, akash.g...@intel.com wrote:

+void i915_guc_register(struct drm_i915_private *dev_priv)
+{
+   if (!i915.enable_guc_submission)
+   return;


The final state of i915.enable_guc_submission is not known at this time.


As per the below sequence, i915.enable_guc_submission would have been 
set to its final value by this time,


i915_driver_load
i915_load_modeset_init
i915_gem_init_hw
intel_guc_setup
i915_guc_submission_init
i915_guc_submission_enable
i915_driver_register
i915_debugfs_register
i915_guc_register


Does it matter if you set up the log even though guc is not used?


I think it would be better to do setup only if guc submission is enabled.


Would this not be better driver from guc_submission_enable and
guc_submission_disable?





With the caveat that you probably need both. i.e. you have to wait for
both the GuC to be enabled and for sysfs to be available.

Sorry I am really confused.
Isn't this a right location ? creating the relay file after the debugfs 
registration has been done.

Other logging related setup is being done at i915_guc_submission_init().

Best regards
Akash


-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 11/18] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-08-15 Thread Goel, Akash



On 8/15/2016 9:06 PM, Tvrtko Ursulin wrote:


On 15/08/16 15:49, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

v3: Fix the blooper of doing the copy twice. (Tvrtko)

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 40
+-
  1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c7b4a57..b8d6313 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1003,6 +1003,8 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  void *src_data_ptr, *dst_data_ptr;
  unsigned int buffer_size, expected_size;
  enum guc_log_buffer_type type;
+unsigned int read_offset, write_offset, bytes_to_copy;
+bool new_overflow;

  if (WARN_ON(!guc->log.buf_addr))
  return;
@@ -1025,11 +1027,14 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  memcpy(_buffer_state_local, log_buffer_state,
 sizeof(struct guc_log_buffer_state));
  buffer_size = log_buffer_state_local.size;
+read_offset = log_buffer_state_local.read_ptr;
+write_offset = log_buffer_state_local.sampled_write_ptr;

  /* Bookkeeping stuff */
  guc->log.flush_count[type] +=
log_buffer_state_local.flush_to_file;
  if (log_buffer_state_local.buffer_full_cnt !=
  guc->log.prev_overflow_count[type]) {
+new_overflow = 1;
  guc->log.total_overflow_count[type] +=
  (log_buffer_state_local.buffer_full_cnt -
   guc->log.prev_overflow_count[type]);
@@ -1043,7 +1048,8 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  guc->log.prev_overflow_count[type] =
  log_buffer_state_local.buffer_full_cnt;
  DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
-}
+} else
+new_overflow = 0;


Nitpick: normally the rule is if one branch has curlies all of them have
to. Checkpatch I think warns about that, or maybe only in strict mode.


Did ran checkpatch with strict option.
Probably overlooked the warning. Will check again



  if (log_buffer_snapshot_state) {
  /* First copy the state structure in snapshot buffer */
@@ -1055,8 +1061,7 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
   * for consistency set the write pointer value to same
   * value of sampled_write_ptr in the snapshot buffer.
   */
-log_buffer_snapshot_state->write_ptr =
-log_buffer_snapshot_state->sampled_write_ptr;
+log_buffer_snapshot_state->write_ptr = write_offset;

  log_buffer_snapshot_state++;

@@ -1079,7 +1084,31 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  buffer_size = expected_size;
  }

-memcpy(dst_data_ptr, src_data_ptr, buffer_size);
+if (unlikely(new_overflow)) {
+/* copy the whole buffer in case of overflow */
+read_offset = 0;
+write_offset = buffer_size;
+} else if (unlikely((read_offset > buffer_size) ||
+(write_offset > buffer_size))) {


Could also check for read_offset == write_offset for even more safety?

That is already handled implicitly, in this case we don't do any copy.
As per the below code bytes_to_copy will come as 0.

if (read_offset <= write_offset) {
bytes_to_copy = write_offset - read_offset;

Best regards
Akash

+DRM_ERROR("invalid log buffer state\n");
+/* copy whole buffer as offsets are unreliable */
+read_offset = 0;
+write_offset = buffer_size;
+}
+
+/* Just copy the newly written data */
+if (read_offset <= write_offset) {
+bytes_to_copy = write_offset - read_offset;
+memcpy(dst_data_ptr + read_offset,
+   src_data_ptr + read_offset, bytes_to_copy);
+} else {
+bytes_to_copy = buffer_size - read_offset;
+memcpy(dst_data_ptr + read_offset,
+   

Re: [Intel-gfx] [PATCH 08/18] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-08-15 Thread Goel, Akash



On 8/15/2016 8:59 PM, Tvrtko Ursulin wrote:


On 15/08/16 15:49, akash.g...@intel.com wrote:

From: Akash Goel 

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snapshots could be stored in the relay buffer.
The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the
'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of
data.

v2: Defer the creation of relay channel & associated debugfs file, as
 debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
   the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
   sufficient buffering for boot time logs

v5:
- Fix the alignment, indentation issues and some minor cleanup. (Tvrtko)
- Update the comment to elaborate on why a relay channel has to be
   associated with the debugfs file. (Tvrtko)

Suggested-by: Chris Wilson 
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/Kconfig   |   1 +
  drivers/gpu/drm/i915/i915_drv.c|   2 +
  drivers/gpu/drm/i915/i915_guc_submission.c | 211
-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 215 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e46..fc900d2 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -11,6 +11,7 @@ config DRM_I915
  select DRM_KMS_HELPER
  select DRM_PANEL
  select DRM_MIPI_DSI
+select RELAY
  # i915 depends on ACPI_VIDEO when ACPI is enabled
  # but for select to work, need to select ACPI_VIDEO's
dependencies, ick
  select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 13ae340..cdee60b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1133,6 +1133,7 @@ static void i915_driver_register(struct
drm_i915_private *dev_priv)
  /* Reveal our presence to userspace */
  if (drm_dev_register(dev, 0) == 0) {
  i915_debugfs_register(dev_priv);
+i915_guc_register(dev_priv);
  i915_setup_sysfs(dev);
  } else
  DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1171,6 +1172,7 @@ static void i915_driver_unregister(struct
drm_i915_private *dev_priv)
  intel_opregion_unregister(dev_priv);

  i915_teardown_sysfs(_priv->drm);
+i915_guc_unregister(dev_priv);
  i915_debugfs_unregister(dev_priv);
  drm_dev_unregister(_priv->drm);

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2b27b87..9b1054c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
   */
  #include 
  #include 
+#include 
+#include 
  #include "i915_drv.h"
  #include "intel_guc.h"

@@ -837,13 +839,159 @@ err:
  return NULL;
  }

+/*
+ * Sub buffer switch callback. Called whenever relay has to switch to
a new
+ * sub buffer, relay stays on the same sub buffer if 0 is returned.
+ */
+static int subbuf_start_callback(struct rchan_buf *buf,
+ void *subbuf,
+ void *prev_subbuf,
+ size_t prev_padding)
+{
+/* Use no-overwrite mode by default, where relay will stop accepting
+ * new data if there are no empty sub buffers left.
+ * There is no strict synchronization enforced by relay between
Consumer
+ * and Producer. In overwrite mode, there is a possibility of
getting
+ * inconsistent/garbled data, the producer could be writing on to
the
+ * same sub buffer from which Consumer is reading. This can't be
avoided
+ * unless Consumer is fast enough and can always run in tandem with
+ * Producer.
+ */
+if (relay_buf_full(buf))
+return 0;
+
+return 1;
+}
+
+/*
+ * file_create() callback. Creates relay file in debugfs.
+ */
+static struct dentry *create_buf_file_callback(const char *filename,
+

Re: [Intel-gfx] [PATCH 06/18] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-15 Thread Goel, Akash



On 8/15/2016 8:50 PM, Tvrtko Ursulin wrote:



On 15/08/16 15:49, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
   crash buffer area for regular cases and copying only the state
   structure data in first page.

v3:
  - Create a vmalloc mapping of log buffer. (Chris)
  - Cover the flush acknowledgment under rpm get & put.(Chris)
  - Revert the change of skipping the copy of crash dump area, as
not really needed, will be covered by subsequent patch.

v4:
  - Destroy the wq under the same condition in which it was created,
pass dev_piv pointer instead of dev to newly added GuC function,
add more comments & rename variable for clarity. (Tvrtko)

v5:
- Allocate & destroy the dedicated wq, for handling flush interrupt,
   from the setup/teardown routines of GuC logging. (Chris)
- Validate the log buffer size value retrieved from state structure
   and do some minor cleanup. (Tvrtko)
- Fix error/warnings reported by checkpatch. (Tvrtko)
- Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 202
+
  drivers/gpu/drm/i915/i915_irq.c|  29 -
  drivers/gpu/drm/i915/intel_guc.h   |   4 +
  3 files changed, 234 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index b062da6..2b27b87 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
intel_guc *guc,
  return host2guc_action(guc, data, ARRAY_SIZE(data));
  }

+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+u32 data[1];
+
+data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+return host2guc_action(guc, data, 1);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -828,6 +837,179 @@ err:
  return NULL;
  }

+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+}
+
+static void *guc_get_write_buffer(struct intel_guc *guc)
+{
+return NULL;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
+struct guc_log_buffer_state log_buffer_state_local;
+void *src_data_ptr, *dst_data_ptr;
+unsigned int buffer_size, expected_size;
+enum guc_log_buffer_type type;
+
+if (WARN_ON(!guc->log.buf_addr))
+return;
+
+/* Get the pointer to shared GuC log buffer */
+log_buffer_state = src_data_ptr = guc->log.buf_addr;
+
+/* Get the pointer to local buffer to store the logs */
+dst_data_ptr = log_buffer_snapshot_state =
guc_get_write_buffer(guc);
+
+/* Actual logs are present from the 2nd page */
+src_data_ptr += PAGE_SIZE;
+dst_data_ptr += PAGE_SIZE;
+
+for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
+/* Make a copy of the state structure in GuC log buffer (which
+ * is uncached mapped) on the stack to avoid reading from it
+ * multiple times.
+ */
+memcpy(_buffer_state_local, log_buffer_state,
+   sizeof(struct guc_log_buffer_state));
+buffer_size = log_buffer_state_local.size;
+
+if (log_buffer_snapshot_state) {
+/* First copy the state structure in snapshot buffer */
+memcpy(log_buffer_snapshot_state, _buffer_state_local,
+   sizeof(struct guc_log_buffer_state));
+
+/* The write pointer could have been updated by the GuC
+ * firmware, after sending the flush interrupt to Host,
+ * for consistency set the write pointer value to same
+ * value of sampled_write_ptr in the snapshot buffer.
+ */
+log_buffer_snapshot_state->write_ptr =
+log_buffer_snapshot_state->sampled_write_ptr;
+
+log_buffer_snapshot_state++;
+
+/* Now copy the actual logs, but before that validate
+ * the buffer size value retrieved from state structure.
+ */
+if (type == GUC_ISR_LOG_BUFFER)
+expected_size = (GUC_LOG_ISR_PAGES+1)*PAGE_SIZE;
+else if (type == GUC_DPC_LOG_BUFFER)
+expected_size = (GUC_LOG_DPC_PAGES+1)*PAGE_SIZE;
+else
+

Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs

2016-08-15 Thread Goel, Akash



On 8/15/2016 2:50 PM, Tvrtko Ursulin wrote:


On 12/08/16 17:31, Goel, Akash wrote:

On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote:

On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

As per the current i915 Driver load sequence, debugfs registration is
done
at the end and so the relay channel debugfs file is also created after
that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC
firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only
relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

Suggested-by: Chris Wilson <ch...@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 61
+-
  1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index af48f62..1c287d7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct
intel_guc *guc)
  relay_close(guc->log.relay_chan);
  }

-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
  {
  struct drm_i915_private *dev_priv = guc_to_i915(guc);
  struct rchan *guc_log_relay_chan;
-struct dentry *log_dir;
  size_t n_subbufs, subbuf_size;

-/* For now create the log file in /sys/kernel/debug/dri/0 dir */
-log_dir = dev_priv->drm.primary->debugfs_root;
-
-/* If /sys/kernel/debug/dri/0 location do not exist, then
debugfs is
- * not mounted and so can't create the relay file.
- * The relay API seems to fit well with debugfs only.


It only needs a dentry, I don't see that it has to be a debugfs one.


Besides dentry, there are other requirements for using relay, which can
be met only for a debugfs file.
debugfs wasn't the preferred choice to place the log file, but had no
other option, as relay API is compatible with debugfs only.


What are those and

For availing relay there are 3 requirements :-
a) Need the associated ‘dentry’ pointer of the file, while opening the
   relay channel.
b) Should be able to use 'relay_file_operations' fops for the file.
c) Set the 'i_private' field of file’s inode to the pointer of relay
   channel buffer.

All the above 3 requirements can be met for a debugfs file in a 
straightforward manner. But not all of them can be met for a file 
created inside sysfs or if the file is created inside /dev as a 
character device file.



should they be mentioned in the comment above?


Or should I mention them in the cover letter or commit message.

Best regards
Akash


Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 15/20] drm/i915: Debugfs support for GuC logging control

2016-08-12 Thread Goel, Akash



On 8/12/2016 9:27 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging
and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling
logging.
 Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
 set the guc_log_level to -1 when logging is disabled. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_debugfs.c| 44 -
  drivers/gpu/drm/i915/i915_guc_submission.c | 63
++
  drivers/gpu/drm/i915/intel_guc.h   |  1 +
  3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 14e0dcf..f472fbcd3 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2674,6 +2674,47 @@ static int i915_guc_log_dump(struct seq_file
*m, void *data)
  return 0;
  }

+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+struct drm_device *dev = data;
+struct drm_i915_private *dev_priv = to_i915(dev);
+
+if (!dev_priv->guc.log.obj)
+return -EINVAL;
+
+*val = i915.guc_log_level;
+
+return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+struct drm_device *dev = data;
+struct drm_i915_private *dev_priv = to_i915(dev);
+int ret;
+
+ret = mutex_lock_interruptible(>struct_mutex);
+if (ret)
+return ret;
+
+if (!dev_priv->guc.log.obj) {
+ret = -EINVAL;
+goto end;
+}
+
+intel_runtime_pm_get(dev_priv);
+ret = i915_guc_log_control(dev_priv, val);
+intel_runtime_pm_put(dev_priv);
+
+end:
+mutex_unlock(>struct_mutex);
+return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+i915_guc_log_control_get, i915_guc_log_control_set,
+"%lld\n");
+
  static int i915_edp_psr_status(struct seq_file *m, void *data)
  {
  struct drm_info_node *node = m->private;
@@ -5477,7 +5518,8 @@ static const struct i915_debugfs_files {
  {"i915_fbc_false_color", _fbc_fc_fops},
  {"i915_dp_test_data", _displayport_test_data_fops},
  {"i915_dp_test_type", _displayport_test_type_fops},
-{"i915_dp_test_active", _displayport_test_active_fops}
+{"i915_dp_test_active", _displayport_test_active_fops},
+{"i915_guc_log_control", _guc_log_control_fops}
  };

  void intel_display_crc_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 4a75c16..041cf68 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct
intel_guc *guc)
  return host2guc_action(guc, data, 2);
  }

+static int host2guc_logging_control(struct intel_guc *guc, u32
control_val)
+{
+u32 data[2];
+
+data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+data[1] = control_val;
+
+return host2guc_action(guc, data, 2);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -1538,3 +1548,56 @@ void i915_guc_register(struct drm_i915_private
*dev_priv)
  guc_log_late_setup(_priv->guc);
  mutex_unlock(_priv->drm.struct_mutex);
  }
+
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64
control_val)
+{
+union guc_log_control log_param;
+int ret;
+
+log_param.logging_enabled = control_val & 0x1;
+log_param.verbosity = (control_val >> 4) & 0xF;


Maybe "log_param.value = control_val" would also work since
guc_log_control is conveniently defined as an union. Doesn't matter though.


+
+if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+return -EINVAL;
+
+/* This combination doesn't make sense & won't have any effect */
+if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+return 0;


I wonder if it would work and maybe look nicer to generalize as:

int guc_log_level;

guc_log_level = log_param.logging_enabled ? log_param.verbosity : -1;
if (i915.guc_log_level == guc_log_level)
return 0;


Fine, will try to refactor the code as per your suggestions.
Thanks for the suggestions.


+
+ret = host2guc_logging_control(_priv->guc, log_param.value);
+if (ret < 0) {
+DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
+return ret;
+}
+
+i915.guc_log_level = log_param.verbosity;


This would then become i915.guc_log_level = 

Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps

2016-08-12 Thread Goel, Akash



On 8/12/2016 8:46 PM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 08:43:58PM +0530, Goel, Akash wrote:

On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote:

Unreleated and unmentioned change to no guard page. Best to remove IMHO.
Can keep the RB in that case.


Though its not called out, sorry for that, but isn't it better to
avoid using the guard page, which will save 4KB of vmalloc virtual
space (which is scarce) for every mapping created by Driver.

Updating the commit message would be fine to mention about this ?.


Too late, already applied without the new flag.


ohh, the patch is already queued for merge ?


Yes, that's why I dropped the guard page when I found out it was being
added. Send a patch to add the flag and we can discuss whether we think
our code is adequate to not require the protection.


Fine, will prepare a separate patch to avoid using the guard page.

Best regards
Akash


-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs

2016-08-12 Thread Goel, Akash



On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is
done
at the end and so the relay channel debugfs file is also created after
that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC
firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 61
+-
  1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index af48f62..1c287d7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct
intel_guc *guc)
  relay_close(guc->log.relay_chan);
  }

-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
  {
  struct drm_i915_private *dev_priv = guc_to_i915(guc);
  struct rchan *guc_log_relay_chan;
-struct dentry *log_dir;
  size_t n_subbufs, subbuf_size;

-/* For now create the log file in /sys/kernel/debug/dri/0 dir */
-log_dir = dev_priv->drm.primary->debugfs_root;
-
-/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
- * not mounted and so can't create the relay file.
- * The relay API seems to fit well with debugfs only.


It only needs a dentry, I don't see that it has to be a debugfs one.

Besides dentry, there are other requirements for using relay, which can 
be met only for a debugfs file.
debugfs wasn't the preferred choice to place the log file, but had no 
other option, as relay API is compatible with debugfs only.


Also retrieving dentry of a file is not so straight forward, as it might 
seem (spent considerable time on this initially).




- */
-if (!log_dir) {
-DRM_DEBUG_DRIVER("Parent debugfs directory not available
yet\n");
-return -ENODEV;
-}
-
  /* Keep the size of sub buffers same as shared log buffer */
  subbuf_size = guc->log.obj->base.size;
  /* Store up to 8 snaphosts, which is large enough to buffer
sufficient
@@ -1127,7 +1114,7 @@ static int guc_create_log_relay_file(struct
intel_guc *guc)
   */
  n_subbufs = 8;

-guc_log_relay_chan = relay_open("guc_log", log_dir,
+guc_log_relay_chan = relay_open(NULL, NULL,
  subbuf_size, n_subbufs, _callbacks, dev_priv);

  if (!guc_log_relay_chan) {
@@ -1140,6 +1127,33 @@ static int guc_create_log_relay_file(struct
intel_guc *guc)
  return 0;
  }

+static int guc_create_log_relay_file(struct intel_guc *guc)
+{
+struct drm_i915_private *dev_priv = guc_to_i915(guc);
+struct dentry *log_dir;
+int ret;
+
+/* For now create the log file in /sys/kernel/debug/dri/0 dir */
+log_dir = dev_priv->drm.primary->debugfs_root;
+
+/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
+ * not mounted and so can't create the relay file.
+ * The relay API seems to fit well with debugfs only.
+ */
+if (!log_dir) {
+DRM_DEBUG_DRIVER("Parent debugfs directory not available
yet\n");
+return -ENODEV;
+}
+
+ret = relay_late_setup_files(guc->log.relay_chan, "guc_log",
log_dir);
+if (ret) {
+DRM_DEBUG_DRIVER("Couldn't associate the channel with file
%d\n", ret);
+return ret;
+}
+
+return 0;
+}
+
  static void guc_log_cleanup(struct intel_guc *guc)
  {
  struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -1167,7 +1181,7 @@ static int guc_create_log_extras(struct
intel_guc *guc)
  {
  struct drm_i915_private *dev_priv = guc_to_i915(guc);
  void *vaddr;
-int ret;
+int ret = 0;

  lockdep_assert_held(_priv->drm.struct_mutex);

@@ -1190,7 +1204,15 @@ static int guc_create_log_extras(struct
intel_guc *guc)
  guc->log.buf_addr = vaddr;
  }

-return 0;
+if (!guc->log.relay_chan) {
+/* Create a relay channel, so that we have buffers for storing
+ * the GuC firmware logs, the channel will be linked with a file
+ * later on when debugfs is registered.
+ */
+ret = guc_create_relay_channel(guc);
+}
+
+return ret;
  }

  static void guc_create_log(struct intel_guc *guc)
@@ -1231,6 +1253,7 @@ static void guc_create_log(struct intel_guc *guc)
  

Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:37 PM, Tvrtko Ursulin wrote:


On 12/08/16 14:45, Goel, Akash wrote:



On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
   crash buffer area for regular cases and copying only the state
   structure data in first page.

v3:
  - Create a vmalloc mapping of log buffer. (Chris)
  - Cover the flush acknowledgment under rpm get & put.(Chris)
  - Revert the change of skipping the copy of crash dump area, as
not really needed, will be covered by subsequent patch.

v4:
  - Destroy the wq under the same condition in which it was created,
pass dev_piv pointer instead of dev to newly added GuC function,
add more comments & rename variable for clarity. (Tvrtko)

Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_drv.c|  14 +++
  drivers/gpu/drm/i915/i915_guc_submission.c | 150
+
  drivers/gpu/drm/i915/i915_irq.c|   5 +-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 0fcd1c0..fc2da32 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
drm_i915_private *dev_priv)
  if (dev_priv->hotplug.dp_wq == NULL)
  goto out_free_wq;

+if (HAS_GUC_SCHED(dev_priv)) {


This just reminded me that a previous patch had:

+if (HAS_GUC_UCODE(dev))
+dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;

In the interrupt setup. I don't think there is a bug right now, but
there is a disagreement between the two which would be good to resolve.

This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
for correctness. I think.


Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.

As per Chris's comments will move the wq init/destroy to the GuC logging
setup/teardown routines (guc_create_log_extras, guc_log_cleanup)
You are fine with that ?.


Yes thats OK I think.




+/* Need a dedicated wq to process log buffer flush interrupts
+ * from GuC without much delay so as to avoid any loss of
logs.
+ */
+dev_priv->guc.log.wq =
+alloc_ordered_workqueue("i915-guc_log", 0);
+if (dev_priv->guc.log.wq == NULL)
+goto out_free_hotplug_dp_wq;
+}
+
  return 0;

+out_free_hotplug_dp_wq:
+destroy_workqueue(dev_priv->hotplug.dp_wq);
  out_free_wq:
  destroy_workqueue(dev_priv->wq);
  out_err:
@@ -782,6 +794,8 @@ out_err:

  static void i915_workqueues_cleanup(struct drm_i915_private
*dev_priv)
  {
+if (HAS_GUC_SCHED(dev_priv))
+destroy_workqueue(dev_priv->guc.log.wq);
  destroy_workqueue(dev_priv->hotplug.dp_wq);
  destroy_workqueue(dev_priv->wq);
  }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c7c679f..2635b67 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
intel_guc *guc,
  return host2guc_action(guc, data, ARRAY_SIZE(data));
  }

+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+u32 data[1];
+
+data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+return host2guc_action(guc, data, 1);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -840,6 +849,127 @@ err:
  return NULL;
  }

+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+return;
+}
+
+static void* guc_get_write_buffer(struct intel_guc *guc)
+{
+return NULL;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
+struct guc_log_buffer_state log_buffer_state_local;
+void *src_data_ptr, *dst_data_ptr;
+u32 i, buffer_size;


unsigned int i if you can be bothered.


Fine will do that for both i & buffer_size.


buffer_size can match the type of log_buffer_state_local.size or use
something else if more appropriate.


But I remember earlier in one of the patch, you suggested to use u32 as
a ty

Re: [Intel-gfx] [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:23 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snaphots could be stored in the relay buffer.


snapshots


The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the
'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of
data.

v2: Defer the creation of relay channel & associated debugfs file, as
 debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
   the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
   sufficient buffering for boot time logs

Suggested-by: Chris Wilson 
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/Kconfig   |   1 +
  drivers/gpu/drm/i915/i915_drv.c|   2 +
  drivers/gpu/drm/i915/i915_guc_submission.c | 206
-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 209 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e46..fc900d2 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -11,6 +11,7 @@ config DRM_I915
  select DRM_KMS_HELPER
  select DRM_PANEL
  select DRM_MIPI_DSI
+select RELAY
  # i915 depends on ACPI_VIDEO when ACPI is enabled
  # but for select to work, need to select ACPI_VIDEO's
dependencies, ick
  select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index fc2da32..cb8c943 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1145,6 +1145,7 @@ static void i915_driver_register(struct
drm_i915_private *dev_priv)
  /* Reveal our presence to userspace */
  if (drm_dev_register(dev, 0) == 0) {
  i915_debugfs_register(dev_priv);
+i915_guc_register(dev_priv);
  i915_setup_sysfs(dev);
  } else
  DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1183,6 +1184,7 @@ static void i915_driver_unregister(struct
drm_i915_private *dev_priv)
  intel_opregion_unregister(dev_priv);

  i915_teardown_sysfs(_priv->drm);
+i915_guc_unregister(dev_priv);
  i915_debugfs_unregister(dev_priv);
  drm_dev_unregister(_priv->drm);

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2635b67..1a2d648 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
   */
  #include 
  #include 
+#include 
+#include 
  #include "i915_drv.h"
  #include "intel_guc.h"

@@ -851,12 +853,33 @@ err:

  static void guc_move_to_next_buf(struct intel_guc *guc)
  {
-return;
+/* Make sure the updates made in the sub buffer are visible when
+ * Consumer sees the following update to offset inside the sub
buffer.
+ */
+smp_wmb();
+
+/* All data has been written, so now move the offset of sub
buffer. */
+relay_reserve(guc->log.relay_chan, guc->log.obj->base.size);
+
+/* Switch to the next sub buffer */
+relay_flush(guc->log.relay_chan);
  }

  static void* guc_get_write_buffer(struct intel_guc *guc)
  {
-return NULL;
+/* FIXME: Cover the check under a lock ? */


Need to resolve before r-b in any case.
After the last patch in this series, where relay channel will be created 
before enabling the GuC interrupts, the need of lock will not be there 
so will remove these comments in that patch.





+if (!guc->log.relay_chan)
+return NULL;
+
+/* Just get the base address of a new sub buffer and copy data
into it
+ * ourselves. NULL will be returned in no-overwrite mode, if all sub
+ * buffers are full. Could have used the relay_write() to indirectly
+ * copy the data, but that would have been bit convoluted, as we
need to
+ * write to only certain locations inside a sub buffer which
cannot be
+ * done 

Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-12 Thread Goel, Akash



On 8/12/2016 9:22 PM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote:

On 8/12/2016 9:02 PM, Chris Wilson wrote:

There's (or will be) a function to dump the error object in a uniform
manner. This patch is obsolete.


There is a print_error_obj() function, but that prints one dword per line.


It used to. It will shortly be a compressed stream.



Pretty printing is left to userspace.
But invariably, we only will be interpreting the error state or Guc log 
buffer dump, and it will be really convenient if we can have 4 dwords 
per line matching the log sample size.



Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-12 Thread Goel, Akash



On 8/12/2016 9:02 PM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 04:20:03PM +0100, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

Added the dump of GuC log buffer to i915 error state, as the contents of
GuC log buffer would also be useful to determine that why the GPU reset
was triggered.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++
 2 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 28ffac5..4bd3790 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -509,6 +509,7 @@ struct drm_i915_error_state {
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
struct drm_i915_error_object *semaphore_obj;
+   struct drm_i915_error_object *guc_log_obj;

struct drm_i915_error_engine {
int engine_id;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index eecb870..561b523 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -546,6 +546,21 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
}
}

+   if ((obj = error->guc_log_obj)) {
+   err_printf(m, "GuC log buffer = 0x%08x\n",
+  lower_32_bits(obj->gtt_offset));
+   for (i = 0; i < obj->page_count; i++) {
+   for (elt = 0; elt < PAGE_SIZE/4; elt += 4) {


Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like
it is counting in u32 * 4 chunks so it might be. Or I might be
confused..

It will be PAGE_SIZE / 4 only. It took me some iterations to get it right.
PAGE_SIZE/4 is number of dwords and
elt+=4  is covering 4 dwords in every iteration




There's (or will be) a function to dump the error object in a uniform
manner. This patch is obsolete.


There is a print_error_obj() function, but that prints one dword per line.
For GuC log buffer its better (for ease of interpretation) to print 4 
dwords per line as each sample if of 4 dwords, also headers are of 8 dwords.
Other benefit is that it reduces the line count of the error state file 
(Compared to other captured buffers like ring buffer, batch buffers, 
status page, size of Log buffer is more, 76 KB).


Best regards
Akash




-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts

2016-08-12 Thread Goel, Akash



On 8/12/2016 8:35 PM, Tvrtko Ursulin wrote:


On 12/08/16 15:31, Goel, Akash wrote:

On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:

+static void gen9_guc2host_events_work(struct work_struct *work)
+{
+struct drm_i915_private *dev_priv =
+container_of(work, struct drm_i915_private,
guc.events_work);
+
+spin_lock_irq(_priv->irq_lock);
+/* Speed up work cancellation during disabling guc
interrupts. */
+if (!dev_priv->guc.interrupts_enabled) {
+spin_unlock_irq(_priv->irq_lock);
+return;


I suppose locking for early exit is something about ensuring the
worker
sees the update to dev_priv->guc.interrupts_enabled done on another
CPU?


Yes locking (providing implicit barrier) will ensure that update made
from another CPU is immediately visible to the worker.


What if the disable happens after the unlock above? It would wait in
disable until the irq handler exits.

Most probably it will not have to wait, as irq handler would have
completed if work item began the execution.
Irq handler just queues the work item, which gets scheduled later on.

Using the lock is beneficial for the case where the execution of work
item and interrupt disabling is done around the same time.


Ok maybe I am missing something.

When can the interrupt disabling happen? Will it be controlled by the
debugfs file or is it driver load/unload and suspend/resume?


yes disabling will happen for all the above 3 scenarios.


+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 gt_iir)
+{
+bool interrupts_enabled;
+
+if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+spin_lock(_priv->irq_lock);
+interrupts_enabled = dev_priv->guc.interrupts_enabled;
+spin_unlock(_priv->irq_lock);


Not sure that taking a lock around only this read is needed.


Again same reason as above, to make sure an update made on another CPU
is immediately visible to the irq handler.


I don't get it, see above. :)


Here also If interrupt disabling & ISR execution happens around the same
time then ISR might miss the reset of 'interrupts_enabled' flag and
queue the new work.


What if reset of interrupts_enabled happens just as the ISR releases the
lock?


Then ISR will proceed ahead and queue the work item.

Lock is useful if reset of interrupts_enabled flag just happens before 
the ISR inspects the value of that flag.
Also lock will help when interrupts_enabled flag is set again, next ISR 
will definitely see it as set.



And same applies to the case when interrupt is re-enabled, ISR might
still see the 'interrupts_enabled' flag as false.
It will eventually see the update though.




+if (interrupts_enabled) {
+/* Sample the log buffer flush related bits & clear them
+ * out now itself from the message identity register to
+ * minimize the probability of losing a flush interrupt,
+ * when there are back to back flush interrupts.
+ * There can be a new flush interrupt, for different log
+ * buffer type (like for ISR), whilst Host is handling
+ * one (for DPC). Since same bit is used in message
+ * register for ISR & DPC, it could happen that GuC
+ * sets the bit for 2nd interrupt but Host clears out
+ * the bit on handling the 1st interrupt.
+ */
+u32 msg = I915_READ(SOFT_SCRATCH(15)) &
+(GUC2HOST_MSG_CRASH_DUMP_POSTED |
+ GUC2HOST_MSG_FLUSH_LOG_BUFFER);
+if (msg) {
+/* Clear the message bits that are handled */
+I915_WRITE(SOFT_SCRATCH(15),
+I915_READ(SOFT_SCRATCH(15)) & ~msg);


Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
twice?


Thought reading it again (just before the update) is bit safer compared
to reading it once, as there is a potential race problem here.
GuC could also write to the SOFT_SCRATCH(15) register, set new events
bit, while Host clears off the bit of handled events.


Don't get it. If there is a race between read and write there still is,
don't see how a second read makes it safer.


Yes can't avoid the race completely by double reads, but can reduce the
race window size.


There was only one thing between the two reads, and that was "if (msg)":

 +u32 msg = I915_READ(SOFT_SCRATCH(15)) &
 +(GUC2HOST_MSG_CRASH_DUMP_POSTED |
 + GUC2HOST_MSG_FLUSH_LOG_BUFFER);

 +if (msg) {

 +/* Clear the message bits that are handled */
 +I915_WRITE(SOFT_SCRATCH(15),
 +I915_READ(SOFT_SCRATCH(15)) & ~msg);



Also I felt code looked better in current form, as macros
GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used
only once.

Will change as per the initial implementation.

 u32 msg = I915_REA

Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps

2016-08-12 Thread Goel, Akash



On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Chris Wilson 

vmaps has a provision for controlling the page protection bits, with
which
we can use to control the mapping type, e.g. WB, WC, UC or even WT.
To allow the caller to choose their mapping type, we add a parameter to
i915_gem_object_pin_map - but we still only allow one vmap to be cached
per object. If the object is currently not pinned, then we recreate the
previous vmap with the new access type, but if it was pinned we report an
error. This effectively limits the access via i915_gem_object_pin_map
to a
single mapping type for the lifetime of the object. Not usually a
problem,
but something to be aware of when setting up the object's vmap.

We will want to vary the access type to enable WC mappings of ringbuffer
and context objects on !llc platforms, as well as other objects where we
need coherent access to the GPU's pages without going through the GTT

v2: Remove the redundant braces around pin count check and fix the marker
 in documentation (Chris)

v3:
- Add a new enum for the vmalloc mapping type & pass that as an
argument to
   i915_object_pin_map. (Tvrtko)
- Use PAGE_MASK to extract or filter the mapping type info and remove a
   superfluous BUG_ON.(Tvrtko)

v4:
- Rename the enums and clean up the pin_map function. (Chris)

Signed-off-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.h|  9 -
  drivers/gpu/drm/i915/i915_gem.c| 58
+++---
  drivers/gpu/drm/i915/i915_gem_dmabuf.c |  2 +-
  drivers/gpu/drm/i915/i915_guc_submission.c |  2 +-
  drivers/gpu/drm/i915/intel_lrc.c   |  8 ++---
  drivers/gpu/drm/i915/intel_ringbuffer.c|  2 +-
  6 files changed, 60 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index 4bd3790..6603812 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -834,6 +834,11 @@ enum i915_cache_level {
  I915_CACHE_WT, /* hsw:gt3e WriteThrough for scanouts */
  };

+enum i915_map_type {
+I915_MAP_WB = 0,
+I915_MAP_WC,
+};
+
  struct i915_ctx_hang_stats {
  /* This context had batch pending when hang was declared */
  unsigned batch_pending;
@@ -3150,6 +3155,7 @@ static inline void
i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
  /**
   * i915_gem_object_pin_map - return a contiguous mapping of the
entire object
   * @obj - the object to map into kernel address space
+ * @map_type - whether the vmalloc mapping should be using WC or WB
pgprot_t
   *
   * Calls i915_gem_object_pin_pages() to prevent reaping of the object's
   * pages and then returns a contiguous mapping of the backing
storage into
@@ -3161,7 +3167,8 @@ static inline void
i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
   * Returns the pointer through which to access the mapped object, or an
   * ERR_PTR() on error.
   */
-void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object
*obj);
+void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object
*obj,
+enum i915_map_type map_type);

  /**
   * i915_gem_object_unpin_map - releases an earlier mapping
diff --git a/drivers/gpu/drm/i915/i915_gem.c
b/drivers/gpu/drm/i915/i915_gem.c
index 03548db..7dabbc3f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2077,10 +2077,11 @@ i915_gem_object_put_pages(struct
drm_i915_gem_object *obj)
  list_del(>global_list);

  if (obj->mapping) {
-if (is_vmalloc_addr(obj->mapping))
-vunmap(obj->mapping);
+void *ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK);
+if (is_vmalloc_addr(ptr))
+vunmap(ptr);
  else
-kunmap(kmap_to_page(obj->mapping));
+kunmap(kmap_to_page(ptr));
  obj->mapping = NULL;
  }

@@ -2253,7 +2254,8 @@ i915_gem_object_get_pages(struct
drm_i915_gem_object *obj)
  }

  /* The 'mapping' part of i915_gem_object_pin_map() below */
-static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
+static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
+ enum i915_map_type type)
  {
  unsigned long n_pages = obj->base.size >> PAGE_SHIFT;
  struct sg_table *sgt = obj->pages;
@@ -2263,9 +2265,10 @@ static void *i915_gem_object_map(const struct
drm_i915_gem_object *obj)
  struct page **pages = stack_pages;
  unsigned long i = 0;
  void *addr;
+bool use_wc = (type == I915_MAP_WC);

  /* A single page can always be kmapped */
-if (n_pages == 1)
+if (n_pages == 1 && !use_wc)
  return kmap(sg_page(sgt->sgl));

  if (n_pages > ARRAY_SIZE(stack_pages)) {
@@ -2281,7 +2284,8 @@ static void *i915_gem_object_map(const struct
drm_i915_gem_object *obj)
  

Re: [Intel-gfx] [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:25 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.

v2: Use mutex in place of spinlock to serialize, as sleep can happen
 while waiting for the action's response from GuC. (Tvrtko)

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
  drivers/gpu/drm/i915/intel_guc.h   | 3 +++
  2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1a2d648..cb9672b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc,
u32 *data, u32 len)
  return -EINVAL;

  intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+mutex_lock(>action_lock);


I would probably take the mutex before grabbing forcewake as a general
rule. Not that I think it matters in this case since we don't expect any
contention on this one.

Yes did not expected a contention for this mutex, hence thought it use 
just around the code where it is actually needed.
Will move it before the forcewake, as you suggested, to conform to the 
rules.


Best regards
Akash


  dev_priv->guc.action_count += 1;
  dev_priv->guc.action_cmd = data[0];
@@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc,
u32 *data, u32 len)
  }
  dev_priv->guc.action_status = status;

+mutex_unlock(>action_lock);
  intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);

  return ret;
@@ -1312,6 +1314,7 @@ int i915_guc_submission_init(struct
drm_i915_private *dev_priv)
  return -ENOMEM;

  ida_init(>ctx_ids);
+mutex_init(>action_lock);
  guc_create_log(guc);
  guc_create_ads(guc);

diff --git a/drivers/gpu/drm/i915/intel_guc.h
b/drivers/gpu/drm/i915/intel_guc.h
index 96ef7dc..e4ec8d8 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -156,6 +156,9 @@ struct intel_guc {

  uint64_t submissions[I915_NUM_ENGINES];
  uint32_t last_seqno[I915_NUM_ENGINES];
+
+/* To serialize the Host2GuC actions */
+struct mutex action_lock;
  };

  /* intel_guc_loader.c */



With or without the mutex vs forcewake ordering change:

Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:56 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
   flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
   the GuC log buffer. (Tvrtko)

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_debugfs.c| 28

  drivers/gpu/drm/i915/i915_guc_submission.c | 19 +++
  drivers/gpu/drm/i915/i915_irq.c|  2 ++
  drivers/gpu/drm/i915/intel_guc.h   |  7 +++
  4 files changed, 56 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 51b59d5..14e0dcf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2539,6 +2539,32 @@ static int i915_guc_load_status_info(struct
seq_file *m, void *data)
  return 0;
  }

+static void i915_guc_log_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv)
+{
+struct intel_guc *guc = _priv->guc;
+
+seq_printf(m, "\nGuC logging stats:\n");
+
+seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);


Why is the width for overflow only 8 chars and not 10 like for flush
since both are u32?


Looks to be a discrepancy. I will check.
Both should be 10 as per the max value of u32, which takes 10 digits in 
decimal form.





+
+seq_printf(m, "\tTotal flush interrupt count: %u\n",
+   guc->log.flush_interrupt_count);
+
+seq_printf(m, "\tCapture miss count: %u\n",
+   guc->log.capture_miss_count);
+}
+
  static void i915_guc_client_info(struct seq_file *m,
   struct drm_i915_private *dev_priv,
   struct i915_guc_client *client)
@@ -2613,6 +2639,8 @@ static int i915_guc_info(struct seq_file *m,
void *data)
  seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
  i915_guc_client_info(m, dev_priv, );

+i915_guc_log_info(m, dev_priv);
+
  /* Add more as required ... */

  return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index cb9672b..1ca1866 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -913,6 +913,24 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  sizeof(struct guc_log_buffer_state));
  buffer_size = log_buffer_state_local.size;

+guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
+if (log_buffer_state_local.buffer_full_cnt !=
+guc->log.prev_overflow_count[i]) {
+guc->log.total_overflow_count[i] +=
+(log_buffer_state_local.buffer_full_cnt -
+ guc->log.prev_overflow_count[i]);
+
+if (log_buffer_state_local.buffer_full_cnt <
+guc->log.prev_overflow_count[i]) {
+/* buffer_full_cnt is a 4 bit counter */
+guc->log.total_overflow_count[i] += 16;
+}
+
+guc->log.prev_overflow_count[i] =
+log_buffer_state_local.buffer_full_cnt;
+DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+}
+
  if (log_buffer_snapshot_state) {
  /* First copy the state structure in local buffer */
  memcpy(log_buffer_snapshot_state, _buffer_state_local,
@@ -953,6 +971,7 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
   * getting consumed by User at a slow rate.
   */
  DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n");
+guc->log.capture_miss_count++;
  }
  }

diff --git a/drivers/gpu/drm/i915/i915_irq.c
b/drivers/gpu/drm/i915/i915_irq.c
index d4d6f0a..b08d1d2 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1705,6 +1705,8 @@ static void gen9_guc_irq_handler(struct
drm_i915_private *dev_priv, 

Re: [Intel-gfx] [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-08-12 Thread Goel, Akash



On 8/12/2016 8:12 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 40
+-
  1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1ca1866..8e0f360 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
  struct guc_log_buffer_state log_buffer_state_local;
  void *src_data_ptr, *dst_data_ptr;
-u32 i, buffer_size;
+bool new_overflow;
+u32 i, buffer_size, read_offset, write_offset, bytes_to_copy;

  if (!guc->log.buf_addr)
  return;
@@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  memcpy(_buffer_state_local, log_buffer_state,
  sizeof(struct guc_log_buffer_state));
  buffer_size = log_buffer_state_local.size;
+read_offset = log_buffer_state_local.read_ptr;
+write_offset = log_buffer_state_local.sampled_write_ptr;

  guc->log.flush_count[i] +=
log_buffer_state_local.flush_to_file;
  if (log_buffer_state_local.buffer_full_cnt !=
  guc->log.prev_overflow_count[i]) {


Wrong alignment. You can try checkpatch.pl for all of those.


Sorry for all the alignment & indentation issues.

Should the above condition be written like this ?

if (log_buffer_state_local.buffer_full_cnt !=
guc->log.prev_overflow_count[i]) {



+new_overflow = 1;


true/false since it is a bool

fine will do that.



  guc->log.total_overflow_count[i] +=
  (log_buffer_state_local.buffer_full_cnt -
   guc->log.prev_overflow_count[i]);
@@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  guc->log.prev_overflow_count[i] =
  log_buffer_state_local.buffer_full_cnt;
  DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
-}
+} else
+new_overflow = 0;

  if (log_buffer_snapshot_state) {
  /* First copy the state structure in local buffer */
@@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
   * for consistency set the write pointer value to same
   * value of sampled_write_ptr in the snapshot buffer.
   */
-log_buffer_snapshot_state->write_ptr =
-log_buffer_snapshot_state->sampled_write_ptr;
+log_buffer_snapshot_state->write_ptr = write_offset;

  log_buffer_snapshot_state++;

  /* Now copy the actual logs */
  memcpy(dst_data_ptr, src_data_ptr, buffer_size);


The confusing bit - the memcpy above still copies the whole buffer, no?


Really very sorry for this blooper.

Best regards
Akash


+if (unlikely(new_overflow)) {
+/* copy the whole buffer in case of overflow */
+read_offset = 0;
+write_offset = buffer_size;
+} else if (unlikely((read_offset > buffer_size) ||
+(write_offset > buffer_size))) {
+DRM_ERROR("invalid log buffer state\n");
+/* copy whole buffer as offsets are unreliable */
+read_offset = 0;
+write_offset = buffer_size;
+}
+
+/* Just copy the newly written data */
+if (read_offset <= write_offset) {
+bytes_to_copy = write_offset - read_offset;
+memcpy(dst_data_ptr + read_offset,
+ src_data_ptr + read_offset, bytes_to_copy);
+} else {
+bytes_to_copy = buffer_size - read_offset;
+memcpy(dst_data_ptr + read_offset,
+ src_data_ptr + read_offset, bytes_to_copy);
+
+bytes_to_copy = write_offset;
+memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy);
+}

  src_data_ptr += buffer_size;
  

Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:


On 12/08/16 14:10, Goel, Akash wrote:

On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

There are certain types of interrupts which Host can recieve from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
   start of bottom half, its not really needed as there is only a
   single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
   register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
   bits, outside the irq spinlock. (Tvrtko)

Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_drv.h|   1 +
  drivers/gpu/drm/i915/i915_guc_submission.c |   5 ++
  drivers/gpu/drm/i915/i915_irq.c| 100
+++--
  drivers/gpu/drm/i915/i915_reg.h|  11 
  drivers/gpu/drm/i915/intel_drv.h   |   3 +
  drivers/gpu/drm/i915/intel_guc.h   |   4 ++
  drivers/gpu/drm/i915/intel_guc_loader.c|   4 ++
  7 files changed, 124 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index a608a5c..28ffac5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1779,6 +1779,7 @@ struct drm_i915_private {
  u32 pm_imr;
  u32 pm_ier;
  u32 pm_rps_events;
+u32 pm_guc_events;
  u32 pipestat_irq_mask[I915_MAX_PIPES];

  struct i915_hotplug hotplug;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ad3b55f..c7c679f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev)
  if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
  return 0;

+gen9_disable_guc_interrupts(dev_priv);
+
  ctx = dev_priv->kernel_context;

  data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
@@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev)
  if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
  return 0;

+if (i915.guc_log_level >= 0)
+gen9_enable_guc_interrupts(dev_priv);
+
  ctx = dev_priv->kernel_context;

  data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
diff --git a/drivers/gpu/drm/i915/i915_irq.c
b/drivers/gpu/drm/i915/i915_irq.c
index 5f93309..5f1974f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct
drm_i915_private *dev_priv,
  } while (0)

  static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv,
u32 pm_iir);
+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 pm_iir);

  /* For display hotplug interrupt */
  static inline void
@@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct
drm_i915_private *dev_priv)
  gen6_reset_rps_interrupts(dev_priv);
  }

+void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(_priv->irq_lock);
+gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
+spin_unlock_irq(_priv->irq_lock);
+}
+
+void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(_priv->irq_lock);
+if (!dev_priv->guc.interrupts_enabled) {
+WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
+dev_priv->pm_guc_events);
+dev_priv->guc.interrupts_enabled = true;
+gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+}
+spin_unlock_irq(_priv->irq_lock);
+}
+
+void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(_priv->irq_lock);
+dev_priv->guc.interrupts_enabled = false;
+
+gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+
+spin_unlock_irq(_priv->irq_lock);
+synchronize_irq(dev_priv->drm.irq);
+
+gen9_reset_guc_interrupts(dev_priv);
+}
+
  /**
   * bdw_update_port_irq - update DE port interrup

Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-12 Thread Goel, Akash



On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
   crash buffer area for regular cases and copying only the state
   structure data in first page.

v3:
  - Create a vmalloc mapping of log buffer. (Chris)
  - Cover the flush acknowledgment under rpm get & put.(Chris)
  - Revert the change of skipping the copy of crash dump area, as
not really needed, will be covered by subsequent patch.

v4:
  - Destroy the wq under the same condition in which it was created,
pass dev_piv pointer instead of dev to newly added GuC function,
add more comments & rename variable for clarity. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.c|  14 +++
  drivers/gpu/drm/i915/i915_guc_submission.c | 150
+
  drivers/gpu/drm/i915/i915_irq.c|   5 +-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 0fcd1c0..fc2da32 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
drm_i915_private *dev_priv)
  if (dev_priv->hotplug.dp_wq == NULL)
  goto out_free_wq;

+if (HAS_GUC_SCHED(dev_priv)) {


This just reminded me that a previous patch had:

+if (HAS_GUC_UCODE(dev))
+dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;

In the interrupt setup. I don't think there is a bug right now, but
there is a disagreement between the two which would be good to resolve.

This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
for correctness. I think.


Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.

As per Chris's comments will move the wq init/destroy to the GuC logging 
setup/teardown routines (guc_create_log_extras, guc_log_cleanup)

You are fine with that ?.




+/* Need a dedicated wq to process log buffer flush interrupts
+ * from GuC without much delay so as to avoid any loss of logs.
+ */
+dev_priv->guc.log.wq =
+alloc_ordered_workqueue("i915-guc_log", 0);
+if (dev_priv->guc.log.wq == NULL)
+goto out_free_hotplug_dp_wq;
+}
+
  return 0;

+out_free_hotplug_dp_wq:
+destroy_workqueue(dev_priv->hotplug.dp_wq);
  out_free_wq:
  destroy_workqueue(dev_priv->wq);
  out_err:
@@ -782,6 +794,8 @@ out_err:

  static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
  {
+if (HAS_GUC_SCHED(dev_priv))
+destroy_workqueue(dev_priv->guc.log.wq);
  destroy_workqueue(dev_priv->hotplug.dp_wq);
  destroy_workqueue(dev_priv->wq);
  }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c7c679f..2635b67 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
intel_guc *guc,
  return host2guc_action(guc, data, ARRAY_SIZE(data));
  }

+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+u32 data[1];
+
+data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+return host2guc_action(guc, data, 1);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -840,6 +849,127 @@ err:
  return NULL;
  }

+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+return;
+}
+
+static void* guc_get_write_buffer(struct intel_guc *guc)
+{
+return NULL;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
+struct guc_log_buffer_state log_buffer_state_local;
+void *src_data_ptr, *dst_data_ptr;
+u32 i, buffer_size;


unsigned int i if you can be bothered.


Fine will do that for both i & buffer_size.

But I remember earlier in one of the patch, you suggested to use u32 as 
a type for some variables.

Please could you share the guideline.
Should u32, u64 be used we are exactly sure of the range of the 
variable, like for variables containing the register values ?






+
+if (!guc->log.buf_addr)
+return;


Can it hit this? If yes, I think better 

Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts

2016-08-12 Thread Goel, Akash



On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

There are certain types of interrupts which Host can recieve from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
   start of bottom half, its not really needed as there is only a
   single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
   register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
   bits, outside the irq spinlock. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.h|   1 +
  drivers/gpu/drm/i915/i915_guc_submission.c |   5 ++
  drivers/gpu/drm/i915/i915_irq.c| 100
+++--
  drivers/gpu/drm/i915/i915_reg.h|  11 
  drivers/gpu/drm/i915/intel_drv.h   |   3 +
  drivers/gpu/drm/i915/intel_guc.h   |   4 ++
  drivers/gpu/drm/i915/intel_guc_loader.c|   4 ++
  7 files changed, 124 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index a608a5c..28ffac5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1779,6 +1779,7 @@ struct drm_i915_private {
  u32 pm_imr;
  u32 pm_ier;
  u32 pm_rps_events;
+u32 pm_guc_events;
  u32 pipestat_irq_mask[I915_MAX_PIPES];

  struct i915_hotplug hotplug;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ad3b55f..c7c679f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev)
  if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
  return 0;

+gen9_disable_guc_interrupts(dev_priv);
+
  ctx = dev_priv->kernel_context;

  data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
@@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev)
  if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
  return 0;

+if (i915.guc_log_level >= 0)
+gen9_enable_guc_interrupts(dev_priv);
+
  ctx = dev_priv->kernel_context;

  data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
diff --git a/drivers/gpu/drm/i915/i915_irq.c
b/drivers/gpu/drm/i915/i915_irq.c
index 5f93309..5f1974f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct
drm_i915_private *dev_priv,
  } while (0)

  static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv,
u32 pm_iir);
+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 pm_iir);

  /* For display hotplug interrupt */
  static inline void
@@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct
drm_i915_private *dev_priv)
  gen6_reset_rps_interrupts(dev_priv);
  }

+void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(_priv->irq_lock);
+gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
+spin_unlock_irq(_priv->irq_lock);
+}
+
+void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(_priv->irq_lock);
+if (!dev_priv->guc.interrupts_enabled) {
+WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
+dev_priv->pm_guc_events);
+dev_priv->guc.interrupts_enabled = true;
+gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+}
+spin_unlock_irq(_priv->irq_lock);
+}
+
+void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(_priv->irq_lock);
+dev_priv->guc.interrupts_enabled = false;
+
+gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+
+spin_unlock_irq(_priv->irq_lock);
+synchronize_irq(dev_priv->drm.irq);
+
+gen9_reset_guc_interrupts(dev_priv);
+}
+
  /**
   * bdw_update_port_irq - update DE port interrupt
   * @dev_priv: driver private
@@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct
*work)
  mutex_unlock(_priv->rps.hw_lock);
  }

+static void 

Re: [Intel-gfx] [PATCH 14/20] drm/i915: Forcefully flush GuC log buffer on reset

2016-08-12 Thread Goel, Akash



On 8/12/2016 12:03 PM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 11:55:17AM +0530, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

Before capturing the GuC logs as a part of error state, there should be a
force log buffer flush action sent to GuC before proceeding with GPU reset
and re-initializing GUC. There could be some data in the log buffer which is
yet to be captured and those logs would be particularly useful to understand
that why the GPU reset was initiated.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_gpu_error.c  |  2 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 27 +++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 3 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 561b523..5e358e2 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1232,6 +1232,8 @@ static void i915_gem_capture_guc_log_buffer(struct 
drm_i915_private *dev_priv,
if (!dev_priv->guc.log.obj)
return;

+   i915_guc_flush_logs(dev_priv);


This is an invalid context for this function, flush_work() is illegal
inside error capture.


Actually the concerned work item should not take much time for execution 
and also it doesn't acquire any such locks due to which it can get blocked.


Should there be no wait whatsoever in error capture ?
Will have to drop this patch.

Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-12 Thread Goel, Akash



On 8/12/2016 12:21 PM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 12:14:28PM +0530, Goel, Akash wrote:



On 8/12/2016 11:58 AM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 11:55:09AM +0530, akash.g...@intel.com wrote:

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 0fcd1c0..fc2da32 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
{
+   if (HAS_GUC_SCHED(dev_priv))
+   destroy_workqueue(dev_priv->guc.log.wq);


if (dev_priv->guc.log.wq)
destroy_workqueue(dev_priv->guc.log.wq);

This shouldn't be here, but in guc teardown.

Likewise this is


Fine will move it to GuC teardown.


@@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private 
*dev_priv)
if (dev_priv->hotplug.dp_wq == NULL)
goto out_free_wq;

+   if (HAS_GUC_SCHED(dev_priv)) {
+   /* Need a dedicated wq to process log buffer flush interrupts
+* from GuC without much delay so as to avoid any loss of logs.
+*/
+   dev_priv->guc.log.wq =


creating guc specific wq, not drm_i915_private's. They can even be
managed by guc.log?

Sorry for the inconsistency here, but didn't get your question.
dev_priv->guc.log.wq


Just somewhere inside guc, I was just noting that you probably already
have setup/teardown for dev_priv->guc.log itself.


Fine, will move the dedicated wq creation/destruction in the
setup/teardown routines for guc.log.

Best Regards
Akash


-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-12 Thread Goel, Akash



On 8/12/2016 11:58 AM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 11:55:09AM +0530, akash.g...@intel.com wrote:

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 0fcd1c0..fc2da32 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
 static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
 {
+   if (HAS_GUC_SCHED(dev_priv))
+   destroy_workqueue(dev_priv->guc.log.wq);


if (dev_priv->guc.log.wq)
destroy_workqueue(dev_priv->guc.log.wq);

This shouldn't be here, but in guc teardown.

Likewise this is


Fine will move it to GuC teardown.


@@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private 
*dev_priv)
if (dev_priv->hotplug.dp_wq == NULL)
goto out_free_wq;

+   if (HAS_GUC_SCHED(dev_priv)) {
+   /* Need a dedicated wq to process log buffer flush interrupts
+* from GuC without much delay so as to avoid any loss of logs.
+*/
+   dev_priv->guc.log.wq =


creating guc specific wq, not drm_i915_private's. They can even be
managed by guc.log?

Sorry for the inconsistency here, but didn't get your question.
dev_priv->guc.log.wq

dev_priv->guc.events_work

Best regards
Akash


-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 17/17] drm/i915: Use rt priority kthread to do GuC log buffer sampling

2016-07-21 Thread Goel, Akash



On 7/21/2016 11:13 AM, Chris Wilson wrote:

On Thu, Jul 21, 2016 at 09:11:42AM +0530, Goel, Akash wrote:



On 7/21/2016 1:04 AM, Chris Wilson wrote:

In the end, just the silly locking and placement of complete_all() is
dangerous. reinit_completion() lacks the barrier to be used like this
really, at any rate, racy with the irq handler, so use sparingly or when
you control the irq handler.

Sorry I forgot to add a comment that
guc_cancel_log_flush_work_sync() should be invoked only after
ensuring that there will be no more flush interrupts, which will
happen either by explicitly disabling the interrupt or disabling the
logging and that's what is done at the 2 call sites.

Since had covered reinit_completion() under the irq_lock, thought an
explicit barrier is not needed.


You hadn't controlled everything via the irq_lock, and nor should you.



spin_lock_irq(_priv->irq_lock);
if (guc->log.flush_signal) {
guc->log.flush_signal = false;
reinit_completion(>log.flush_completion);
spin_unlock_irq(_priv->irq_lock);
i915_guc_capture_logs(_priv->drm);
complete_all(>log.flush_completion);

The placement of complete_all isn't right for the case, where
a guc_cancel_log_flush_work_sync() is called but there was no prior
flush interrupt received.


Exactly.


(Also not sure if log.signal = 0 is sane,

Did log.signal = 0 for fast cancellation. Will remove that.

A smp_wmb() after reinit_completion(_completion) would be fine ?


Don't worry, the race can only be controlled by controlling the irq.




In the end, I think something more like

while (signal) ...

complete_all();
schedule();
reinit_completion();

is the simplest.

Thanks much, so will have the task body like this.
do {
set_current_state(TASK_INT);
while (cmpxchg(, 1, 0)) {
i915_guc_capture_logs();
};
complete_all(log.complete);
if (kthread_should_stop())
break;
schedule();
reinit_completion();
} while(1);




or the current callsites really require the flush.)


Sync against a ongoing/pending flush is being done for the 2
forceful flush cases, which will be effective only if the pending
flush is completed, so forceful flush should be serialized with a
pending flush.


Or you just signal=true, wakeup task, wait_timeout. Otherwise you
haven't really serialized anything without disabling the interrupt.

Agree without disabling the interrupt, serialization cannot be provided,

For the sync can use,
{
WARN_ON(guc->interrupts_enabled);
wait_for_completion_interruptible_timeout(
guc->log.complete, 5 /* in jiffies*/);
}


Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 17/17] drm/i915: Use rt priority kthread to do GuC log buffer sampling

2016-07-20 Thread Goel, Akash



On 7/21/2016 1:04 AM, Chris Wilson wrote:

On Sun, Jul 10, 2016 at 07:11:24PM +0530, akash.g...@intel.com wrote:

@@ -1707,8 +1692,8 @@ static void gen9_guc_irq_handler(struct drm_i915_private 
*dev_priv, u32 gt_iir)
I915_READ(SOFT_SCRATCH(15)) & ~msg);

/* Handle flush interrupt event in bottom half 
*/
-   queue_work(dev_priv->guc.log.wq,
-   _priv->guc.events_work);
+   smp_store_mb(dev_priv->guc.log.flush_signal, 1);
+   wake_up_process(dev_priv->guc.log.flush_task);
}



+void guc_cancel_log_flush_work_sync(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(_priv->irq_lock);
+   dev_priv->guc.log.flush_signal = false;
+   spin_unlock_irq(_priv->irq_lock);
+
+   if (dev_priv->guc.log.flush_task)
+   wait_for_completion(_priv->guc.log.flush_completion);
+}
+
+static int guc_log_flush_worker(void *arg)
+{
+   struct drm_i915_private *dev_priv = arg;
+   struct intel_guc *guc = _priv->guc;
+
+   /* Install ourselves with high priority to reduce signalling latency */
+   struct sched_param param = { .sched_priority = 1 };
+   sched_setscheduler_nocheck(current, SCHED_FIFO, );
+
+   do {
+   set_current_state(TASK_INTERRUPTIBLE);
+
+   spin_lock_irq(_priv->irq_lock);
+   if (guc->log.flush_signal) {
+   guc->log.flush_signal = false;
+   reinit_completion(>log.flush_completion);
+   spin_unlock_irq(_priv->irq_lock);
+   i915_guc_capture_logs(_priv->drm);
+   complete_all(>log.flush_completion);
+   } else {
+   spin_unlock_irq(_priv->irq_lock);
+   if (kthread_should_stop())
+   break;
+
+   schedule();
+   }
+   } while (1);
+   __set_current_state(TASK_RUNNING);
+
+   return 0;


This looks decidely fishy.


Sorry for that.


irq handler:
smp_store_mb(log.signal, 1);
wake_up_process(log.tsk);

worker:
do {
set_current_state(TASK_INT);

while (cmpxchg(, 1, 0)) {
reinit_completion(log.complete);
i915_guc_capture_logs();
}

complete_all(log.complete);
if (kthread_should_stop())
break;

schedule();
} while(1);
__set_current_state(TASK_RUNNING);

flush:
smp_store_mb(log.signal, 0);
wait_for_completion(log.complete);


In the end, just the silly locking and placement of complete_all() is
dangerous. reinit_completion() lacks the barrier to be used like this
really, at any rate, racy with the irq handler, so use sparingly or when
you control the irq handler.
Sorry I forgot to add a comment that guc_cancel_log_flush_work_sync() 
should be invoked only after ensuring that there will be no more flush 
interrupts, which will happen either by explicitly disabling the 
interrupt or disabling the logging and that's what is done at the 2 call 
sites.


Since had covered reinit_completion() under the irq_lock, thought an 
explicit barrier is not needed.


spin_lock_irq(_priv->irq_lock);
if (guc->log.flush_signal) {
guc->log.flush_signal = false;
reinit_completion(>log.flush_completion);
spin_unlock_irq(_priv->irq_lock);
i915_guc_capture_logs(_priv->drm);
complete_all(>log.flush_completion);

The placement of complete_all isn't right for the case, where
a guc_cancel_log_flush_work_sync() is called but there was no prior 
flush interrupt received.



(Also not sure if log.signal = 0 is sane,

Did log.signal = 0 for fast cancellation. Will remove that.

A smp_wmb() after reinit_completion(_completion) would be fine ?


or the current callsites really require the flush.)


Sync against a ongoing/pending flush is being done for the 2 forceful 
flush cases, which will be effective only if the pending flush is 
completed, so forceful flush should be serialized with a pending flush.


Best regards
Akash


-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control

2016-07-20 Thread Goel, Akash



On 7/20/2016 5:20 PM, Tvrtko Ursulin wrote:


On 20/07/16 12:29, Goel, Akash wrote:

On 7/20/2016 4:10 PM, Tvrtko Ursulin wrote:

On 20/07/16 11:12, Goel, Akash wrote:

On 7/20/2016 3:17 PM, Tvrtko Ursulin wrote:



+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+NULL, i915_guc_log_control_set,
+"0x%08llx\n");


Does the readback still work with no get method?


readback will give a 'Permission denied' error


Is that what we want? I think it would be nice to allow read-back
unless
there is a specific reason why it shouldn't be allowed.



Ok can implement a dummy read back function but what should be
shown/returned on read.

Should I show/return the guc_log_level value (which is also available
from /sys/module/i915/parameters/) ?


I would return the same value that was written in. Is the problem that
it is not stored anywhere? Maybe reconstruct it from
i915.guc_log_level ?



The verbosity value will be same as guc_log_level. But whether logging
on GuC side is currently enabled or disabled can't be inferred (it
could
have been disabled at run time).
So will have to store the exact value written by User.


That's what I meant. Code currently seem to decompose the value written
via debugfs and store it in i915.guc_log_level:

0x00 = -1
0x10 = -1
...
0x01 = 0
0x11 = 1
0x21 = 2
0x31 = 3
...

So for readback you could translate back from i915.guc_log_level to the
debugfs format.


Sorry for all the mess.

Should I add a new field 'debugfs_ctrl_val' in guc structure, to store
the value previously written to debugfs file, considering guc_log_level
only gives an indication of the verbosity level ?

Actually in future there may be other additions also to the value
written to guc_log_control debugfs, have right now exposed only logging
& verbosity level controls to User, as they are deemed most useful right
now.
But there are some other controls also which can be passed to GuC
firmware through UK_LOG_ENABLE_LOGGING host2guc action.


I see.

Would it work, for time being at least, to set i915.guc_log_level to -1
when logging is disabled via debugfs?

Actually had thought about this, but didn't pursue since on doing so 
will have to adjust some of the guc_log_level related asserts/ conditions.

Will do it now as currently this looks to be the best alternative.
Thanks much for the inputs.

Best regards
Akash


It think that also has the advantage of making the current guc logging
state consistent when observed from the outside. Otherwise the debugfs
value and module parameter may disagree on it, as you said before. Which
is not that great I think.

Apart from making the reported stated consistent, that way you could, at
least for the time being, get away without storing a copy of
guc_log_control but reconstruct it from the module parameter on read-back.

Regards,

Tvrtko



You could avoid storing a copy of guc_log_control like that.




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control

2016-07-20 Thread Goel, Akash



On 7/20/2016 4:10 PM, Tvrtko Ursulin wrote:


On 20/07/16 11:12, Goel, Akash wrote:

On 7/20/2016 3:17 PM, Tvrtko Ursulin wrote:

+ret = -EINVAL;
+goto end;
+}
+
+intel_runtime_pm_get(dev_priv);
+ret = i915_guc_log_control(dev, val);
+intel_runtime_pm_put(dev_priv);
+
+end:
+mutex_unlock(>struct_mutex);
+return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+NULL, i915_guc_log_control_set,
+"0x%08llx\n");


Does the readback still work with no get method?


readback will give a 'Permission denied' error


Is that what we want? I think it would be nice to allow read-back
unless
there is a specific reason why it shouldn't be allowed.



Ok can implement a dummy read back function but what should be
shown/returned on read.

Should I show/return the guc_log_level value (which is also available
from /sys/module/i915/parameters/) ?


I would return the same value that was written in. Is the problem that
it is not stored anywhere? Maybe reconstruct it from
i915.guc_log_level ?



The verbosity value will be same as guc_log_level. But whether logging
on GuC side is currently enabled or disabled can't be inferred (it could
have been disabled at run time).
So will have to store the exact value written by User.


That's what I meant. Code currently seem to decompose the value written
via debugfs and store it in i915.guc_log_level:

0x00 = -1
0x10 = -1
...
0x01 = 0
0x11 = 1
0x21 = 2
0x31 = 3
...

So for readback you could translate back from i915.guc_log_level to the
debugfs format.


Sorry for all the mess.

Should I add a new field 'debugfs_ctrl_val' in guc structure, to store 
the value previously written to debugfs file, considering guc_log_level 
only gives an indication of the verbosity level ?


Actually in future there may be other additions also to the value 
written to guc_log_control debugfs, have right now exposed only logging 
& verbosity level controls to User, as they are deemed most useful right 
now.
But there are some other controls also which can be passed to GuC 
firmware through UK_LOG_ENABLE_LOGGING host2guc action.


Best regards
Akash


Although I have suggested below even more...


Although it is not ideal that we got two formats for the same thing.
Thinking about that, why not use the same format in debugfs as for the
module param?


... that why do we have to have two formats? Isn't that a bit confusing?

Why couldn't we use the same integer values from i915.guc_log_level for
debugfs control ?





And I forgot, i915.guc_log_level == 0 is logging enabled with minimum
verbosity?


i915.guc_log_level == 0 just indicates the minimum verbosity. But
logging could still be disabled on GuC side.


Yes, I can't remember any precedent where zero means enabled so it is
just weird. But it is too late to change it now. :(


For example, Driver boots with 'i915.guc_log_level = 0' so logging is
enabled, later User disables the logging by echoing 0x0 on the
guc_log_control debugfs file.


That's fine

Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control

2016-07-20 Thread Goel, Akash



On 7/20/2016 3:17 PM, Tvrtko Ursulin wrote:


On 20/07/16 10:32, Goel, Akash wrote:



On 7/20/2016 2:38 PM, Tvrtko Ursulin wrote:


On 20/07/16 05:42, Goel, Akash wrote:

On 7/19/2016 4:54 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and
controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable
logging
and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling
logging.
 Useful for Validation.

Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_debugfs.c| 32 -
  drivers/gpu/drm/i915/i915_guc_submission.c | 57
++
  drivers/gpu/drm/i915/intel_guc.h   |  1 +
  3 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 5e35565..3c9c7f7 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2644,6 +2644,35 @@ static int i915_guc_log_dump(struct seq_file
*m, void *data)
  return 0;
  }

+static int
+i915_guc_log_control_set(void *data, u64 val)
+{
+struct drm_device *dev = data;
+struct drm_i915_private *dev_priv = dev->dev_private;


to_i915 should be used.

Sorry for missing this, need to use this at other places also.




+int ret;
+
+ret = mutex_lock_interruptible(>struct_mutex);
+if (ret)
+return ret;
+
+if (!i915.enable_guc_submission || !dev_priv->guc.log.obj) {


Wouldn't guc.log.obj be enough?


Actually failure in allocation of log buffer, at boot time, is not
considered fatal and submission through GuC is still done.
So i915.enable_guc_submission could be 1 with guc.log.obj as NULL.


If guc.log.obj is NULL it will return -EINVAL without trying to create
it here. If you intended for this function to try and create the log
object if not already present, via i915_guc_log_control, in that case
the condition above should only be if (!i915.enable_guc_submisison), no?


If guc.log.obj is found to be NULL, we consider logging can't be enabled
at run time. Allocation of log buffer is supposed to done
at boot time only, otherwise GuC would have to be reset & firmware to be
reloaded to pass the log buffer address at run time, which is probably
not desirable. That's why in the first patch decoupled the allocation of
log buffer from log_level value.


Okay so why then the check above shouldn't just be;

if (!dev_priv->guc.log.obj)

as I originally suggested?


Right, so sorry got confused, I misread & interpreted that you are 
suggesting to have !i915.enable_guc_submission check instead.


(!dev_priv->guc.log.obj) check should suffice.








+ret = -EINVAL;
+goto end;
+}
+
+intel_runtime_pm_get(dev_priv);
+ret = i915_guc_log_control(dev, val);
+intel_runtime_pm_put(dev_priv);
+
+end:
+mutex_unlock(>struct_mutex);
+return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+NULL, i915_guc_log_control_set,
+"0x%08llx\n");


Does the readback still work with no get method?


readback will give a 'Permission denied' error


Is that what we want? I think it would be nice to allow read-back unless
there is a specific reason why it shouldn't be allowed.



Ok can implement a dummy read back function but what should be
shown/returned on read.

Should I show/return the guc_log_level value (which is also available
from /sys/module/i915/parameters/) ?


I would return the same value that was written in. Is the problem that
it is not stored anywhere? Maybe reconstruct it from
i915.guc_log_level ?



The verbosity value will be same as guc_log_level. But whether logging 
on GuC side is currently enabled or disabled can't be inferred (it could 
have been disabled at run time).

So will have to store the exact value written by User.


Although it is not ideal that we got two formats for the same thing.
Thinking about that, why not use the same format in debugfs as for the
module param?

And I forgot, i915.guc_log_level == 0 is logging enabled with minimum
verbosity?

i915.guc_log_level == 0 just indicates the minimum verbosity. But 
logging could still be disabled on GuC side.


For example, Driver boots with 'i915.guc_log_level = 0' so logging is 
enabled, later User disables the logging by echoing 0x0 on the 
guc_log_control debugfs file.


Best regards
Akash


Is it too late to change that? :)

Regards,

Tvrtko


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 08/17] drm/i915: Forcefully flush GuC log buffer on reset

2016-07-20 Thread Goel, Akash



On 7/20/2016 2:42 PM, Chris Wilson wrote:

On Wed, Jul 20, 2016 at 09:51:45AM +0530, Goel, Akash wrote:



On 7/19/2016 4:51 PM, Chris Wilson wrote:

On Tue, Jul 19, 2016 at 12:12:20PM +0100, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

If GuC logs are being captured, there should be a force log buffer flush
action sent to GuC before proceeding with GPU reset and re-initializing
GUC. Those logs would be useful to understand why the GPU reset was
initiated.

v2: Rebase.

Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++
drivers/gpu/drm/i915/i915_irq.c|  2 ++
drivers/gpu/drm/i915/intel_guc.h   |  1 +
3 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 9b436fa..8cc31c6 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -183,6 +183,16 @@ static int host2guc_logbuffer_flush_complete(struct 
intel_guc *guc)
return host2guc_action(guc, data, 1);
}

+static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH;
+   data[1] = 0;
+
+   return host2guc_action(guc, data, 2);
+}
+
/*
 * Initialise, update, or clear doorbell data shared with the GuC
 *
@@ -1404,6 +1414,28 @@ void i915_guc_capture_logs(struct drm_device *dev)
intel_runtime_pm_put(dev_priv);
}

+void i915_guc_capture_logs_on_reset(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = dev->dev_private;
+
+   mutex_lock(>struct_mutex);


Not sure what are the repercussion of taking the mutex on the
i915_reset_and_wakeup and path (error capture, hangcheck, dont' know
this area well). Check with Chris and Mika I suppose (cc-ed)?




Took the struct_mutex, just to avoid a very remote possibility where
i915_guc_capture_logs_on_reset & debugfs function
i915_guc_log_control executes concurrently.


Flat out invalid to take struct_mutex on the error capture path, or any
lock at all really (just in case of driver bugs). Consider it to be an
atomic context that may preempt the driver at any point.


Actually I see that i915_reset() too takes the struct_mutex right at
the beginning and I have plugged the call to
i915_guc_capture_logs_on_reset() just before that.


Postmortem state is captured from i915_capture_error_state(), and as I
recall one of the raison d'etre for this facility was to include the guc
log in the error state.


Sorry I missed augmenting the error state with guc firmware logs.
For that also a prior flush will be needed, will do the flush without 
acquiring the struct_mutex.


Best regards
Akash


-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control

2016-07-20 Thread Goel, Akash



On 7/20/2016 2:38 PM, Tvrtko Ursulin wrote:


On 20/07/16 05:42, Goel, Akash wrote:

On 7/19/2016 4:54 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and
controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging
and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling
logging.
 Useful for Validation.

Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_debugfs.c| 32 -
  drivers/gpu/drm/i915/i915_guc_submission.c | 57
++
  drivers/gpu/drm/i915/intel_guc.h   |  1 +
  3 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 5e35565..3c9c7f7 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2644,6 +2644,35 @@ static int i915_guc_log_dump(struct seq_file
*m, void *data)
  return 0;
  }

+static int
+i915_guc_log_control_set(void *data, u64 val)
+{
+struct drm_device *dev = data;
+struct drm_i915_private *dev_priv = dev->dev_private;


to_i915 should be used.

Sorry for missing this, need to use this at other places also.




+int ret;
+
+ret = mutex_lock_interruptible(>struct_mutex);
+if (ret)
+return ret;
+
+if (!i915.enable_guc_submission || !dev_priv->guc.log.obj) {


Wouldn't guc.log.obj be enough?


Actually failure in allocation of log buffer, at boot time, is not
considered fatal and submission through GuC is still done.
So i915.enable_guc_submission could be 1 with guc.log.obj as NULL.


If guc.log.obj is NULL it will return -EINVAL without trying to create
it here. If you intended for this function to try and create the log
object if not already present, via i915_guc_log_control, in that case
the condition above should only be if (!i915.enable_guc_submisison), no?

If guc.log.obj is found to be NULL, we consider logging can't be enabled 
at run time. Allocation of log buffer is supposed to done
at boot time only, otherwise GuC would have to be reset & firmware to be 
reloaded to pass the log buffer address at run time, which is probably 
not desirable. That's why in the first patch decoupled the allocation of 
log buffer from log_level value.





+ret = -EINVAL;
+goto end;
+}
+
+intel_runtime_pm_get(dev_priv);
+ret = i915_guc_log_control(dev, val);
+intel_runtime_pm_put(dev_priv);
+
+end:
+mutex_unlock(>struct_mutex);
+return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+NULL, i915_guc_log_control_set,
+"0x%08llx\n");


Does the readback still work with no get method?


readback will give a 'Permission denied' error


Is that what we want? I think it would be nice to allow read-back unless
there is a specific reason why it shouldn't be allowed.



Ok can implement a dummy read back function but what should be 
shown/returned on read.


Should I show/return the guc_log_level value (which is also available 
from /sys/module/i915/parameters/) ?






+
  static int i915_edp_psr_status(struct seq_file *m, void *data)
  {
  struct drm_info_node *node = m->private;
@@ -5464,7 +5493,8 @@ static const struct i915_debugfs_files {
  {"i915_fbc_false_color", _fbc_fc_fops},
  {"i915_dp_test_data", _displayport_test_data_fops},
  {"i915_dp_test_type", _displayport_test_type_fops},
-{"i915_dp_test_active", _displayport_test_active_fops}
+{"i915_dp_test_active", _displayport_test_active_fops},
+{"i915_guc_log_control", _guc_log_control_fops}
  };

  void intel_display_crc_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8cc31c6..2e3b723 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -193,6 +193,16 @@ static int host2guc_force_logbuffer_flush(struct
intel_guc *guc)
  return host2guc_action(guc, data, 2);
  }

+static int host2guc_logging_control(struct intel_guc *guc, u32
control_val)
+{
+u32 data[2];
+
+data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+data[1] = control_val;
+
+return host2guc_action(guc, data, 2);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -1455,3 +1465,50 @@ void i915_guc_register(struct drm_device *dev)
  guc_log_late_setup(dev);
  mutex_unlock(>struct_mutex);
  }
+
+int i915_guc_log_control(struct drm_device *dev, uint64_t control_val)
+{

Re: [Intel-gfx] [PATCH 09/17] drm/i915: Debugfs support for GuC logging control

2016-07-19 Thread Goel, Akash



On 7/19/2016 4:54 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging
and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling
logging.
 Useful for Validation.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_debugfs.c| 32 -
  drivers/gpu/drm/i915/i915_guc_submission.c | 57
++
  drivers/gpu/drm/i915/intel_guc.h   |  1 +
  3 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 5e35565..3c9c7f7 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2644,6 +2644,35 @@ static int i915_guc_log_dump(struct seq_file
*m, void *data)
  return 0;
  }

+static int
+i915_guc_log_control_set(void *data, u64 val)
+{
+struct drm_device *dev = data;
+struct drm_i915_private *dev_priv = dev->dev_private;


to_i915 should be used.

Sorry for missing this, need to use this at other places also.




+int ret;
+
+ret = mutex_lock_interruptible(>struct_mutex);
+if (ret)
+return ret;
+
+if (!i915.enable_guc_submission || !dev_priv->guc.log.obj) {


Wouldn't guc.log.obj be enough?


Actually failure in allocation of log buffer, at boot time, is not 
considered fatal and submission through GuC is still done.

So i915.enable_guc_submission could be 1 with guc.log.obj as NULL.




+ret = -EINVAL;
+goto end;
+}
+
+intel_runtime_pm_get(dev_priv);
+ret = i915_guc_log_control(dev, val);
+intel_runtime_pm_put(dev_priv);
+
+end:
+mutex_unlock(>struct_mutex);
+return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+NULL, i915_guc_log_control_set,
+"0x%08llx\n");


Does the readback still work with no get method?


readback will give a 'Permission denied' error




+
  static int i915_edp_psr_status(struct seq_file *m, void *data)
  {
  struct drm_info_node *node = m->private;
@@ -5464,7 +5493,8 @@ static const struct i915_debugfs_files {
  {"i915_fbc_false_color", _fbc_fc_fops},
  {"i915_dp_test_data", _displayport_test_data_fops},
  {"i915_dp_test_type", _displayport_test_type_fops},
-{"i915_dp_test_active", _displayport_test_active_fops}
+{"i915_dp_test_active", _displayport_test_active_fops},
+{"i915_guc_log_control", _guc_log_control_fops}
  };

  void intel_display_crc_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8cc31c6..2e3b723 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -193,6 +193,16 @@ static int host2guc_force_logbuffer_flush(struct
intel_guc *guc)
  return host2guc_action(guc, data, 2);
  }

+static int host2guc_logging_control(struct intel_guc *guc, u32
control_val)
+{
+u32 data[2];
+
+data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+data[1] = control_val;
+
+return host2guc_action(guc, data, 2);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -1455,3 +1465,50 @@ void i915_guc_register(struct drm_device *dev)
  guc_log_late_setup(dev);
  mutex_unlock(>struct_mutex);
  }
+
+int i915_guc_log_control(struct drm_device *dev, uint64_t control_val)
+{
+struct drm_i915_private *dev_priv = dev->dev_private;


to_i915

Actually, function should take dev_priv if not even guc depending on the
established convention in the file.


Ok for all the new logging related exported functions, will use dev_priv.


+union guc_log_control log_param;
+int ret;
+
+log_param.logging_enabled = control_val & 0x1;
+log_param.verbosity = (control_val >> 4) & 0xF;
+
+if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+return -EINVAL;
+
+/* This combination doesn't make sense & won't have any effect */
+if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+return -EINVAL;


Hm, disabling while already disabled - why should that return an error?
Might be annoying in scripts.


Just to make the User aware. Ok will suppress this and return 0.



+
+ret = host2guc_logging_control(_priv->guc, log_param.value);
+if (ret < 0) {
+DRM_DEBUG_DRIVER("host2guc action failed\n");


Add ret to the log since it is easy?


fine will do that.

+return ret;
+}
+
+i915.guc_log_level = log_param.verbosity;
+
+/* If log_level was 

Re: [Intel-gfx] [PATCH 08/17] drm/i915: Forcefully flush GuC log buffer on reset

2016-07-19 Thread Goel, Akash



On 7/19/2016 4:51 PM, Chris Wilson wrote:

On Tue, Jul 19, 2016 at 12:12:20PM +0100, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

If GuC logs are being captured, there should be a force log buffer flush
action sent to GuC before proceeding with GPU reset and re-initializing
GUC. Those logs would be useful to understand why the GPU reset was
initiated.

v2: Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 32 ++
 drivers/gpu/drm/i915/i915_irq.c|  2 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 3 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 9b436fa..8cc31c6 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -183,6 +183,16 @@ static int host2guc_logbuffer_flush_complete(struct 
intel_guc *guc)
return host2guc_action(guc, data, 1);
 }

+static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH;
+   data[1] = 0;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1404,6 +1414,28 @@ void i915_guc_capture_logs(struct drm_device *dev)
intel_runtime_pm_put(dev_priv);
 }

+void i915_guc_capture_logs_on_reset(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = dev->dev_private;
+
+   mutex_lock(>struct_mutex);


Not sure what are the repercussion of taking the mutex on the
i915_reset_and_wakeup and path (error capture, hangcheck, dont' know
this area well). Check with Chris and Mika I suppose (cc-ed)?




Took the struct_mutex, just to avoid a very remote possibility where
i915_guc_capture_logs_on_reset & debugfs function i915_guc_log_control 
executes concurrently.



Flat out invalid to take struct_mutex on the error capture path, or any
lock at all really (just in case of driver bugs). Consider it to be an
atomic context that may preempt the driver at any point.


Actually I see that i915_reset() too takes the struct_mutex right at the 
beginning and I have plugged the call to 
i915_guc_capture_logs_on_reset() just before that.


Also it is being called after i915_error_wake_up(), so any client 
waiting on a request would have backed off and any new attempt by 
clients to lock the struct_mutex should see i915_reset_in_progress as

true.

Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/17] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-07-19 Thread Goel, Akash



On 7/19/2016 5:01 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel 

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snaphots could be stored in the relay buffer.
The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the
'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of
data.

v2: Defer the creation of relay channel & associated debugfs file, as
 debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

Suggested-by: Chris Wilson 
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.c|   2 +
  drivers/gpu/drm/i915/i915_guc_submission.c | 197
-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  3 files changed, 199 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 25c6b9b..43c9900 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1177,6 +1177,7 @@ static void i915_driver_register(struct
drm_i915_private *dev_priv)
  /* Reveal our presence to userspace */
  if (drm_dev_register(dev, 0) == 0) {
  i915_debugfs_register(dev_priv);
+i915_guc_register(dev);
  i915_setup_sysfs(dev);
  } else
  DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1215,6 +1216,7 @@ static void i915_driver_unregister(struct
drm_i915_private *dev_priv)
  intel_opregion_unregister(dev_priv);

  i915_teardown_sysfs(_priv->drm);
+i915_guc_unregister(_priv->drm);
  i915_debugfs_unregister(dev_priv);
  drm_dev_unregister(_priv->drm);

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index d3dbb8e..9b436fa 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
   */
  #include 
  #include 
+#include 
+#include 
  #include "i915_drv.h"
  #include "intel_guc.h"

@@ -836,12 +838,33 @@ err:

  static void guc_move_to_next_buf(struct intel_guc *guc)
  {
-return;
+/* Make sure our updates are in the sub buffer are visible when
+ * Consumer sees a newly produced sub buffer.
+ */
+smp_wmb();
+
+/* All data has been written, so now move the offset of sub
buffer. */
+relay_reserve(guc->log.relay_chan, guc->log.obj->base.size);
+
+/* Switch to the next sub buffer */
+relay_flush(guc->log.relay_chan);
  }

  static void* guc_get_write_buffer(struct intel_guc *guc)
  {
-return NULL;
+/* FIXME: Cover the check under a lock ? */
+if (!guc->log.relay_chan)
+return NULL;
+
+/* Just get the base address of a new sub buffer and copy data
into it
+ * ourselves. NULL will be returned in no-overwrite mode, if all sub
+ * buffers are full. Could have used the relay_write() to indirectly
+ * copy the data, but that would have been bit convoluted, as we
need to
+ * write to only certain locations inside a sub buffer which
cannot be
+ * done without using relay_reserve() along with relay_write().
So its
+ * better to use relay_reserve() alone.
+ */
+return relay_reserve(guc->log.relay_chan, 0);
  }

  static void guc_read_update_log_buffer(struct drm_device *dev)
@@ -906,6 +929,119 @@ static void guc_read_update_log_buffer(struct
drm_device *dev)
  guc_move_to_next_buf(guc);
  }

+/*
+ * Sub buffer switch callback. Called whenever relay has to switch to
a new
+ * sub buffer, relay stays on the same sub buffer if 0 is returned.
+ */
+static int subbuf_start_callback(struct rchan_buf *buf,
+ void *subbuf,
+ void *prev_subbuf,
+ size_t prev_padding)
+{
+/* Use no-overwrite mode by default, where relay will stop accepting
+ * new data if there are no empty sub buffers left.
+ * There is no strict synchronization enforced by relay between
Consumer
+ * and Producer. In overwrite mode, there is a possibility of
getting
+ * inconsistent/garbled data, the producer could be writing on to
the
+ * same sub buffer from which 

Re: [Intel-gfx] [PATCH 06/17] drm/i915: Handle log buffer flush interrupt event from GuC

2016-07-19 Thread Goel, Akash



On 7/19/2016 4:28 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
   crash buffer area for regular cases and copying only the state
   structure data in first page.

v3:
  - Create a vmalloc mapping of log buffer. (Chris)
  - Cover the flush acknowledgment under rpm get & put.(Chris)
  - Revert the change of skipping the copy of crash dump area, as
not really needed, will be covered by subsequent patch.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.c|  13 +++
  drivers/gpu/drm/i915/i915_guc_submission.c | 148
+
  drivers/gpu/drm/i915/i915_irq.c|   5 +-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 167 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index b9a8117..25c6b9b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -791,8 +791,20 @@ static int i915_workqueues_init(struct
drm_i915_private *dev_priv)
  if (dev_priv->hotplug.dp_wq == NULL)
  goto out_free_wq;

+if (HAS_GUC_SCHED(dev_priv)) {
+/* Need a dedicated wq to process log buffer flush interrupts
+ * from GuC without much delay so as to avoid any loss of logs.
+ */
+dev_priv->guc.log.wq =
+alloc_ordered_workqueue("i915-guc_log", 0);
+if (dev_priv->guc.log.wq == NULL)
+goto out_free_hotplug_dp_wq;
+}
+
  return 0;

+out_free_hotplug_dp_wq:
+destroy_workqueue(dev_priv->hotplug.dp_wq);
  out_free_wq:
  destroy_workqueue(dev_priv->wq);
  out_err:
@@ -803,6 +815,7 @@ out_err:

  static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
  {
+destroy_workqueue(dev_priv->guc.log.wq);


I am ignoring the wq parts of the patch since the next series may look
different in this respect.

However you may need to have wq destruction under the same HAS_GUC_SCHED
condition as when you create it.


Thanks, will do.
Sorry, my bad.



  destroy_workqueue(dev_priv->hotplug.dp_wq);
  destroy_workqueue(dev_priv->wq);
  }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0bac172..d3dbb8e 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
intel_guc *guc,
  return host2guc_action(guc, data, ARRAY_SIZE(data));
  }

+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+u32 data[1];
+
+data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+return host2guc_action(guc, data, 1);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -825,6 +834,123 @@ err:
  return NULL;
  }

+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+return;
+}
+
+static void* guc_get_write_buffer(struct intel_guc *guc)
+{
+return NULL;
+}
+
+static void guc_read_update_log_buffer(struct drm_device *dev)


dev_priv should be passed in for driver internal functions.


+{
+struct drm_i915_private *dev_priv = dev->dev_private;
+struct intel_guc *guc = _priv->guc;
+struct guc_log_buffer_state *log_buffer_state,
*log_buffer_copy_state;
+struct guc_log_buffer_state log_buffer_state_local;
+void *src_data_ptr, *dst_data_ptr;
+u32 i, buffer_size;
+
+if (!guc->log.obj || !guc->log.buf_addr)
+return;
+
+log_buffer_state = src_data_ptr = guc->log.buf_addr;
+
+/* Get the pointer to local buffer to store the logs */
+dst_data_ptr = log_buffer_copy_state = guc_get_write_buffer(guc);


This will return NULL so the loop below doesn't do anything much. I
assume at this point in the patch series things are not wired up yet?

The below loop will still update the state structures, lying in the 
first page of GuC log buffer.

There is no local buffer yet to store the logs.


+
+/* Actual logs are present from the 2nd page */
+src_data_ptr += PAGE_SIZE;
+dst_data_ptr += PAGE_SIZE;
+
+for (i = 0; i < GUC_MAX_LOG_BUFFER; i++) {
+log_buffer_state_local = *log_buffer_state;
+buffer_size = log_buffer_state_local.size;
+
+if (log_buffer_copy_state) {
+/* First copy the state structure */
+ 

Re: [Intel-gfx] [PATCH 10/17] drm/i915: New module param to control the size of buffer used for storing GuC firmware logs

2016-07-18 Thread Goel, Akash



On 7/18/2016 6:36 PM, Tvrtko Ursulin wrote:


On 18/07/16 13:19, Goel, Akash wrote:

On 7/18/2016 3:36 PM, Tvrtko Ursulin wrote:

On 15/07/16 16:36, Goel, Akash wrote:

On 7/15/2016 4:45 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

On recieving the log buffer flush interrupt from GuC firmware, Driver
stores the snapshot of the log buffer in a local buffer, from which
Userspace can pull the logs. By default Driver store, up to, 4
snapshots
of the log buffer in a local buffer (managed by relay).
Added a new module (read only) param, 'guc_log_size', through which
User
can specify the number of snapshots of log buffer to be stored in
local
buffer. This can be used to ensure capturing of all boot time logs
even
with high verbosity level.

v2: Rename module param to more apt name 'guc_log_buffer_nr'.
(Nikula)

Suggested-by: Chris Wilson <ch...@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 +--
  drivers/gpu/drm/i915/i915_params.c | 5 +
  drivers/gpu/drm/i915/i915_params.h | 1 +
  3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2e3b723..009d7c0 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1046,8 +1046,7 @@ static int guc_create_log_relay_file(struct
intel_guc *guc)

  /* Keep the size of sub buffers same as shared log buffer */
  subbuf_size = guc->log.obj->base.size;
-/* TODO: Decide based on the User's input */
-n_subbufs = 4;
+n_subbufs = i915.guc_log_buffer_nr;

  guc_log_relay_chan = relay_open("guc_log", log_dir,
  subbuf_size, n_subbufs, _callbacks, dev);
diff --git a/drivers/gpu/drm/i915/i915_params.c
b/drivers/gpu/drm/i915/i915_params.c
index 8b13bfa..d30c972 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -57,6 +57,7 @@ struct i915_params i915 __read_mostly = {
  .enable_guc_loading = -1,
  .enable_guc_submission = -1,
  .guc_log_level = -1,
+.guc_log_buffer_nr = 4,
  .enable_dp_mst = true,
  .inject_load_failure = 0,
  .enable_dpcd_backlight = false,
@@ -214,6 +215,10 @@ module_param_named(guc_log_level,
i915.guc_log_level, int, 0400);
  MODULE_PARM_DESC(guc_log_level,
  "GuC firmware logging level (-1:disabled (default),
0-3:enabled)");

+module_param_named(guc_log_buffer_nr, i915.guc_log_buffer_nr, int,
0400);
+MODULE_PARM_DESC(guc_log_buffer_nr,
+"Number of sub buffers to store GuC firmware logs (default:
4)");
+
  module_param_named_unsafe(enable_dp_mst, i915.enable_dp_mst, bool,
0600);
  MODULE_PARM_DESC(enable_dp_mst,
  "Enable multi-stream transport (MST) for new DisplayPort sinks.
(default: true)");
diff --git a/drivers/gpu/drm/i915/i915_params.h
b/drivers/gpu/drm/i915/i915_params.h
index 0ad020b..14ca855 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -48,6 +48,7 @@ struct i915_params {
  int enable_guc_loading;
  int enable_guc_submission;
  int guc_log_level;
+int guc_log_buffer_nr;
  int use_mmio_flip;
  int mmio_debug;
  int edp_vswing;



I did not figure out after a quick read of
Documentation/filesystems/relay.txt whether we really need this to be
configurable?

If I got it right number of sub-buffers here only has a relation to
the
userspace relay consumer latency. If the userspace is responsive
should
just two be enough? Or the existing default of four was shown in
practice that it is better and good enough?


Yes one of the use of this module parameter is to give User some leeway
i.e. more time to collect logs from the relay buffer. User may not be
always able to match the rate at which logs are being produced from the
GuC side.

2 could be too less.
Even 4, when running a benchmark, was proving less and not able to
match
the Driver rate (this might change after some optimization is done from
User space side also, like splice).


Okay, it makes sense for it to be bigger than four by default then,
correct?


The other use is to ensure capturing of all boot time logs, even with
maximum verbosity level. The default number of sub buffers may not
always be sufficient to store all the logs from boot, by the time User
is ready to capture the logs.
Saw about 8 flush interrupts coming from GuC during the boot.


How important it is for a default value to capture all activity since
boot?

I think we need to keep in mind here that amount of that activity may be
a lot different with different setups so it might not be that
interesting after all.

Someone will log in via a display manager, which may generate a widely
differing amount of GPU activity, until they start the logger. Someone
else on the other

Re: [Intel-gfx] [PATCH 15/17] drm/i915: Increase GuC log buffer size to reduce flush interrupts

2016-07-18 Thread Goel, Akash



On 7/18/2016 3:24 PM, Tvrtko Ursulin wrote:


On 15/07/16 17:20, Goel, Akash wrote:

On 7/15/2016 8:37 PM, Tvrtko Ursulin wrote:

On 15/07/16 15:42, Goel, Akash wrote:

On 7/15/2016 5:27 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut
down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.

Suggested-by: Tvrtko Ursulin <tvrtko.ursu...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/intel_guc_fwif.h | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 1de6928..7521ed5 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -104,9 +104,9 @@
  #define   GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3)
  #define   GUC_LOG_CRASH_PAGES1
  #define   GUC_LOG_CRASH_SHIFT4
-#define   GUC_LOG_DPC_PAGES3
+#define   GUC_LOG_DPC_PAGES7
  #define   GUC_LOG_DPC_SHIFT6
-#define   GUC_LOG_ISR_PAGES3
+#define   GUC_LOG_ISR_PAGES7
  #define   GUC_LOG_ISR_SHIFT9
  #define   GUC_LOG_BUF_ADDR_SHIFT12

@@ -436,9 +436,9 @@ enum guc_log_buffer_type {
   *|   Crash dump state header |
   * Page1  +---+
   *|   ISR logs|
- * Page5  +---+
- *|   DPC logs|
   * Page9  +---+
+ *|   DPC logs|
+ * Page17 +---+
   *| Crash Dump logs   |
   *+---+
   *



I don't mind - but does it help? And how much and for what? Haven't
you
later found that the uncached reads were the main issue?

This change along with kthread patch, helped reduce the overflow counts
and even eliminate them for some benchmarks.

Though with the impending optimization for Uncached reads there should
be further improvements but in my view, notwithstanding the improvement
w.r.t overflow count, its still a better configuration to work with as
flush interrupt frequency is cut down to half and not able to see any
apparent downsides to it.


I was primarily thinking to go with a minimal and simplest set of
patches to implement the feature.


I second that and working with the same intent.


Logic was that apparently none of the smart and complex optimisations
managed to solve the dropped interrupt issue, until the slowness of the
uncached read was discovered to be the real/main issue.

So it seems that is something that definitely needs to be implemented.
(Whether or not it will be possible to use SSE instructions to do the
read I don't know.)



log buffer resizing and rt priority kthread changes have definitely
helped significantly.

Only of late we realized that there is a potential way to speed up
Uncached reads also. Moreover I am yet to test that on kernel side.
So until that is tested & proves to be enough, we have to rely on the
other optimizations & can't dismiss them


Maybe, depends if, what I thought was the case, none of the other
optimizations actually enabled a drop-free logging in all interesting
scenarios.

If we conclude that simply improving the copy speed removes the need for
any other optimisations and complications, we can talk about whether
every individual one of those still makes sense.

In my opinion we should keep this change, regardless of the copying 
speed up. Moreover this is a straight forward change.


Actually this also helps in reducing the output log file size, apart
from reducing the flush interrupt count.
With the original settings, 44 KB was needed for one snapshot.
With the modified settings, 76 KB is needed for one snapshot but it
will be equivalent to 2 snapshots of the original setting.
So 12KB saving, every 88 KB, over the original setting.

Best regards
Akash


Assuming it is possible, then the question is whether there is need for
all the other optimisations. Ie. do we need the kthread with rtprio or
would a simple worker be enough?

I think we can take a call, once we have the results with Uncached read
optimization.


Agreed. Lets see how that works out and the discuss on how the final
series should look like.

Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 10/17] drm/i915: New module param to control the size of buffer used for storing GuC firmware logs

2016-07-18 Thread Goel, Akash



On 7/18/2016 3:36 PM, Tvrtko Ursulin wrote:


On 15/07/16 16:36, Goel, Akash wrote:

On 7/15/2016 4:45 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

On recieving the log buffer flush interrupt from GuC firmware, Driver
stores the snapshot of the log buffer in a local buffer, from which
Userspace can pull the logs. By default Driver store, up to, 4
snapshots
of the log buffer in a local buffer (managed by relay).
Added a new module (read only) param, 'guc_log_size', through which
User
can specify the number of snapshots of log buffer to be stored in local
buffer. This can be used to ensure capturing of all boot time logs even
with high verbosity level.

v2: Rename module param to more apt name 'guc_log_buffer_nr'. (Nikula)

Suggested-by: Chris Wilson <ch...@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 +--
  drivers/gpu/drm/i915/i915_params.c | 5 +
  drivers/gpu/drm/i915/i915_params.h | 1 +
  3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2e3b723..009d7c0 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1046,8 +1046,7 @@ static int guc_create_log_relay_file(struct
intel_guc *guc)

  /* Keep the size of sub buffers same as shared log buffer */
  subbuf_size = guc->log.obj->base.size;
-/* TODO: Decide based on the User's input */
-n_subbufs = 4;
+n_subbufs = i915.guc_log_buffer_nr;

  guc_log_relay_chan = relay_open("guc_log", log_dir,
  subbuf_size, n_subbufs, _callbacks, dev);
diff --git a/drivers/gpu/drm/i915/i915_params.c
b/drivers/gpu/drm/i915/i915_params.c
index 8b13bfa..d30c972 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -57,6 +57,7 @@ struct i915_params i915 __read_mostly = {
  .enable_guc_loading = -1,
  .enable_guc_submission = -1,
  .guc_log_level = -1,
+.guc_log_buffer_nr = 4,
  .enable_dp_mst = true,
  .inject_load_failure = 0,
  .enable_dpcd_backlight = false,
@@ -214,6 +215,10 @@ module_param_named(guc_log_level,
i915.guc_log_level, int, 0400);
  MODULE_PARM_DESC(guc_log_level,
  "GuC firmware logging level (-1:disabled (default),
0-3:enabled)");

+module_param_named(guc_log_buffer_nr, i915.guc_log_buffer_nr, int,
0400);
+MODULE_PARM_DESC(guc_log_buffer_nr,
+"Number of sub buffers to store GuC firmware logs (default: 4)");
+
  module_param_named_unsafe(enable_dp_mst, i915.enable_dp_mst, bool,
0600);
  MODULE_PARM_DESC(enable_dp_mst,
  "Enable multi-stream transport (MST) for new DisplayPort sinks.
(default: true)");
diff --git a/drivers/gpu/drm/i915/i915_params.h
b/drivers/gpu/drm/i915/i915_params.h
index 0ad020b..14ca855 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -48,6 +48,7 @@ struct i915_params {
  int enable_guc_loading;
  int enable_guc_submission;
  int guc_log_level;
+int guc_log_buffer_nr;
  int use_mmio_flip;
  int mmio_debug;
  int edp_vswing;



I did not figure out after a quick read of
Documentation/filesystems/relay.txt whether we really need this to be
configurable?

If I got it right number of sub-buffers here only has a relation to the
userspace relay consumer latency. If the userspace is responsive should
just two be enough? Or the existing default of four was shown in
practice that it is better and good enough?


Yes one of the use of this module parameter is to give User some leeway
i.e. more time to collect logs from the relay buffer. User may not be
always able to match the rate at which logs are being produced from the
GuC side.

2 could be too less.
Even 4, when running a benchmark, was proving less and not able to match
the Driver rate (this might change after some optimization is done from
User space side also, like splice).


Okay, it makes sense for it to be bigger than four by default then,
correct?


The other use is to ensure capturing of all boot time logs, even with
maximum verbosity level. The default number of sub buffers may not
always be sufficient to store all the logs from boot, by the time User
is ready to capture the logs.
Saw about 8 flush interrupts coming from GuC during the boot.


How important it is for a default value to capture all activity since boot?

I think we need to keep in mind here that amount of that activity may be
a lot different with different setups so it might not be that
interesting after all.

Someone will log in via a display manager, which may generate a widely
differing amount of GPU activity, until they start the logger. Someone
else on the other hand might be booting to vt only, starting the logger,
and only then starting the g

Re: [Intel-gfx] [PATCH 14/17] drm/i915: Add stats for GuC log buffer flush interrupts

2016-07-18 Thread Goel, Akash



On 7/18/2016 5:03 PM, Tvrtko Ursulin wrote:


On 18/07/16 11:59, Goel, Akash wrote:

On 7/18/2016 3:46 PM, Tvrtko Ursulin wrote:


On 15/07/16 16:58, Goel, Akash wrote:

On 7/15/2016 5:21 PM, Tvrtko Ursulin wrote:

On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_debugfs.c| 26
++
  drivers/gpu/drm/i915/i915_guc_submission.c |  8 
  drivers/gpu/drm/i915/i915_irq.c|  1 +
  drivers/gpu/drm/i915/intel_guc.h   |  6 ++
  4 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 3c9c7f7..888a18a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2538,6 +2538,30 @@ static int i915_guc_load_status_info(struct
seq_file *m, void *data)
  return 0;
  }

+static void i915_guc_log_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv)
+{
+struct intel_guc *guc = _priv->guc;
+
+seq_printf(m, "\nGuC logging stats:\n");
+
+seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
+
+seq_printf(m, "\tTotal flush interrupt count: %u\n",
+guc->log.flush_interrupt_count);
+
+}
+
  static void i915_guc_client_info(struct seq_file *m,
   struct drm_i915_private *dev_priv,
   struct i915_guc_client *client)
@@ -2611,6 +2635,8 @@ static int i915_guc_info(struct seq_file *m,
void *data)
  seq_printf(m, "\nGuC execbuf client @ %p:\n",
guc.execbuf_client);
  i915_guc_client_info(m, dev_priv, );

+i915_guc_log_info(m, dev_priv);
+
  /* Add more as required ... */

  return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c1e637f..9c94a43 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -914,6 +914,14 @@ static void guc_read_update_log_buffer(struct
drm_device *dev)
  log_buffer_state_local = *log_buffer_state;
  buffer_size = log_buffer_state_local.size;

+guc->log.flush_count[i] +=
log_buffer_state_local.flush_to_file;
+if (log_buffer_state_local.buffer_full_cnt !=
+guc->log.prev_overflow_count[i]) {
+guc->log.prev_overflow_count[i] =
+log_buffer_state_local.buffer_full_cnt;
+guc->log.total_overflow_count[i]++;


Is log_buffer_state_local.buffer_full_cnt guaranteed to be one
here? Or
you would need to increase total_overflow_count by its value?



buffer_full_cnt will not remain as one. Its a 4 bit counter, will be
incremented monotonically by GuC firmware on every new detection of
overflow, so will increase from 0 to 15 & then wrap around.
Hence have to use '!=' in the condition instead of '>'.


But can it happen that it jumps by more than one between being sampled
here? In which case you would need to replace:

guc->log.total_overflow_count[i]++;

by something like:


guc->log.total_overflow_count[i] +=
log_buffer_state_local.buffer_full_cnt -
guc->log.prev_overflow_count[i];

(Doesn't handle the wrap though, just to illustrate my point.)


Actually logic in GuC firmware is such that overflow counter cannot
increment by more than 1 without Driver coming into picture in between,
by the virtue of flush interrupt.


Hm, and what happens to the data and overflow counter if the driver is
not responsive enough?

GuC will not stall and keep writing the logs into the buffer, if Driver 
is slow in responding to the previous flush interrupt.


But the overflow detection is done through a bit weird logic, which is 
executed only when GuC receives the response of the last flush interrupt 
from Driver, and increment is done by 1 only irrespective of how late 
the acknowledgement came from Driver 

Re: [Intel-gfx] [PATCH 13/17] drm/i915: New lock to serialize the Host2GuC actions

2016-07-18 Thread Goel, Akash



On 7/18/2016 4:48 PM, Tvrtko Ursulin wrote:


On 18/07/16 11:46, Goel, Akash wrote:

On 7/18/2016 3:42 PM, Tvrtko Ursulin wrote:


On 15/07/16 16:51, Goel, Akash wrote:



On 7/15/2016 5:10 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

With the addition of new Host2GuC actions related to GuC logging,
there
is a need of a lock to serialize them, as they can execute
concurrently
with each other and also with other existing actions.


After which patch in this series is this required?


 From patch 6 or 7 saw the problem, when enabled flush interrupts from
boot (guc_log_level >= 0).


That means this patch should come before 6 or 7. :)


Also new HOST2GUC actions LOG_BUFFER_FILE_FLUSH_COMPLETE &
UK_LOG_ENABLE_LOGGING can execute concurrently with each other.


Right I see, from the worker/thread vs debugfs activity.


Will use mutex to serialize and place the patch earlier in the series.
Please suggest which would be better,
mutex_lock()
or
mutex_lock_interruptible().


Interruptible from the debugfs paths, otherwise not.


Yes calls from debugfs path should ideally use interruptible version,
but then how to determine that whether the given host2guc_action call
came from debugfs path.
Should I add a new argument 'interruptible_wait' to host2guc_action() or
to keep things simple use mutex_lock() only ?
I thought it would be cleaner to abstract the lock usage, for 
serialization, entirely inside the host2guc_action only.



--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc,
u32 *data, u32 len)
  return -EINVAL;

  intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+spin_lock(>action_lock);


The code below can sleep waiting for a response from GuC so you cannot
use a spinlock. Mutex I suppose...


Sorry I missed the sleep.
Probably I did not see any problem, in spite of a spinlock, as
_wait_for
macro does not sleep when used in atomic context, does a busy wait
instead.


I wonder about that in general, since in_atomic is not a reliable
indicator. But that is beside the point. You probably haven't seen it
because the action completes in the first shorter, atomic sleep, check.


Actually I had profiled host2guc_logbuffer_flush_complete() and saw that
on some occasions it was taking more than 100 micro seconds,
so presumably it would have went past the first wait.
But most of the times it was less than 10 micro seconds only.

ret = wait_for_us(host2guc_action_response(dev_priv, ), 10);
if (ret)
 ret = wait_for(host2guc_action_response(dev_priv, ), 10);


Yes presumably so. In that case keep in mind that in_atomic always
returns false in spinlock sections unless the kernel has
CONFIG_PREEMPT_COUNT enabled.


Thanks for this info, will be mindful of this in future.

Best regards
Akash


Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 14/17] drm/i915: Add stats for GuC log buffer flush interrupts

2016-07-18 Thread Goel, Akash



On 7/18/2016 3:46 PM, Tvrtko Ursulin wrote:


On 15/07/16 16:58, Goel, Akash wrote:

On 7/15/2016 5:21 PM, Tvrtko Ursulin wrote:

On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_debugfs.c| 26
++
  drivers/gpu/drm/i915/i915_guc_submission.c |  8 
  drivers/gpu/drm/i915/i915_irq.c|  1 +
  drivers/gpu/drm/i915/intel_guc.h   |  6 ++
  4 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 3c9c7f7..888a18a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2538,6 +2538,30 @@ static int i915_guc_load_status_info(struct
seq_file *m, void *data)
  return 0;
  }

+static void i915_guc_log_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv)
+{
+struct intel_guc *guc = _priv->guc;
+
+seq_printf(m, "\nGuC logging stats:\n");
+
+seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
+
+seq_printf(m, "\tTotal flush interrupt count: %u\n",
+guc->log.flush_interrupt_count);
+
+}
+
  static void i915_guc_client_info(struct seq_file *m,
   struct drm_i915_private *dev_priv,
   struct i915_guc_client *client)
@@ -2611,6 +2635,8 @@ static int i915_guc_info(struct seq_file *m,
void *data)
  seq_printf(m, "\nGuC execbuf client @ %p:\n",
guc.execbuf_client);
  i915_guc_client_info(m, dev_priv, );

+i915_guc_log_info(m, dev_priv);
+
  /* Add more as required ... */

  return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c1e637f..9c94a43 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -914,6 +914,14 @@ static void guc_read_update_log_buffer(struct
drm_device *dev)
  log_buffer_state_local = *log_buffer_state;
  buffer_size = log_buffer_state_local.size;

+guc->log.flush_count[i] +=
log_buffer_state_local.flush_to_file;
+if (log_buffer_state_local.buffer_full_cnt !=
+guc->log.prev_overflow_count[i]) {
+guc->log.prev_overflow_count[i] =
+log_buffer_state_local.buffer_full_cnt;
+guc->log.total_overflow_count[i]++;


Is log_buffer_state_local.buffer_full_cnt guaranteed to be one here? Or
you would need to increase total_overflow_count by its value?



buffer_full_cnt will not remain as one. Its a 4 bit counter, will be
incremented monotonically by GuC firmware on every new detection of
overflow, so will increase from 0 to 15 & then wrap around.
Hence have to use '!=' in the condition instead of '>'.


But can it happen that it jumps by more than one between being sampled
here? In which case you would need to replace:

guc->log.total_overflow_count[i]++;

by something like:


guc->log.total_overflow_count[i] +=
log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[i];

(Doesn't handle the wrap though, just to illustrate my point.)

Actually logic in GuC firmware is such that overflow counter cannot 
increment by more than 1 without Driver coming into picture in between, 
by the virtue of flush interrupt.

But nevertheless the logic on Driver side should be like the way you
suggested.

Does this revised logic looks fine ?

if (log_buffer_state_local.buffer_full_cnt !=
guc->log.prev_overflow_count[i]) {
new_overflow = 1;
	guc->log.total_overflow_count[i] += 
(log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[i]);


	if (log_buffer_state_local.buffer_full_cnt < 
guc->log.prev_overflow_count[i])

guc->log.total_overflow_count[i] += 15;

log_buffer_state_local.buffer_full_cnt = 
guc->log.prev_overflow_

Re: [Intel-gfx] [PATCH 13/17] drm/i915: New lock to serialize the Host2GuC actions

2016-07-18 Thread Goel, Akash



On 7/18/2016 3:42 PM, Tvrtko Ursulin wrote:


On 15/07/16 16:51, Goel, Akash wrote:



On 7/15/2016 5:10 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.


After which patch in this series is this required?


 From patch 6 or 7 saw the problem, when enabled flush interrupts from
boot (guc_log_level >= 0).


That means this patch should come before 6 or 7. :)


Also new HOST2GUC actions LOG_BUFFER_FILE_FLUSH_COMPLETE &
UK_LOG_ENABLE_LOGGING can execute concurrently with each other.


Right I see, from the worker/thread vs debugfs activity.


Will use mutex to serialize and place the patch earlier in the series.
Please suggest which would be better,
mutex_lock()
or
mutex_lock_interruptible().



Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
  drivers/gpu/drm/i915/intel_guc.h   | 3 +++
  2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6043166..c1e637f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc,
u32 *data, u32 len)
  return -EINVAL;

  intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+spin_lock(>action_lock);


The code below can sleep waiting for a response from GuC so you cannot
use a spinlock. Mutex I suppose...


Sorry I missed the sleep.
Probably I did not see any problem, in spite of a spinlock, as _wait_for
macro does not sleep when used in atomic context, does a busy wait
instead.


I wonder about that in general, since in_atomic is not a reliable
indicator. But that is beside the point. You probably haven't seen it
because the action completes in the first shorter, atomic sleep, check.

Actually I had profiled host2guc_logbuffer_flush_complete() and saw that 
on some occasions it was taking more than 100 micro seconds,

so presumably it would have went past the first wait.
But most of the times it was less than 10 micro seconds only.

ret = wait_for_us(host2guc_action_response(dev_priv, ), 10);
if (ret)
ret = wait_for(host2guc_action_response(dev_priv, ), 10);

Best regards
Akash

Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 11/17] drm/i915: Support to create write combined type vmaps

2016-07-15 Thread Goel, Akash



On 7/15/2016 5:01 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Chris Wilson 

vmaps has a provision for controlling the page protection bits, with
which
we can use to control the mapping type, e.g. WB, WC, UC or even WT.
To allow the caller to choose their mapping type, we add a parameter to
i915_gem_object_pin_map - but we still only allow one vmap to be cached
per object. If the object is currently not pinned, then we recreate the
previous vmap with the new access type, but if it was pinned we report an
error. This effectively limits the access via i915_gem_object_pin_map
to a
single mapping type for the lifetime of the object. Not usually a
problem,
but something to be aware of when setting up the object's vmap.

We will want to vary the access type to enable WC mappings of ringbuffer
and context objects on !llc platforms, as well as other objects where we
need coherent access to the GPU's pages without going through the GTT

v2: Remove the redundant braces around pin count check and fix the marker
 in documentation (Chris)

Signed-off-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.h|  4 ++-
  drivers/gpu/drm/i915/i915_gem.c| 57
+++---
  drivers/gpu/drm/i915/i915_gem_dmabuf.c |  2 +-
  drivers/gpu/drm/i915/i915_guc_submission.c |  2 +-
  drivers/gpu/drm/i915/intel_lrc.c   |  8 ++---
  drivers/gpu/drm/i915/intel_ringbuffer.c|  2 +-
  6 files changed, 54 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index 6e2ddfa..84afa17 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3248,6 +3248,7 @@ static inline void
i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
  /**
   * i915_gem_object_pin_map - return a contiguous mapping of the
entire object
   * @obj - the object to map into kernel address space
+ * @use_wc - whether the mapping should be using WC or WB pgprot_t
   *
   * Calls i915_gem_object_pin_pages() to prevent reaping of the object's
   * pages and then returns a contiguous mapping of the backing
storage into
@@ -3259,7 +3260,8 @@ static inline void
i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
   * Returns the pointer through which to access the mapped object, or an
   * ERR_PTR() on error.
   */
-void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object
*obj);
+void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object
*obj,
+ bool use_wc);


Could you make it an enum instead of a bool? Commit message suggests
more modes will potentially be added and if so, and we start with an
enum straight away, it will make for less churn in the future.

func(something, true) is always also quite unreadabe in the code because
one has to remember or remind himself what it really means.

Something like func(something, MAP_WC) would be simply self-documenting.


Thanks nice suggestion, will do that.
enum only or macros also will do ?
#define MAP_CACHED  0x1
#define MAP_WC  0x2



  /**
   * i915_gem_object_unpin_map - releases an earlier mapping
diff --git a/drivers/gpu/drm/i915/i915_gem.c
b/drivers/gpu/drm/i915/i915_gem.c
index 8f50919..c431b40 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2471,10 +2471,11 @@ i915_gem_object_put_pages(struct
drm_i915_gem_object *obj)
  list_del(>global_list);

  if (obj->mapping) {
-if (is_vmalloc_addr(obj->mapping))
-vunmap(obj->mapping);
+void *ptr = (void *)((uintptr_t)obj->mapping & ~1);


How many bits we have to play with here? Is there a suitable define
somewhere we could use for a mask instead of hardcoded "1" or we could
add one if you think that would be better?


As Chris said, will use PAGE_MASK.




+if (is_vmalloc_addr(ptr))
+vunmap(ptr);
  else
-kunmap(kmap_to_page(obj->mapping));
+kunmap(kmap_to_page(ptr));
  obj->mapping = NULL;
  }

@@ -2647,7 +2648,8 @@ i915_gem_object_get_pages(struct
drm_i915_gem_object *obj)
  }

  /* The 'mapping' part of i915_gem_object_pin_map() below */
-static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
+static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
+bool use_wc)
  {
  unsigned long n_pages = obj->base.size >> PAGE_SHIFT;
  struct sg_table *sgt = obj->pages;
@@ -2659,7 +2661,7 @@ static void *i915_gem_object_map(const struct
drm_i915_gem_object *obj)
  void *addr;

  /* A single page can always be kmapped */
-if (n_pages == 1)
+if (n_pages == 1 && !use_wc)
  return kmap(sg_page(sgt->sgl));

  if (n_pages > ARRAY_SIZE(stack_pages)) {
@@ -2675,7 +2677,8 @@ static void *i915_gem_object_map(const struct

Re: [Intel-gfx] [PATCH 15/17] drm/i915: Increase GuC log buffer size to reduce flush interrupts

2016-07-15 Thread Goel, Akash



On 7/15/2016 8:37 PM, Tvrtko Ursulin wrote:


On 15/07/16 15:42, Goel, Akash wrote:

On 7/15/2016 5:27 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.

Suggested-by: Tvrtko Ursulin <tvrtko.ursu...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/intel_guc_fwif.h | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 1de6928..7521ed5 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -104,9 +104,9 @@
  #define   GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3)
  #define   GUC_LOG_CRASH_PAGES1
  #define   GUC_LOG_CRASH_SHIFT4
-#define   GUC_LOG_DPC_PAGES3
+#define   GUC_LOG_DPC_PAGES7
  #define   GUC_LOG_DPC_SHIFT6
-#define   GUC_LOG_ISR_PAGES3
+#define   GUC_LOG_ISR_PAGES7
  #define   GUC_LOG_ISR_SHIFT9
  #define   GUC_LOG_BUF_ADDR_SHIFT12

@@ -436,9 +436,9 @@ enum guc_log_buffer_type {
   *|   Crash dump state header |
   * Page1  +---+
   *|   ISR logs|
- * Page5  +---+
- *|   DPC logs|
   * Page9  +---+
+ *|   DPC logs|
+ * Page17 +---+
   *| Crash Dump logs   |
   *+---+
   *



I don't mind - but does it help? And how much and for what? Haven't you
later found that the uncached reads were the main issue?

This change along with kthread patch, helped reduce the overflow counts
and even eliminate them for some benchmarks.

Though with the impending optimization for Uncached reads there should
be further improvements but in my view, notwithstanding the improvement
w.r.t overflow count, its still a better configuration to work with as
flush interrupt frequency is cut down to half and not able to see any
apparent downsides to it.


I was primarily thinking to go with a minimal and simplest set of
patches to implement the feature.


I second that and working with the same intent.


Logic was that apparently none of the smart and complex optimisations
managed to solve the dropped interrupt issue, until the slowness of the
uncached read was discovered to be the real/main issue.

So it seems that is something that definitely needs to be implemented.
(Whether or not it will be possible to use SSE instructions to do the
read I don't know.)



log buffer resizing and rt priority kthread changes have definitely 
helped significantly.


Only of late we realized that there is a potential way to speed up 
Uncached reads also. Moreover I am yet to test that on kernel side.
So until that is tested & proves to be enough, we have to rely on the 
other optimizations & can't dismiss them



Assuming it is possible, then the question is whether there is need for
all the other optimisations. Ie. do we need the kthread with rtprio or
would a simple worker be enough?
I think we can take a call, once we have the results with Uncached read 
optimization.



Do we need the new i915 param for tweaking the relay sub-buffers?

In my opinion it will be really useful to have this provision, as I
tried to explain in the other mail.


Do we need the increase of the log buffer size?
Though this seems to be a benign change which is definitely good to 
have, but again can decide upon it once we have the results.


The extra patch to do smarter reads?


If we do not have the issue of the dropped interrupts with none of these
extra patches applied, then we could afford to not bother with them now.
Would make the series shorter and review easier and the feature in quicker.


Agree with you.
Had none of these optimizations in the initial version of the series, 
but was compelled to add them later when realized the rate at which GuC 
was generating the logs.


Best regards
Akash


Or maybe we do need all the advanced stuff, I don't know, I am just
asking the question and would like to see some data.

Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 14/17] drm/i915: Add stats for GuC log buffer flush interrupts

2016-07-15 Thread Goel, Akash



On 7/15/2016 5:21 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_debugfs.c| 26
++
  drivers/gpu/drm/i915/i915_guc_submission.c |  8 
  drivers/gpu/drm/i915/i915_irq.c|  1 +
  drivers/gpu/drm/i915/intel_guc.h   |  6 ++
  4 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 3c9c7f7..888a18a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2538,6 +2538,30 @@ static int i915_guc_load_status_info(struct
seq_file *m, void *data)
  return 0;
  }

+static void i915_guc_log_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv)
+{
+struct intel_guc *guc = _priv->guc;
+
+seq_printf(m, "\nGuC logging stats:\n");
+
+seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
+
+seq_printf(m, "\tTotal flush interrupt count: %u\n",
+guc->log.flush_interrupt_count);
+
+}
+
  static void i915_guc_client_info(struct seq_file *m,
   struct drm_i915_private *dev_priv,
   struct i915_guc_client *client)
@@ -2611,6 +2635,8 @@ static int i915_guc_info(struct seq_file *m,
void *data)
  seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
  i915_guc_client_info(m, dev_priv, );

+i915_guc_log_info(m, dev_priv);
+
  /* Add more as required ... */

  return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c1e637f..9c94a43 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -914,6 +914,14 @@ static void guc_read_update_log_buffer(struct
drm_device *dev)
  log_buffer_state_local = *log_buffer_state;
  buffer_size = log_buffer_state_local.size;

+guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
+if (log_buffer_state_local.buffer_full_cnt !=
+guc->log.prev_overflow_count[i]) {
+guc->log.prev_overflow_count[i] =
+log_buffer_state_local.buffer_full_cnt;
+guc->log.total_overflow_count[i]++;


Is log_buffer_state_local.buffer_full_cnt guaranteed to be one here? Or
you would need to increase total_overflow_count by its value?



buffer_full_cnt will not remain as one. Its a 4 bit counter, will be 
incremented monotonically by GuC firmware on every new detection of 
overflow, so will increase from 0 to 15 & then wrap around.

Hence have to use '!=' in the condition instead of '>'.

Best regards
Akash


+}
+
  if (log_buffer_copy_state) {
  /* First copy the state structure */
  memcpy(log_buffer_copy_state, _buffer_state_local,
diff --git a/drivers/gpu/drm/i915/i915_irq.c
b/drivers/gpu/drm/i915/i915_irq.c
index bdd7a67..c3fb67e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1711,6 +1711,7 @@ static void gen9_guc_irq_handler(struct
drm_i915_private *dev_priv, u32 gt_iir)
  _priv->guc.events_work);
  }
  }
+dev_priv->guc.log.flush_interrupt_count++;
  spin_unlock(_priv->irq_lock);
  }
  }
diff --git a/drivers/gpu/drm/i915/intel_guc.h
b/drivers/gpu/drm/i915/intel_guc.h
index 611f4a7..e911a32 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -128,6 +128,12 @@ struct intel_guc_log {
  struct workqueue_struct *wq;
  void *buf_addr;
  struct rchan *relay_chan;
+
+/* logging related stats */
+u32 flush_interrupt_count;
+u32 prev_overflow_count[GUC_MAX_LOG_BUFFER];
+u32 total_overflow_count[GUC_MAX_LOG_BUFFER];
+u32 flush_count[GUC_MAX_LOG_BUFFER];
  };

  struct intel_guc {



Regards,

Tvrtko


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org

Re: [Intel-gfx] [PATCH 13/17] drm/i915: New lock to serialize the Host2GuC actions

2016-07-15 Thread Goel, Akash



On 7/15/2016 5:10 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel 

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.


After which patch in this series is this required?

From patch 6 or 7 saw the problem, when enabled flush interrupts from 
boot (guc_log_level >= 0).


Also new HOST2GUC actions LOG_BUFFER_FILE_FLUSH_COMPLETE & 
UK_LOG_ENABLE_LOGGING can execute concurrently with each other.




Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
  drivers/gpu/drm/i915/intel_guc.h   | 3 +++
  2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6043166..c1e637f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc,
u32 *data, u32 len)
  return -EINVAL;

  intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+spin_lock(>action_lock);


The code below can sleep waiting for a response from GuC so you cannot
use a spinlock. Mutex I suppose...


Sorry I missed the sleep.
Probably I did not see any problem, in spite of a spinlock, as _wait_for 
macro does not sleep when used in atomic context, does a busy wait instead.


Best Regards
Akash





  dev_priv->guc.action_count += 1;
  dev_priv->guc.action_cmd = data[0];
@@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc,
u32 *data, u32 len)
  }
  dev_priv->guc.action_status = status;

+spin_unlock(>action_lock);
  intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);

  return ret;
@@ -1304,6 +1306,7 @@ int i915_guc_submission_init(struct
drm_i915_private *dev_priv)
  return -ENOMEM;

  ida_init(>ctx_ids);
+spin_lock_init(>action_lock);


I think this should go to guc_client_alloc which is where the guc client
object is allocated and initialized.


  guc_create_log(guc);
  guc_create_ads(guc);

diff --git a/drivers/gpu/drm/i915/intel_guc.h
b/drivers/gpu/drm/i915/intel_guc.h
index d56bde6..611f4a7 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -157,6 +157,9 @@ struct intel_guc {

  uint64_t submissions[I915_NUM_ENGINES];
  uint32_t last_seqno[I915_NUM_ENGINES];
+
+/* To serialize the Host2GuC actions */
+spinlock_t action_lock;
  };

  /* intel_guc_loader.c */



Regards,

Tvrtko


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 10/17] drm/i915: New module param to control the size of buffer used for storing GuC firmware logs

2016-07-15 Thread Goel, Akash



On 7/15/2016 4:45 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel 

On recieving the log buffer flush interrupt from GuC firmware, Driver
stores the snapshot of the log buffer in a local buffer, from which
Userspace can pull the logs. By default Driver store, up to, 4 snapshots
of the log buffer in a local buffer (managed by relay).
Added a new module (read only) param, 'guc_log_size', through which User
can specify the number of snapshots of log buffer to be stored in local
buffer. This can be used to ensure capturing of all boot time logs even
with high verbosity level.

v2: Rename module param to more apt name 'guc_log_buffer_nr'. (Nikula)

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 +--
  drivers/gpu/drm/i915/i915_params.c | 5 +
  drivers/gpu/drm/i915/i915_params.h | 1 +
  3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2e3b723..009d7c0 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1046,8 +1046,7 @@ static int guc_create_log_relay_file(struct
intel_guc *guc)

  /* Keep the size of sub buffers same as shared log buffer */
  subbuf_size = guc->log.obj->base.size;
-/* TODO: Decide based on the User's input */
-n_subbufs = 4;
+n_subbufs = i915.guc_log_buffer_nr;

  guc_log_relay_chan = relay_open("guc_log", log_dir,
  subbuf_size, n_subbufs, _callbacks, dev);
diff --git a/drivers/gpu/drm/i915/i915_params.c
b/drivers/gpu/drm/i915/i915_params.c
index 8b13bfa..d30c972 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -57,6 +57,7 @@ struct i915_params i915 __read_mostly = {
  .enable_guc_loading = -1,
  .enable_guc_submission = -1,
  .guc_log_level = -1,
+.guc_log_buffer_nr = 4,
  .enable_dp_mst = true,
  .inject_load_failure = 0,
  .enable_dpcd_backlight = false,
@@ -214,6 +215,10 @@ module_param_named(guc_log_level,
i915.guc_log_level, int, 0400);
  MODULE_PARM_DESC(guc_log_level,
  "GuC firmware logging level (-1:disabled (default), 0-3:enabled)");

+module_param_named(guc_log_buffer_nr, i915.guc_log_buffer_nr, int,
0400);
+MODULE_PARM_DESC(guc_log_buffer_nr,
+"Number of sub buffers to store GuC firmware logs (default: 4)");
+
  module_param_named_unsafe(enable_dp_mst, i915.enable_dp_mst, bool,
0600);
  MODULE_PARM_DESC(enable_dp_mst,
  "Enable multi-stream transport (MST) for new DisplayPort sinks.
(default: true)");
diff --git a/drivers/gpu/drm/i915/i915_params.h
b/drivers/gpu/drm/i915/i915_params.h
index 0ad020b..14ca855 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -48,6 +48,7 @@ struct i915_params {
  int enable_guc_loading;
  int enable_guc_submission;
  int guc_log_level;
+int guc_log_buffer_nr;
  int use_mmio_flip;
  int mmio_debug;
  int edp_vswing;



I did not figure out after a quick read of
Documentation/filesystems/relay.txt whether we really need this to be
configurable?

If I got it right number of sub-buffers here only has a relation to the
userspace relay consumer latency. If the userspace is responsive should
just two be enough? Or the existing default of four was shown in
practice that it is better and good enough?

Yes one of the use of this module parameter is to give User some leeway 
i.e. more time to collect logs from the relay buffer. User may not be 
always able to match the rate at which logs are being produced from the 
GuC side.


2 could be too less.
Even 4, when running a benchmark, was proving less and not able to match 
the Driver rate (this might change after some optimization is done from 
User space side also, like splice).


The other use is to ensure capturing of all boot time logs, even with 
maximum verbosity level. The default number of sub buffers may not 
always be sufficient to store all the logs from boot, by the time User 
is ready to capture the logs.

Saw about 8 flush interrupts coming from GuC during the boot.


I am just not sure this is a useful module parameter without some more
data.

Even if it is needed, as minimum I think the name should reflect this is
about the relay side of things and not the GuC log buffer itself. So
something like i915.guc_relay_log_subbuf_nr or something.

Fine will use this name.


With the matching description of course.


Is the current description not apt ?
"Number of sub buffers to store GuC firmware logs (default: 4)");"

Best regards
Akash


Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 15/17] drm/i915: Increase GuC log buffer size to reduce flush interrupts

2016-07-15 Thread Goel, Akash



On 7/15/2016 5:27 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Akash Goel 

In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.

Suggested-by: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/intel_guc_fwif.h | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 1de6928..7521ed5 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -104,9 +104,9 @@
  #define   GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3)
  #define   GUC_LOG_CRASH_PAGES1
  #define   GUC_LOG_CRASH_SHIFT4
-#define   GUC_LOG_DPC_PAGES3
+#define   GUC_LOG_DPC_PAGES7
  #define   GUC_LOG_DPC_SHIFT6
-#define   GUC_LOG_ISR_PAGES3
+#define   GUC_LOG_ISR_PAGES7
  #define   GUC_LOG_ISR_SHIFT9
  #define   GUC_LOG_BUF_ADDR_SHIFT12

@@ -436,9 +436,9 @@ enum guc_log_buffer_type {
   *|   Crash dump state header |
   * Page1  +---+
   *|   ISR logs|
- * Page5  +---+
- *|   DPC logs|
   * Page9  +---+
+ *|   DPC logs|
+ * Page17 +---+
   *| Crash Dump logs   |
   *+---+
   *



I don't mind - but does it help? And how much and for what? Haven't you
later found that the uncached reads were the main issue?
This change along with kthread patch, helped reduce the overflow counts 
and even eliminate them for some benchmarks.


Though with the impending optimization for Uncached reads there should 
be further improvements but in my view, notwithstanding the improvement 
w.r.t overflow count, its still a better configuration to work with as 
flush interrupt frequency is cut down to half and not able to see any 
apparent downsides to it.


Best Regards
Akash


Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/17] drm/i915: Support for GuC interrupts

2016-07-11 Thread Goel, Akash



On 7/11/2016 7:13 PM, Tvrtko Ursulin wrote:


On 11/07/16 14:38, Goel, Akash wrote:

On 7/11/2016 6:53 PM, Tvrtko Ursulin wrote:


On 11/07/16 14:15, Goel, Akash wrote:

On 7/11/2016 4:00 PM, Tvrtko Ursulin wrote:





+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 gt_iir)
+{
+if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+spin_lock(_priv->irq_lock);
+if (dev_priv->guc.interrupts_enabled) {
+/* Sample the log buffer flush related bits & clear them
+ * out now itself from the message identity register to
+ * minimize the probability of losing a flush interrupt,
+ * when there are back to back flush interrupts.
+ * There can be a new flush interrupt, for different log
+ * buffer type (like for ISR), whilst Host is handling
+ * one (for DPC). Since same bit is used in message
+ * register for ISR & DPC, it could happen that GuC
+ * sets the bit for 2nd interrupt but Host clears out
+ * the bit on handling the 1st interrupt.
+ */
+u32 msg = I915_READ(SOFT_SCRATCH(15)) &
+(GUC2HOST_MSG_CRASH_DUMP_POSTED |
+ GUC2HOST_MSG_FLUSH_LOG_BUFFER);
+if (msg) {
+/* Clear the message bits that are handled */
+I915_WRITE(SOFT_SCRATCH(15),
+I915_READ(SOFT_SCRATCH(15)) & ~msg);
+
+/* Handle flush interrupt event in bottom half */
+queue_work(dev_priv->wq,
_priv->guc.events_work);


Since the later patch is changing this to use a thread, since you have
established worker is too slow - especially the shared one - I would
really recommend you start with the kthread straight away. Not have
the
worker for a while in the same series and then later change it to a
thread.


Actually it won't be appropriate to say that shared worker thread is
too
slow, but having a dedicated kthread definitely helps.

I kept the kthread patch at the last so that as per the response,
review comments can drop it also.


I think it should only be one implementation in the patch series. If we
agreed on a kthread make it so from the start.


Agree but actually right now, added the kthread patch more as a RFC and
presumed this won't be the final version of the series.
Will do the needful, as per the review comments, in the next version.


Ack.


And describe in the commit message why it was selected etc.


+}
+}
+spin_unlock(_priv->irq_lock);


Why does the above needs to be done under the irq_lock ?


Using the irq_lock for 'guc.interrupts_enabled', especially useful
while disabling the interrupt.


Why? I don't see how it gains you anything and so it seems preferable
not to hold it over mmio accesses.


Yes not needed for the mmio access part.
Just needed for the inspection of 'guc.interrupts_enabled' value.
Will reorder the code.


You don't need it just for reading that value, you can just drop it.



Its not strictly needed as its a mere read. But as per my limited
understanding, without the spinlock (which provides an implicit barrier
also) ISR might miss the reset of 'interrupts_enabled' flag, from a
thread on other CPU, and queue the new work. The update will be
visible eventually though. And same applies to the case when
'interrupts_enabled' flag is set from other CPU.
Good practice to use locks for accessing shared variables ?.

Best regards
Akash



Regards,

Tvrtko



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/17] drm/i915: Support for GuC interrupts

2016-07-11 Thread Goel, Akash



On 7/11/2016 6:53 PM, Tvrtko Ursulin wrote:


On 11/07/16 14:15, Goel, Akash wrote:

On 7/11/2016 4:00 PM, Tvrtko Ursulin wrote:





+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 gt_iir)
+{
+if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+spin_lock(_priv->irq_lock);
+if (dev_priv->guc.interrupts_enabled) {
+/* Sample the log buffer flush related bits & clear them
+ * out now itself from the message identity register to
+ * minimize the probability of losing a flush interrupt,
+ * when there are back to back flush interrupts.
+ * There can be a new flush interrupt, for different log
+ * buffer type (like for ISR), whilst Host is handling
+ * one (for DPC). Since same bit is used in message
+ * register for ISR & DPC, it could happen that GuC
+ * sets the bit for 2nd interrupt but Host clears out
+ * the bit on handling the 1st interrupt.
+ */
+u32 msg = I915_READ(SOFT_SCRATCH(15)) &
+(GUC2HOST_MSG_CRASH_DUMP_POSTED |
+ GUC2HOST_MSG_FLUSH_LOG_BUFFER);
+if (msg) {
+/* Clear the message bits that are handled */
+I915_WRITE(SOFT_SCRATCH(15),
+I915_READ(SOFT_SCRATCH(15)) & ~msg);
+
+/* Handle flush interrupt event in bottom half */
+queue_work(dev_priv->wq, _priv->guc.events_work);


Since the later patch is changing this to use a thread, since you have
established worker is too slow - especially the shared one - I would
really recommend you start with the kthread straight away. Not have the
worker for a while in the same series and then later change it to a
thread.


Actually it won't be appropriate to say that shared worker thread is too
slow, but having a dedicated kthread definitely helps.

I kept the kthread patch at the last so that as per the response,
review comments can drop it also.


I think it should only be one implementation in the patch series. If we
agreed on a kthread make it so from the start.


Agree but actually right now, added the kthread patch more as a RFC and
presumed this won't be the final version of the series.
Will do the needful, as per the review comments, in the next version.


And describe in the commit message why it was selected etc.


+}
+}
+spin_unlock(_priv->irq_lock);


Why does the above needs to be done under the irq_lock ?


Using the irq_lock for 'guc.interrupts_enabled', especially useful
while disabling the interrupt.


Why? I don't see how it gains you anything and so it seems preferable
not to hold it over mmio accesses.


Yes not needed for the mmio access part.
Just needed for the inspection of 'guc.interrupts_enabled' value.
Will reorder the code.

Best regards
Akash


Regards,

Tvrtko



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/17] drm/i915: Support for GuC interrupts

2016-07-11 Thread Goel, Akash



On 7/11/2016 4:00 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

There are certain types of interrupts which Host can recieve from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
   start of bottom half, its not really needed as there is only a
   single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
   register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.h|  1 +
  drivers/gpu/drm/i915/i915_guc_submission.c |  5 ++
  drivers/gpu/drm/i915/i915_irq.c| 98
--
  drivers/gpu/drm/i915/i915_reg.h| 11 
  drivers/gpu/drm/i915/intel_drv.h   |  3 +
  drivers/gpu/drm/i915/intel_guc.h   |  4 ++
  drivers/gpu/drm/i915/intel_guc_loader.c|  4 ++
  7 files changed, 122 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index c3a579f..6e2ddfa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1794,6 +1794,7 @@ struct drm_i915_private {
  u32 pm_imr;
  u32 pm_ier;
  u32 pm_rps_events;
+u32 pm_guc_events;
  u32 pipestat_irq_mask[I915_MAX_PIPES];

  struct i915_hotplug hotplug;

+
  /**
   * bdw_update_port_irq - update DE port interrupt
   * @dev_priv: driver private
@@ -1174,6 +1208,21 @@ static void gen6_pm_rps_work(struct work_struct
*work)
  mutex_unlock(_priv->rps.hw_lock);
  }

+static void gen9_guc2host_events_work(struct work_struct *work)
+{
+struct drm_i915_private *dev_priv =
+container_of(work, struct drm_i915_private, guc.events_work);
+
+spin_lock_irq(_priv->irq_lock);
+/* Speed up work cancellation during disabling guc interrupts. */
+if (!dev_priv->guc.interrupts_enabled) {
+spin_unlock_irq(_priv->irq_lock);
+return;
+}
+spin_unlock_irq(_priv->irq_lock);
+
+/* TODO: Handle the events for which GuC interrupted host */
+}

  /**
   * ivybridge_parity_work - Workqueue called when a parity error
interrupt
@@ -1346,11 +1395,13 @@ static irqreturn_t gen8_gt_irq_ack(struct
drm_i915_private *dev_priv,
  DRM_ERROR("The master control interrupt lied (GT3)!\n");
  }

-if (master_ctl & GEN8_GT_PM_IRQ) {
+if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
  gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2));
-if (gt_iir[2] & dev_priv->pm_rps_events) {
+if (gt_iir[2] & (dev_priv->pm_rps_events |
+ dev_priv->pm_guc_events)) {
  I915_WRITE_FW(GEN8_GT_IIR(2),
-  gt_iir[2] & dev_priv->pm_rps_events);
+  gt_iir[2] & (dev_priv->pm_rps_events |
+   dev_priv->pm_guc_events));
  ret = IRQ_HANDLED;
  } else
  DRM_ERROR("The master control interrupt lied (PM)!\n");
@@ -1382,6 +1433,9 @@ static void gen8_gt_irq_handler(struct
drm_i915_private *dev_priv,

  if (gt_iir[2] & dev_priv->pm_rps_events)
  gen6_rps_irq_handler(dev_priv, gt_iir[2]);
+
+if (gt_iir[2] & dev_priv->pm_guc_events)
+gen9_guc_irq_handler(dev_priv, gt_iir[2]);
  }

  static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
@@ -1628,6 +1682,38 @@ static void gen6_rps_irq_handler(struct
drm_i915_private *dev_priv, u32 pm_iir)
  }
  }

+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 gt_iir)
+{
+if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+spin_lock(_priv->irq_lock);
+if (dev_priv->guc.interrupts_enabled) {
+/* Sample the log buffer flush related bits & clear them
+ * out now itself from the message identity register to
+ * minimize the probability of losing a flush interrupt,
+ * when there are back to back flush interrupts.
+ * There can be a new flush interrupt, for different log
+ * buffer type (like for ISR), whilst Host is handling
+ * one (for DPC). Since same bit is used in message
+ * register for ISR & DPC, 

Re: [Intel-gfx] [PATCH 01/17] drm/i915: Decouple GuC log setup from verbosity parameter

2016-07-11 Thread Goel, Akash



On 7/11/2016 5:20 PM, Tvrtko Ursulin wrote:


On 11/07/16 12:41, Goel, Akash wrote:

On 7/11/2016 3:07 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2112e02..8a9a0cb 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c



diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 605c696..b211bd0 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -175,11 +175,13 @@ static void set_guc_init_params(struct
drm_i915_private *dev_priv)
  params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
  GUC_CTL_VCS2_ENABLED;

-if (i915.guc_log_level >= 0) {
-params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+params[GUC_CTL_LOG_PARAMS] = guc->log_flags;


guc->log_flags will be zero when logging is not configured because guc
is a part of dev_priv. So it looks safe - although I reckon it would be
clearer to set this (GUC_CTL_LOG_PARAMS) explicitly inside the if-else
below?


If logging is not enabled at (due to guc_log_level < 0), then also
log_flags needs to be setup & passed to GuC firmware.
log_flags shall not be zero even when logging is not be enabled (at boot
time).
Actually log_flags will also contain the address of the log buffer.


Ah yes, I got confused by jumping between one file with your patch
applied and one without it.


+
+if (i915.guc_log_level >= 0)
  params[GUC_CTL_DEBUG] =
  i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
-}
+else
+params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED;


I also wonder how come GUC_LOG_DISABLED isn't set today when
i915.guc_log_level == -1, given that:

#define   GUC_LOG_DISABLED (1 << 6)

Is that bit set by default somehow if i915 does not program it?



Yes currently GUC_LOG_DISABLED won't get set for guc_log_level = -1.
But then log buffer address will go as NULL and GUC_LOG_VALID flag will
go as 0, for guc_log_level = -1. So this way logging on GuC side will
not get enabled.
I hope I understood your concern correctly.


Yes, this clarifies it. Although I do have one more question then - what
happens if at boot i915.guc_log_level == -1 and then with later patches
logging gets enabled via debugfs - who and how sets
params[GUC_CTL_DEBUG]? Host2GuC overrides this parameter?



Yes through Host2GuC action type, UK_LOG_ENABLE_LOGGING, Host will 
request GuC firmware to enable/disable logging and alter the verbosity

level.

The params[GUC_CTL_DEBUG] is just part of the firmware initialization
parameters and is not used after that.

Best regards
Akash


Regards,

Tvrtko



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/17] drm/i915: Decouple GuC log setup from verbosity parameter

2016-07-11 Thread Goel, Akash



On 7/11/2016 3:07 PM, Tvrtko Ursulin wrote:


On 10/07/16 14:41, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2112e02..8a9a0cb 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -832,9 +832,6 @@ static void guc_create_log(struct intel_guc *guc)
  unsigned long offset;
  uint32_t size, flags;

-if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
-return;
-
  if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
  i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 605c696..b211bd0 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -175,11 +175,13 @@ static void set_guc_init_params(struct
drm_i915_private *dev_priv)
  params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
  GUC_CTL_VCS2_ENABLED;

-if (i915.guc_log_level >= 0) {
-params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+params[GUC_CTL_LOG_PARAMS] = guc->log_flags;


guc->log_flags will be zero when logging is not configured because guc
is a part of dev_priv. So it looks safe - although I reckon it would be
clearer to set this (GUC_CTL_LOG_PARAMS) explicitly inside the if-else
below?


If logging is not enabled at (due to guc_log_level < 0), then also 
log_flags needs to be setup & passed to GuC firmware.
log_flags shall not be zero even when logging is not be enabled (at boot 
time).

Actually log_flags will also contain the address of the log buffer.




+
+if (i915.guc_log_level >= 0)
  params[GUC_CTL_DEBUG] =
  i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
-}
+else
+params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED;


I also wonder how come GUC_LOG_DISABLED isn't set today when
i915.guc_log_level == -1, given that:

#define   GUC_LOG_DISABLED (1 << 6)

Is that bit set by default somehow if i915 does not program it?



Yes currently GUC_LOG_DISABLED won't get set for guc_log_level = -1.
But then log buffer address will go as NULL and GUC_LOG_VALID flag will
go as 0, for guc_log_level = -1. So this way logging on GuC side will 
not get enabled.

I hope I understood your concern correctly.

Best regards
Akash



  if (guc->ads_obj) {
  u32 ads = (u32)i915_gem_obj_ggtt_offset(guc->ads_obj)



Regards,

Tvrtko


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/14] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set

2016-07-03 Thread Goel, Akash



On 7/3/2016 3:08 PM, Chris Wilson wrote:

On Sun, Jul 03, 2016 at 12:21:20AM +0530, akash.g...@intel.com wrote:

From: Akash Goel 

So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level routines
so as to allow sharing the programming of PM IER/IIR/IMR registers between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow
easy sharing of it between Turbo & GuC without involving a rmw operation.

v2:
- For appropriateness & avoid any ambiguity, rename old functions
  enable/disable pm_irq to mask/unmask pm_irq and rename new functions
  enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko)
- Use u32 in place of uint32_t. (Tvrtko)

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_irq.c | 63 -
 drivers/gpu/drm/i915/intel_drv.h|  3 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  4 +--
 4 files changed, 53 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9ef4919..85a7103 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1806,6 +1806,7 @@ struct drm_i915_private {
};
u32 gt_irq_mask;
u32 pm_irq_mask;
+   u32 pm_ier_mask;


Oops. u32 pm_imr; and u32 pm_ier;


Fine, will rename.


u32 pm_rps_events;
u32 pipestat_irq_mask[I915_MAX_PIPES];

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4378a65..dd5ae6d 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -314,7 +314,7 @@ static void snb_update_pm_irq(struct drm_i915_private 
*dev_priv,
}
 }

-void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;
@@ -322,28 +322,62 @@ void gen6_enable_pm_irq(struct drm_i915_private 
*dev_priv, uint32_t mask)
snb_update_pm_irq(dev_priv, mask, mask);
 }

-static void __gen6_disable_pm_irq(struct drm_i915_private *dev_priv,
- uint32_t mask)
+static void __gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
snb_update_pm_irq(dev_priv, mask, 0);
 }

-void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;

-   __gen6_disable_pm_irq(dev_priv, mask);
+   __gen6_mask_pm_irq(dev_priv, mask);
 }

-void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+void gen6_reset_pm_irq(struct drm_i915_private *dev_priv, u32 reset_mask)


reset_pm_iir


Thanks, will update.


 {
i915_reg_t reg = gen6_pm_iir(dev_priv);

-   spin_lock_irq(_priv->irq_lock);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
+   assert_spin_locked(_priv->irq_lock);
+
+   I915_WRITE(reg, reset_mask);
+   I915_WRITE(reg, reset_mask);
POSTING_READ(reg);
+}
+
+void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, u32 enable_mask)
+{
+   u32 new_val;
+
+   assert_spin_locked(_priv->irq_lock);
+
+   new_val = dev_priv->pm_ier_mask;
+   new_val |= enable_mask;
+
+   dev_priv->pm_ier_mask = new_val;


dev_priv->pm_ier |= new_val;


Sorry, my bad.



+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask);
+   gen6_unmask_pm_irq(dev_priv, enable_mask);


What barrier do you need between the hw and the caller? I presume there
is a POSTING_READ in this callchain, would be nice to document it.

/* unmask_pm_irq provides a POSTING_READ */


Thanks, will add the comment.
So will assume that POSTING_READ is good enough here.


+}
+
+void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, u32 disable_mask)
+{
+   u32 new_val;
+
+   assert_spin_locked(_priv->irq_lock);
+
+   new_val = dev_priv->pm_ier_mask;
+   new_val &= ~disable_mask;
+
+   dev_priv->pm_ier_mask = new_val;


dev_priv->pm_ier &= ~disable_mask;


+   __gen6_mask_pm_irq(dev_priv, disable_mask);
+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask);


Do we need a barrier upon disabling? (Usually we need a stronger barrier
on enabling to ensure we don't miss an interrupt when enabling, but for
disabling we don't care.)

So no modification needed here, as you mentioned that we don't need to 
care about the register update getting completed in the disabling case.



+}
+
+void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+{
+   

Re: [Intel-gfx] [PATCH 13/14] drm/i915: Add stats for GuC log buffer flush interrupts

2016-07-03 Thread Goel, Akash



On 7/3/2016 3:14 PM, Chris Wilson wrote:

On Sun, Jul 03, 2016 at 12:21:30AM +0530, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush


For what purpose? Would not tracepoints be even more useful?
Having a stats would be useful to get an idea of the volume & the rate 
at which logs are being generated from GuC side and whether Driver is 
quick enough to capture all of them.


Yes tracepoint would also be very useful.

Please see below the logging related stats, in the output of
‘i915_guc_info’ on execution of ‘gem_exec_nop’ IGT.

GuC total action count: 623531
GuC action failure count: 0
GuC last action command: 0x30
GuC last action status: 0xf000
GuC last action error code: 0

GuC submissions:
render ring :9019910, last seqno 0x01a4390b
blitter ring:6188291, last seqno 0x01a4390d
bsd ring:6179075, last seqno 0x01a4390c
video enhancement ring  :6156547, last seqno 0x01a4390e
Total: 27543823

GuC execbuf client @ 8801659fb100:
Priority 2, GuC ctx index: 0, PD offset 0x800
Doorbell id 0, offset: 0x0, cookie 0x1a4490f
WQ size 8192, offset: 0x1000, tail 4336
Work queue full: 0
Failed to queue: 0
Failed doorbell: 0
Last submission result: 0
Submissions: 9019910 render ring
Submissions: 6188291 blitter ring
Submissions: 6179075 bsd ring
Submissions: 6156547 video enhancement ring
Total: 27543823

GuC logging stats:
ISR: flush count 321718, overflow count0
DPC: flush count 303788, overflow count1
CRASH:   flush count  0, overflow count0
Total flush interrupt count: 625511


Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/14] drm/i915: Handle log buffer flush interrupt event from GuC

2016-07-03 Thread Goel, Akash



On 7/3/2016 5:51 PM, Goel, Akash wrote:



On 7/3/2016 2:45 PM, Chris Wilson wrote:

On Sun, Jul 03, 2016 at 12:21:22AM +0530, akash.g...@intel.com wrote:

+static void guc_read_update_log_buffer(struct drm_device *dev, bool
capture_all)
+{
+struct drm_i915_private *dev_priv = dev->dev_private;
+struct intel_guc *guc = _priv->guc;
+struct guc_log_buffer_state *log_buffer_state;
+struct guc_log_buffer_state *log_buffer_copy_state;
+void *src_ptr, *dst_ptr;
+u32 num_pages_to_copy;
+int i;
+
+if (!guc->log.obj)
+return;
+
+num_pages_to_copy = guc->log.obj->base.size / PAGE_SIZE;
+/* Don't really need to copy crash buffer area in regular cases
as there
+ * won't be any unread data there.
+ */
+if (!capture_all)
+num_pages_to_copy -= (GUC_LOG_CRASH_PAGES + 1);
+
+log_buffer_state = src_ptr =
+kmap_atomic(i915_gem_object_get_page(guc->log.obj, 0));


So why not use i915_gem_object_pin_map() from the start?

That will cut down on the churn later.


Fine, will reorder the series and squash the other patch 'drm/i915: Use
uncached(WC) mapping for accessing the GuC log buffer' with this patch.

Sorry got confused, will use the i915_gem_object_pin_map() here instead 
of kmap and keep the WC mapping patch at the end of series only. Then 
will just have to modify the call to i915_gem_object_pin_map() to pass 
the WC flag.




Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/14] drm/i915: Handle log buffer flush interrupt event from GuC

2016-07-03 Thread Goel, Akash



On 7/3/2016 2:45 PM, Chris Wilson wrote:

On Sun, Jul 03, 2016 at 12:21:22AM +0530, akash.g...@intel.com wrote:

+static void guc_read_update_log_buffer(struct drm_device *dev, bool 
capture_all)
+{
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   struct intel_guc *guc = _priv->guc;
+   struct guc_log_buffer_state *log_buffer_state;
+   struct guc_log_buffer_state *log_buffer_copy_state;
+   void *src_ptr, *dst_ptr;
+   u32 num_pages_to_copy;
+   int i;
+
+   if (!guc->log.obj)
+   return;
+
+   num_pages_to_copy = guc->log.obj->base.size / PAGE_SIZE;
+   /* Don't really need to copy crash buffer area in regular cases as there
+* won't be any unread data there.
+*/
+   if (!capture_all)
+   num_pages_to_copy -= (GUC_LOG_CRASH_PAGES + 1);
+
+   log_buffer_state = src_ptr =
+   kmap_atomic(i915_gem_object_get_page(guc->log.obj, 0));


So why not use i915_gem_object_pin_map() from the start?

That will cut down on the churn later.


Fine, will reorder the series and squash the other patch 'drm/i915: Use 
uncached(WC) mapping for accessing the GuC log buffer' with this patch.


Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 04/11] drm/i915: Support for GuC interrupts

2016-07-01 Thread Goel, Akash



On 7/1/2016 2:17 PM, Tvrtko Ursulin wrote:



On 01/07/16 07:16, Goel, Akash wrote:

[snip]




+/* Process all the GuC to Host events in bottom half */
+gen6_disable_pm_irq(dev_priv,
+GEN9_GUC_TO_HOST_INT_EVENT);


Why it is important to disable the interrupt here? Not for the queue
work I think.


We want to & can handle one interrupt at a time, unless the queued work
item is executed we can't process the next interrupt, so better to keep
the interrupt masked.
Sorry this is what is my understanding.


So it is queued in hardware and will get asserted when unmasked?

As per my understanding, if the interrupt is masked (IMR), it won't be
queued, will be ignored & so will not be asserted on unmasking.

If the interrupt wasn't masked, but was disabled (in IER) then it
will be asserted (in IIR) when its enabled.







Also, is it safe with regards to potentially losing the interrupt?


Particularly for the FLUSH_LOG_BUFFER case, GuC won't send a new flush
interrupt unless its gets an acknowledgement (flush signal) of the
previous one from Host.


Ah so the previous comment is really impossible? I mean the need to
mask?


Sorry my comments were not fully correct. GuC can send a new flush
interrupt, even if the previous one is pending, but that will be for a
different log buffer type (3 types of log buffer ISR, DPC, CRASH).
For the same buffer type, GuC won't send a new flush interrupt unless
its gets an acknowledgement of the previous one from Host.

But as you said the workqueue is ordered and furthermore there is a
single instance of work item, so the serialization will be provided
implicitly and there is no real need to mask the interrupt.

As mentioned above, a new flush interrupt can come while the previous
one is being processed on Host but due to a single instance of work item
either that new interrupt will not do anything effectively if work
item was in a pending state or will re queue the work item if it was
getting executed at that time.

Also the state of all 3 log buffer types are being parsed irrespective
for which one the interrupt actually came, and the whole buffer is being
captured (this is how it has been recommended to handle the flush
interrupts from Host side). So if a new interrupt comes while the work
item was in a pending state, then effectively work of this new interrupt
will also be done when work item is executed later.

So will remove the masking then ?


I think so, because if I understood what you wrote, masking can lose us
an interrupt.



If a new flush interrupt comes while the work item was getting executed
then there is a potential of losing an opportunity to sample the log buffer.
Will not mask the interrupt.
Thanks for persisting on this.



Possibly just put a comment up there explaining that.




+queue_work(dev_priv->wq, _priv->guc.events_work);


Because dev_priv->wq is a one a time in order wq so if something
else is
running on it and taking time, can that also be a cause of dropping an
interrupt or being late with sending the flush signal to the guc
and so
losing some logs?


Its a Driver's private workqueue and Turbo work item is also queued
inside this workqueue which too needs to be executed without much
delay.
But yes the flush work item can get substantially delayed in case if
there are other work items queued before it, especially the
mm.retire_work (but generally executes every ~1 second).

Best would be if the log buffer (44KB data) can be sampled in IRQ
context (or Tasklet context) itself.


I was just trying to understand if you perhaps need a dedicated wq. I
don't have a feel at all on how much data guc logging generates per
second. If the interrupt is low frequency even with a lot of cmd
submission happening it could be fine like it is.


Actually with maximum verbosity level, I am seeing flush interrupt every
ms, with 'gem_exec_nop' IGT, as there are lot of submissions being done.
But such may not happen in real life scenario.

I think, if needed, later on we can either have a dedicated high
priority work queue for logging work or use the tasklet context to do
the processing.


Hm, do you need to add some DRM_ERROR or something if wq starts lagging
behind the flush interrupts? How many missed flush interrupts can we
afford before the logging buffer starts getting overwritten?



Actually if GuC is producing logs at such a fast rate then we can't 
afford to miss even a single interrupt, if we don't want to lose any logs.

When the log buffer becomes half full, GuC sends a flush interrupt.
GuC firmware expects that while it is writing to 2nd half of the buffer,
first half would get consumed by Host and then get a flush completed
acknowledgement from Host, so that it does not end up doing any 
overwrite causing loss of logs.


There is a buffer_full_cnt field in the state structure which GuC 
firmware increments every time it detects a potential log buffer 
overflow. Probably this

Re: [Intel-gfx] [PATCH 04/11] drm/i915: Support for GuC interrupts

2016-07-01 Thread Goel, Akash



On 6/28/2016 7:14 PM, Tvrtko Ursulin wrote:


On 28/06/16 12:12, Goel, Akash wrote:



On 6/28/2016 3:33 PM, Tvrtko Ursulin wrote:


On 27/06/16 13:16, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

There are certain types of interrupts which Host can recieve from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2: Use common low level routines for PM IER/IIR programming (Chris)
 Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
 Replace disabling of wake ref asserts with rpm get/put (Chris)

Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_drv.h|  1 +
  drivers/gpu/drm/i915/i915_guc_submission.c |  5 ++
  drivers/gpu/drm/i915/i915_irq.c| 95
--
  drivers/gpu/drm/i915/i915_reg.h| 11 
  drivers/gpu/drm/i915/intel_drv.h   |  3 +
  drivers/gpu/drm/i915/intel_guc.h   |  5 ++
  drivers/gpu/drm/i915/intel_guc_loader.c|  4 ++
  7 files changed, 120 insertions(+), 4 deletions(-)



+static void gen9_guc2host_events_work(struct work_struct *work)
+{
+struct drm_i915_private *dev_priv =
+container_of(work, struct drm_i915_private, guc.events_work);
+
+spin_lock_irq(_priv->irq_lock);
+/* Speed up work cancelation during disabling guc interrupts. */
+if (!dev_priv->guc.interrupts_enabled) {
+spin_unlock_irq(_priv->irq_lock);
+return;
+}
+
+/* Though this work item gets synced during rpm suspend, but
still need
+ * a rpm get/put to avoid the warning, as it could get executed
in a
+ * window, where rpm ref count has dropped to zero but rpm
suspend has
+ * not kicked in. Generally device is expected to be active only
at this
+ * time so get/put should be really quick.
+ */
+intel_runtime_pm_get(dev_priv);
+
+gen6_enable_pm_irq(dev_priv, GEN9_GUC_TO_HOST_INT_EVENT);
+spin_unlock_irq(_priv->irq_lock);
+
+/* TODO: Handle the events for which GuC interrupted host */
+
+intel_runtime_pm_put(dev_priv);
+}

  static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
@@ -1653,6 +1722,20 @@ static void gen6_rps_irq_handler(struct
drm_i915_private *dev_priv, u32 pm_iir)
  }
  }

+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 gt_iir)
+{
+if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+spin_lock(_priv->irq_lock);
+if (dev_priv->guc.interrupts_enabled) {


So it is expected interrupts will always be enabled when
i915.guc_log_level is set, correct?


Yes currently only when guc_log_level > 0, interrupt should be enabled.

But we need to disable/enable the interrupt upon suspend/resume and
across GPU reset.
So interrupt may not be always in a enabled state when guc_log_level>0.


Also do you need to check against dev_priv->guc.interrupts_enabled at
all then? Or from an opposite angle, would you instead need to log the
fact unexpected interrupt was received here?


I think this check is needed, to avoid the race in disabling interrupt.
Please refer the sequence in interrupt disabling function (same as rps
disabling), there we first set the interrupts_enabled flag to false,
then wait for the work item to finish execution and then program the IMR
register.


Right I see now that it is copy-pasted existing sequence. In this case I
won't question it further. :)




+/* Process all the GuC to Host events in bottom half */
+gen6_disable_pm_irq(dev_priv,
+GEN9_GUC_TO_HOST_INT_EVENT);


Why it is important to disable the interrupt here? Not for the queue
work I think.


We want to & can handle one interrupt at a time, unless the queued work
item is executed we can't process the next interrupt, so better to keep
the interrupt masked.
Sorry this is what is my understanding.


So it is queued in hardware and will get asserted when unmasked?

As per my understanding, if the interrupt is masked (IMR), it won't be
queued, will be ignored & so will not be asserted on unmasking.

If the interrupt wasn't masked, but was disabled (in IER) then it
will be asserted (in IIR) when its enabled.







Also, is it safe with regards to potentially losing the interrupt?


Particularly for the FLUSH_LOG_BUFFER case, GuC won't send a new flush
interrupt unless its gets an acknowledgement (flush signal) of the
previous one from Host.


Ah so the previous comment is really impossible? I mean the need to mask?


Sorry my comments were not fully correct. GuC can send a new flush 
interrupt, even if the previous one 

Re: [Intel-gfx] [PATCH 04/11] drm/i915: Support for GuC interrupts

2016-06-28 Thread Goel, Akash



On 6/28/2016 3:33 PM, Tvrtko Ursulin wrote:


On 27/06/16 13:16, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

There are certain types of interrupts which Host can recieve from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2: Use common low level routines for PM IER/IIR programming (Chris)
 Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
 Replace disabling of wake ref asserts with rpm get/put (Chris)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.h|  1 +
  drivers/gpu/drm/i915/i915_guc_submission.c |  5 ++
  drivers/gpu/drm/i915/i915_irq.c| 95
--
  drivers/gpu/drm/i915/i915_reg.h| 11 
  drivers/gpu/drm/i915/intel_drv.h   |  3 +
  drivers/gpu/drm/i915/intel_guc.h   |  5 ++
  drivers/gpu/drm/i915/intel_guc_loader.c|  4 ++
  7 files changed, 120 insertions(+), 4 deletions(-)



+static void gen9_guc2host_events_work(struct work_struct *work)
+{
+struct drm_i915_private *dev_priv =
+container_of(work, struct drm_i915_private, guc.events_work);
+
+spin_lock_irq(_priv->irq_lock);
+/* Speed up work cancelation during disabling guc interrupts. */
+if (!dev_priv->guc.interrupts_enabled) {
+spin_unlock_irq(_priv->irq_lock);
+return;
+}
+
+/* Though this work item gets synced during rpm suspend, but
still need
+ * a rpm get/put to avoid the warning, as it could get executed in a
+ * window, where rpm ref count has dropped to zero but rpm
suspend has
+ * not kicked in. Generally device is expected to be active only
at this
+ * time so get/put should be really quick.
+ */
+intel_runtime_pm_get(dev_priv);
+
+gen6_enable_pm_irq(dev_priv, GEN9_GUC_TO_HOST_INT_EVENT);
+spin_unlock_irq(_priv->irq_lock);
+
+/* TODO: Handle the events for which GuC interrupted host */
+
+intel_runtime_pm_put(dev_priv);
+}

  /**
   * ivybridge_parity_work - Workqueue called when a parity error
interrupt
@@ -1371,11 +1435,13 @@ static irqreturn_t gen8_gt_irq_ack(struct
drm_i915_private *dev_priv,
  DRM_ERROR("The master control interrupt lied (GT3)!\n");
  }

-if (master_ctl & GEN8_GT_PM_IRQ) {
+if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
  gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2));
-if (gt_iir[2] & dev_priv->pm_rps_events) {
+if (gt_iir[2] & (dev_priv->pm_rps_events |
+ dev_priv->guc_events)) {
  I915_WRITE_FW(GEN8_GT_IIR(2),
-  gt_iir[2] & dev_priv->pm_rps_events);
+  gt_iir[2] & (dev_priv->pm_rps_events |
+   dev_priv->guc_events));
  ret = IRQ_HANDLED;
  } else
  DRM_ERROR("The master control interrupt lied (PM)!\n");
@@ -1407,6 +1473,9 @@ static void gen8_gt_irq_handler(struct
drm_i915_private *dev_priv,

  if (gt_iir[2] & dev_priv->pm_rps_events)
  gen6_rps_irq_handler(dev_priv, gt_iir[2]);
+
+if (gt_iir[2] & dev_priv->guc_events)
+gen9_guc_irq_handler(dev_priv, gt_iir[2]);
  }

  static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
@@ -1653,6 +1722,20 @@ static void gen6_rps_irq_handler(struct
drm_i915_private *dev_priv, u32 pm_iir)
  }
  }

+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 gt_iir)
+{
+if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+spin_lock(_priv->irq_lock);
+if (dev_priv->guc.interrupts_enabled) {


So it is expected interrupts will always be enabled when
i915.guc_log_level is set, correct?


Yes currently only when guc_log_level > 0, interrupt should be enabled.

But we need to disable/enable the interrupt upon suspend/resume and
across GPU reset.
So interrupt may not be always in a enabled state when guc_log_level>0.


Also do you need to check against dev_priv->guc.interrupts_enabled at
all then? Or from an opposite angle, would you instead need to log the
fact unexpected interrupt was received here?


I think this check is needed, to avoid the race in disabling interrupt.
Please refer the sequence in interrupt disabling function (same as rps
disabling), there we first set the interrupts_enabled flag to false,
then wait for the work item to finish execution and then program the IMR 
register.





+/* Process all the GuC to Host events in bottom half */
+gen6_disable_pm_irq(dev_priv,
+GEN9_GUC_TO_HOST_INT_EVENT);


Why it is important to disable the interrupt here? Not for the queue
work I think.


We 

Re: [Intel-gfx] [PATCH 10/11] drm/i915: Support to create write combined type vmaps

2016-06-28 Thread Goel, Akash



On 6/28/2016 3:22 PM, Chris Wilson wrote:

On Mon, Jun 27, 2016 at 05:46:57PM +0530, akash.g...@intel.com wrote:

From: Chris Wilson 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 20c701c..3ef1ee5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3197,6 +3197,7 @@ static inline void i915_gem_object_unpin_pages(struct 
drm_i915_gem_object *obj)
 /**
  * i915_gem_object_pin_map - return a contiguous mapping of the entire object
  * @obj - the object to map into kernel address space
+ * _wc - whether the mapping should be using WC or WB pgprot_t


s/&/@/ I think


Sorry my bad.




 /* get, pin, and map the pages of the object into kernel space */
-void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj)
+void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj, bool use_wc)
 {
+   void *ptr;
+   bool has_wc;
+   bool pinned;
int ret;

lockdep_assert_held(>base.dev->struct_mutex);
+   GEM_BUG_ON((obj->ops->flags & I915_GEM_OBJECT_HAS_STRUCT_PAGE) == 0);

ret = i915_gem_object_get_pages(obj);
if (ret)
return ERR_PTR(ret);

+   GEM_BUG_ON(obj->pages == NULL);
i915_gem_object_pin_pages(obj);

-   if (!obj->mapping) {
-   obj->mapping = i915_gem_object_map(obj);
-   if (!obj->mapping) {
-   i915_gem_object_unpin_pages(obj);
-   return ERR_PTR(-ENOMEM);
+   pinned = (obj->pages_pin_count > 1);


Too many ()


Sorry is the above condition not correct ?
If pin count is more than 1 then it implies that pages have been pinned 
elsewhere also, so pages were already pinned before they were pinned

one more time, inside this function.
Please let me know, will fix it.

Best regards
Akash


Hmm. It may look a bit dubious if I add my r-b here. But I didn't spot
any rebasing errors.
-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/11] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-06-28 Thread Goel, Akash



On 6/28/2016 3:17 PM, Chris Wilson wrote:

On Mon, Jun 27, 2016 at 05:46:53PM +0530, akash.g...@intel.com wrote:

+static void guc_remove_log_relay_file(struct intel_guc *guc)
+{
+   relay_close(guc->log_relay_chan);
+}
+
+static void guc_create_log_relay_file(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct drm_device *dev = dev_priv->dev;
+   struct dentry *log_dir;
+   struct rchan *guc_log_relay_chan;
+   size_t n_subbufs, subbuf_size;
+
+   if (guc->log_relay_chan)
+   return;
+
+   /* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
+* not mounted and so can't create the relay file.
+* The relay API seems to fit well with debugfs only.
+*/


Ah. dev->primary->debugfs_root does not exist until the end of driver
loading.

You need to add an intel_guc_register() to the i915_driver_register()
after we call drm_dev_rigster() (that then calls this function).

Similarly, this needs to be torn down in unregister.


Yes, realized this today, that can’t get to the ‘dri’ directory until
the end of Driver load.
So will have to create the relay file after i915_driver_register().




+   if (!dev->primary->debugfs_root) {
+   /* logging will remain off */
+   i915.guc_log_level = -1;
+   return;
+   }
+
+   /* For now create the log file in /sys/kernel/debug/dri dir. */
+   log_dir = dev->primary->debugfs_root->d_parent;


In future, this will be something like /sys/kernel/gpu/i915/guc_log, so
I don't see a good argument for not being more canonical in the debugfs
placement and using dev->primary->debugfs_root (i.e. /.../dri/0)


Yes can now use the dev->primary->debugfs_root itself.

Actually earlier 'i915_debugfs_files' were being created inside other
drm_minor directories also (i.e. dri/64 & dri/128), but now they are
restricted only to dri/0.

Best regards
Akash


At the very least, you need to explain why we don't use dri/0/
-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/11] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set

2016-06-28 Thread Goel, Akash



On 6/28/2016 2:05 PM, Tvrtko Ursulin wrote:


On 27/06/16 17:35, Goel, Akash wrote:

On 6/27/2016 9:16 PM, Tvrtko Ursulin wrote:


On 27/06/16 13:16, akash.g...@intel.com wrote:

From: Akash Goel <akash.g...@intel.com>

So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level
routines
so as to allow sharing the programming of PM IER/IIR/IMR registers
between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to
allow
easy sharing of it between Turbo & GuC without involving a rmw
operation.

Suggested-by: Chris Wilson <ch...@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_drv.h  |  1 +
  drivers/gpu/drm/i915/i915_irq.c  | 55

  drivers/gpu/drm/i915/intel_drv.h |  6 +
  3 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index 9ef4919..85a7103 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1806,6 +1806,7 @@ struct drm_i915_private {
  };
  u32 gt_irq_mask;
  u32 pm_irq_mask;
+u32 pm_ier_mask;
  u32 pm_rps_events;
  u32 pipestat_irq_mask[I915_MAX_PIPES];

diff --git a/drivers/gpu/drm/i915/i915_irq.c
b/drivers/gpu/drm/i915/i915_irq.c
index 4378a65..7316ab4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -336,14 +336,52 @@ void gen6_disable_pm_irq(struct drm_i915_private
*dev_priv, uint32_t mask)
  __gen6_disable_pm_irq(dev_priv, mask);
  }

-void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+void gen6_reset_pm_interrupts(struct drm_i915_private *dev_priv,
+   uint32_t reset_mask)


Kernel prefers u32. It is not that overall i915 is clean in that
respect, but every time maintainers merge patches checkpatch shouts
about it, and more noise tougher it is to spot more important issues. I
would appreciate if u32 was used throughout.


Fine, will use u32.


Thanks!


  {
  i915_reg_t reg = gen6_pm_iir(dev_priv);

-spin_lock_irq(_priv->irq_lock);
-I915_WRITE(reg, dev_priv->pm_rps_events);
-I915_WRITE(reg, dev_priv->pm_rps_events);
+assert_spin_locked(_priv->irq_lock);
+
+I915_WRITE(reg, reset_mask);
+I915_WRITE(reg, reset_mask);
  POSTING_READ(reg);
+}
+
+void gen6_enable_pm_interrupts(struct drm_i915_private *dev_priv,
+   uint32_t enable_mask)
+{
+uint32_t new_val;
+
+assert_spin_locked(_priv->irq_lock);
+
+new_val = dev_priv->pm_ier_mask;
+new_val |= enable_mask;
+
+dev_priv->pm_ier_mask = new_val;
+I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask);
+gen6_enable_pm_irq(dev_priv, enable_mask);


Hm, will this be confusing that we will have gen6_enable_pm_interrupts
and gen6_enable_pm_irq, so extremely similar names and same parameters,
but for different use?

Sorry for using confusing, ambiguous names.


Maybe rename the old one to gen6_unmask_pm_irq and name this one
gen6_enable_pm_irq ? If there is really need to have both. Or add some
kerneldoc explaining which one is used for what?


Can I do like this, keep gen6_enable_pm_interrupts as is and rename
gen6_enable_pm_irq to gen6_unmask_pm_irq.
Similarly also rename gen6_disable_pm_irq to gen6_mask_pm_irq.


Yes for mask/unmask, but I think the suffix really needs to be the same
since it is the same functional family.

Fine, so will rename gen6_enable_pm_interrupts to gen6_enable_pm_irq,
and gen6_enable_pm_irq to gen6_unmask_pm_irq

Best regards
Akash


Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/11] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set

2016-06-27 Thread Goel, Akash



On 6/27/2016 9:16 PM, Tvrtko Ursulin wrote:


On 27/06/16 13:16, akash.g...@intel.com wrote:

From: Akash Goel 

So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level routines
so as to allow sharing the programming of PM IER/IIR/IMR registers
between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to
allow
easy sharing of it between Turbo & GuC without involving a rmw operation.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.h  |  1 +
  drivers/gpu/drm/i915/i915_irq.c  | 55

  drivers/gpu/drm/i915/intel_drv.h |  6 +
  3 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index 9ef4919..85a7103 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1806,6 +1806,7 @@ struct drm_i915_private {
  };
  u32 gt_irq_mask;
  u32 pm_irq_mask;
+u32 pm_ier_mask;
  u32 pm_rps_events;
  u32 pipestat_irq_mask[I915_MAX_PIPES];

diff --git a/drivers/gpu/drm/i915/i915_irq.c
b/drivers/gpu/drm/i915/i915_irq.c
index 4378a65..7316ab4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -336,14 +336,52 @@ void gen6_disable_pm_irq(struct drm_i915_private
*dev_priv, uint32_t mask)
  __gen6_disable_pm_irq(dev_priv, mask);
  }

-void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+void gen6_reset_pm_interrupts(struct drm_i915_private *dev_priv,
+   uint32_t reset_mask)


Kernel prefers u32. It is not that overall i915 is clean in that
respect, but every time maintainers merge patches checkpatch shouts
about it, and more noise tougher it is to spot more important issues. I
would appreciate if u32 was used throughout.


Fine, will use u32.




  {
  i915_reg_t reg = gen6_pm_iir(dev_priv);

-spin_lock_irq(_priv->irq_lock);
-I915_WRITE(reg, dev_priv->pm_rps_events);
-I915_WRITE(reg, dev_priv->pm_rps_events);
+assert_spin_locked(_priv->irq_lock);
+
+I915_WRITE(reg, reset_mask);
+I915_WRITE(reg, reset_mask);
  POSTING_READ(reg);
+}
+
+void gen6_enable_pm_interrupts(struct drm_i915_private *dev_priv,
+   uint32_t enable_mask)
+{
+uint32_t new_val;
+
+assert_spin_locked(_priv->irq_lock);
+
+new_val = dev_priv->pm_ier_mask;
+new_val |= enable_mask;
+
+dev_priv->pm_ier_mask = new_val;
+I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask);
+gen6_enable_pm_irq(dev_priv, enable_mask);


Hm, will this be confusing that we will have gen6_enable_pm_interrupts
and gen6_enable_pm_irq, so extremely similar names and same parameters,
but for different use?

Sorry for using confusing, ambiguous names.


Maybe rename the old one to gen6_unmask_pm_irq and name this one
gen6_enable_pm_irq ? If there is really need to have both. Or add some
kerneldoc explaining which one is used for what?


Can I do like this, keep gen6_enable_pm_interrupts as is and rename 
gen6_enable_pm_irq to gen6_unmask_pm_irq.

Similarly also rename gen6_disable_pm_irq to gen6_mask_pm_irq.

Best regards
Akash



+}
+
+void gen6_disable_pm_interrupts(struct drm_i915_private *dev_priv,
+uint32_t disable_mask)
+{
+uint32_t new_val;
+
+assert_spin_locked(_priv->irq_lock);
+
+new_val = dev_priv->pm_ier_mask;
+new_val &= ~disable_mask;
+
+dev_priv->pm_ier_mask = new_val;
+__gen6_disable_pm_irq(dev_priv, disable_mask);
+I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier_mask);
+}
+
+void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(_priv->irq_lock);
+gen6_reset_pm_interrupts(dev_priv, dev_priv->pm_rps_events);
  dev_priv->rps.pm_iir = 0;
  spin_unlock_irq(_priv->irq_lock);
  }
@@ -355,9 +393,7 @@ void gen6_enable_rps_interrupts(struct
drm_i915_private *dev_priv)
  WARN_ON(dev_priv->rps.pm_iir);
  WARN_ON(I915_READ(gen6_pm_iir(dev_priv)) &
dev_priv->pm_rps_events);
  dev_priv->rps.interrupts_enabled = true;
-I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) |
-dev_priv->pm_rps_events);
-gen6_enable_pm_irq(dev_priv, dev_priv->pm_rps_events);
+gen6_enable_pm_interrupts(dev_priv, dev_priv->pm_rps_events);

  spin_unlock_irq(_priv->irq_lock);
  }
@@ -379,9 +415,7 @@ void gen6_disable_rps_interrupts(struct
drm_i915_private *dev_priv)

  I915_WRITE(GEN6_PMINTRMSK, gen6_sanitize_rps_pm_mask(dev_priv,
~0));

-__gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events);
-I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) &
-~dev_priv->pm_rps_events);
+gen6_disable_pm_interrupts(dev_priv, 

Re: [Intel-gfx] [PATCH 01/11] drm/i915: Decouple GuC log setup from verbosity parameter

2016-06-27 Thread Goel, Akash



On 6/27/2016 9:26 PM, Tvrtko Ursulin wrote:


On 27/06/16 16:32, Goel, Akash wrote:



On 6/27/2016 8:30 PM, Tvrtko Ursulin wrote:


On 27/06/16 13:16, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

GuC Log buffer allocation was tied up with verbosity level kernel
parameter
i915.guc_log_level. User could be given a provision to enable
logging at
runtime and not necessarily during load time only. This patch will
perform
allocation of shared log buffer always but will initially enable
logging on
GuC side through init params based on i915.guc_log_level.

Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 ---
  drivers/gpu/drm/i915/intel_guc_loader.c| 8 +---
  2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 355b647..28a810f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -826,9 +826,6 @@ static void guc_create_log(struct intel_guc *guc)
  unsigned long offset;
  uint32_t size, flags;

-if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
-return;
-
  if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
  i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 8fe96a2..db3c897 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -173,11 +173,13 @@ static void set_guc_init_params(struct
drm_i915_private *dev_priv)
  params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
  GUC_CTL_VCS2_ENABLED;

-if (i915.guc_log_level >= 0) {
-params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+
+if (i915.guc_log_level >= 0)
  params[GUC_CTL_DEBUG] =
  i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
-}
+else
+params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED;

  if (guc->ads_obj) {
  u32 ads = (u32)i915_gem_obj_ggtt_offset(guc->ads_obj)



I did not manage to understand what is the benefit of always allocating
the log buffer? If the user never enables logging it just wasted 11
pages of memory, correct?


Yes if User never enables the logging at runtime, 11 RAM pages will be
wasted.

Currently the pages are permanently pinned in GGTT also.
The GGTT address of log buffer is passed in the GuC firmware init
params, at firmware loading time.

Probably this can be circumvented, if pages can be pinned right before
enabling logging (but using the same GGTT address).


Looking at the later patches in the series, could you instead create the
log buffer when logging is enabled via debugfs or implicitly via the
relayfs access?

Or is the problem then that you would then have to reset the GuC to
activate it?


Yes GuC would have to be reset & firmware needs to be reloaded to pass
the log buffer address.


Right, as minimum I think commit message needs to explain that. The
current explanation does not hold anyway since it is not possible to
enable it via modifying the module parameter.


Right, there should have been an explanation citing the constraint in 
late allocation of log buffer when logging is enabled.

Sorry for missing.



Btw have you considered keeping the module param as a global GuC logging
enable and adding new code on top? So keep the current code to only
allocate the buffer when module param is set, and then if it isn't fail
the later userspace triggered attempts to start the logging (in debugfs
or relayfs)?


Yes that was considered, keeping module param as the master control and 
allowing disable/enable of logging at runtime (through debugfs) only

when module param is set at boot time.

IIRC there was a request from Validation to keep logging control 
independent of boot time value of module param. So even if system

booted with guc_log_level as -1, still allow the logging to be enabled
at runtime later, through a debugfs interface 'i915_guc_log_control'.

Best regards
Akash


Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/11] drm/i915: Decouple GuC log setup from verbosity parameter

2016-06-27 Thread Goel, Akash



On 6/27/2016 8:30 PM, Tvrtko Ursulin wrote:


On 27/06/16 13:16, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

GuC Log buffer allocation was tied up with verbosity level kernel
parameter
i915.guc_log_level. User could be given a provision to enable logging at
runtime and not necessarily during load time only. This patch will
perform
allocation of shared log buffer always but will initially enable
logging on
GuC side through init params based on i915.guc_log_level.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 ---
  drivers/gpu/drm/i915/intel_guc_loader.c| 8 +---
  2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 355b647..28a810f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -826,9 +826,6 @@ static void guc_create_log(struct intel_guc *guc)
  unsigned long offset;
  uint32_t size, flags;

-if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
-return;
-
  if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
  i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 8fe96a2..db3c897 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -173,11 +173,13 @@ static void set_guc_init_params(struct
drm_i915_private *dev_priv)
  params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
  GUC_CTL_VCS2_ENABLED;

-if (i915.guc_log_level >= 0) {
-params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+
+if (i915.guc_log_level >= 0)
  params[GUC_CTL_DEBUG] =
  i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
-}
+else
+params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED;

  if (guc->ads_obj) {
  u32 ads = (u32)i915_gem_obj_ggtt_offset(guc->ads_obj)



I did not manage to understand what is the benefit of always allocating
the log buffer? If the user never enables logging it just wasted 11
pages of memory, correct?

Yes if User never enables the logging at runtime, 11 RAM pages will be 
wasted.


Currently the pages are permanently pinned in GGTT also.
The GGTT address of log buffer is passed in the GuC firmware init 
params, at firmware loading time.


Probably this can be circumvented, if pages can be pinned right before
enabling logging (but using the same GGTT address).


Looking at the later patches in the series, could you instead create the
log buffer when logging is enabled via debugfs or implicitly via the
relayfs access?

Or is the problem then that you would then have to reset the GuC to
activate it?


Yes GuC would have to be reset & firmware needs to be reloaded to pass 
the log buffer address.


Best regards
Akash



Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 09/11] drm/i915: New module param to control the size of buffer used for storing GuC firmware logs

2016-06-27 Thread Goel, Akash



On 6/27/2016 7:01 PM, Jani Nikula wrote:

On Mon, 27 Jun 2016, akash.g...@intel.com wrote:

From: Akash Goel 

On recieving the log buffer flush interrupt from GuC firmware, Driver
stores the snapshot of the log buffer in a local buffer, from which
Userspace can pull the logs. By default Driver store, up to, 4 snapshots
of the log buffer in a local buffer (managed by relay).
Added a new module (read only) param, 'guc_log_size', through which User
can specify the number of snapshots of log buffer to be stored in local
buffer. This can be used to ensure capturing of all boot time logs even
with high verbosity level.

Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 +--
 drivers/gpu/drm/i915/i915_params.c | 5 +
 drivers/gpu/drm/i915/i915_params.h | 1 +
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index fd26a9e..8c0fd83 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -999,8 +999,7 @@ static void guc_create_log_relay_file(struct intel_guc *guc)

/* Keep the size of sub buffers same as shared log buffer */
subbuf_size = guc->log_obj->base.size;
-   /* TODO: Decide based on the User's input */
-   n_subbufs = 4;
+   n_subbufs = i915.guc_log_size;

guc_log_relay_chan = relay_open("guc_log", log_dir,
subbuf_size, n_subbufs, _callbacks, dev);
diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 8b13bfa..14ce0c4 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -57,6 +57,7 @@ struct i915_params i915 __read_mostly = {
.enable_guc_loading = -1,
.enable_guc_submission = -1,
.guc_log_level = -1,
+   .guc_log_size = 4,
.enable_dp_mst = true,
.inject_load_failure = 0,
.enable_dpcd_backlight = false,
@@ -214,6 +215,10 @@ module_param_named(guc_log_level, i915.guc_log_level, int, 
0400);
 MODULE_PARM_DESC(guc_log_level,
"GuC firmware logging level (-1:disabled (default), 0-3:enabled)");

+module_param_named(guc_log_size, i915.guc_log_size, int, 0400);
+MODULE_PARM_DESC(guc_log_size,
+   "Number of sub buffers to store GuC firmware logs (default: 4)");
+


I guess my battle against adding all sorts of module parameters all the
time is a futile and lost one. :(

Please at least make it clear what the unit of the size is. It's not
obvious to me, and I shouldn't have to look at the source for that.



Sorry for not choosing a suitable name in first place.
I agree the name should be indicative of the unit.
As you would have seen, the parameter provides number of snapshots of 
the Log buffer which can be stored on Driver side.

The size of one snapshot or Log buffer is not so important here and can
change in future.

Please suggest an appropriate name ('guc_log_buffer_nr' ?)

Best regards
Akash

BR,
Jani.



 module_param_named_unsafe(enable_dp_mst, i915.enable_dp_mst, bool, 0600);
 MODULE_PARM_DESC(enable_dp_mst,
"Enable multi-stream transport (MST) for new DisplayPort sinks. (default: 
true)");
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index 0ad020b..89fa832 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -48,6 +48,7 @@ struct i915_params {
int enable_guc_loading;
int enable_guc_submission;
int guc_log_level;
+   int guc_log_size;
int use_mmio_flip;
int mmio_debug;
int edp_vswing;



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC 00/12] Support for sustained capturing of GuC firmware logs

2016-06-03 Thread Goel, Akash



On 6/3/2016 12:45 PM, Daniel Vetter wrote:

On Thu, Jun 02, 2016 at 12:21:49PM +0200, Johannes Berg wrote:

On Thu, 2016-06-02 at 10:16 +, Daniel Vetter wrote:


I still kinda like relayfs, except that it's not available in non-
debug builds. But so are plenty of other really interesting files we
have hidden in there.

sysfs isn't the solution, I already have a black eye from the sysfs
maintainer for our error state.


Heh. I tend to agree though.


No idea really where to put stuff. One option might be to have an
official debug directory (like we have power already) in sysfs as the
canonical place where drivers can dump stuff. We're not the only ones
with too much data to get to userspace for debugging driver/hw
issues, e.g. wireless firmware has pretty similar solutions.


We have two things in wireless:

 1) the devcoredump stuff, but that's a one-time event when something
bad happens and dumps a big blob into userspace, doesn't seem
relevant here

 2) continuous logging, which uses a debugfs file (though it could be
relayfs as well, doesn't really make a difference)


relayfs apparently moved in with debugfs. And a requirement (or at least
strong wishlist item) is that we can get at the data on production systems
(which really shouldn't mount debugfs). Seems like there's no place to
dump debug information outside of debugfs :(


There could be something said for using tracing, but that's only
independent of debugfs since the tracefs introduction in kernel 4.1.


We tried looking into tracing stuff for our performance counters, and at
least there the mismatch for dumping large-scale stuff was too much. But
tracefs looks like just the tracing debugfs directory cut out into a
separate filesystem, exactly to avoid that dreaded debugfs-is-insecure
issues.

I'd say we should smash it into debugfs, and if these troubles persist
then maybe we need to clean up the mess in there a bit and expose it as
drm_debugfs or whatever. Probably a topic for kernel summit even. At least
I feel like there's not enough consensus to add ABI at this point.

Hi Daniel,

Thanks much for your inputs.

So, on interim basis, can we have a relay backed debugfs interface only
i.e. /sys/kernel/debug/dri/guc_log.

And once the support for drm_debugfs is added, its just a matter of 
changing the file location, i.e. move it inside the drm_debugfs.


Best regards
Akash


-Daniel


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC 03/12] drm/i915: Support for GuC interrupts

2016-05-28 Thread Goel, Akash



On 5/28/2016 8:05 PM, Chris Wilson wrote:

On Sat, May 28, 2016 at 07:15:52PM +0530, Goel, Akash wrote:



On 5/28/2016 5:43 PM, Chris Wilson wrote:

On Sat, May 28, 2016 at 02:52:16PM +0530, Goel, Akash wrote:



On 5/28/2016 1:26 AM, Chris Wilson wrote:

On Sat, May 28, 2016 at 01:12:54AM +0530, akash.g...@intel.com wrote:

+void gen8_reset_guc_interrupts(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   i915_reg_t reg = gen6_pm_iir(dev_priv);



>From the looks of this we have multiple shadows for the same register.

That's very bad.

Now the platforms might be mutually exclusive, but it is still a mistake
that will catch us out.


Will check how it is in newer platforms.


+   spin_lock_irq(_priv->irq_lock);
+   I915_WRITE(reg, dev_priv->guc_events);
+   I915_WRITE(reg, dev_priv->guc_events);


What? Not even the tiniest of comments to explain?


Sorry actually just copied these steps as is from the
gen6_reset_rps_interrupts(), considering that the same set of
registers (IIR, IER, IMR) are involved here.
So the double clearing of IIR followed by posting read could be
needed here also.


Move it all to i915_irq.c and export routines to manipulate pm_iir such
that multiple users do not conflict.


Sorry but all interrupt related stuff for rps & GuC is already
inside i915_irq.c file.


Didn't notice, because this code didn't match my expectations for an
interface exported from i915_irq.c


Also the IER, IMR, IIR registers are being updated in a non
conflicting manner, no overlap between the PM bits & GuC events
bits.


They share a register, that mandates arbitration.



I think the arbitration (& serialization) is already being provided by 
irq_lock.



You mean to say need to have single set of routines only for interrupt
reset/enable/disable operations for rps & GuC.


Yes.



Fine will make them to use a single set of low level routines.


+   POSTING_READ(reg);


Again. Not even the tiniest of comments to explain?


+   spin_unlock_irq(_priv->irq_lock);
+}
+
+void gen8_enable_guc_interrupts(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = dev->dev_private;
+
+   spin_lock_irq(_priv->irq_lock);
+   if (!dev_priv->guc.interrupts_enabled) {
+   WARN_ON(I915_READ(gen6_pm_iir(dev_priv)) &
+   dev_priv->guc_events);
+   dev_priv->guc.interrupts_enabled = true;
+   I915_WRITE(gen6_pm_ier(dev_priv),
+ I915_READ(gen6_pm_ier(dev_priv)) | dev_priv->guc_events);


ier should be known, rmw on the reg should not be required.


Sorry same as above, copy paste from gen6_enable_rps_interrupts().
Without rmw, would this be fine ?

if (dev_priv->rps.interrupts_enabled)
I915_WRITE(gen6_pm_ier(dev_priv),
dev_priv->pm_rps_events | dev_priv->guc_events);
else
I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->guc_events);


Still has the presumption of owning a register that is ostensibly used
by others.


Since pm_ier is a shared register and being used by others also, rmw
seem to be more suited here. Otherwise need to be aware of who all is
sharing it so as to update it without disturbing the bits owned by
others.


Exactly, see above. The best interfaces from i915_irq.c do not use rmw
on the register values.


Fine will try to do away with use rmw operation for pm_ier by
maintaining a bit mask of enabled interrupts (just like pm_irq_mask).




+static void gen8_guc2host_events_work(struct work_struct *work)
+{
+   struct drm_i915_private *dev_priv =
+   container_of(work, struct drm_i915_private, guc.events_work);
+
+   spin_lock_irq(_priv->irq_lock);
+   /* Speed up work cancelation during disabling guc interrupts. */
+   if (!dev_priv->guc.interrupts_enabled) {
+   spin_unlock_irq(_priv->irq_lock);
+   return;
+   }
+
+   DISABLE_RPM_WAKEREF_ASSERTS(dev_priv);


This just shouts that the code is broken.


You mean to say that ideally the wakeref_count (& power.usage_count)
should already be non zero here.


Yes. If it is not under your control, then you have a bug in your code.
Existing DISABLE_RPM_WAKEREF_ASSERTS tell us where we know we have a bug
(and hacks in place whilst we wait for patch review).



This work item can also execute in a window where wakeref_count (&
power.usage_count) have become zero but runtime suspend has not yet
kicked in (due to auto-suspend delay), so "RPM wakelock ref not held
during HW access" warning would come.


i.e. your code is buggy, as DISABLE_RPM_WAKEREF_ASSERTS implied.



But isn't this applicable to rps work item also ?.
If there is a way found to circumvent this, then same can be applied to 
GuC work item also. DISABLE_RPM_WAKEREF_ASSERTS is a stopgap solution.



void 

Re: [Intel-gfx] [RFC 03/12] drm/i915: Support for GuC interrupts

2016-05-28 Thread Goel, Akash



On 5/28/2016 5:43 PM, Chris Wilson wrote:

On Sat, May 28, 2016 at 02:52:16PM +0530, Goel, Akash wrote:



On 5/28/2016 1:26 AM, Chris Wilson wrote:

On Sat, May 28, 2016 at 01:12:54AM +0530, akash.g...@intel.com wrote:

From: Sagar Arun Kamble <sagar.a.kam...@intel.com>

There are certain types of interrupts which Host can recieve from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

Signed-off-by: Sagar Arun Kamble <sagar.a.kam...@intel.com>
Signed-off-by: Akash Goel <akash.g...@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h|   1 +
drivers/gpu/drm/i915/i915_guc_submission.c |   2 +
drivers/gpu/drm/i915/i915_irq.c| 100 -
drivers/gpu/drm/i915/i915_reg.h|  11 
drivers/gpu/drm/i915/intel_drv.h   |   3 +
drivers/gpu/drm/i915/intel_guc.h   |   5 ++
drivers/gpu/drm/i915/intel_guc_loader.c|   1 +
7 files changed, 120 insertions(+), 3 deletions(-)



static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir);
+static void gen8_guc_irq_handler(struct drm_i915_private *dev_priv, u32 
pm_iir);

/* For display hotplug interrupt */
static inline void
@@ -400,6 +401,55 @@ void gen6_disable_rps_interrupts(struct drm_i915_private 
*dev_priv)
synchronize_irq(dev_priv->dev->irq);
}

+void gen8_reset_guc_interrupts(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   i915_reg_t reg = gen6_pm_iir(dev_priv);



>From the looks of this we have multiple shadows for the same register.

That's very bad.

Now the platforms might be mutually exclusive, but it is still a mistake
that will catch us out.


Will check how it is in newer platforms.


+   spin_lock_irq(_priv->irq_lock);
+   I915_WRITE(reg, dev_priv->guc_events);
+   I915_WRITE(reg, dev_priv->guc_events);


What? Not even the tiniest of comments to explain?


Sorry actually just copied these steps as is from the
gen6_reset_rps_interrupts(), considering that the same set of
registers (IIR, IER, IMR) are involved here.
So the double clearing of IIR followed by posting read could be
needed here also.


Move it all to i915_irq.c and export routines to manipulate pm_iir such
that multiple users do not conflict.

Sorry but all interrupt related stuff for rps & GuC is already inside 
i915_irq.c file.
Also the IER, IMR, IIR registers are being updated in a non conflicting 
manner, no overlap between the PM bits & GuC events bits.


You mean to say need to have single set of routines only for interrupt
reset/enable/disable operations for rps & GuC.


+   POSTING_READ(reg);


Again. Not even the tiniest of comments to explain?


+   spin_unlock_irq(_priv->irq_lock);
+}
+
+void gen8_enable_guc_interrupts(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = dev->dev_private;
+
+   spin_lock_irq(_priv->irq_lock);
+   if (!dev_priv->guc.interrupts_enabled) {
+   WARN_ON(I915_READ(gen6_pm_iir(dev_priv)) &
+   dev_priv->guc_events);
+   dev_priv->guc.interrupts_enabled = true;
+   I915_WRITE(gen6_pm_ier(dev_priv),
+ I915_READ(gen6_pm_ier(dev_priv)) | dev_priv->guc_events);


ier should be known, rmw on the reg should not be required.


Sorry same as above, copy paste from gen6_enable_rps_interrupts().
Without rmw, would this be fine ?

if (dev_priv->rps.interrupts_enabled)
I915_WRITE(gen6_pm_ier(dev_priv),
dev_priv->pm_rps_events | dev_priv->guc_events);
else
I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->guc_events);


Still has the presumption of owning a register that is ostensibly used
by others.


Since pm_ier is a shared register and being used by others also, rmw
seem to be more suited here. Otherwise need to be aware of who all is
sharing it so as to update it without disturbing the bits owned by
others.


+static void gen8_guc2host_events_work(struct work_struct *work)
+{
+   struct drm_i915_private *dev_priv =
+   container_of(work, struct drm_i915_private, guc.events_work);
+
+   spin_lock_irq(_priv->irq_lock);
+   /* Speed up work cancelation during disabling guc interrupts. */
+   if (!dev_priv->guc.interrupts_enabled) {
+   spin_unlock_irq(_priv->irq_lock);
+   return;
+   }
+
+   DISABLE_RPM_WAKEREF_ASSERTS(dev_priv);


This just shouts that the code is broken.


You mean to say that ideally the wakeref_count (& power.usage_count)
should alr

  1   2   >