On 7/18/2016 5:03 PM, Tvrtko Ursulin wrote:

On 18/07/16 11:59, Goel, Akash wrote:
On 7/18/2016 3:46 PM, Tvrtko Ursulin wrote:

On 15/07/16 16:58, Goel, Akash wrote:
On 7/15/2016 5:21 PM, Tvrtko Ursulin wrote:
On 10/07/16 14:41, akash.g...@intel.com wrote:
From: Akash Goel <akash.g...@intel.com>

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

Signed-off-by: Akash Goel <akash.g...@intel.com>
---
  drivers/gpu/drm/i915/i915_debugfs.c        | 26
++++++++++++++++++++++++++
  drivers/gpu/drm/i915/i915_guc_submission.c |  8 ++++++++
  drivers/gpu/drm/i915/i915_irq.c            |  1 +
  drivers/gpu/drm/i915/intel_guc.h           |  6 ++++++
  4 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 3c9c7f7..888a18a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2538,6 +2538,30 @@ static int i915_guc_load_status_info(struct
seq_file *m, void *data)
      return 0;
  }

+static void i915_guc_log_info(struct seq_file *m,
+                 struct drm_i915_private *dev_priv)
+{
+    struct intel_guc *guc = &dev_priv->guc;
+
+    seq_printf(m, "\nGuC logging stats:\n");
+
+    seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
+        guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+        guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+    seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
+        guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+        guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+    seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
+        guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+        guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
+
+    seq_printf(m, "\tTotal flush interrupt count: %u\n",
+            guc->log.flush_interrupt_count);
+
+}
+
  static void i915_guc_client_info(struct seq_file *m,
                   struct drm_i915_private *dev_priv,
                   struct i915_guc_client *client)
@@ -2611,6 +2635,8 @@ static int i915_guc_info(struct seq_file *m,
void *data)
      seq_printf(m, "\nGuC execbuf client @ %p:\n",
guc.execbuf_client);
      i915_guc_client_info(m, dev_priv, &client);

+    i915_guc_log_info(m, dev_priv);
+
      /* Add more as required ... */

      return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c1e637f..9c94a43 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -914,6 +914,14 @@ static void guc_read_update_log_buffer(struct
drm_device *dev)
          log_buffer_state_local = *log_buffer_state;
          buffer_size = log_buffer_state_local.size;

+        guc->log.flush_count[i] +=
log_buffer_state_local.flush_to_file;
+        if (log_buffer_state_local.buffer_full_cnt !=
+                    guc->log.prev_overflow_count[i]) {
+            guc->log.prev_overflow_count[i] =
+                    log_buffer_state_local.buffer_full_cnt;
+            guc->log.total_overflow_count[i]++;

Is log_buffer_state_local.buffer_full_cnt guaranteed to be one
here? Or
you would need to increase total_overflow_count by its value?


buffer_full_cnt will not remain as one. Its a 4 bit counter, will be
incremented monotonically by GuC firmware on every new detection of
overflow, so will increase from 0 to 15 & then wrap around.
Hence have to use '!=' in the condition instead of '>'.

But can it happen that it jumps by more than one between being sampled
here? In which case you would need to replace:

guc->log.total_overflow_count[i]++;

by something like:


guc->log.total_overflow_count[i] +=
log_buffer_state_local.buffer_full_cnt -
guc->log.prev_overflow_count[i];

(Doesn't handle the wrap though, just to illustrate my point.)

Actually logic in GuC firmware is such that overflow counter cannot
increment by more than 1 without Driver coming into picture in between,
by the virtue of flush interrupt.

Hm, and what happens to the data and overflow counter if the driver is
not responsive enough?

GuC will not stall and keep writing the logs into the buffer, if Driver is slow in responding to the previous flush interrupt.

But the overflow detection is done through a bit weird logic, which is executed only when GuC receives the response of the last flush interrupt from Driver, and increment is done by 1 only irrespective of how late the acknowledgement came from Driver side.

But nevertheless the logic on Driver side should be like the way you
suggested.

Does this revised logic looks fine ?

if (log_buffer_state_local.buffer_full_cnt !=
         guc->log.prev_overflow_count[i]) {
     new_overflow = 1;
     guc->log.total_overflow_count[i] +=
(log_buffer_state_local.buffer_full_cnt -
guc->log.prev_overflow_count[i]);

     if (log_buffer_state_local.buffer_full_cnt <
guc->log.prev_overflow_count[i])
         guc->log.total_overflow_count[i] += 15;

     log_buffer_state_local.buffer_full_cnt =
guc->log.prev_overflow_count[i];
}

Not sure, maybe += 16 ? If counter goes from 15 to zero, then -15, +15 =
no change which is wrong?

Right, it should be 16. Sorry, my bad.

But in general if the condition should not happen I would also put a
WARN_ON_ONCE or a rate limited DRM_ERROR in that case.

I think rate limited DRM_ERROR would be apt here.

Best regards
Akash
Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to