[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915: intel_dp_link_is_valid() should only return status of link (rev3)
== Series Details == Series: drm/i915: intel_dp_link_is_valid() should only return status of link (rev3) URL : https://patchwork.freedesktop.org/series/9737/ State : failure == Summary == Series 9737v3 drm/i915: intel_dp_link_is_valid() should only return status of link http://patchwork.freedesktop.org/api/1.0/series/9737/revisions/3/mbox Test kms_cursor_legacy: Subgroup basic-flip-vs-cursor-legacy: fail -> PASS (ro-byt-n2820) pass -> FAIL (ro-bdw-i5-5250u) Subgroup basic-flip-vs-cursor-varying-size: pass -> FAIL (ro-bdw-i5-5250u) pass -> DMESG-FAIL (fi-skl-i7-6700k) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: dmesg-warn -> PASS (ro-bdw-i7-5600u) dmesg-warn -> SKIP (ro-bdw-i5-5250u) Subgroup suspend-read-crc-pipe-b: pass -> INCOMPLETE (fi-hsw-i7-4770k) skip -> DMESG-WARN (ro-bdw-i5-5250u) Subgroup suspend-read-crc-pipe-c: skip -> DMESG-WARN (ro-bdw-i5-5250u) fi-hsw-i7-4770k total:207 pass:186 dwarn:0 dfail:0 fail:0 skip:20 fi-kbl-qkkr total:244 pass:185 dwarn:29 dfail:0 fail:3 skip:27 fi-skl-i7-6700k total:244 pass:208 dwarn:4 dfail:2 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:218 dwarn:3 dfail:0 fail:2 skip:17 ro-bdw-i7-5600u total:240 pass:207 dwarn:0 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:240 pass:195 dwarn:0 dfail:0 fail:3 skip:42 ro-byt-n2820 total:240 pass:198 dwarn:0 dfail:0 fail:2 skip:40 ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:173 dwarn:0 dfail:0 fail:2 skip:60 ro-ivb-i7-3770 total:240 pass:205 dwarn:0 dfail:0 fail:0 skip:35 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:222 dwarn:0 dfail:0 fail:4 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1859/ 3612906 drm-intel-nightly: 2016y-08m-12d-15h-08m-02s UTC integration manifest b41a36b drm/i915: intel_dp_link_is_valid() should only return status of link ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [v2,1/2] drm/i915/mst: Validate modes against available link bandwidth
== Series Details == Series: series starting with [v2,1/2] drm/i915/mst: Validate modes against available link bandwidth URL : https://patchwork.freedesktop.org/series/11039/ State : failure == Summary == Series 11039v1 Series without cover letter http://patchwork.freedesktop.org/api/1.0/series/11039/revisions/1/mbox Test kms_cursor_legacy: Subgroup basic-flip-vs-cursor-varying-size: fail -> PASS (ro-byt-n2820) pass -> FAIL (ro-bdw-i5-5250u) pass -> DMESG-FAIL (fi-skl-i7-6700k) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: dmesg-warn -> PASS (ro-bdw-i7-5600u) Subgroup suspend-read-crc-pipe-b: pass -> DMESG-WARN (ro-bdw-i7-5600u) skip -> DMESG-WARN (ro-bdw-i5-5250u) Subgroup suspend-read-crc-pipe-c: pass -> DMESG-WARN (ro-bdw-i7-5600u) fi-hsw-i7-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-kbl-qkkr total:244 pass:186 dwarn:29 dfail:0 fail:3 skip:26 fi-skl-i7-6700k total:244 pass:208 dwarn:4 dfail:2 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:219 dwarn:3 dfail:0 fail:1 skip:17 ro-bdw-i7-5600u total:240 pass:205 dwarn:2 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:240 pass:194 dwarn:0 dfail:0 fail:4 skip:42 ro-byt-n2820 total:240 pass:198 dwarn:0 dfail:0 fail:2 skip:40 ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:173 dwarn:0 dfail:0 fail:2 skip:60 ro-ivb-i7-3770 total:240 pass:205 dwarn:0 dfail:0 fail:0 skip:35 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:222 dwarn:0 dfail:0 fail:4 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1858/ 3612906 drm-intel-nightly: 2016y-08m-12d-15h-08m-02s UTC integration manifest 200cbdb drm/mst: A Helper function that returns available link bandwidth 813da48 drm/i915/mst: Validate modes against available link bandwidth ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915: Embrace the race in busy-ioctl
== Series Details == Series: drm/i915: Embrace the race in busy-ioctl URL : https://patchwork.freedesktop.org/series/11034/ State : failure == Summary == Series 11034v1 drm/i915: Embrace the race in busy-ioctl http://patchwork.freedesktop.org/api/1.0/series/11034/revisions/1/mbox Test kms_cursor_legacy: Subgroup basic-cursor-vs-flip-legacy: pass -> FAIL (ro-byt-n2820) Subgroup basic-cursor-vs-flip-varying-size: fail -> PASS (ro-ilk1-i5-650) Subgroup basic-flip-vs-cursor-legacy: fail -> PASS (ro-skl3-i5-6260u) Subgroup basic-flip-vs-cursor-varying-size: pass -> FAIL (ro-bdw-i5-5250u) pass -> DMESG-FAIL (fi-skl-i7-6700k) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: dmesg-warn -> PASS (ro-bdw-i7-5600u) dmesg-warn -> SKIP (ro-bdw-i5-5250u) Subgroup suspend-read-crc-pipe-c: skip -> DMESG-WARN (ro-bdw-i5-5250u) fi-hsw-i7-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-kbl-qkkr total:244 pass:186 dwarn:28 dfail:0 fail:3 skip:27 fi-skl-i7-6700k total:244 pass:208 dwarn:4 dfail:2 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:219 dwarn:2 dfail:0 fail:1 skip:18 ro-bdw-i7-5600u total:240 pass:207 dwarn:0 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:240 pass:194 dwarn:0 dfail:0 fail:4 skip:42 ro-byt-n2820 total:240 pass:196 dwarn:0 dfail:0 fail:4 skip:40 ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:174 dwarn:0 dfail:0 fail:1 skip:60 ro-ivb-i7-3770 total:240 pass:205 dwarn:0 dfail:0 fail:0 skip:35 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:223 dwarn:0 dfail:0 fail:3 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1857/ 3612906 drm-intel-nightly: 2016y-08m-12d-15h-08m-02s UTC integration manifest eb6c27a drm/i915: Embrace the race in busy-ioctl ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3] drm/i915: intel_dp_link_is_valid() should only return status of link
Intel_dp_link_is_valid() function reads the Link status registers and returns a boolean to indicate link is valid or not. If the link has lost lock and is not valid any more, link training is performed outside the function else previously trained link is retained. This gives us flexibility of checking whether link is valid and training it independently. v3: * Removed some unnecessary DEBUG prints * Optimized the conditional checking (Dhinakaran Pandiyan) v2: * Changed the function name from intel_dp_check_link_status() to intel_dp_link_is_valid() (Lukas Wunner) * Checks for CRTC and active CRTC are moved outside the intel_dp_link_is_valid() function (Rodrigo Vivi) Signed-off-by: Manasi Navare --- drivers/gpu/drm/i915/intel_dp.c | 53 ++--- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 364db90..d234042 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -3881,36 +3881,32 @@ go_again: return -EINVAL; } -static void -intel_dp_check_link_status(struct intel_dp *intel_dp) +static bool +intel_dp_link_is_valid(struct intel_dp *intel_dp) { - struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base; struct drm_device *dev = intel_dp_to_dev(intel_dp); u8 link_status[DP_LINK_STATUS_SIZE]; WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex)); if (!intel_dp_get_link_status(intel_dp, link_status)) { - DRM_ERROR("Failed to get link status\n"); - return; + DRM_DEBUG_KMS("Failed to get link status\n"); + return false; } - if (!intel_encoder->base.crtc) - return; + /* Check if the link is valid by reading the bits of Link status +* registers +*/ + if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) { + DRM_DEBUG_KMS("Channel EQ or CR not ok, need to retrain\n"); + return false; + } - if (!to_intel_crtc(intel_encoder->base.crtc)->active) - return; + return true; - /* if link training is requested we should perform it always */ - if ((intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) || - (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count))) { - DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n", - intel_encoder->base.name); - intel_dp_start_link_train(intel_dp); - intel_dp_stop_link_train(intel_dp); - } } + /* * According to DP spec * 5.1.2: @@ -3928,6 +3924,8 @@ static bool intel_dp_short_pulse(struct intel_dp *intel_dp) { struct drm_device *dev = intel_dp_to_dev(intel_dp); + struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp); + struct intel_encoder *intel_encoder = &intel_dig_port->base; u8 sink_irq_vector = 0; u8 old_sink_count = intel_dp->sink_count; bool ret; @@ -3968,8 +3966,17 @@ intel_dp_short_pulse(struct intel_dp *intel_dp) DRM_DEBUG_DRIVER("CP or sink specific irq unhandled\n"); } + /* Do not train the link if there is no crtc */ + if (!intel_encoder->base.crtc || + !to_intel_crtc(intel_encoder->base.crtc)->active) + return true; + drm_modeset_lock(&dev->mode_config.connection_mutex, NULL); - intel_dp_check_link_status(intel_dp); + if (!intel_dp_link_is_valid(intel_dp) || + intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) { + intel_dp_start_link_train(intel_dp); + intel_dp_stop_link_train(intel_dp); + } drm_modeset_unlock(&dev->mode_config.connection_mutex); return true; @@ -4298,8 +4305,16 @@ intel_dp_long_pulse(struct intel_connector *intel_connector) * check links status, there has been known issues of * link loss triggerring long pulse */ + /* Do not train the link if there is no crtc */ + if (!intel_encoder->base.crtc || + !to_intel_crtc(intel_encoder->base.crtc)->active) + goto out; + drm_modeset_lock(&dev->mode_config.connection_mutex, NULL); - intel_dp_check_link_status(intel_dp); + if (!intel_dp_link_is_valid(intel_dp)) { + intel_dp_start_link_train(intel_dp); + intel_dp_stop_link_train(intel_dp); + } drm_modeset_unlock(&dev->mode_config.connection_mutex); goto out; } -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3] drm/i915/dp: DP audio API changes for MST
On Sat, 2016-08-13 at 00:16 +, Pandiyan, Dhinakaran wrote: > On Fri, 2016-08-12 at 08:18 +0300, Ville Syrjälä wrote: > > On Fri, Aug 12, 2016 at 04:28:09AM +, Pandiyan, Dhinakaran wrote: > > > On Thu, 2016-08-11 at 10:39 +0300, Ville Syrjälä wrote: > > > > On Thu, Aug 11, 2016 at 07:10:39AM +, Pandiyan, Dhinakaran wrote: > > > > > On Thu, 2016-08-11 at 09:26 +0300, Ville Syrjälä wrote: > > > > > > On Wed, Aug 10, 2016 at 12:41:57PM -0700, Dhinakaran Pandiyan wrote: > > > > > > > DP MST provides the capability to send multiple video and audio > > > > > > > streams > > > > > > > through a single port. This requires the API's between i915 and > > > > > > > audio > > > > > > > drivers to distinguish between multiple audio capable displays > > > > > > > that can be > > > > > > > connected to a port. Currently only the port identity is shared > > > > > > > in the > > > > > > > APIs. This patch adds support for MST with an additional parameter > > > > > > > 'int pipe'. The existing parameter 'port' does not change it's > > > > > > > meaning. > > > > > > > > > > > > > > pipe = > > > > > > > MST : display pipe that the stream originates from > > > > > > > Non-MST : -1 > > > > > > > > > > > > > > Affected APIs: > > > > > > > struct i915_audio_component_ops > > > > > > > - int (*sync_audio_rate)(struct device *, int port, int > > > > > > > rate); > > > > > > > + int (*sync_audio_rate)(struct device *, int port, int pipe, > > > > > > > + int rate); > > > > > > > > > > > > > > - int (*get_eld)(struct device *, int port, bool *enabled, > > > > > > > - unsigned char *buf, int max_bytes); > > > > > > > + int (*get_eld)(struct device *, int port, int pipe, > > > > > > > +bool *enabled, unsigned char *buf, int > > > > > > > max_bytes); > > > > > > > > > > > > > > struct i915_audio_component_audio_ops > > > > > > > - void (*pin_eld_notify)(void *audio_ptr, int port); > > > > > > > + void (*pin_eld_notify)(void *audio_ptr, int port, int > > > > > > > pipe); > > > > > > > > > > > > > > This patch makes dummy changes in the audio drivers (Libin) for > > > > > > > build to > > > > > > > succeed. The audio side drivers will send the right 'pipe' values > > > > > > > in > > > > > > > patches that will follow. > > > > > > > > > > > > > > v2: > > > > > > > Renamed the new API parameter from 'dev_id' to 'pipe'. (Jim, > > > > > > > Ville) > > > > > > > Included Asoc driver API compatibility changes from Jeeja. > > > > > > > Added WARN_ON() for invalid pipe in get_saved_encoder(). (Takashi) > > > > > > > Added comment for av_enc_map[] definition. (Takashi) > > > > > > > > > > > > > > v3: > > > > > > > Fixed logic error introduced while renaming 'dev_id' as 'pipe' > > > > > > > (Ville) > > > > > > > Renamed get_saved_encoder() to get_saved_enc() to reduce line > > > > > > > length > > > > > > > > > > > > > > Signed-off-by: Dhinakaran Pandiyan > > > > > > > --- > > > > > > > drivers/gpu/drm/i915/i915_drv.h| 3 +- > > > > > > > drivers/gpu/drm/i915/intel_audio.c | 93 > > > > > > > ++ > > > > > > > include/drm/i915_component.h | 6 +-- > > > > > > > include/sound/hda_i915.h | 11 +++-- > > > > > > > sound/hda/hdac_i915.c | 9 ++-- > > > > > > > sound/pci/hda/patch_hdmi.c | 7 +-- > > > > > > > sound/soc/codecs/hdac_hdmi.c | 2 +- > > > > > > > 7 files changed, 86 insertions(+), 45 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h > > > > > > > b/drivers/gpu/drm/i915/i915_drv.h > > > > > > > index c36d176..8e4a88f 100644 > > > > > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > > > > > @@ -2036,7 +2036,8 @@ struct drm_i915_private { > > > > > > > /* perform PHY state sanity checks? */ > > > > > > > bool chv_phy_assert[2]; > > > > > > > > > > > > > > - struct intel_encoder *dig_port_map[I915_MAX_PORTS]; > > > > > > > + /* Used to save the pipe-to-encoder mapping for audio */ > > > > > > > + struct intel_encoder *av_enc_map[I915_MAX_PIPES]; > > > > > > > > > > > > > > /* > > > > > > >* NOTE: This is the dri1/ums dungeon, don't add stuff here. > > > > > > > Your patch > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_audio.c > > > > > > > b/drivers/gpu/drm/i915/intel_audio.c > > > > > > > index ef20875..a7467ea 100644 > > > > > > > --- a/drivers/gpu/drm/i915/intel_audio.c > > > > > > > +++ b/drivers/gpu/drm/i915/intel_audio.c > > > > > > > @@ -500,6 +500,7 @@ void intel_audio_codec_enable(struct > > > > > > > intel_encoder *intel_encoder) > > > > > > > struct i915_audio_component *acomp = dev_priv->audio_component; > > > > > > > struct intel_digital_port *intel_dig_port = > > > > > > > enc_to_dig_port(encoder); > > > > > > > enum port port = intel_dig_port->port; > > > > > > > + enum pipe pipe = crtc->pipe; > > > > > > > >
Re: [Intel-gfx] [PATCH v2] drm/i915: intel_dp_link_is_valid() should only return status of link
On Fri, Aug 12, 2016 at 02:50:58PM -0700, Pandiyan, Dhinakaran wrote: > On Fri, 2016-08-12 at 10:56 -0700, Manasi Navare wrote: > > On Thu, Aug 11, 2016 at 08:18:54PM -0700, Pandiyan, Dhinakaran wrote: > > > On Thu, 2016-08-11 at 15:23 -0700, Manasi Navare wrote: > > > > Intel_dp_link_is_valid() function reads the Link status registers > > > > and returns a boolean to indicate link is valid or not. > > > > If the link has lost lock and is not valid any more, link > > > > training is performed outside the function else previously trained link > > > > is retained. > > > > This gives us flexibility of checking whether link is valid and training > > > > it independently. > > > > > > > > v2: > > > > * Changed the function name from intel_dp_check_link_status() > > > > to intel_dp_link_is_valid() (Lukas Wunner) > > > > * Checks for CRTC and active CRTC are moved outside the > > > > intel_dp_link_is_valid() function (Rodrigo Vivi) > > > > > > > > Signed-off-by: Manasi Navare > > > > --- > > > > drivers/gpu/drm/i915/intel_dp.c | 56 > > > > +++-- > > > > 1 file changed, 37 insertions(+), 19 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c > > > > b/drivers/gpu/drm/i915/intel_dp.c > > > > index 364db90..891147d 100644 > > > > --- a/drivers/gpu/drm/i915/intel_dp.c > > > > +++ b/drivers/gpu/drm/i915/intel_dp.c > > > > @@ -3881,36 +3881,33 @@ go_again: > > > > return -EINVAL; > > > > } > > > > > > > > -static void > > > > -intel_dp_check_link_status(struct intel_dp *intel_dp) > > > > +static bool > > > > +intel_dp_link_is_valid(struct intel_dp *intel_dp) > > > > { > > > > - struct intel_encoder *intel_encoder = > > > > &dp_to_dig_port(intel_dp)->base; > > > > struct drm_device *dev = intel_dp_to_dev(intel_dp); > > > > u8 link_status[DP_LINK_STATUS_SIZE]; > > > > > > > > > > > > WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex)); > > > > > > > > if (!intel_dp_get_link_status(intel_dp, link_status)) { > > > > - DRM_ERROR("Failed to get link status\n"); > > > > - return; > > > > + DRM_DEBUG_KMS("Failed to get link status\n"); > > > > + return false; > > > > } > > > > > > > > - if (!intel_encoder->base.crtc) > > > > - return; > > > > + /* Check if the link is valid by reading the bits of Link status > > > > +* registers > > > > +*/ > > > > + if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) { > > > > + DRM_DEBUG_KMS("Channel EQ or CR not ok, need to > > > > retrain\n"); > > > drm_dp_channel_eq_ok() does not check for CR. Should we just say > > > "Channel EQ not ok" to preempt ambiguity while debugging ? > > > > Actually this macro checks for DP_CHANNEL_EQ_BITS which is defined as: > > #define DP_CHANNEL_EQ_BITS (DP_LANE_CR_DONE | \ > > DP_LANE_CHANNEL_EQ_DONE | \ > > DP_LANE_SYMBOL_LOCKED) > > So it includes checking for Channel EQ and Clock Recovery CR bits > > > > > > Thank you, I should have looked hard. I will leave this to you. > > > > > > > > + return false; > > > > + } > > > > > > > > - if (!to_intel_crtc(intel_encoder->base.crtc)->active) > > > > - return; > > > > + DRM_DEBUG_KMS("Link is good, no need to retrain\n"); > > > The caller does not expect us to link train anymore, I don't think we > > > have to explicitly state "no need to retrain". Also, do we need debug > > > messages if the link is good? > > > > I agree , maybe this is not needed. I will remove this > > > > > > > > > + return true; > > > > > > > > - /* if link training is requested we should perform it always */ > > > > - if ((intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) || > > > > - (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count))) > > > > { > > > > - DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n", > > > > - intel_encoder->base.name); > > > > - intel_dp_start_link_train(intel_dp); > > > > - intel_dp_stop_link_train(intel_dp); > > > > - } > > > > } > > > > > > > > + > > > > /* > > > > * According to DP spec > > > > * 5.1.2: > > > > @@ -3928,6 +3925,8 @@ static bool > > > > intel_dp_short_pulse(struct intel_dp *intel_dp) > > > > { > > > > struct drm_device *dev = intel_dp_to_dev(intel_dp); > > > > + struct intel_digital_port *intel_dig_port = > > > > dp_to_dig_port(intel_dp); > > > > + struct intel_encoder *intel_encoder = &intel_dig_port->base; > > > > u8 sink_irq_vector = 0; > > > > u8 old_sink_count = intel_dp->sink_count; > > > > bool ret; > > > > @@ -3968,8 +3967,18 @@ intel_dp_short_pulse(struct intel_dp *intel_dp) > > > > DRM_DEBUG_D
Re: [Intel-gfx] [PATCH v3] drm/i915/dp: DP audio API changes for MST
On Fri, 2016-08-12 at 08:18 +0300, Ville Syrjälä wrote: > On Fri, Aug 12, 2016 at 04:28:09AM +, Pandiyan, Dhinakaran wrote: > > On Thu, 2016-08-11 at 10:39 +0300, Ville Syrjälä wrote: > > > On Thu, Aug 11, 2016 at 07:10:39AM +, Pandiyan, Dhinakaran wrote: > > > > On Thu, 2016-08-11 at 09:26 +0300, Ville Syrjälä wrote: > > > > > On Wed, Aug 10, 2016 at 12:41:57PM -0700, Dhinakaran Pandiyan wrote: > > > > > > DP MST provides the capability to send multiple video and audio > > > > > > streams > > > > > > through a single port. This requires the API's between i915 and > > > > > > audio > > > > > > drivers to distinguish between multiple audio capable displays that > > > > > > can be > > > > > > connected to a port. Currently only the port identity is shared in > > > > > > the > > > > > > APIs. This patch adds support for MST with an additional parameter > > > > > > 'int pipe'. The existing parameter 'port' does not change it's > > > > > > meaning. > > > > > > > > > > > > pipe = > > > > > > MST : display pipe that the stream originates from > > > > > > Non-MST : -1 > > > > > > > > > > > > Affected APIs: > > > > > > struct i915_audio_component_ops > > > > > > - int (*sync_audio_rate)(struct device *, int port, int rate); > > > > > > + int (*sync_audio_rate)(struct device *, int port, int pipe, > > > > > > +int rate); > > > > > > > > > > > > - int (*get_eld)(struct device *, int port, bool *enabled, > > > > > > - unsigned char *buf, int max_bytes); > > > > > > + int (*get_eld)(struct device *, int port, int pipe, > > > > > > + bool *enabled, unsigned char *buf, int > > > > > > max_bytes); > > > > > > > > > > > > struct i915_audio_component_audio_ops > > > > > > - void (*pin_eld_notify)(void *audio_ptr, int port); > > > > > > + void (*pin_eld_notify)(void *audio_ptr, int port, int pipe); > > > > > > > > > > > > This patch makes dummy changes in the audio drivers (Libin) for > > > > > > build to > > > > > > succeed. The audio side drivers will send the right 'pipe' values in > > > > > > patches that will follow. > > > > > > > > > > > > v2: > > > > > > Renamed the new API parameter from 'dev_id' to 'pipe'. (Jim, Ville) > > > > > > Included Asoc driver API compatibility changes from Jeeja. > > > > > > Added WARN_ON() for invalid pipe in get_saved_encoder(). (Takashi) > > > > > > Added comment for av_enc_map[] definition. (Takashi) > > > > > > > > > > > > v3: > > > > > > Fixed logic error introduced while renaming 'dev_id' as 'pipe' > > > > > > (Ville) > > > > > > Renamed get_saved_encoder() to get_saved_enc() to reduce line length > > > > > > > > > > > > Signed-off-by: Dhinakaran Pandiyan > > > > > > --- > > > > > > drivers/gpu/drm/i915/i915_drv.h| 3 +- > > > > > > drivers/gpu/drm/i915/intel_audio.c | 93 > > > > > > ++ > > > > > > include/drm/i915_component.h | 6 +-- > > > > > > include/sound/hda_i915.h | 11 +++-- > > > > > > sound/hda/hdac_i915.c | 9 ++-- > > > > > > sound/pci/hda/patch_hdmi.c | 7 +-- > > > > > > sound/soc/codecs/hdac_hdmi.c | 2 +- > > > > > > 7 files changed, 86 insertions(+), 45 deletions(-) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h > > > > > > b/drivers/gpu/drm/i915/i915_drv.h > > > > > > index c36d176..8e4a88f 100644 > > > > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > > > > @@ -2036,7 +2036,8 @@ struct drm_i915_private { > > > > > > /* perform PHY state sanity checks? */ > > > > > > bool chv_phy_assert[2]; > > > > > > > > > > > > - struct intel_encoder *dig_port_map[I915_MAX_PORTS]; > > > > > > + /* Used to save the pipe-to-encoder mapping for audio */ > > > > > > + struct intel_encoder *av_enc_map[I915_MAX_PIPES]; > > > > > > > > > > > > /* > > > > > > * NOTE: This is the dri1/ums dungeon, don't add stuff here. > > > > > > Your patch > > > > > > diff --git a/drivers/gpu/drm/i915/intel_audio.c > > > > > > b/drivers/gpu/drm/i915/intel_audio.c > > > > > > index ef20875..a7467ea 100644 > > > > > > --- a/drivers/gpu/drm/i915/intel_audio.c > > > > > > +++ b/drivers/gpu/drm/i915/intel_audio.c > > > > > > @@ -500,6 +500,7 @@ void intel_audio_codec_enable(struct > > > > > > intel_encoder *intel_encoder) > > > > > > struct i915_audio_component *acomp = dev_priv->audio_component; > > > > > > struct intel_digital_port *intel_dig_port = > > > > > > enc_to_dig_port(encoder); > > > > > > enum port port = intel_dig_port->port; > > > > > > + enum pipe pipe = crtc->pipe; > > > > > > > > > > > > connector = drm_select_eld(encoder); > > > > > > if (!connector) > > > > > > @@ -524,12 +525,18 @@ void intel_audio_codec_enable(struct > > > > > > intel_encoder *intel_encoder) > > > > > > > > > > > > mutex_lock(&dev_priv->av_mutex); > > > > > > intel_encoder->audi
Re: [Intel-gfx] drm/i915/fbc: disable FBC on FIFO underruns
On Fri, 2016-06-10 at 22:18 -0300, Paulo Zanoni wrote: > Ever since I started working on FBC I was already aware that FBC can > really amplify the FIFO underrun symptoms. On systems where FIFO > underruns were harmless error messages, enabling FBC would cause the > underruns to give black screens. > Do we know why we get black screens in this scenario? > We recently tried to enable FBC on Haswell and got reports of a system > that would hang after some hours of uptime, and the first bad commit > was the one that enabled FBC. We also observed that this system had > FIFO underrun error messages on its dmesg. Although we don't have any > evidence that fixing the underruns would solve the bug and make FBC > work properly on this machine, IMHO it's better if we minimize the > amount of possible problems by just giving up FBC whenever we detect > an underrun. > > v2: new version, different implementation and commit message. > > Cc: Stefan Richter > Cc: Lyude > Cc: Steven Honeyman > Signed-off-by: Paulo Zanoni > --- > drivers/gpu/drm/i915/i915_drv.h| 3 ++ > drivers/gpu/drm/i915/intel_drv.h | 1 + > drivers/gpu/drm/i915/intel_fbc.c | 53 > ++ > drivers/gpu/drm/i915/intel_fifo_underrun.c | 2 ++ > 4 files changed, 59 insertions(+) > > > Since my test machines don't produce FIFO underrun errors, I tested this by > creating a debugfs file that just calls intel_fbc_handle_fifo_underrun(). I'd > appreciate some Tested-by tags, if possible. > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 20a676d..18b4257 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -908,6 +908,9 @@ struct intel_fbc { > bool enabled; > bool active; > > + bool underrun_detected; > + struct work_struct underrun_work; > + > struct intel_fbc_state_cache { > struct { > unsigned int mode_flags; > diff --git a/drivers/gpu/drm/i915/intel_drv.h > b/drivers/gpu/drm/i915/intel_drv.h > index ebe7b34..7bf97b1 100644 > --- a/drivers/gpu/drm/i915/intel_drv.h > +++ b/drivers/gpu/drm/i915/intel_drv.h > @@ -1436,6 +1436,7 @@ void intel_fbc_invalidate(struct drm_i915_private > *dev_priv, > void intel_fbc_flush(struct drm_i915_private *dev_priv, >unsigned int frontbuffer_bits, enum fb_op_origin origin); > void intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv); > +void intel_fbc_handle_fifo_underrun(struct drm_i915_private *dev_priv); > > /* intel_hdmi.c */ > void intel_hdmi_init(struct drm_device *dev, i915_reg_t hdmi_reg, enum port > port); > diff --git a/drivers/gpu/drm/i915/intel_fbc.c > b/drivers/gpu/drm/i915/intel_fbc.c > index d268f76..2363bff 100644 > --- a/drivers/gpu/drm/i915/intel_fbc.c > +++ b/drivers/gpu/drm/i915/intel_fbc.c > @@ -755,6 +755,13 @@ static bool intel_fbc_can_activate(struct intel_crtc > *crtc) > struct intel_fbc *fbc = &dev_priv->fbc; > struct intel_fbc_state_cache *cache = &fbc->state_cache; > > + /* We don't need to use a state cache here since this information is > + * global for every CRTC. */ > + if (fbc->underrun_detected) { > + fbc->no_fbc_reason = "underrun detected"; > + return false; > + } > + > if (!cache->plane.visible) { > fbc->no_fbc_reason = "primary plane not visible"; > return false; > @@ -1195,6 +1202,51 @@ void intel_fbc_global_disable(struct drm_i915_private > *dev_priv) > cancel_work_sync(&fbc->work.work); > } > > +static void intel_fbc_underrun_work_fn(struct work_struct *work) > +{ > + struct drm_i915_private *dev_priv = > + container_of(work, struct drm_i915_private, fbc.underrun_work); > + struct intel_fbc *fbc = &dev_priv->fbc; > + > + mutex_lock(&fbc->lock); > + > + /* Maybe we were scheduled twice. */ > + if (fbc->underrun_detected) > + goto out; > + > + DRM_DEBUG_KMS("Disabling FBC due to FIFO underrun.\n"); > + fbc->underrun_detected = true; > + > + intel_fbc_deactivate(dev_priv); > +out: > + mutex_unlock(&fbc->lock); > +} > + > +/** > + * intel_fbc_handle_fifo_underrun - disable FBC when we get a FIFO underrun > + * @dev_priv: i915 device instance > + * > + * Without FBC, most underruns are harmless and don't really cause too many > + * problems, except for an annoying message on dmesg. With FBC, underruns can > + * become black screens or even worse, especially when paired with bad > + * watermarks. So in order for us to be on the safe side, completely disable > FBC > + * in case we ever detect a FIFO underrun on any pipe. An underrun on any > pipe > + * already suggests that watermarks may be bad, so try to be as safe as > + * possible. > + */ > +void intel_fbc_handle_fifo_underrun(struct drm_i915_private *dev_priv) > +{ > + struct intel_fbc *fbc = &dev_priv->fbc; > + > + if (!fbc_sup
Re: [Intel-gfx] [PATCH v2] drm/i915: intel_dp_link_is_valid() should only return status of link
On Fri, 2016-08-12 at 10:56 -0700, Manasi Navare wrote: > On Thu, Aug 11, 2016 at 08:18:54PM -0700, Pandiyan, Dhinakaran wrote: > > On Thu, 2016-08-11 at 15:23 -0700, Manasi Navare wrote: > > > Intel_dp_link_is_valid() function reads the Link status registers > > > and returns a boolean to indicate link is valid or not. > > > If the link has lost lock and is not valid any more, link > > > training is performed outside the function else previously trained link > > > is retained. > > > This gives us flexibility of checking whether link is valid and training > > > it independently. > > > > > > v2: > > > * Changed the function name from intel_dp_check_link_status() > > > to intel_dp_link_is_valid() (Lukas Wunner) > > > * Checks for CRTC and active CRTC are moved outside the > > > intel_dp_link_is_valid() function (Rodrigo Vivi) > > > > > > Signed-off-by: Manasi Navare > > > --- > > > drivers/gpu/drm/i915/intel_dp.c | 56 > > > +++-- > > > 1 file changed, 37 insertions(+), 19 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c > > > b/drivers/gpu/drm/i915/intel_dp.c > > > index 364db90..891147d 100644 > > > --- a/drivers/gpu/drm/i915/intel_dp.c > > > +++ b/drivers/gpu/drm/i915/intel_dp.c > > > @@ -3881,36 +3881,33 @@ go_again: > > > return -EINVAL; > > > } > > > > > > -static void > > > -intel_dp_check_link_status(struct intel_dp *intel_dp) > > > +static bool > > > +intel_dp_link_is_valid(struct intel_dp *intel_dp) > > > { > > > - struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base; > > > struct drm_device *dev = intel_dp_to_dev(intel_dp); > > > u8 link_status[DP_LINK_STATUS_SIZE]; > > > > > > WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex)); > > > > > > if (!intel_dp_get_link_status(intel_dp, link_status)) { > > > - DRM_ERROR("Failed to get link status\n"); > > > - return; > > > + DRM_DEBUG_KMS("Failed to get link status\n"); > > > + return false; > > > } > > > > > > - if (!intel_encoder->base.crtc) > > > - return; > > > + /* Check if the link is valid by reading the bits of Link status > > > + * registers > > > + */ > > > + if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) { > > > + DRM_DEBUG_KMS("Channel EQ or CR not ok, need to retrain\n"); > > drm_dp_channel_eq_ok() does not check for CR. Should we just say > > "Channel EQ not ok" to preempt ambiguity while debugging ? > > Actually this macro checks for DP_CHANNEL_EQ_BITS which is defined as: > #define DP_CHANNEL_EQ_BITS (DP_LANE_CR_DONE | \ > DP_LANE_CHANNEL_EQ_DONE | \ > DP_LANE_SYMBOL_LOCKED) > So it includes checking for Channel EQ and Clock Recovery CR bits > > Thank you, I should have looked hard. I will leave this to you. > > > > > + return false; > > > + } > > > > > > - if (!to_intel_crtc(intel_encoder->base.crtc)->active) > > > - return; > > > + DRM_DEBUG_KMS("Link is good, no need to retrain\n"); > > The caller does not expect us to link train anymore, I don't think we > > have to explicitly state "no need to retrain". Also, do we need debug > > messages if the link is good? > > I agree , maybe this is not needed. I will remove this > > > > > > + return true; > > > > > > - /* if link training is requested we should perform it always */ > > > - if ((intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) || > > > - (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count))) { > > > - DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n", > > > - intel_encoder->base.name); > > > - intel_dp_start_link_train(intel_dp); > > > - intel_dp_stop_link_train(intel_dp); > > > - } > > > } > > > > > > + > > > /* > > > * According to DP spec > > > * 5.1.2: > > > @@ -3928,6 +3925,8 @@ static bool > > > intel_dp_short_pulse(struct intel_dp *intel_dp) > > > { > > > struct drm_device *dev = intel_dp_to_dev(intel_dp); > > > + struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp); > > > + struct intel_encoder *intel_encoder = &intel_dig_port->base; > > > u8 sink_irq_vector = 0; > > > u8 old_sink_count = intel_dp->sink_count; > > > bool ret; > > > @@ -3968,8 +3967,18 @@ intel_dp_short_pulse(struct intel_dp *intel_dp) > > > DRM_DEBUG_DRIVER("CP or sink specific irq unhandled\n"); > > > } > > > > > > + /* Do not train the link if there is no crtc */ > > > + if (!intel_encoder->base.crtc) > > > + return true; > > > + if (!to_intel_crtc(intel_encoder->base.crtc)->active) > > > + return true; > > > + > > I might be completely off base here. Shouldn't we keep the link valid > > irrespective of whether there is an active crtc? I thought that is what > > the refactoring is supposed to enable. Does intel_dp_short_pulse() get > > called when there is a link loss during upfront link tr
Re: [Intel-gfx] [PATCH v2 1/2] drm/i915/mst: Validate modes against available link bandwidth
Hi Anusha, [auto build test ERROR on drm/drm-next] [also build test ERROR on v4.8-rc1 next-20160812] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Anusha-Srivatsa/drm-i915-mst-Validate-modes-against-available-link-bandwidth/20160813-050818 base: git://people.freedesktop.org/~airlied/linux.git drm-next config: i386-randconfig-x014-201632 (attached as .config) compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705 reproduce: # save the attached .config to linux build tree make ARCH=i386 Note: the linux-review/Anusha-Srivatsa/drm-i915-mst-Validate-modes-against-available-link-bandwidth/20160813-050818 HEAD 413b1bb45dc2c58540a17c5ca642b6aee2c97405 builds fine. It only hurts bisectibility. All errors (new ones prefixed by >>): drivers/gpu/drm/i915/intel_dp_mst.c: In function 'intel_dp_mst_mode_valid': >> drivers/gpu/drm/i915/intel_dp_mst.c:363:14: error: implicit declaration of >> function 'drm_dp_mst_get_avail_pbn' [-Werror=implicit-function-declaration] avail_pbn = drm_dp_mst_get_avail_pbn(mgr, port); ^~~~ cc1: some warnings being treated as errors vim +/drm_dp_mst_get_avail_pbn +363 drivers/gpu/drm/i915/intel_dp_mst.c 357 struct intel_connector *intel_connector = to_intel_connector(connector); 358 struct intel_dp *intel_dp = intel_connector->mst_port; 359 struct drm_dp_mst_topology_mgr *mgr = &intel_dp->mst_mgr; 360 struct drm_dp_mst_port *port = (struct drm_dp_mst_port *) (intel_connector->port); 361 int max_dotclk = to_i915(connector->dev)->max_dotclk_freq; 362 > 363 avail_pbn = drm_dp_mst_get_avail_pbn(mgr, port); 364 req_pbn = drm_dp_calc_pbn_mode(mode->clock, 24); 365 if (req_pbn > avail_pbn) 366 return MODE_H_ILLEGAL; --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 2/2] drm/mst: A Helper function that returns available link bandwidth
Add a function that returns the available link bandwidth for MST port so that we can accurately determine whether a new mode is valid for the link or not. v2: Put the Signed-off to the end of commit message Cc: dri-de...@lists.freedesktop.org Cc: dhinakaran.pandi...@intel.com Signed-off-by: Anusha Srivatsa --- drivers/gpu/drm/drm_dp_mst_topology.c | 12 include/drm/drm_dp_mst_helper.h | 1 + 2 files changed, 13 insertions(+) diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 04e4571..7a239f6 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -43,6 +43,8 @@ static bool dump_dp_payload_table(struct drm_dp_mst_topology_mgr *mgr, char *buf); static int test_calc_pbn_mode(void); +int drm_dp_mst_get_avail_pbn(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port); + static void drm_dp_put_port(struct drm_dp_mst_port *port); static int drm_dp_dpcd_write_payload(struct drm_dp_mst_topology_mgr *mgr, @@ -2730,6 +2732,16 @@ static int test_calc_pbn_mode(void) return 0; } +int drm_dp_mst_get_avail_pbn(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port) +{ +port = drm_dp_get_validated_port_ref(mgr,port); +if (port) +return port->available_pbn; + +return -EINVAL; +} +EXPORT_SYMBOL(drm_dp_mst_get_avail_pbn); + /* we want to kick the TX after we've ack the up/down IRQs. */ static void drm_dp_mst_kick_tx(struct drm_dp_mst_topology_mgr *mgr) { diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index 0032076..74dc4ab 100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -576,6 +576,7 @@ struct edid *drm_dp_mst_get_edid(struct drm_connector *connector, struct drm_dp_ int drm_dp_calc_pbn_mode(int clock, int bpp); +int drm_dp_mst_get_avail_pbn(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port); bool drm_dp_mst_allocate_vcpi(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port, int pbn, int *slots); -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 1/2] drm/i915/mst: Validate modes against available link bandwidth
Validate the modes against available link bandwidth rather than maximum link bandwidth so that we have a better idea as to whether a proposed mode can truly run beside existing stream. v2: Put the Signed-off to the end of the commit message Cc: dhinakaran.pandi...@intel.com Signed-off-by: Anusha Srivatsa --- drivers/gpu/drm/i915/intel_dp_mst.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c b/drivers/gpu/drm/i915/intel_dp_mst.c index 629337d..e7e87d7 100644 --- a/drivers/gpu/drm/i915/intel_dp_mst.c +++ b/drivers/gpu/drm/i915/intel_dp_mst.c @@ -352,13 +352,23 @@ static enum drm_mode_status intel_dp_mst_mode_valid(struct drm_connector *connector, struct drm_display_mode *mode) { + int req_pbn = 0; + int avail_pbn = 0; + struct intel_connector *intel_connector = to_intel_connector(connector); + struct intel_dp *intel_dp = intel_connector->mst_port; + struct drm_dp_mst_topology_mgr *mgr = &intel_dp->mst_mgr; + struct drm_dp_mst_port *port = (struct drm_dp_mst_port *) (intel_connector->port); int max_dotclk = to_i915(connector->dev)->max_dotclk_freq; - /* TODO - validate mode against available PBN for link */ + avail_pbn = drm_dp_mst_get_avail_pbn(mgr, port); + req_pbn = drm_dp_calc_pbn_mode(mode->clock, 24); + if (req_pbn > avail_pbn) + return MODE_H_ILLEGAL; + if (mode->clock < 1) return MODE_CLOCK_LOW; - if (mode->flags & DRM_MODE_FLAG_DBLCLK) +if (mode->flags & DRM_MODE_FLAG_DBLCLK) return MODE_H_ILLEGAL; if (mode->clock > max_dotclk) -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 12/20] drm/doc: Update drm_framebuffer docs
On Wed, Aug 10, 2016 at 06:15:56PM +0300, Ville Syrjälä wrote: > On Tue, Aug 09, 2016 at 03:41:23PM +0200, Daniel Vetter wrote: > > - Move the intro section into a DOC comment, and update it slightly. > > - kernel-doc for struct drm_framebuffer! > > > > Signed-off-by: Daniel Vetter > > --- > > Documentation/gpu/drm-kms.rst | 26 +-- > > drivers/gpu/drm/drm_framebuffer.c | 35 +++ > > include/drm/drm_framebuffer.h | 94 > > +-- > > 3 files changed, 118 insertions(+), 37 deletions(-) > > > > diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst > > index 8264a88a8695..d244e03658cc 100644 > > --- a/Documentation/gpu/drm-kms.rst > > +++ b/Documentation/gpu/drm-kms.rst > > @@ -39,30 +39,8 @@ Atomic Mode Setting Function Reference > > Frame Buffer Abstraction > > > > > > -Frame buffers are abstract memory objects that provide a source of > > -pixels to scanout to a CRTC. Applications explicitly request the > > -creation of frame buffers through the DRM_IOCTL_MODE_ADDFB(2) ioctls > > -and receive an opaque handle that can be passed to the KMS CRTC control, > > -plane configuration and page flip functions. > > - > > -Frame buffers rely on the underneath memory manager for low-level memory > > -operations. When creating a frame buffer applications pass a memory > > -handle (or a list of memory handles for multi-planar formats) through > > -the ``drm_mode_fb_cmd2`` argument. For drivers using GEM as their > > -userspace buffer management interface this would be a GEM handle. > > -Drivers are however free to use their own backing storage object > > -handles, e.g. vmwgfx directly exposes special TTM handles to userspace > > -and so expects TTM handles in the create ioctl and not GEM handles. > > - > > -The lifetime of a drm framebuffer is controlled with a reference count, > > -drivers can grab additional references with > > -:c:func:`drm_framebuffer_reference()`and drop them again with > > -:c:func:`drm_framebuffer_unreference()`. For driver-private > > -framebuffers for which the last reference is never dropped (e.g. for the > > -fbdev framebuffer when the struct :c:type:`struct drm_framebuffer > > -` is embedded into the fbdev helper struct) > > -drivers can manually clean up a framebuffer at module unload time with > > -:c:func:`drm_framebuffer_unregister_private()`. > > +.. kernel-doc:: drivers/gpu/drm/drm_framebuffer.c > > + :doc: overview > > > > Frame Buffer Functions Reference > > > > diff --git a/drivers/gpu/drm/drm_framebuffer.c > > b/drivers/gpu/drm/drm_framebuffer.c > > index c7a8a623b336..f2f4928c7262 100644 > > --- a/drivers/gpu/drm/drm_framebuffer.c > > +++ b/drivers/gpu/drm/drm_framebuffer.c > > @@ -28,6 +28,41 @@ > > #include "drm_crtc_internal.h" > > > > /** > > + * DOC: overview > > + * > > + * Frame buffers are abstract memory objects that provide a source of > > pixels to > > + * scanout to a CRTC. Applications explicitly request the creation of frame > > + * buffers through the DRM_IOCTL_MODE_ADDFB(2) ioctls and receive an opaque > > + * handle that can be passed to the KMS CRTC control, plane configuration > > and > > + * page flip functions. > > + * > > + * Frame buffers rely on the underlying memory manager for allocating > > backing > > + * storage. When creating a frame buffer applications pass a memory handle > > + * (or a list of memory handles for multi-planar formats) through the > > + * struct &drm_mode_fb_cmd2 argument. For drivers using GEM as their > > userspace > > + * buffer management interface this would be a GEM handle. Drivers are > > however > > + * free to use their own backing storage object handles, e.g. vmwgfx > > directly > > + * exposes special TTM handles to userspace and so expects TTM handles in > > the > > + * create ioctl and not GEM handles. > > + * > > + * Framebuffers are tracked with struct &drm_framebuffer. They are > > published > > + * using drm_framebuffer_init() - after calling that function userspace > > can use > > + * and access the framebuffer object. The helper function > > + * drm_helper_mode_fill_fb_struct() can be used to pre-fill the required > > + * metadata fields. > > + * > > + * The lifetime of a drm framebuffer is controlled with a reference count, > > + * drivers can grab additional references with drm_framebuffer_reference() > > and > > + * drop them again with drm_framebuffer_unreference(). For driver-private > > + * framebuffers for which the last reference is never dropped (e.g. for the > > + * fbdev framebuffer when the struct struct &drm_framebuffer is embedded > > into > > + * the fbdev helper struct) drivers can manually clean up a framebuffer at > > + * module unload time with drm_framebuffer_unregister_private(). But doing > > this > > + * is not recommended, and it's better to have a normal free-standing > > struct > > + * &drm_framebuffer. > > + */ > > + >
Re: [Intel-gfx] [PATCH 11/20] drm: Extract drm_framebuffer.[hc]
On Wed, Aug 10, 2016 at 10:48:20AM -0400, Sean Paul wrote: > On Tue, Aug 9, 2016 at 9:41 AM, Daniel Vetter wrote: > > > > -/** > > - * drm_crtc_force_disable_all - Forcibly turn off all enabled CRTCs > > - * @dev: DRM device whose CRTCs to turn off > > - * > > - * Drivers may want to call this on unload to ensure that all displays are > > - * unlit and the GPU is in a consistent, low power state. Takes modeset > > locks. > > - * > > - * Returns: > > - * Zero on success, error code on failure. > > - */ > > -int drm_crtc_force_disable_all(struct drm_device *dev) > > -{ > > - struct drm_crtc *crtc; > > - int ret = 0; > > - > > - drm_modeset_lock_all(dev); > > - drm_for_each_crtc(crtc, dev) > > - if (crtc->enabled) { > > - ret = drm_crtc_force_disable(crtc); > > - if (ret) > > - goto out; > > - } > > -out: > > - drm_modeset_unlock_all(dev); > > - return ret; > > -} > > -EXPORT_SYMBOL(drm_crtc_force_disable_all); > > > I'm not so sure about moving this one. If it's going to be declared in > drm_crtc.h, it should stay here (with force_disable). Alternatively, > assuming no one else is using this (didn't check), move it to > drm_framebuffer and make it a static helper function there (removing > the declaration from drm_crtc.h). This shouldn't be moved, accidentally overselected. Will fix. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/9] drm/i915/cmdparser: Improve hash function
On 12 August 2016 at 18:58, Chris Wilson wrote: > On Fri, Aug 12, 2016 at 06:42:30PM +0100, Matthew Auld wrote: >> > -#define STD_MI_OPCODE_MASK 0xFF80 >> > -#define STD_3D_OPCODE_MASK 0x >> > -#define STD_2D_OPCODE_MASK 0xFFC0 >> > -#define STD_MFX_OPCODE_MASK 0x >> > +#define STD_MI_OPCODE_SHIFT (32 - 9) >> > +#define STD_3D_OPCODE_SHIFT (32 - 16) >> > +#define STD_2D_OPCODE_SHIFT (32 - 10) >> > +#define STD_MFX_OPCODE_SHIFT (32 - 16) >> Why don't we make use of this one in cmd_header_key? What client is it >> supposed to map to? > > It doesn't map to its own CLIENT, it reuses the RC_CLIENT for its > commands. (iirc) hmm, okay. Reviewed-by: Matthew Auld ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference"
On Fri, Aug 12, 2016 at 08:25:43PM +0200, Peter Zijlstra wrote: > On Thu, Aug 11, 2016 at 11:26:47AM -0700, Paul E. McKenney wrote: > > If my upcoming testing of the two changes together pans out, I will > > give you a Tested-by -- I am guessing that you don't want to wait > > until the next merge window for these changes. > > I was planning to stuff them in tip/locking/urgent, so they'd end up in > this release. They seem to work fine for me, so for both: Tested-by: Paul E. McKenney Thanx, Paul ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference"
On Thu, Aug 11, 2016 at 11:26:47AM -0700, Paul E. McKenney wrote: > If my upcoming testing of the two changes together pans out, I will > give you a Tested-by -- I am guessing that you don't want to wait > until the next merge window for these changes. I was planning to stuff them in tip/locking/urgent, so they'd end up in this release. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/9] drm/i915/cmdparser: Improve hash function
On Fri, Aug 12, 2016 at 06:42:30PM +0100, Matthew Auld wrote: > > -#define STD_MI_OPCODE_MASK 0xFF80 > > -#define STD_3D_OPCODE_MASK 0x > > -#define STD_2D_OPCODE_MASK 0xFFC0 > > -#define STD_MFX_OPCODE_MASK 0x > > +#define STD_MI_OPCODE_SHIFT (32 - 9) > > +#define STD_3D_OPCODE_SHIFT (32 - 16) > > +#define STD_2D_OPCODE_SHIFT (32 - 10) > > +#define STD_MFX_OPCODE_SHIFT (32 - 16) > Why don't we make use of this one in cmd_header_key? What client is it > supposed to map to? It doesn't map to its own CLIENT, it reuses the RC_CLIENT for its commands. (iirc) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Embrace the race in busy-ioctl
Daniel Vetter proposed a new challenge to the serialisation inside the busy-ioctl that exposed a flaw that could result in us reporting the wrong engine as being busy. If the request is reallocated as we test its busyness and then reassigned to this object by another thread, we would not notice that the test itself was incorrect. We are faced with a choice of using __i915_gem_active_get_request_rcu() to first acquire a reference to the request preventing the race, or to acknowledge the race and accept the limitations upon the accuracy of the busy flags. Note that we guarantee that we never falsely report the object as idle (providing userspace itself doesn't race), and so the most important use of the busy-ioctl and its guarantees are fulfilled. Signed-off-by: Chris Wilson Cc: Daniel Vetter Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c | 87 ++--- include/uapi/drm/i915_drm.h | 15 ++- 2 files changed, 60 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 5566916870eb..c77915378768 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3791,49 +3791,54 @@ static __always_inline unsigned int __busy_set_if_active(const struct i915_gem_active *active, unsigned int (*flag)(unsigned int id)) { - /* For more discussion about the barriers and locking concerns, -* see __i915_gem_active_get_rcu(). -*/ - do { - struct drm_i915_gem_request *request; - unsigned int id; - - request = rcu_dereference(active->request); - if (!request || i915_gem_request_completed(request)) - return 0; + struct drm_i915_gem_request *request; - id = request->engine->exec_id; + request = rcu_dereference(active->request); + if (!request || i915_gem_request_completed(request)) + return 0; - /* Check that the pointer wasn't reassigned and overwritten. -* -* In __i915_gem_active_get_rcu(), we enforce ordering between -* the first rcu pointer dereference (imposing a -* read-dependency only on access through the pointer) and -* the second lockless access through the memory barrier -* following a successful atomic_inc_not_zero(). Here there -* is no such barrier, and so we must manually insert an -* explicit read barrier to ensure that the following -* access occurs after all the loads through the first -* pointer. -* -* It is worth comparing this sequence with -* raw_write_seqcount_latch() which operates very similarly. -* The challenge here is the visibility of the other CPU -* writes to the reallocated request vs the local CPU ordering. -* Before the other CPU can overwrite the request, it will -* have updated our active->request and gone through a wmb. -* During the read here, we want to make sure that the values -* we see have not been overwritten as we do so - and we do -* that by serialising the second pointer check with the writes -* on other other CPUs. -* -* The corresponding write barrier is part of -* rcu_assign_pointer(). -*/ - smp_rmb(); - if (request == rcu_access_pointer(active->request)) - return flag(id); - } while (1); + /* This is racy. See __i915_gem_active_get_rcu() for a in detail +* discussion of how to handle the race correctly, but for reporting +* the busy state we err on the side of potentially reporting the +* wrong engine as being busy (but we guarantee that the result +* is at least self-consistent). +* +* As we use SLAB_DESTROY_BY_RCU, the request may be reallocated +* whilst we are inspecting it, even under the RCU read lock as we are. +* This means that there is a small window for the engine and/or the +* seqno to have been overwritten. The seqno will always be in the +* future compared to the intended, and so we know that if that +* seqno is idle (on whatever engine) our request is idle and the +* return 0 above is correct. +* +* The issue is that if the engine is switched, it is just as likely +* to report that it is busy (but since the switch happened, we know +* the request should be idle). So there is a small chance that a busy +* result is actually the wrong engine. +* +* So why don't we care? +* +* For starters, the busy ioctl is a heuristic that
Re: [Intel-gfx] [PATCH 7/9] drm/i915/cmdparser: Check for SKIP descriptors first
Reviewed-by: Matthew Auld ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: intel_dp_link_is_valid() should only return status of link
On Thu, Aug 11, 2016 at 08:18:54PM -0700, Pandiyan, Dhinakaran wrote: > On Thu, 2016-08-11 at 15:23 -0700, Manasi Navare wrote: > > Intel_dp_link_is_valid() function reads the Link status registers > > and returns a boolean to indicate link is valid or not. > > If the link has lost lock and is not valid any more, link > > training is performed outside the function else previously trained link > > is retained. > > This gives us flexibility of checking whether link is valid and training > > it independently. > > > > v2: > > * Changed the function name from intel_dp_check_link_status() > > to intel_dp_link_is_valid() (Lukas Wunner) > > * Checks for CRTC and active CRTC are moved outside the > > intel_dp_link_is_valid() function (Rodrigo Vivi) > > > > Signed-off-by: Manasi Navare > > --- > > drivers/gpu/drm/i915/intel_dp.c | 56 > > +++-- > > 1 file changed, 37 insertions(+), 19 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c > > b/drivers/gpu/drm/i915/intel_dp.c > > index 364db90..891147d 100644 > > --- a/drivers/gpu/drm/i915/intel_dp.c > > +++ b/drivers/gpu/drm/i915/intel_dp.c > > @@ -3881,36 +3881,33 @@ go_again: > > return -EINVAL; > > } > > > > -static void > > -intel_dp_check_link_status(struct intel_dp *intel_dp) > > +static bool > > +intel_dp_link_is_valid(struct intel_dp *intel_dp) > > { > > - struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base; > > struct drm_device *dev = intel_dp_to_dev(intel_dp); > > u8 link_status[DP_LINK_STATUS_SIZE]; > > > > WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex)); > > > > if (!intel_dp_get_link_status(intel_dp, link_status)) { > > - DRM_ERROR("Failed to get link status\n"); > > - return; > > + DRM_DEBUG_KMS("Failed to get link status\n"); > > + return false; > > } > > > > - if (!intel_encoder->base.crtc) > > - return; > > + /* Check if the link is valid by reading the bits of Link status > > +* registers > > +*/ > > + if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) { > > + DRM_DEBUG_KMS("Channel EQ or CR not ok, need to retrain\n"); > drm_dp_channel_eq_ok() does not check for CR. Should we just say > "Channel EQ not ok" to preempt ambiguity while debugging ? Actually this macro checks for DP_CHANNEL_EQ_BITS which is defined as: #define DP_CHANNEL_EQ_BITS (DP_LANE_CR_DONE | \ DP_LANE_CHANNEL_EQ_DONE | \ DP_LANE_SYMBOL_LOCKED) So it includes checking for Channel EQ and Clock Recovery CR bits > > > + return false; > > + } > > > > - if (!to_intel_crtc(intel_encoder->base.crtc)->active) > > - return; > > + DRM_DEBUG_KMS("Link is good, no need to retrain\n"); > The caller does not expect us to link train anymore, I don't think we > have to explicitly state "no need to retrain". Also, do we need debug > messages if the link is good? I agree , maybe this is not needed. I will remove this > > > + return true; > > > > - /* if link training is requested we should perform it always */ > > - if ((intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) || > > - (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count))) { > > - DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n", > > - intel_encoder->base.name); > > - intel_dp_start_link_train(intel_dp); > > - intel_dp_stop_link_train(intel_dp); > > - } > > } > > > > + > > /* > > * According to DP spec > > * 5.1.2: > > @@ -3928,6 +3925,8 @@ static bool > > intel_dp_short_pulse(struct intel_dp *intel_dp) > > { > > struct drm_device *dev = intel_dp_to_dev(intel_dp); > > + struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp); > > + struct intel_encoder *intel_encoder = &intel_dig_port->base; > > u8 sink_irq_vector = 0; > > u8 old_sink_count = intel_dp->sink_count; > > bool ret; > > @@ -3968,8 +3967,18 @@ intel_dp_short_pulse(struct intel_dp *intel_dp) > > DRM_DEBUG_DRIVER("CP or sink specific irq unhandled\n"); > > } > > > > + /* Do not train the link if there is no crtc */ > > + if (!intel_encoder->base.crtc) > > + return true; > > + if (!to_intel_crtc(intel_encoder->base.crtc)->active) > > + return true; > > + > I might be completely off base here. Shouldn't we keep the link valid > irrespective of whether there is an active crtc? I thought that is what > the refactoring is supposed to enable. Does intel_dp_short_pulse() get > called when there is a link loss during upfront link training? And in > that case, shouldn't we retrain even without a crtc? We cannot ever retrain without a CRTC. This check is more for making sure that the clocks are set up befofe we try to retrain else we will see AUX channel failures. If I track this back in the kerne
Re: [Intel-gfx] [PATCH 5/9] drm/i915/cmdparser: Improve hash function
> -#define STD_MI_OPCODE_MASK 0xFF80 > -#define STD_3D_OPCODE_MASK 0x > -#define STD_2D_OPCODE_MASK 0xFFC0 > -#define STD_MFX_OPCODE_MASK 0x > +#define STD_MI_OPCODE_SHIFT (32 - 9) > +#define STD_3D_OPCODE_SHIFT (32 - 16) > +#define STD_2D_OPCODE_SHIFT (32 - 10) > +#define STD_MFX_OPCODE_SHIFT (32 - 16) Why don't we make use of this one in cmd_header_key? What client is it supposed to map to? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PULL] topic/drm-misc
Hi Dave, - more fence destaging and cleanup (Gustavo&Sumit) - DRIVER_LEGACY to untangle from DRIVER_MODESET - drm_mm refactor (Chris) - fbdev-less compile fies - clipped plane src/dst rects (Ville) - + a few mediatek patches that build on top of that (Bibby+Daniel) - small stuff all over really Cheers, Daniel The following changes since commit 29b4817d4018df78086157ea3a55c1d9424a7cfc: Linux 4.8-rc1 (2016-08-07 18:18:00 -0700) are available in the git repository at: git://anongit.freedesktop.org/drm-intel tags/topic/drm-misc-2016-08-12 for you to fetch changes up to 3590d50e2313644cd192ff55e83df76dea232319: dma-buf/fence: kerneldoc: remove spurious section header (2016-08-12 20:32:14 +0530) Bibby Hsieh (2): drm/mediatek: Use drm_atomic destroy_state helpers drm/mediatek: Fix mtk_atomic_complete for runtime_pm Chris Wilson (4): drm: Track drm_mm nodes with an interval tree drm: Convert drm_vma_manager to embedded interval-tree in drm_mm drm: Skip initialising the drm_mm_node->hole_stack drm: Declare that create drm_mm nodes with size 0 is illegal Daniel Kurtz (5): drm/mediatek: Remove mtk_drm_crtc_check_flush drm/mediatek: plane: Remove plane zpos/index drm/mediatek: Remove mtk_drm_plane drm/mediatek: plane: Merge mtk_plane_enable into mtk_plane_atomic_update drm/mediatek: plane: Use FB's format's cpp to compute x offset Daniel Vetter (8): drm: Mark up legacy/dri1 drivers with DRM_LEGACY drm: Used DRM_LEGACY for all legacy functions drm: Make sure drm_vblank_no_hw_counter isn't abused drm/fb-helper: Add a dummy remove_conflicting_framebuffers drm: Remove superflous linux/fb.h includes drm/vmwgfx: select CONFIG_FB drm/radeon|amgpu: Make fbdev emulation optional drm: Protect fb_defio in drivers with CONFIG_KMS_FBDEV_EMULATION David Herrmann (1): drm: rename DRM_MINOR_LEGACY to DRM_MINOR_PRIMARY Gustavo Padovan (5): dma-buf/fence-array: add fence_is_array() dma-buf/sync_file: refactor fence storage in struct sync_file dma-buf/sync_file: add sync_file_get_fence() Documentation: add doc for sync_file_get_fence() dma-buf/sync_file: only enable fence signalling on poll() Joonas Lahtinen (1): drm: BIT(DRM_ROTATE_?) -> DRM_ROTATE_? Keith Packard (1): drm: Don't prepare or cleanup unchanging frame buffers [v3] Lyude (3): drm: Add ratelimited versions of the DRM_DEBUG* macros drm/dp_helper: Print first error received on failure in drm_dp_dpcd_access() drm/dp_helper: Rate limit timeout errors from drm_dp_i2c_do_msg() Peter Chen (1): Revert "gpu: drm: omapdrm: dss-of: add missing of_node_put after calling of_parse_phandle" Rodrigo Vivi (1): drm: Avoid printing negative values for unsigned variables. Sumit Semwal (2): dma-buf/fence: kerneldoc: remove unused struct members dma-buf/fence: kerneldoc: remove spurious section header Ville Syrjälä (9): drm: Warn about negative sizes when calculating scale factor drm: Store clipped src/dst coordinatee in drm_plane_state drm/plane-helper: Add drm_plane_helper_check_state() drm/i915: Use drm_plane_state.{src,dst,visible} drm/i915: Use drm_plane_helper_check_state() drm/rockchip: Use drm_plane_state.{src, dst} drm/rockchip: Use drm_plane_helper_check_state() drm/mediatek: Use drm_plane_helper_check_state() drm/simple_kms_helper: Use drm_plane_helper_check_state() Documentation/gpu/drm-internals.rst| 9 +- Documentation/sync_file.txt| 14 ++ drivers/dma-buf/fence-array.c | 1 + drivers/dma-buf/sync_file.c| 204 ++--- drivers/gpu/drm/Kconfig| 8 - drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 1 - drivers/gpu/drm/amd/powerplay/hwmgr/fiji_hwmgr.c | 1 - .../gpu/drm/amd/powerplay/hwmgr/polaris10_hwmgr.c | 1 - drivers/gpu/drm/amd/powerplay/hwmgr/ppatomctrl.c | 1 - drivers/gpu/drm/amd/powerplay/hwmgr/tonga_hwmgr.c | 1 - .../amd/powerplay/hwmgr/tonga_processpptables.c| 1 - drivers/gpu/drm/arm/malidp_drv.h | 2 +- drivers/gpu/drm/arm/malidp_planes.c| 20 +- drivers/gpu/drm/armada/armada_fbdev.c | 1 - drivers/gpu/drm/armada/armada_overlay.c| 2 +- drivers/gpu/drm/ast/ast_fb.c | 1 - drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c| 22 +-- drivers/gpu/drm/bochs/bochs.h | 1 - drivers/gpu/drm/bochs/bochs_drv.c | 3 +- drivers/gpu/drm/bridge/parade-ps8622.c | 1 - drivers/gpu/drm/cirrus/cirrus_drv.c| 2 +- drivers/gpu/drm/cirrus/cirrus_fbdev.c
[Intel-gfx] [PULL] drm-intel-next
Hi Dave, drm-intel-next-2016-08-08: - refactor ddi buffer programming a bit (Ville) - large-scale renaming to untangle naming in the gem code (Chris) - rework vma/active tracking for accurately reaping idle mappings of shared objects (Chris) - misc dp sst/mst probing corner case fixes (Ville) - tons of cleanup&tunings all around in gem - lockless (rcu-protected) request lookup, plus use it everywhere for non(b)locking waits (Chris) - pipe crc debugfs fixes (Rodrigo) - random fixes all over drm-intel-next-2016-07-25: - more engine code unification (Tvrtko) - reorganize rps&rc6 setup (Chris Wilson) - hotplug polling when in deep rpm states, especially fixes vls (Lyude) - mocs fix for bxt (Imre) - convert i915 request to use dma fences (Chris) - prep work for lockless i915 requests/fences (needed for full sync integration) from Chris Wilson - wait for external rendering/fences attached to dma_bufs (Chris) - tons of small bugfixes all over Note also contains a backmerge (git got confused), but when you've pulled in all pending pulls (there's a few now) I want to do another backmerge to get at the latest fences stuff from Gustavo. Cheers, Daniel The following changes since commit 1cf915d305b6e1d57db6c35c208016f9747ba3c6: Merge tag 'imx-drm-fixes-2016-07-27' of git://git.pengutronix.de/git/pza/linux into drm-next (2016-07-30 05:45:30 +1000) are available in the git repository at: git://anongit.freedesktop.org/drm-intel tags/drm-intel-next-2016-08-08 for you to fetch changes up to c5b7e97b27db4f8a8ffe1072506620679043f006: drm/i915: Update DRIVER_DATE to 20160808 (2016-08-08 09:37:31 +0200) - refactor ddi buffer programming a bit (Ville) - large-scale renaming to untangle naming in the gem code (Chris) - rework vma/active tracking for accurately reaping idle mappings of shared objects (Chris) - misc dp sst/mst probing corner case fixes (Ville) - tons of cleanup&tunings all around in gem - lockless (rcu-protected) request lookup, plus use it everywhere for non(b)locking waits (Chris) - pipe crc debugfs fixes (Rodrigo) - random fixes all over Akash Goel (1): drm/i915/gen9: Update i915_drpc_info debugfs for coarse pg & forcewake info Bob Paauwe (1): drm/i915: Set legacy properties when using legacy gamma set IOCTL. (v2) Chris Wilson (152): drm/i915/breadcrumbs: Queue hangcheck before sleeping drm/i915: Kick hangcheck from retire worker drm/i915: Remove temporary RPM wakeref assert disables drm/i915: Update ifdeffery for mutex->owner drm/i915: Provide argument names for static stubs drm/i915: Flush GT idle status upon reset drm/i915: Preserve current RPS frequency across init drm/i915: Perform static RPS frequency setup before userspace drm/i915: Move overclocking detection to alongside RPS frequency detection drm/i915: Define a separate variable and control for RPS waitboost frequency drm/i915: Remove superfluous powersave work flushing drm/i915: Defer enabling rc6 til after we submit the first batch/context drm/i915: Hide gen6_update_ring_freq() drm/i915/fbdev: Drain the suspend worker on retiring drm/i915/fbdev: Check for the framebuffer before use drm/i915/evict: Always switch away from the current context drm/i915: Flush logical context image out to memory upon suspend drm/i915: Handle ENOSPC after failing to insert a mappable node drm/i915: Move GEM request routines to i915_gem_request.c drm/i915: Retire oldest completed request before allocating next drm/i915: Mark all current requests as complete before resetting them drm/i915: Derive GEM requests from dma-fence drm/i915: Disable waitboosting for fence_wait() drm/i915: Disable waitboosting for mmioflips/semaphores drm/i915: Mark imported dma-buf objects as being coherent drm/i915: Wait on external rendering for GEM objects drm/i915: Rename request reference/unreference to get/put drm/i915: Rename i915_gem_context_reference/unreference() drm/i915: Wrap drm_gem_object_lookup in i915_gem_object_lookup drm/i915: Wrap drm_gem_object_reference in i915_gem_object_get drm/i915: Rename drm_gem_object_unreference in preparation for lockless free drm/i915: Rename drm_gem_object_unreference_unlocked in preparation for lockless free drm/i915: Treat ringbuffer writes as write to normal memory drm/i915: Rename ring->virtual_start as ring->vaddr drm/i915: Convert i915_semaphores_is_enabled over to early sanitize drm/i915: Enable RC6 immediately Revert "drm/i915: Enable RC6 immediately" drm/i915: Drop racy markup of missed-irqs from idle-worker drm/i915: Update the breadcrumb interrupt counter before enabling drm/i915: Reduce breadcrumb lock coverage for intel_engine
Re: [Intel-gfx] [PATCH 15/20] drm/i915: Debugfs support for GuC logging control
On 8/12/2016 9:27 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun Kamble This patch provides debugfs interface i915_guc_output_control for on the fly enabling/disabling of logging in GuC firmware and controlling the verbosity level of logs. The value written to the file, should have bit 0 set to enable logging and bits 4-7 should contain the verbosity info. v2: Add a forceful flush, to collect left over logs, on disabling logging. Useful for Validation. v3: Besides minor cleanup, implement read method for the debugfs file and set the guc_log_level to -1 when logging is disabled. (Tvrtko) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c| 44 - drivers/gpu/drm/i915/i915_guc_submission.c | 63 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 107 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 14e0dcf..f472fbcd3 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2674,6 +2674,47 @@ static int i915_guc_log_dump(struct seq_file *m, void *data) return 0; } +static int i915_guc_log_control_get(void *data, u64 *val) +{ +struct drm_device *dev = data; +struct drm_i915_private *dev_priv = to_i915(dev); + +if (!dev_priv->guc.log.obj) +return -EINVAL; + +*val = i915.guc_log_level; + +return 0; +} + +static int i915_guc_log_control_set(void *data, u64 val) +{ +struct drm_device *dev = data; +struct drm_i915_private *dev_priv = to_i915(dev); +int ret; + +ret = mutex_lock_interruptible(&dev->struct_mutex); +if (ret) +return ret; + +if (!dev_priv->guc.log.obj) { +ret = -EINVAL; +goto end; +} + +intel_runtime_pm_get(dev_priv); +ret = i915_guc_log_control(dev_priv, val); +intel_runtime_pm_put(dev_priv); + +end: +mutex_unlock(&dev->struct_mutex); +return ret; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops, +i915_guc_log_control_get, i915_guc_log_control_set, +"%lld\n"); + static int i915_edp_psr_status(struct seq_file *m, void *data) { struct drm_info_node *node = m->private; @@ -5477,7 +5518,8 @@ static const struct i915_debugfs_files { {"i915_fbc_false_color", &i915_fbc_fc_fops}, {"i915_dp_test_data", &i915_displayport_test_data_fops}, {"i915_dp_test_type", &i915_displayport_test_type_fops}, -{"i915_dp_test_active", &i915_displayport_test_active_fops} +{"i915_dp_test_active", &i915_displayport_test_active_fops}, +{"i915_guc_log_control", &i915_guc_log_control_fops} }; void intel_display_crc_init(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 4a75c16..041cf68 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc *guc) return host2guc_action(guc, data, 2); } +static int host2guc_logging_control(struct intel_guc *guc, u32 control_val) +{ +u32 data[2]; + +data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING; +data[1] = control_val; + +return host2guc_action(guc, data, 2); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -1538,3 +1548,56 @@ void i915_guc_register(struct drm_i915_private *dev_priv) guc_log_late_setup(&dev_priv->guc); mutex_unlock(&dev_priv->drm.struct_mutex); } + +int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val) +{ +union guc_log_control log_param; +int ret; + +log_param.logging_enabled = control_val & 0x1; +log_param.verbosity = (control_val >> 4) & 0xF; Maybe "log_param.value = control_val" would also work since guc_log_control is conveniently defined as an union. Doesn't matter though. + +if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN || +log_param.verbosity > GUC_LOG_VERBOSITY_MAX) +return -EINVAL; + +/* This combination doesn't make sense & won't have any effect */ +if (!log_param.logging_enabled && (i915.guc_log_level < 0)) +return 0; I wonder if it would work and maybe look nicer to generalize as: int guc_log_level; guc_log_level = log_param.logging_enabled ? log_param.verbosity : -1; if (i915.guc_log_level == guc_log_level) return 0; Fine, will try to refactor the code as per your suggestions. Thanks for the suggestions. + +ret = host2guc_logging_control(&dev_priv->guc, log_param.value); +if (ret < 0) { +DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret); +return ret; +} + +i915.guc_log_level = log_param.verbosity; This would then become i915.guc_log_level = guc_log_level. + +/*
Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps
On 8/12/2016 8:46 PM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 08:43:58PM +0530, Goel, Akash wrote: On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote: Unreleated and unmentioned change to no guard page. Best to remove IMHO. Can keep the RB in that case. Though its not called out, sorry for that, but isn't it better to avoid using the guard page, which will save 4KB of vmalloc virtual space (which is scarce) for every mapping created by Driver. Updating the commit message would be fine to mention about this ?. Too late, already applied without the new flag. ohh, the patch is already queued for merge ? Yes, that's why I dropped the guard page when I found out it was being added. Send a patch to add the flag and we can discuss whether we think our code is adequate to not require the protection. Fine, will prepare a separate patch to avoid using the guard page. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel As per the current i915 Driver load sequence, debugfs registration is done at the end and so the relay channel debugfs file is also created after that but the GuC firmware is loaded much earlier in the sequence. As a result Driver could miss capturing the boot-time logs of GuC firmware if there are flush interrupts from the GuC side. Relay has a provision to support early logging where initially only relay channel can be created, to have buffers for storing logs, and later on channel can be associated with a debugfs file at appropriate time. Have availed that, which allows Driver to capture boot time logs also, which can be collected once Userspace comes up. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 61 +- 1 file changed, 44 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index af48f62..1c287d7 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct intel_guc *guc) relay_close(guc->log.relay_chan); } -static int guc_create_log_relay_file(struct intel_guc *guc) +static int guc_create_relay_channel(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); struct rchan *guc_log_relay_chan; -struct dentry *log_dir; size_t n_subbufs, subbuf_size; -/* For now create the log file in /sys/kernel/debug/dri/0 dir */ -log_dir = dev_priv->drm.primary->debugfs_root; - -/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is - * not mounted and so can't create the relay file. - * The relay API seems to fit well with debugfs only. It only needs a dentry, I don't see that it has to be a debugfs one. Besides dentry, there are other requirements for using relay, which can be met only for a debugfs file. debugfs wasn't the preferred choice to place the log file, but had no other option, as relay API is compatible with debugfs only. Also retrieving dentry of a file is not so straight forward, as it might seem (spent considerable time on this initially). - */ -if (!log_dir) { -DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n"); -return -ENODEV; -} - /* Keep the size of sub buffers same as shared log buffer */ subbuf_size = guc->log.obj->base.size; /* Store up to 8 snaphosts, which is large enough to buffer sufficient @@ -1127,7 +1114,7 @@ static int guc_create_log_relay_file(struct intel_guc *guc) */ n_subbufs = 8; -guc_log_relay_chan = relay_open("guc_log", log_dir, +guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size, n_subbufs, &relay_callbacks, dev_priv); if (!guc_log_relay_chan) { @@ -1140,6 +1127,33 @@ static int guc_create_log_relay_file(struct intel_guc *guc) return 0; } +static int guc_create_log_relay_file(struct intel_guc *guc) +{ +struct drm_i915_private *dev_priv = guc_to_i915(guc); +struct dentry *log_dir; +int ret; + +/* For now create the log file in /sys/kernel/debug/dri/0 dir */ +log_dir = dev_priv->drm.primary->debugfs_root; + +/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is + * not mounted and so can't create the relay file. + * The relay API seems to fit well with debugfs only. + */ +if (!log_dir) { +DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n"); +return -ENODEV; +} + +ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir); +if (ret) { +DRM_DEBUG_DRIVER("Couldn't associate the channel with file %d\n", ret); +return ret; +} + +return 0; +} + static void guc_log_cleanup(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); @@ -1167,7 +1181,7 @@ static int guc_create_log_extras(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); void *vaddr; -int ret; +int ret = 0; lockdep_assert_held(&dev_priv->drm.struct_mutex); @@ -1190,7 +1204,15 @@ static int guc_create_log_extras(struct intel_guc *guc) guc->log.buf_addr = vaddr; } -return 0; +if (!guc->log.relay_chan) { +/* Create a relay channel, so that we have buffers for storing + * the GuC firmware logs, the channel will be linked with a file + * later on when debugfs is registered. + */ +ret = guc_create_relay_channel(guc); +} + +return ret; } static void guc_create_log(struct intel_guc *guc) @@ -1231,6 +1253,7 @@ static void guc_create_log(struct intel_guc *guc) guc->log.obj = obj; if (guc_create_log_extras(guc
Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel As per the current i915 Driver load sequence, debugfs registration is done at the end and so the relay channel debugfs file is also created after that but the GuC firmware is loaded much earlier in the sequence. As a result Driver could miss capturing the boot-time logs of GuC firmware if there are flush interrupts from the GuC side. Relay has a provision to support early logging where initially only relay channel can be created, to have buffers for storing logs, and later on channel can be associated with a debugfs file at appropriate time. Have availed that, which allows Driver to capture boot time logs also, which can be collected once Userspace comes up. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 61 +- 1 file changed, 44 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index af48f62..1c287d7 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct intel_guc *guc) relay_close(guc->log.relay_chan); } -static int guc_create_log_relay_file(struct intel_guc *guc) +static int guc_create_relay_channel(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); struct rchan *guc_log_relay_chan; - struct dentry *log_dir; size_t n_subbufs, subbuf_size; - /* For now create the log file in /sys/kernel/debug/dri/0 dir */ - log_dir = dev_priv->drm.primary->debugfs_root; - - /* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is -* not mounted and so can't create the relay file. -* The relay API seems to fit well with debugfs only. It only needs a dentry, I don't see that it has to be a debugfs one. -*/ - if (!log_dir) { - DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n"); - return -ENODEV; - } - /* Keep the size of sub buffers same as shared log buffer */ subbuf_size = guc->log.obj->base.size; /* Store up to 8 snaphosts, which is large enough to buffer sufficient @@ -1127,7 +1114,7 @@ static int guc_create_log_relay_file(struct intel_guc *guc) */ n_subbufs = 8; - guc_log_relay_chan = relay_open("guc_log", log_dir, + guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size, n_subbufs, &relay_callbacks, dev_priv); if (!guc_log_relay_chan) { @@ -1140,6 +1127,33 @@ static int guc_create_log_relay_file(struct intel_guc *guc) return 0; } +static int guc_create_log_relay_file(struct intel_guc *guc) +{ + struct drm_i915_private *dev_priv = guc_to_i915(guc); + struct dentry *log_dir; + int ret; + + /* For now create the log file in /sys/kernel/debug/dri/0 dir */ + log_dir = dev_priv->drm.primary->debugfs_root; + + /* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is +* not mounted and so can't create the relay file. +* The relay API seems to fit well with debugfs only. +*/ + if (!log_dir) { + DRM_DEBUG_DRIVER("Parent debugfs directory not available yet\n"); + return -ENODEV; + } + + ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir); + if (ret) { + DRM_DEBUG_DRIVER("Couldn't associate the channel with file %d\n", ret); + return ret; + } + + return 0; +} + static void guc_log_cleanup(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); @@ -1167,7 +1181,7 @@ static int guc_create_log_extras(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); void *vaddr; - int ret; + int ret = 0; lockdep_assert_held(&dev_priv->drm.struct_mutex); @@ -1190,7 +1204,15 @@ static int guc_create_log_extras(struct intel_guc *guc) guc->log.buf_addr = vaddr; } - return 0; + if (!guc->log.relay_chan) { + /* Create a relay channel, so that we have buffers for storing +* the GuC firmware logs, the channel will be linked with a file +* later on when debugfs is registered. +*/ + ret = guc_create_relay_channel(guc); + } + + return ret; } static void guc_create_log(struct intel_guc *guc) @@ -1231,6 +1253,7 @@ static void guc_create_log(struct intel_guc *guc) guc->log.obj = obj; if (guc_create_log_extras(guc)) { + guc_log_cleanup(guc); gem_release_guc_obj(guc->log.obj); guc->log.obj = NULL;
Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/12/2016 7:37 PM, Tvrtko Ursulin wrote: On 12/08/16 14:45, Goel, Akash wrote: On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun Kamble GuC ukernel sends an interrupt to Host to flush the log buffer and expects Host to correspondingly update the read pointer information in the state structure, once it has consumed the log buffer contents by copying them to a file or buffer. Even if Host couldn't copy the contents, it can still update the read pointer so that logging state is not disturbed on GuC side. v2: - Use a dedicated workqueue for handling flush interrupt. (Tvrtko) - Reduce the overall log buffer copying time by skipping the copy of crash buffer area for regular cases and copying only the state structure data in first page. v3: - Create a vmalloc mapping of log buffer. (Chris) - Cover the flush acknowledgment under rpm get & put.(Chris) - Revert the change of skipping the copy of crash dump area, as not really needed, will be covered by subsequent patch. v4: - Destroy the wq under the same condition in which it was created, pass dev_piv pointer instead of dev to newly added GuC function, add more comments & rename variable for clarity. (Tvrtko) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.c| 14 +++ drivers/gpu/drm/i915/i915_guc_submission.c | 150 + drivers/gpu/drm/i915/i915_irq.c| 5 +- drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 170 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 0fcd1c0..fc2da32 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv) if (dev_priv->hotplug.dp_wq == NULL) goto out_free_wq; +if (HAS_GUC_SCHED(dev_priv)) { This just reminded me that a previous patch had: +if (HAS_GUC_UCODE(dev)) +dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT; In the interrupt setup. I don't think there is a bug right now, but there is a disagreement between the two which would be good to resolve. This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED for correctness. I think. Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch. As per Chris's comments will move the wq init/destroy to the GuC logging setup/teardown routines (guc_create_log_extras, guc_log_cleanup) You are fine with that ?. Yes thats OK I think. +/* Need a dedicated wq to process log buffer flush interrupts + * from GuC without much delay so as to avoid any loss of logs. + */ +dev_priv->guc.log.wq = +alloc_ordered_workqueue("i915-guc_log", 0); +if (dev_priv->guc.log.wq == NULL) +goto out_free_hotplug_dp_wq; +} + return 0; +out_free_hotplug_dp_wq: +destroy_workqueue(dev_priv->hotplug.dp_wq); out_free_wq: destroy_workqueue(dev_priv->wq); out_err: @@ -782,6 +794,8 @@ out_err: static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv) { +if (HAS_GUC_SCHED(dev_priv)) +destroy_workqueue(dev_priv->guc.log.wq); destroy_workqueue(dev_priv->hotplug.dp_wq); destroy_workqueue(dev_priv->wq); } diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c7c679f..2635b67 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc, return host2guc_action(guc, data, ARRAY_SIZE(data)); } +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) +{ +u32 data[1]; + +data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE; + +return host2guc_action(guc, data, 1); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -840,6 +849,127 @@ err: return NULL; } +static void guc_move_to_next_buf(struct intel_guc *guc) +{ +return; +} + +static void* guc_get_write_buffer(struct intel_guc *guc) +{ +return NULL; +} + +static void guc_read_update_log_buffer(struct intel_guc *guc) +{ +struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; +struct guc_log_buffer_state log_buffer_state_local; +void *src_data_ptr, *dst_data_ptr; +u32 i, buffer_size; unsigned int i if you can be bothered. Fine will do that for both i & buffer_size. buffer_size can match the type of log_buffer_state_local.size or use something else if more appropriate. But I remember earlier in one of the patch, you suggested to use u32 as a type for some variables. Please could you share the guideline. Should u32, u64 be used we are exactly sure of the range of the variable, like for
Re: [Intel-gfx] [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
On 8/12/2016 7:23 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the User to capture GuC firmware logs. Availed relay framework to implement the interface, where Driver will have to just use a relay API to store snapshots of the GuC log buffer in the buffer managed by relay. The snapshot will be taken when GuC firmware sends a log buffer flush interrupt and up to four snaphots could be stored in the relay buffer. snapshots The relay buffer will be operated in a mode where it will overwrite the data not yet collected by User. Besides mmap method, through which User can directly access the relay buffer contents, relay also supports the 'poll' method. Through the 'poll' call on log file, User can come to know whenever a new snapshot of the log buffer is taken by Driver, so can run in tandem with the Driver and capture the logs in a sustained/streaming manner, without any loss of data. v2: Defer the creation of relay channel & associated debugfs file, as debugfs setup is now done at the end of i915 Driver load. (Chris) v3: - Switch to no-overwrite mode for relay. - Fix the relay sub buffer switching sequence. v4: - Update i915 Kconfig to select RELAY config. (TvrtKo) - Log a message when there is no sub buffer available to capture the GuC log buffer. (Tvrtko) - Increase the number of relay sub buffers to 8 from 4, to have sufficient buffering for boot time logs Suggested-by: Chris Wilson Signed-off-by: Sourab Gupta Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_drv.c| 2 + drivers/gpu/drm/i915/i915_guc_submission.c | 206 - drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 209 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 7769e46..fc900d2 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -11,6 +11,7 @@ config DRM_I915 select DRM_KMS_HELPER select DRM_PANEL select DRM_MIPI_DSI +select RELAY # i915 depends on ACPI_VIDEO when ACPI is enabled # but for select to work, need to select ACPI_VIDEO's dependencies, ick select BACKLIGHT_LCD_SUPPORT if ACPI diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index fc2da32..cb8c943 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1145,6 +1145,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) /* Reveal our presence to userspace */ if (drm_dev_register(dev, 0) == 0) { i915_debugfs_register(dev_priv); +i915_guc_register(dev_priv); i915_setup_sysfs(dev); } else DRM_ERROR("Failed to register driver for userspace access!\n"); @@ -1183,6 +1184,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) intel_opregion_unregister(dev_priv); i915_teardown_sysfs(&dev_priv->drm); +i915_guc_unregister(dev_priv); i915_debugfs_unregister(dev_priv); drm_dev_unregister(&dev_priv->drm); diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 2635b67..1a2d648 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -23,6 +23,8 @@ */ #include #include +#include +#include #include "i915_drv.h" #include "intel_guc.h" @@ -851,12 +853,33 @@ err: static void guc_move_to_next_buf(struct intel_guc *guc) { -return; +/* Make sure the updates made in the sub buffer are visible when + * Consumer sees the following update to offset inside the sub buffer. + */ +smp_wmb(); + +/* All data has been written, so now move the offset of sub buffer. */ +relay_reserve(guc->log.relay_chan, guc->log.obj->base.size); + +/* Switch to the next sub buffer */ +relay_flush(guc->log.relay_chan); } static void* guc_get_write_buffer(struct intel_guc *guc) { -return NULL; +/* FIXME: Cover the check under a lock ? */ Need to resolve before r-b in any case. After the last patch in this series, where relay channel will be created before enabling the GuC interrupts, the need of lock will not be there so will remove these comments in that patch. +if (!guc->log.relay_chan) +return NULL; + +/* Just get the base address of a new sub buffer and copy data into it + * ourselves. NULL will be returned in no-overwrite mode, if all sub + * buffers are full. Could have used the relay_write() to indirectly + * copy the data, but that would have been bit convoluted, as we need to + * write to only certain locations inside a sub buffer which cannot be + * done without using relay_reserve() along with relay_write(). So its + * better to use relay_
Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
On Fri, Aug 12, 2016 at 09:34:23PM +0530, Goel, Akash wrote: > > > On 8/12/2016 9:22 PM, Chris Wilson wrote: > >On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote: > >>On 8/12/2016 9:02 PM, Chris Wilson wrote: > >>>There's (or will be) a function to dump the error object in a uniform > >>>manner. This patch is obsolete. > >> > >>There is a print_error_obj() function, but that prints one dword per line. > > > >It used to. It will shortly be a compressed stream. > > >Pretty printing is left to userspace. > But invariably, we only will be interpreting the error state or Guc > log buffer dump, and it will be really convenient if we can have 4 > dwords per line matching the log sample size. That's fine. Do it in userspace. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 19/20] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer
On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel In order to have fast reads from the GuC log buffer, used SSE4.1 movntdqa based memcpy function i915_memcpy_from_wc. GuC log buffer has a WC type vmalloc mapping and copying using movntqda from WC type memory is almost as fast as reading from WB memory. This will further reduce the log buffer sampling time, so is needed dearly to deal with the flush interrupt storm when GuC is generating logs at a very high rate. Ideally SSE 4.1 should be present on all chipsets supporting GuC based submisssions, but if not then logging will not be enabled. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 1818343..af48f62 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -987,15 +987,16 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) /* Just copy the newly written data */ if (read_offset <= write_offset) { bytes_to_copy = write_offset - read_offset; - memcpy(dst_data_ptr + read_offset, + i915_memcpy_from_wc(dst_data_ptr + read_offset, src_data_ptr + read_offset, bytes_to_copy); } else { bytes_to_copy = buffer_size - read_offset; - memcpy(dst_data_ptr + read_offset, + i915_memcpy_from_wc(dst_data_ptr + read_offset, src_data_ptr + read_offset, bytes_to_copy); bytes_to_copy = write_offset; - memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy); + i915_memcpy_from_wc(dst_data_ptr, src_data_ptr, +bytes_to_copy); } src_data_ptr += buffer_size; @@ -1210,6 +1211,16 @@ static void guc_create_log(struct intel_guc *guc) obj = guc->log.obj; if (!obj) { + /* We require SSE 4.1 for fast reads from the GuC log buffer and +* it should be present on the chipsets supporting GuC based +* submisssions. +*/ + if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) { + /* logging will not be enabled */ + i915.guc_log_level = -1; + return; + } + obj = gem_allocate_guc_obj(dev_priv, size); if (!obj) { /* logging will be off */ Reviewed-by: Tvrtko Ursulin Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer
On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel Host needs to sample the GuC log buffer on every flush interrupt from GuC. To ensure that we always get the up-to-date data from log buffer, its better to access the buffer through an uncached CPU mapping. Also the way buffer is accessed from GuC & Host side, manually doing cache flush may not be effective always if cached CPU mapping is used. Though there could be some performance implication with Uncached read, but reliability of data will be ensured. v2: Rebase. v3: Rebase. Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 1d58d36..1818343 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1002,8 +1002,6 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) dst_data_ptr += buffer_size; } - /* FIXME: invalidate/flush for log buffer needed */ - /* Update the read pointer in the shared log buffer */ log_buffer_state->read_ptr = write_offset; @@ -1177,8 +1175,11 @@ static int guc_create_log_extras(struct intel_guc *guc) return 0; if (!guc->log.buf_addr) { - /* Create a vmalloc mapping of log buffer pages */ - vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WB); + /* Create a WC (Uncached for read) vmalloc mapping of log +* buffer pages, so that we can directly get the data +* (up-to-date) from memory. +*/ + vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WC); if (IS_ERR(vaddr)) { ret = PTR_ERR(vaddr); DRM_ERROR("Couldn't map log buffer pages %d\n", ret); Reviewed-by: Tvrtko Ursulin Hopefully no one applies this without 19/20. :) Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
On 8/12/2016 9:22 PM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote: On 8/12/2016 9:02 PM, Chris Wilson wrote: There's (or will be) a function to dump the error object in a uniform manner. This patch is obsolete. There is a print_error_obj() function, but that prints one dword per line. It used to. It will shortly be a compressed stream. Pretty printing is left to userspace. But invariably, we only will be interpreting the error state or Guc log buffer dump, and it will be really convenient if we can have 4 dwords per line matching the log sample size. Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 8/9] drm/i915/cmdparser: Use binary search for faster register lookup
On 12 August 2016 at 16:07, Chris Wilson wrote: > A signifcant proportion of the cmdparsing time for some batches is the > cost to find the register in the mmiotable. We ensure that those tables > are in ascending order such that we could do a binary search if it was > ever merited. It is. > > Signed-off-by: Chris Wilson Cool. s/signifcant/significant/ Reviewed-by: Matthew Auld ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/15] drm/omap: Use per-plane rotation property
On Thu, Aug 11, 2016 at 04:33:32PM +0300, Ville Syrjälä wrote: > On Thu, Aug 11, 2016 at 02:32:44PM +0300, Tomi Valkeinen wrote: > > Hi, > > > > On 22/07/16 16:43, ville.syrj...@linux.intel.com wrote: > > > From: Ville Syrjälä > > > > > > The global mode_config.rotation_property is going away, switch over to > > > per-plane rotation_property. > > > > > > Not sure I got the annoying crtc rotation_property handling right. > > > Might work, or migth not. > > > > I think something is funny with this patch or the series. I fetched your > > branch, and with your series, it looks like the primary planes lose all > > their props. modetest says: > > > > could not get plane 26 properties: Invalid argument > > could not get plane 30 properties: Invalid argument > > Hmm. Weird. Is it really the get props ioctl that fails? > > The first EINVAL I can spot there is > if (!obj->properties) { > ret = -EINVAL; > goto out_unref; > } > which definitely makes no sense since this is assigned > as plane->base.properties = &plane->properties. So can't be that unless > we manage to clear the pointer somehow after the init. > > The only other direct EINVAL I see there is if > drm_object_property_get_value(obj->properties->properties[i]) > fails to find the passed prop in the properties array. Which clearly > can't happen since we got it from the array in the first place. Also, > clearly that code is rather inefficient, perhaps someone should rewrite > it a bit. > > Can't quite see how this could fail for the plane in other ways. But I > might be blind. I tried to think on this a bit more, and the only think I came up with was that we end up doing the drm_plane_create_rotation_property() twice for the primary planes. I tried that on i915 but it'd didn't result in anything bad AFAICS. Would leak a bit, but so what :P Dunno, I guess you could try something like: --- a/drivers/gpu/drm/omapdrm/omap_plane.c +++ b/drivers/gpu/drm/omapdrm/omap_plane.c @@ -211,11 +211,12 @@ void omap_plane_install_properties(struct drm_plane *plane, struct omap_drm_private *priv = dev->dev_private; if (priv->has_dmm) { - drm_plane_create_rotation_property(plane, - BIT(DRM_ROTATE_0), - BIT(DRM_ROTATE_0) | BIT(DRM_ROTATE_90) | - BIT(DRM_ROTATE_180) | BIT(DRM_ROTATE_270) | - BIT(DRM_REFLECT_X) | BIT(DRM_REFLECT_Y)); + if (!plane->rotation_property) + drm_plane_create_rotation_property(plane, + BIT(DRM_ROTATE_0), + BIT(DRM_ROTATE_0) | BIT(DRM_ROTATE_90) | + BIT(DRM_ROTATE_180) | BIT(DRM_ROTATE_270) | + BIT(DRM_REFLECT_X) | BIT(DRM_REFLECT_Y)); > > > > > and > > > > Planes: > > id crtcfb CRTC x,yx,y gamma size possible > > crtcs > > 26 28 55 0,0 0,0 0 0x0001 > > formats: RG16 RX12 XR12 RA12 AR12 XR15 AR15 RG24 RX24 XR24 RA24 AR24 > > no properties found > > 30 0 0 0,0 0,0 0 0x0002 > > formats: RG16 RX12 XR12 RA12 AR12 XR15 AR15 RG24 RX24 XR24 RA24 AR24 > > NV12 YUYV UYVY > > no properties found > > > > I didn't look closer yet. > > > > Tomi > > > > > > > -- > Ville Syrjälä > Intel OTC > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 15/20] drm/i915: Debugfs support for GuC logging control
On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun Kamble This patch provides debugfs interface i915_guc_output_control for on the fly enabling/disabling of logging in GuC firmware and controlling the verbosity level of logs. The value written to the file, should have bit 0 set to enable logging and bits 4-7 should contain the verbosity info. v2: Add a forceful flush, to collect left over logs, on disabling logging. Useful for Validation. v3: Besides minor cleanup, implement read method for the debugfs file and set the guc_log_level to -1 when logging is disabled. (Tvrtko) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c| 44 - drivers/gpu/drm/i915/i915_guc_submission.c | 63 ++ drivers/gpu/drm/i915/intel_guc.h | 1 + 3 files changed, 107 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 14e0dcf..f472fbcd3 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2674,6 +2674,47 @@ static int i915_guc_log_dump(struct seq_file *m, void *data) return 0; } +static int i915_guc_log_control_get(void *data, u64 *val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = to_i915(dev); + + if (!dev_priv->guc.log.obj) + return -EINVAL; + + *val = i915.guc_log_level; + + return 0; +} + +static int i915_guc_log_control_set(void *data, u64 val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = to_i915(dev); + int ret; + + ret = mutex_lock_interruptible(&dev->struct_mutex); + if (ret) + return ret; + + if (!dev_priv->guc.log.obj) { + ret = -EINVAL; + goto end; + } + + intel_runtime_pm_get(dev_priv); + ret = i915_guc_log_control(dev_priv, val); + intel_runtime_pm_put(dev_priv); + +end: + mutex_unlock(&dev->struct_mutex); + return ret; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops, + i915_guc_log_control_get, i915_guc_log_control_set, + "%lld\n"); + static int i915_edp_psr_status(struct seq_file *m, void *data) { struct drm_info_node *node = m->private; @@ -5477,7 +5518,8 @@ static const struct i915_debugfs_files { {"i915_fbc_false_color", &i915_fbc_fc_fops}, {"i915_dp_test_data", &i915_displayport_test_data_fops}, {"i915_dp_test_type", &i915_displayport_test_type_fops}, - {"i915_dp_test_active", &i915_displayport_test_active_fops} + {"i915_dp_test_active", &i915_displayport_test_active_fops}, + {"i915_guc_log_control", &i915_guc_log_control_fops} }; void intel_display_crc_init(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 4a75c16..041cf68 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc *guc) return host2guc_action(guc, data, 2); } +static int host2guc_logging_control(struct intel_guc *guc, u32 control_val) +{ + u32 data[2]; + + data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING; + data[1] = control_val; + + return host2guc_action(guc, data, 2); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -1538,3 +1548,56 @@ void i915_guc_register(struct drm_i915_private *dev_priv) guc_log_late_setup(&dev_priv->guc); mutex_unlock(&dev_priv->drm.struct_mutex); } + +int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val) +{ + union guc_log_control log_param; + int ret; + + log_param.logging_enabled = control_val & 0x1; + log_param.verbosity = (control_val >> 4) & 0xF; Maybe "log_param.value = control_val" would also work since guc_log_control is conveniently defined as an union. Doesn't matter though. + + if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN || + log_param.verbosity > GUC_LOG_VERBOSITY_MAX) + return -EINVAL; + + /* This combination doesn't make sense & won't have any effect */ + if (!log_param.logging_enabled && (i915.guc_log_level < 0)) + return 0; I wonder if it would work and maybe look nicer to generalize as: int guc_log_level; guc_log_level = log_param.logging_enabled ? log_param.verbosity : -1; if (i915.guc_log_level == guc_log_level) return 0; + + ret = host2guc_logging_control(&dev_priv->guc, log_param.value); + if (ret < 0) { + DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret); + return ret; + } + + i915.guc_log_level = log_p
Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote: > On 8/12/2016 9:02 PM, Chris Wilson wrote: > >There's (or will be) a function to dump the error object in a uniform > >manner. This patch is obsolete. > > There is a print_error_obj() function, but that prints one dword per line. It used to. It will shortly be a compressed stream. Pretty printing is left to userspace. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
On 8/12/2016 9:02 PM, Chris Wilson wrote: On Fri, Aug 12, 2016 at 04:20:03PM +0100, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel Added the dump of GuC log buffer to i915 error state, as the contents of GuC log buffer would also be useful to determine that why the GPU reset was triggered. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++ 2 files changed, 28 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 28ffac5..4bd3790 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -509,6 +509,7 @@ struct drm_i915_error_state { struct intel_overlay_error_state *overlay; struct intel_display_error_state *display; struct drm_i915_error_object *semaphore_obj; + struct drm_i915_error_object *guc_log_obj; struct drm_i915_error_engine { int engine_id; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index eecb870..561b523 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -546,6 +546,21 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, } } + if ((obj = error->guc_log_obj)) { + err_printf(m, "GuC log buffer = 0x%08x\n", + lower_32_bits(obj->gtt_offset)); + for (i = 0; i < obj->page_count; i++) { + for (elt = 0; elt < PAGE_SIZE/4; elt += 4) { Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like it is counting in u32 * 4 chunks so it might be. Or I might be confused.. It will be PAGE_SIZE / 4 only. It took me some iterations to get it right. PAGE_SIZE/4 is number of dwords and elt+=4 is covering 4 dwords in every iteration There's (or will be) a function to dump the error object in a uniform manner. This patch is obsolete. There is a print_error_obj() function, but that prints one dword per line. For GuC log buffer its better (for ease of interpretation) to print 4 dwords per line as each sample if of 4 dwords, also headers are of 8 dwords. Other benefit is that it reduces the line count of the error state file (Compared to other captured buffers like ring buffer, batch buffers, status page, size of Log buffer is more, 76 KB). Best regards Akash -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [1/9] drm/i915/cmdparser: Make initialisation failure non-fatal
== Series Details == Series: series starting with [1/9] drm/i915/cmdparser: Make initialisation failure non-fatal URL : https://patchwork.freedesktop.org/series/11031/ State : failure == Summary == Applying: drm/i915/cmdparser: Make initialisation failure non-fatal fatal: sha1 information is lacking or useless (drivers/gpu/drm/i915/i915_drv.h). error: could not build fake ancestor Patch failed at 0001 drm/i915/cmdparser: Make initialisation failure non-fatal The copy of the patch that failed is found in: .git/rebase-apply/patch When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts
On 8/12/2016 8:35 PM, Tvrtko Ursulin wrote: On 12/08/16 15:31, Goel, Akash wrote: On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote: +static void gen9_guc2host_events_work(struct work_struct *work) +{ +struct drm_i915_private *dev_priv = +container_of(work, struct drm_i915_private, guc.events_work); + +spin_lock_irq(&dev_priv->irq_lock); +/* Speed up work cancellation during disabling guc interrupts. */ +if (!dev_priv->guc.interrupts_enabled) { +spin_unlock_irq(&dev_priv->irq_lock); +return; I suppose locking for early exit is something about ensuring the worker sees the update to dev_priv->guc.interrupts_enabled done on another CPU? Yes locking (providing implicit barrier) will ensure that update made from another CPU is immediately visible to the worker. What if the disable happens after the unlock above? It would wait in disable until the irq handler exits. Most probably it will not have to wait, as irq handler would have completed if work item began the execution. Irq handler just queues the work item, which gets scheduled later on. Using the lock is beneficial for the case where the execution of work item and interrupt disabling is done around the same time. Ok maybe I am missing something. When can the interrupt disabling happen? Will it be controlled by the debugfs file or is it driver load/unload and suspend/resume? yes disabling will happen for all the above 3 scenarios. +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) +{ +bool interrupts_enabled; + +if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) { +spin_lock(&dev_priv->irq_lock); +interrupts_enabled = dev_priv->guc.interrupts_enabled; +spin_unlock(&dev_priv->irq_lock); Not sure that taking a lock around only this read is needed. Again same reason as above, to make sure an update made on another CPU is immediately visible to the irq handler. I don't get it, see above. :) Here also If interrupt disabling & ISR execution happens around the same time then ISR might miss the reset of 'interrupts_enabled' flag and queue the new work. What if reset of interrupts_enabled happens just as the ISR releases the lock? Then ISR will proceed ahead and queue the work item. Lock is useful if reset of interrupts_enabled flag just happens before the ISR inspects the value of that flag. Also lock will help when interrupts_enabled flag is set again, next ISR will definitely see it as set. And same applies to the case when interrupt is re-enabled, ISR might still see the 'interrupts_enabled' flag as false. It will eventually see the update though. +if (interrupts_enabled) { +/* Sample the log buffer flush related bits & clear them + * out now itself from the message identity register to + * minimize the probability of losing a flush interrupt, + * when there are back to back flush interrupts. + * There can be a new flush interrupt, for different log + * buffer type (like for ISR), whilst Host is handling + * one (for DPC). Since same bit is used in message + * register for ISR & DPC, it could happen that GuC + * sets the bit for 2nd interrupt but Host clears out + * the bit on handling the 1st interrupt. + */ +u32 msg = I915_READ(SOFT_SCRATCH(15)) & +(GUC2HOST_MSG_CRASH_DUMP_POSTED | + GUC2HOST_MSG_FLUSH_LOG_BUFFER); +if (msg) { +/* Clear the message bits that are handled */ +I915_WRITE(SOFT_SCRATCH(15), +I915_READ(SOFT_SCRATCH(15)) & ~msg); Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it twice? Thought reading it again (just before the update) is bit safer compared to reading it once, as there is a potential race problem here. GuC could also write to the SOFT_SCRATCH(15) register, set new events bit, while Host clears off the bit of handled events. Don't get it. If there is a race between read and write there still is, don't see how a second read makes it safer. Yes can't avoid the race completely by double reads, but can reduce the race window size. There was only one thing between the two reads, and that was "if (msg)": +u32 msg = I915_READ(SOFT_SCRATCH(15)) & +(GUC2HOST_MSG_CRASH_DUMP_POSTED | + GUC2HOST_MSG_FLUSH_LOG_BUFFER); +if (msg) { +/* Clear the message bits that are handled */ +I915_WRITE(SOFT_SCRATCH(15), +I915_READ(SOFT_SCRATCH(15)) & ~msg); Also I felt code looked better in current form, as macros GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used only once. Will change as per the initial implementation. u32 msg = I915_READ(SOFT_SCRATCH(15)); if (msg & (GUC2HOST_MSG_CRASH_DUMP_
Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
On Fri, Aug 12, 2016 at 04:20:03PM +0100, Tvrtko Ursulin wrote: > > On 12/08/16 07:25, akash.g...@intel.com wrote: > >From: Akash Goel > > > >Added the dump of GuC log buffer to i915 error state, as the contents of > >GuC log buffer would also be useful to determine that why the GPU reset > >was triggered. > > > >Suggested-by: Chris Wilson > >Signed-off-by: Akash Goel > >--- > > drivers/gpu/drm/i915/i915_drv.h | 1 + > > drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++ > > 2 files changed, 28 insertions(+) > > > >diff --git a/drivers/gpu/drm/i915/i915_drv.h > >b/drivers/gpu/drm/i915/i915_drv.h > >index 28ffac5..4bd3790 100644 > >--- a/drivers/gpu/drm/i915/i915_drv.h > >+++ b/drivers/gpu/drm/i915/i915_drv.h > >@@ -509,6 +509,7 @@ struct drm_i915_error_state { > > struct intel_overlay_error_state *overlay; > > struct intel_display_error_state *display; > > struct drm_i915_error_object *semaphore_obj; > >+struct drm_i915_error_object *guc_log_obj; > > > > struct drm_i915_error_engine { > > int engine_id; > >diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > >b/drivers/gpu/drm/i915/i915_gpu_error.c > >index eecb870..561b523 100644 > >--- a/drivers/gpu/drm/i915/i915_gpu_error.c > >+++ b/drivers/gpu/drm/i915/i915_gpu_error.c > >@@ -546,6 +546,21 @@ int i915_error_state_to_str(struct > >drm_i915_error_state_buf *m, > > } > > } > > > >+if ((obj = error->guc_log_obj)) { > >+err_printf(m, "GuC log buffer = 0x%08x\n", > >+ lower_32_bits(obj->gtt_offset)); > >+for (i = 0; i < obj->page_count; i++) { > >+for (elt = 0; elt < PAGE_SIZE/4; elt += 4) { > > Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like > it is counting in u32 * 4 chunks so it might be. Or I might be > confused.. There's (or will be) a function to dump the error object in a uniform manner. This patch is obsolete. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer
On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel Added the dump of GuC log buffer to i915 error state, as the contents of GuC log buffer would also be useful to determine that why the GPU reset was triggered. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++ 2 files changed, 28 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 28ffac5..4bd3790 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -509,6 +509,7 @@ struct drm_i915_error_state { struct intel_overlay_error_state *overlay; struct intel_display_error_state *display; struct drm_i915_error_object *semaphore_obj; + struct drm_i915_error_object *guc_log_obj; struct drm_i915_error_engine { int engine_id; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index eecb870..561b523 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -546,6 +546,21 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, } } + if ((obj = error->guc_log_obj)) { + err_printf(m, "GuC log buffer = 0x%08x\n", + lower_32_bits(obj->gtt_offset)); + for (i = 0; i < obj->page_count; i++) { + for (elt = 0; elt < PAGE_SIZE/4; elt += 4) { Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like it is counting in u32 * 4 chunks so it might be. Or I might be confused.. + err_printf(m, "[%08x] %08x %08x %08x %08x\n", + (u32)(i*PAGE_SIZE) + elt*4, + obj->pages[i][elt], + obj->pages[i][elt+1], + obj->pages[i][elt+2], + obj->pages[i][elt+3]); + } + } + } + if (error->overlay) intel_overlay_print_error_state(m, error->overlay); @@ -625,6 +640,7 @@ static void i915_error_state_free(struct kref *error_ref) } i915_error_object_free(error->semaphore_obj); + i915_error_object_free(error->guc_log_obj); for (i = 0; i < error->vm_count; i++) kfree(error->active_bo[i]); @@ -1210,6 +1226,16 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, } } +static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv, +struct drm_i915_error_state *error) Alignment. +{ + if (!dev_priv->guc.log.obj) + return; + + error->guc_log_obj = i915_error_ggtt_object_create(dev_priv, + dev_priv->guc.log.obj); +} + /* FIXME: Since pin count/bound list is global, we duplicate what we capture per * VM. */ @@ -1439,6 +1465,7 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv, i915_gem_capture_buffers(dev_priv, error); i915_gem_record_fences(dev_priv, error); i915_gem_record_rings(dev_priv, error); + i915_gem_capture_guc_log_buffer(dev_priv, error); do_gettimeofday(&error->time); Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps
On Fri, Aug 12, 2016 at 08:43:58PM +0530, Goel, Akash wrote: > On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote: > >Unreleated and unmentioned change to no guard page. Best to remove IMHO. > >Can keep the RB in that case. > > Though its not called out, sorry for that, but isn't it better to > avoid using the guard page, which will save 4KB of vmalloc virtual > space (which is scarce) for every mapping created by Driver. > > Updating the commit message would be fine to mention about this ?. Too late, already applied without the new flag. Yes, that's why I dropped the guard page when I found out it was being added. Send a patch to add the flag and we can discuss whether we think our code is adequate to not require the protection. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 9/9] drm/i915/cmdparser: Accelerate copies from WC memory
On Fri, Aug 12, 2016 at 04:07:30PM +0100, Chris Wilson wrote: > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index 2fe88d930ca7..8dcdc27afe80 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -715,18 +715,13 @@ static int i915_gem_seqno_info(struct seq_file *m, void > *data) > struct drm_device *dev = node->minor->dev; > struct drm_i915_private *dev_priv = to_i915(dev); > struct intel_engine_cs *engine; > - int ret; > > - ret = mutex_lock_interruptible(&dev->struct_mutex); > - if (ret) > - return ret; > intel_runtime_pm_get(dev_priv); > > for_each_engine(engine, dev_priv) > i915_ring_seqno_info(m, engine); > > intel_runtime_pm_put(dev_priv); > - mutex_unlock(&dev->struct_mutex); On noes, rebase damage. /o\ -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps
On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Chris Wilson vmaps has a provision for controlling the page protection bits, with which we can use to control the mapping type, e.g. WB, WC, UC or even WT. To allow the caller to choose their mapping type, we add a parameter to i915_gem_object_pin_map - but we still only allow one vmap to be cached per object. If the object is currently not pinned, then we recreate the previous vmap with the new access type, but if it was pinned we report an error. This effectively limits the access via i915_gem_object_pin_map to a single mapping type for the lifetime of the object. Not usually a problem, but something to be aware of when setting up the object's vmap. We will want to vary the access type to enable WC mappings of ringbuffer and context objects on !llc platforms, as well as other objects where we need coherent access to the GPU's pages without going through the GTT v2: Remove the redundant braces around pin count check and fix the marker in documentation (Chris) v3: - Add a new enum for the vmalloc mapping type & pass that as an argument to i915_object_pin_map. (Tvrtko) - Use PAGE_MASK to extract or filter the mapping type info and remove a superfluous BUG_ON.(Tvrtko) v4: - Rename the enums and clean up the pin_map function. (Chris) Signed-off-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h| 9 - drivers/gpu/drm/i915/i915_gem.c| 58 +++--- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 2 +- drivers/gpu/drm/i915/i915_guc_submission.c | 2 +- drivers/gpu/drm/i915/intel_lrc.c | 8 ++--- drivers/gpu/drm/i915/intel_ringbuffer.c| 2 +- 6 files changed, 60 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4bd3790..6603812 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -834,6 +834,11 @@ enum i915_cache_level { I915_CACHE_WT, /* hsw:gt3e WriteThrough for scanouts */ }; +enum i915_map_type { +I915_MAP_WB = 0, +I915_MAP_WC, +}; + struct i915_ctx_hang_stats { /* This context had batch pending when hang was declared */ unsigned batch_pending; @@ -3150,6 +3155,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) /** * i915_gem_object_pin_map - return a contiguous mapping of the entire object * @obj - the object to map into kernel address space + * @map_type - whether the vmalloc mapping should be using WC or WB pgprot_t * * Calls i915_gem_object_pin_pages() to prevent reaping of the object's * pages and then returns a contiguous mapping of the backing storage into @@ -3161,7 +3167,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) * Returns the pointer through which to access the mapped object, or an * ERR_PTR() on error. */ -void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj); +void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj, +enum i915_map_type map_type); /** * i915_gem_object_unpin_map - releases an earlier mapping diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 03548db..7dabbc3f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2077,10 +2077,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj) list_del(&obj->global_list); if (obj->mapping) { -if (is_vmalloc_addr(obj->mapping)) -vunmap(obj->mapping); +void *ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK); +if (is_vmalloc_addr(ptr)) +vunmap(ptr); else -kunmap(kmap_to_page(obj->mapping)); +kunmap(kmap_to_page(ptr)); obj->mapping = NULL; } @@ -2253,7 +2254,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) } /* The 'mapping' part of i915_gem_object_pin_map() below */ -static void *i915_gem_object_map(const struct drm_i915_gem_object *obj) +static void *i915_gem_object_map(const struct drm_i915_gem_object *obj, + enum i915_map_type type) { unsigned long n_pages = obj->base.size >> PAGE_SHIFT; struct sg_table *sgt = obj->pages; @@ -2263,9 +2265,10 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj) struct page **pages = stack_pages; unsigned long i = 0; void *addr; +bool use_wc = (type == I915_MAP_WC); /* A single page can always be kmapped */ -if (n_pages == 1) +if (n_pages == 1 && !use_wc) return kmap(sg_page(sgt->sgl)); if (n_pages > ARRAY_SIZE(stack_pages)) { @@ -2281,7 +2284,8 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj) /* Check that we have the expected number of pages */ GEM_B
[Intel-gfx] [PATCH 7/9] drm/i915/cmdparser: Check for SKIP descriptors first
If the command descriptor says to skip it, ignore checking for anyother other conflict. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 3b1100a0e0cb..b88607bb971a 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -1022,6 +1022,9 @@ static bool check_cmd(const struct intel_engine_cs *engine, const bool is_master, bool *oacontrol_set) { + if (desc->flags & CMD_DESC_SKIP) + return true; + if (desc->flags & CMD_DESC_REJECT) { DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd); return false; -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 9/9] drm/i915/cmdparser: Accelerate copies from WC memory
If we need to use clflush to prepare our batch for reads from memory, we can bypass the cache instead by using non-temporal copies. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 58 ++ drivers/gpu/drm/i915/i915_debugfs.c| 24 -- drivers/gpu/drm/i915/i915_drv.c| 19 --- drivers/gpu/drm/i915/i915_gem.c| 48 drivers/gpu/drm/i915/i915_gem_gtt.c| 17 +++--- drivers/gpu/drm/i915/i915_gem_tiling.c | 4 --- drivers/gpu/drm/i915/i915_irq.c| 2 -- drivers/gpu/drm/i915/intel_uncore.c| 6 ++-- 8 files changed, 81 insertions(+), 97 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index cea3ef7299cc..3244ef1401ad 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -969,8 +969,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, { unsigned int src_needs_clflush; unsigned int dst_needs_clflush; - void *dst, *ptr; - int offset, n; + void *dst; int ret; ret = i915_gem_obj_prepare_shmem_read(src_obj, &src_needs_clflush); @@ -987,24 +986,43 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, if (IS_ERR(dst)) goto unpin_dst; - ptr = dst; - offset = offset_in_page(batch_start_offset); - if (dst_needs_clflush & CLFLUSH_BEFORE) - batch_len = roundup(batch_len, boot_cpu_data.x86_clflush_size); - - for (n = batch_start_offset >> PAGE_SHIFT; batch_len; n++) { - int len = min_t(int, batch_len, PAGE_SIZE - offset); - void *vaddr; - - vaddr = kmap_atomic(i915_gem_object_get_page(src_obj, n)); - if (src_needs_clflush) - drm_clflush_virt_range(vaddr + offset, len); - memcpy(ptr, vaddr + offset, len); - kunmap_atomic(vaddr); - - ptr += len; - batch_len -= len; - offset = 0; + if (src_needs_clflush && + i915_memcpy_from_wc((void *)(uintptr_t)batch_start_offset, 0, 0)) { + void *src; + + src = i915_gem_object_pin_map(src_obj, I915_MAP_WC); + if (IS_ERR(src)) + goto shmem_copy; + + i915_memcpy_from_wc(dst, + src + batch_start_offset, + ALIGN(batch_len, 16)); + i915_gem_object_unpin_map(src_obj); + } else { + void *ptr; + int offset, n; + +shmem_copy: + offset = offset_in_page(batch_start_offset); + if (dst_needs_clflush & CLFLUSH_BEFORE) + batch_len = roundup(batch_len, + boot_cpu_data.x86_clflush_size); + + ptr = dst; + for (n = batch_start_offset >> PAGE_SHIFT; batch_len; n++) { + int len = min_t(int, batch_len, PAGE_SIZE - offset); + void *vaddr; + + vaddr = kmap_atomic(i915_gem_object_get_page(src_obj, n)); + if (src_needs_clflush) + drm_clflush_virt_range(vaddr + offset, len); + memcpy(ptr, vaddr + offset, len); + kunmap_atomic(vaddr); + + ptr += len; + batch_len -= len; + offset = 0; + } } /* dst_obj is returned with vmap pinned */ diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 2fe88d930ca7..8dcdc27afe80 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -715,18 +715,13 @@ static int i915_gem_seqno_info(struct seq_file *m, void *data) struct drm_device *dev = node->minor->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct intel_engine_cs *engine; - int ret; - ret = mutex_lock_interruptible(&dev->struct_mutex); - if (ret) - return ret; intel_runtime_pm_get(dev_priv); for_each_engine(engine, dev_priv) i915_ring_seqno_info(m, engine); intel_runtime_pm_put(dev_priv); - mutex_unlock(&dev->struct_mutex); return 0; } @@ -1379,11 +1374,7 @@ static int ironlake_drpc_info(struct seq_file *m) struct drm_i915_private *dev_priv = to_i915(dev); u32 rgvmodectl, rstdbyctl; u16 crstandvid; - int ret; - ret = mutex_lock_interruptible(&dev->struct_mutex); - if (ret) - return ret; intel_runtime_pm_get(dev_priv); rgvmodectl = I915_READ(MEMMODECTL); @@ -1391,7 +1382,6 @@ static int ironlake_drpc_info(struct seq_file *m) crstandvid = I915_R
[Intel-gfx] [PATCH 8/9] drm/i915/cmdparser: Use binary search for faster register lookup
A signifcant proportion of the cmdparsing time for some batches is the cost to find the register in the mmiotable. We ensure that those tables are in ascending order such that we could do a binary search if it was ever merited. It is. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 42 -- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index b88607bb971a..cea3ef7299cc 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -925,36 +925,37 @@ find_cmd(struct intel_engine_cs *engine, } static const struct drm_i915_reg_descriptor * -find_reg(const struct drm_i915_reg_descriptor *table, -int count, u32 addr) +__find_reg(const struct drm_i915_reg_descriptor *table, int count, u32 addr) { - int i; - - for (i = 0; i < count; i++) { - if (i915_mmio_reg_offset(table[i].addr) == addr) - return &table[i]; + int start = 0, end = count; + while (start < end) { + int mid = start + (end - start) / 2; + int ret = addr - i915_mmio_reg_offset(table[mid].addr); + if (ret < 0) + end = mid; + else if (ret > 0) + start = mid + 1; + else + return &table[mid]; } - return NULL; } static const struct drm_i915_reg_descriptor * -find_reg_in_tables(const struct drm_i915_reg_table *tables, - int count, bool is_master, u32 addr) +find_reg(const struct intel_engine_cs *engine, bool is_master, u32 addr) { - int i; - const struct drm_i915_reg_table *table; - const struct drm_i915_reg_descriptor *reg; + const struct drm_i915_reg_table *table = engine->reg_tables; + int count = engine->reg_table_count; - for (i = 0; i < count; i++) { - table = &tables[i]; + do { if (!table->master || is_master) { - reg = find_reg(table->regs, table->num_regs, - addr); + const struct drm_i915_reg_descriptor *reg; + + reg = __find_reg(table->regs, table->num_regs, addr); if (reg != NULL) return reg; } - } + } while (table++, --count); return NULL; } @@ -1049,10 +1050,7 @@ static bool check_cmd(const struct intel_engine_cs *engine, offset += step) { const u32 reg_addr = cmd[offset] & desc->reg.mask; const struct drm_i915_reg_descriptor *reg = - find_reg_in_tables(engine->reg_tables, - engine->reg_table_count, - is_master, - reg_addr); + find_reg(engine, is_master, reg_addr); if (!reg) { DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X in command: 0x%08X (exec_id=%d)\n", -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/9] drm/i915/cmdparser: Improve hash function
The existing code's hashfunction is very suboptimal (most 3D commands use the same bucket degrading the hash to a long list). The code even acknowledge that the issue was known and the fix simple: /* * If we attempt to generate a perfect hash, we should be able to look at bits * 31:29 of a command from a batch buffer and use the full mask for that * client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this. */ Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 51 +- 1 file changed, 31 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 4c903081604c..274f2136a846 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -86,24 +86,24 @@ * general bitmasking mechanism. */ -#define STD_MI_OPCODE_MASK 0xFF80 -#define STD_3D_OPCODE_MASK 0x -#define STD_2D_OPCODE_MASK 0xFFC0 -#define STD_MFX_OPCODE_MASK 0x +#define STD_MI_OPCODE_SHIFT (32 - 9) +#define STD_3D_OPCODE_SHIFT (32 - 16) +#define STD_2D_OPCODE_SHIFT (32 - 10) +#define STD_MFX_OPCODE_SHIFT (32 - 16) #define CMD(op, opm, f, lm, fl, ...) \ { \ .flags = (fl) | ((f) ? CMD_DESC_FIXED : 0), \ - .cmd = { (op), (opm) }, \ + .cmd = { (op), ~0u << (opm) }, \ .length = { (lm) }, \ __VA_ARGS__ \ } /* Convenience macros to compress the tables */ -#define SMI STD_MI_OPCODE_MASK -#define S3D STD_3D_OPCODE_MASK -#define S2D STD_2D_OPCODE_MASK -#define SMFX STD_MFX_OPCODE_MASK +#define SMI STD_MI_OPCODE_SHIFT +#define S3D STD_3D_OPCODE_SHIFT +#define S2D STD_2D_OPCODE_SHIFT +#define SMFX STD_MFX_OPCODE_SHIFT #define F true #define S CMD_DESC_SKIP #define R CMD_DESC_REJECT @@ -696,12 +696,26 @@ struct cmd_node { * non-opcode bits being set. But if we don't include those bits, some 3D * commands may hash to the same bucket due to not including opcode bits that * make the command unique. For now, we will risk hashing to the same bucket. - * - * If we attempt to generate a perfect hash, we should be able to look at bits - * 31:29 of a command from a batch buffer and use the full mask for that - * client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this. */ -#define CMD_HASH_MASK STD_MI_OPCODE_MASK +static inline u32 cmd_header_key(u32 x) +{ + u32 shift; + + switch (x >> INSTR_CLIENT_SHIFT) { + default: + case INSTR_MI_CLIENT: + shift = STD_MI_OPCODE_SHIFT; + break; + case INSTR_RC_CLIENT: + shift = STD_3D_OPCODE_SHIFT; + break; + case INSTR_BC_CLIENT: + shift = STD_2D_OPCODE_SHIFT; + break; + } + + return x >> shift; +} static int init_hash_table(struct intel_engine_cs *engine, const struct drm_i915_cmd_table *cmd_tables, @@ -725,7 +739,7 @@ static int init_hash_table(struct intel_engine_cs *engine, desc_node->desc = desc; hash_add(engine->cmd_hash, &desc_node->node, -desc->cmd.value & CMD_HASH_MASK); +cmd_header_key(desc->cmd.value)); } } @@ -864,12 +878,9 @@ find_cmd_in_table(struct intel_engine_cs *engine, struct cmd_node *desc_node; hash_for_each_possible(engine->cmd_hash, desc_node, node, - cmd_header & CMD_HASH_MASK) { + cmd_header_key(cmd_header)) { const struct drm_i915_cmd_descriptor *desc = desc_node->desc; - u32 masked_cmd = desc->cmd.mask & cmd_header; - u32 masked_value = desc->cmd.value & desc->cmd.mask; - - if (masked_cmd == masked_value) + if (((cmd_header ^ desc->cmd.value) & desc->cmd.mask) == 0) return desc; } -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/9] drm/i915/cmdparser: Compare against the previous command descriptor
On the blitter (and in test code), we see long sequences of repeated commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these, we can skip the hashtable lookup by remembering the previous command descriptor and doing a straightforward compare of the command header. The corollary is that we need to do one extra comparison before lookup up new commands. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 274f2136a846..3b1100a0e0cb 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -350,6 +350,9 @@ static const struct drm_i915_cmd_descriptor hsw_blt_cmds[] = { CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ), }; +static const struct drm_i915_cmd_descriptor noop_desc = + CMD(MI_NOOP, SMI, F, 1, S); + #undef CMD #undef SMI #undef S3D @@ -898,11 +901,14 @@ find_cmd_in_table(struct intel_engine_cs *engine, static const struct drm_i915_cmd_descriptor* find_cmd(struct intel_engine_cs *engine, u32 cmd_header, +const struct drm_i915_cmd_descriptor *desc, struct drm_i915_cmd_descriptor *default_desc) { - const struct drm_i915_cmd_descriptor *desc; u32 mask; + if (((cmd_header ^ desc->cmd.value) & desc->cmd.mask) == 0) + return desc; + desc = find_cmd_in_table(engine, cmd_header); if (desc) return desc; @@ -911,10 +917,10 @@ find_cmd(struct intel_engine_cs *engine, if (!mask) return NULL; - BUG_ON(!default_desc); - default_desc->flags = CMD_DESC_SKIP; + default_desc->cmd.value = cmd_header; + default_desc->cmd.mask = 0x; default_desc->length.mask = mask; - + default_desc->flags = CMD_DESC_SKIP; return default_desc; } @@ -1165,7 +1171,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, bool is_master) { u32 *cmd, *batch_end; - struct drm_i915_cmd_descriptor default_desc = { 0 }; + struct drm_i915_cmd_descriptor default_desc = noop_desc; + const struct drm_i915_cmd_descriptor *desc = &default_desc; bool oacontrol_set = false; /* OACONTROL tracking. See check_cmd() */ bool needs_clflush_after = false; int ret = 0; @@ -1185,13 +1192,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, */ batch_end = cmd + (batch_len / sizeof(*batch_end)); while (cmd < batch_end) { - const struct drm_i915_cmd_descriptor *desc; u32 length; if (*cmd == MI_BATCH_BUFFER_END) break; - desc = find_cmd(engine, *cmd, &default_desc); + desc = find_cmd(engine, *cmd, desc, &default_desc); if (!desc) { DRM_DEBUG_DRIVER("CMD: Unrecognized command: 0x%08X\n", *cmd); -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/9] drm/i915/cmdparser: Make initialisation failure non-fatal
If the developer adds a register in the wrong order, we BUG during boot. That makes development and testing very difficult. Let's be a bit more friendly and disable the command parser with a big warning if the tables are invalid. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 30 ++ drivers/gpu/drm/i915/i915_drv.h| 2 +- drivers/gpu/drm/i915/intel_engine_cs.c | 6 -- 3 files changed, 23 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index a1f4683f5c35..1882dc28c750 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -746,17 +746,15 @@ static void fini_hash_table(struct intel_engine_cs *engine) * Optionally initializes fields related to batch buffer command parsing in the * struct intel_engine_cs based on whether the platform requires software * command parsing. - * - * Return: non-zero if initialization fails */ -int intel_engine_init_cmd_parser(struct intel_engine_cs *engine) +void intel_engine_init_cmd_parser(struct intel_engine_cs *engine) { const struct drm_i915_cmd_table *cmd_tables; int cmd_table_count; int ret; if (!IS_GEN7(engine->i915)) - return 0; + return; switch (engine->id) { case RCS: @@ -811,24 +809,32 @@ int intel_engine_init_cmd_parser(struct intel_engine_cs *engine) break; default: MISSING_CASE(engine->id); - BUG(); + return; } - BUG_ON(!validate_cmds_sorted(engine, cmd_tables, cmd_table_count)); - BUG_ON(!validate_regs_sorted(engine)); + if (!hash_empty(engine->cmd_hash)) { + DRM_DEBUG_DRIVER("%s: no commands?\n", engine->name); + return; + } - WARN_ON(!hash_empty(engine->cmd_hash)); + if (!validate_cmds_sorted(engine, cmd_tables, cmd_table_count)) { + DRM_ERROR("%s: command descriptions are not sorted\n", + engine->name); + return; + } + if (!validate_regs_sorted(engine)) { + DRM_ERROR("%s: registers are not sorted\n", engine->name); + return; + } ret = init_hash_table(engine, cmd_tables, cmd_table_count); if (ret) { - DRM_ERROR("CMD: cmd_parser_init failed!\n"); + DRM_ERROR("%s: initialised failed!\n", engine->name); fini_hash_table(engine); - return ret; + return; } engine->needs_cmd_parser = true; - - return 0; } /** diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 52207b086286..f5b187662059 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3608,7 +3608,7 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type); /* i915_cmd_parser.c */ int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv); -int intel_engine_init_cmd_parser(struct intel_engine_cs *engine); +void intel_engine_init_cmd_parser(struct intel_engine_cs *engine); void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine); int intel_engine_cmd_parser(struct intel_engine_cs *engine, struct drm_i915_gem_object *batch_obj, diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 63440c6a6349..0eb19388eba4 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -239,6 +239,8 @@ void intel_engine_setup_common(struct intel_engine_cs *engine) intel_engine_init_requests(engine); intel_engine_init_hangcheck(engine); i915_gem_batch_pool_init(engine, &engine->batch_pool); + + intel_engine_init_cmd_parser(engine); } int intel_engine_create_scratch(struct intel_engine_cs *engine, int size) @@ -305,7 +307,7 @@ int intel_engine_init_common(struct intel_engine_cs *engine) if (ret) return ret; - return intel_engine_init_cmd_parser(engine); + return 0; } /** @@ -319,8 +321,8 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine) { intel_engine_cleanup_scratch(engine); - intel_engine_cleanup_cmd_parser(engine); i915_gem_render_state_fini(engine); intel_engine_fini_breadcrumbs(engine); + intel_engine_cleanup_cmd_parser(engine); i915_gem_batch_pool_fini(&engine->batch_pool); } -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/9] drm/i915/cmdparser: Only cache the dst vmap
For simplicity, we want to continue using a contiguous mapping of the command buffer, but we can reduce the number of vmappings we hold by switching over to a page-by-page copy from the user batch buffer to the shadow. The cost for saving one linear mapping is about 5% in trivial workloads - which is more or less the overhead in calling kmap_atomic(). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++--- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 545c333663c0..4c903081604c 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -951,7 +951,8 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, { unsigned int src_needs_clflush; unsigned int dst_needs_clflush; - void *src, *dst; + void *dst, *ptr; + int offset, n; int ret; ret = i915_gem_obj_prepare_shmem_read(src_obj, &src_needs_clflush); @@ -964,30 +965,33 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, goto unpin_src; } - src = i915_gem_object_pin_map(src_obj, I915_MAP_WB); - if (IS_ERR(src)) { - dst = src; - goto unpin_dst; - } - dst = i915_gem_object_pin_map(dst_obj, I915_MAP_WB); if (IS_ERR(dst)) - goto unmap_src; - - src += batch_start_offset; - if (src_needs_clflush) - drm_clflush_virt_range(src, batch_len); + goto unpin_dst; + ptr = dst; + offset = offset_in_page(batch_start_offset); if (dst_needs_clflush & CLFLUSH_BEFORE) batch_len = roundup(batch_len, boot_cpu_data.x86_clflush_size); - memcpy(dst, src, batch_len); + for (n = batch_start_offset >> PAGE_SHIFT; batch_len; n++) { + int len = min_t(int, batch_len, PAGE_SIZE - offset); + void *vaddr; + + vaddr = kmap_atomic(i915_gem_object_get_page(src_obj, n)); + if (src_needs_clflush) + drm_clflush_virt_range(vaddr + offset, len); + memcpy(ptr, vaddr + offset, len); + kunmap_atomic(vaddr); + + ptr += len; + batch_len -= len; + offset = 0; + } /* dst_obj is returned with vmap pinned */ *needs_clflush_after = dst_needs_clflush & CLFLUSH_AFTER; -unmap_src: - i915_gem_object_unpin_map(src_obj); unpin_dst: i915_gem_object_unpin_pages(dst_obj); unpin_src: -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] cmdparser perf improvement
From the moment the cmdparser was enabled (4.0) we got regression reports about the performance regression, e.g. most notable on Baytrail http://www.spinics.net/lists/dri-devel/msg80933.html msg->id:1428627643.3417.22.ca...@collabora.com Whilst this doesn't make the cmdparser free, it does significantly reduce the overhead. (The cached vmappings and better hash were tested at the time and demonstrated to reduce the impact on the user's workload to the point where the new kernel was an improvement over the last known good). This builds upon the regression fixes to stop the cmdparser falling over in the first place. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/9] drm/i915/cmdparser: Add the TIMESTAMP register for the other engines
Since I have been using the BCS_TIMESTAMP to measure latency of execution upon the blitter ring, allow regular userspace to also read from that register. They are already allowed RCS_TIMESTAMP! Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 1882dc28c750..5fbd049f8095 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -458,6 +458,7 @@ static const struct drm_i915_reg_descriptor gen7_render_regs[] = { REG32(GEN7_GPGPU_DISPATCHDIMX), REG32(GEN7_GPGPU_DISPATCHDIMY), REG32(GEN7_GPGPU_DISPATCHDIMZ), + REG64_IDX(RING_TIMESTAMP, BSD_RING_BASE), REG64_IDX(GEN7_SO_NUM_PRIMS_WRITTEN, 0), REG64_IDX(GEN7_SO_NUM_PRIMS_WRITTEN, 1), REG64_IDX(GEN7_SO_NUM_PRIMS_WRITTEN, 2), @@ -473,6 +474,7 @@ static const struct drm_i915_reg_descriptor gen7_render_regs[] = { REG32(GEN7_L3SQCREG1), REG32(GEN7_L3CNTLREG2), REG32(GEN7_L3CNTLREG3), + REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE), }; static const struct drm_i915_reg_descriptor hsw_render_regs[] = { @@ -502,7 +504,10 @@ static const struct drm_i915_reg_descriptor hsw_render_regs[] = { }; static const struct drm_i915_reg_descriptor gen7_blt_regs[] = { + REG64_IDX(RING_TIMESTAMP, RENDER_RING_BASE), + REG64_IDX(RING_TIMESTAMP, BSD_RING_BASE), REG32(BCS_SWCTRL), + REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE), }; static const struct drm_i915_reg_descriptor ivb_master_regs[] = { -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/9] drm/i915/cmdparser: Use cached vmappings
The single largest factor in the overhead of parsing the commands is the setup of the virtual mapping to provide a continuous block for the batch buffer. If we keep those vmappings around (against the better judgement of mm/vmalloc.c, which we offset by handwaving and looking suggestively at the shrinker) we can dramatically improve the performance of the parser for small batches (such as media workloads). Furthermore, we can use the prepare shmem read/write functions to determine how best we need to clflush the range (rather than every page of the object). The impact of caching both src/dst vmaps is +80% on ivb and +140% on byt for the throughput on small batches. (Caching just the dst vmap and iterating over the src, doing a page by page copy is roughly 5% slower on both platforms. That may be an acceptable trade-off to eliminate one cached vmapping, and we may be able to reduce the per-page copying overhead further.) For *this* simple test case, the cmdparser is now within a factor of 2 of ideal performance. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 121 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 6 ++ 2 files changed, 47 insertions(+), 80 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 5fbd049f8095..545c333663c0 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -942,98 +942,57 @@ find_reg_in_tables(const struct drm_i915_reg_table *tables, return NULL; } -static u32 *vmap_batch(struct drm_i915_gem_object *obj, - unsigned start, unsigned len) -{ - int i; - void *addr = NULL; - struct sg_page_iter sg_iter; - int first_page = start >> PAGE_SHIFT; - int last_page = (len + start + 4095) >> PAGE_SHIFT; - int npages = last_page - first_page; - struct page **pages; - - pages = drm_malloc_ab(npages, sizeof(*pages)); - if (pages == NULL) { - DRM_DEBUG_DRIVER("Failed to get space for pages\n"); - goto finish; - } - - i = 0; - for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, first_page) { - pages[i++] = sg_page_iter_page(&sg_iter); - if (i == npages) - break; - } - - addr = vmap(pages, i, 0, PAGE_KERNEL); - if (addr == NULL) { - DRM_DEBUG_DRIVER("Failed to vmap pages\n"); - goto finish; - } - -finish: - if (pages) - drm_free_large(pages); - return (u32*)addr; -} - -/* Returns a vmap'd pointer to dest_obj, which the caller must unmap */ -static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, +/* Returns a vmap'd pointer to dst_obj, which the caller must unmap */ +static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, struct drm_i915_gem_object *src_obj, u32 batch_start_offset, - u32 batch_len) + u32 batch_len, + bool *needs_clflush_after) { - unsigned int needs_clflush; - void *src_base, *src; - void *dst = NULL; + unsigned int src_needs_clflush; + unsigned int dst_needs_clflush; + void *src, *dst; int ret; - if (batch_len > dest_obj->base.size || - batch_len + batch_start_offset > src_obj->base.size) - return ERR_PTR(-E2BIG); - - if (WARN_ON(dest_obj->pages_pin_count == 0)) - return ERR_PTR(-ENODEV); - - ret = i915_gem_obj_prepare_shmem_read(src_obj, &needs_clflush); - if (ret) { - DRM_DEBUG_DRIVER("CMD: failed to prepare shadow batch\n"); + ret = i915_gem_obj_prepare_shmem_read(src_obj, &src_needs_clflush); + if (ret) return ERR_PTR(ret); - } - src_base = vmap_batch(src_obj, batch_start_offset, batch_len); - if (!src_base) { - DRM_DEBUG_DRIVER("CMD: Failed to vmap batch\n"); - ret = -ENOMEM; + ret = i915_gem_obj_prepare_shmem_write(dst_obj, &dst_needs_clflush); + if (ret) { + dst = ERR_PTR(ret); goto unpin_src; } - ret = i915_gem_object_set_to_cpu_domain(dest_obj, true); - if (ret) { - DRM_DEBUG_DRIVER("CMD: Failed to set shadow batch to CPU\n"); - goto unmap_src; + src = i915_gem_object_pin_map(src_obj, I915_MAP_WB); + if (IS_ERR(src)) { + dst = src; + goto unpin_dst; } - dst = vmap_batch(dest_obj, 0, batch_len); - if (!dst) { - DRM_DEBUG_DRIVER("CMD: Failed to vmap shadow batch\n"); - ret = -ENOMEM; + dst = i915_gem_object_pin_map(dst_obj, I915_MAP_WB); + if (IS_ERR(dst)) goto unmap_src; - } - src = src_base + offset_in_pa
Re: [Intel-gfx] [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer
On 12/08/16 15:48, Goel, Akash wrote: On 8/12/2016 8:12 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel GuC firmware sends an interrupt to flush the log buffer when it becomes half full, so Driver doesn't really need to sample the complete buffer and can just copy only the newly written data by GuC into the local buffer, i.e. as per the read & write pointer values. Moreover the flush interrupt would generally come for one type of log buffer, when it becomes half full, so at that time the other 2 types of log buffer would comparatively have much lesser unread data in them. In case of overflow reported by GuC, Driver do need to copy the entire buffer as the whole buffer would contain the unread data. v2: Rebase. Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 40 +- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 1ca1866..8e0f360 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; struct guc_log_buffer_state log_buffer_state_local; void *src_data_ptr, *dst_data_ptr; -u32 i, buffer_size; +bool new_overflow; +u32 i, buffer_size, read_offset, write_offset, bytes_to_copy; if (!guc->log.buf_addr) return; @@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) memcpy(&log_buffer_state_local, log_buffer_state, sizeof(struct guc_log_buffer_state)); buffer_size = log_buffer_state_local.size; +read_offset = log_buffer_state_local.read_ptr; +write_offset = log_buffer_state_local.sampled_write_ptr; guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; if (log_buffer_state_local.buffer_full_cnt != guc->log.prev_overflow_count[i]) { Wrong alignment. You can try checkpatch.pl for all of those. Sorry for all the alignment & indentation issues. Should the above condition be written like this ? if (log_buffer_state_local.buffer_full_cnt != guc->log.prev_overflow_count[i]) { Yes, but checkpatch.pl is your friend. :) +new_overflow = 1; true/false since it is a bool fine will do that. guc->log.total_overflow_count[i] += (log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[i]); @@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) guc->log.prev_overflow_count[i] = log_buffer_state_local.buffer_full_cnt; DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); -} +} else +new_overflow = 0; if (log_buffer_snapshot_state) { /* First copy the state structure in local buffer */ @@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) * for consistency set the write pointer value to same * value of sampled_write_ptr in the snapshot buffer. */ -log_buffer_snapshot_state->write_ptr = -log_buffer_snapshot_state->sampled_write_ptr; +log_buffer_snapshot_state->write_ptr = write_offset; log_buffer_snapshot_state++; /* Now copy the actual logs */ memcpy(dst_data_ptr, src_data_ptr, buffer_size); The confusing bit - the memcpy above still copies the whole buffer, no? Really very sorry for this blooper. No worries, it happens to everyone! Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts
On 12/08/16 15:31, Goel, Akash wrote: On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote: +static void gen9_guc2host_events_work(struct work_struct *work) +{ +struct drm_i915_private *dev_priv = +container_of(work, struct drm_i915_private, guc.events_work); + +spin_lock_irq(&dev_priv->irq_lock); +/* Speed up work cancellation during disabling guc interrupts. */ +if (!dev_priv->guc.interrupts_enabled) { +spin_unlock_irq(&dev_priv->irq_lock); +return; I suppose locking for early exit is something about ensuring the worker sees the update to dev_priv->guc.interrupts_enabled done on another CPU? Yes locking (providing implicit barrier) will ensure that update made from another CPU is immediately visible to the worker. What if the disable happens after the unlock above? It would wait in disable until the irq handler exits. Most probably it will not have to wait, as irq handler would have completed if work item began the execution. Irq handler just queues the work item, which gets scheduled later on. Using the lock is beneficial for the case where the execution of work item and interrupt disabling is done around the same time. Ok maybe I am missing something. When can the interrupt disabling happen? Will it be controlled by the debugfs file or is it driver load/unload and suspend/resume? +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir) +{ +bool interrupts_enabled; + +if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) { +spin_lock(&dev_priv->irq_lock); +interrupts_enabled = dev_priv->guc.interrupts_enabled; +spin_unlock(&dev_priv->irq_lock); Not sure that taking a lock around only this read is needed. Again same reason as above, to make sure an update made on another CPU is immediately visible to the irq handler. I don't get it, see above. :) Here also If interrupt disabling & ISR execution happens around the same time then ISR might miss the reset of 'interrupts_enabled' flag and queue the new work. What if reset of interrupts_enabled happens just as the ISR releases the lock? And same applies to the case when interrupt is re-enabled, ISR might still see the 'interrupts_enabled' flag as false. It will eventually see the update though. +if (interrupts_enabled) { +/* Sample the log buffer flush related bits & clear them + * out now itself from the message identity register to + * minimize the probability of losing a flush interrupt, + * when there are back to back flush interrupts. + * There can be a new flush interrupt, for different log + * buffer type (like for ISR), whilst Host is handling + * one (for DPC). Since same bit is used in message + * register for ISR & DPC, it could happen that GuC + * sets the bit for 2nd interrupt but Host clears out + * the bit on handling the 1st interrupt. + */ +u32 msg = I915_READ(SOFT_SCRATCH(15)) & +(GUC2HOST_MSG_CRASH_DUMP_POSTED | + GUC2HOST_MSG_FLUSH_LOG_BUFFER); +if (msg) { +/* Clear the message bits that are handled */ +I915_WRITE(SOFT_SCRATCH(15), +I915_READ(SOFT_SCRATCH(15)) & ~msg); Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it twice? Thought reading it again (just before the update) is bit safer compared to reading it once, as there is a potential race problem here. GuC could also write to the SOFT_SCRATCH(15) register, set new events bit, while Host clears off the bit of handled events. Don't get it. If there is a race between read and write there still is, don't see how a second read makes it safer. Yes can't avoid the race completely by double reads, but can reduce the race window size. There was only one thing between the two reads, and that was "if (msg)": +u32 msg = I915_READ(SOFT_SCRATCH(15)) & +(GUC2HOST_MSG_CRASH_DUMP_POSTED | + GUC2HOST_MSG_FLUSH_LOG_BUFFER); +if (msg) { +/* Clear the message bits that are handled */ +I915_WRITE(SOFT_SCRATCH(15), +I915_READ(SOFT_SCRATCH(15)) & ~msg); Also I felt code looked better in current form, as macros GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used only once. Will change as per the initial implementation. u32 msg = I915_READ(SOFT_SCRATCH(15)); if (msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED | GUC2HOST_MSG_FLUSH_LOG_BUFFER) { msg &= ~(GUC2HOST_MSG_CRASH_DUMP_POSTED | GUC2HOST_MSG_FLUSH_LOG_BUFFER); I915_WRITE(SOFT_SCRATCH(15), msg); } Or: u32 msg, flush; msg = I915_READ(SOFT_SCRATCH(15)); flush = msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED | GUC2HOST_MSG_FLUSH_LOG_BUFFER); if (flush
Re: [Intel-gfx] [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions
On 8/12/2016 7:25 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel With the addition of new Host2GuC actions related to GuC logging, there is a need of a lock to serialize them, as they can execute concurrently with each other and also with other existing actions. v2: Use mutex in place of spinlock to serialize, as sleep can happen while waiting for the action's response from GuC. (Tvrtko) Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++ drivers/gpu/drm/i915/intel_guc.h | 3 +++ 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 1a2d648..cb9672b 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) return -EINVAL; intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); +mutex_lock(&guc->action_lock); I would probably take the mutex before grabbing forcewake as a general rule. Not that I think it matters in this case since we don't expect any contention on this one. Yes did not expected a contention for this mutex, hence thought it use just around the code where it is actually needed. Will move it before the forcewake, as you suggested, to conform to the rules. Best regards Akash dev_priv->guc.action_count += 1; dev_priv->guc.action_cmd = data[0]; @@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) } dev_priv->guc.action_status = status; +mutex_unlock(&guc->action_lock); intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); return ret; @@ -1312,6 +1314,7 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv) return -ENOMEM; ida_init(&guc->ctx_ids); +mutex_init(&guc->action_lock); guc_create_log(guc); guc_create_ads(guc); diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h index 96ef7dc..e4ec8d8 100644 --- a/drivers/gpu/drm/i915/intel_guc.h +++ b/drivers/gpu/drm/i915/intel_guc.h @@ -156,6 +156,9 @@ struct intel_guc { uint64_t submissions[I915_NUM_ENGINES]; uint32_t last_seqno[I915_NUM_ENGINES]; + +/* To serialize the Host2GuC actions */ +struct mutex action_lock; }; /* intel_guc_loader.c */ With or without the mutex vs forcewake ordering change: Reviewed-by: Tvrtko Ursulin Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts
On 8/12/2016 7:56 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel GuC firmware sends an interrupt to flush the log buffer when it becomes half full. GuC firmware also tracks how many times the buffer overflowed. It would be useful to maintain a statistics of how many flush interrupts were received and for which type of log buffer, along with the overflow count of each buffer type. Augmented i915_log_info debugfs to report back these statistics. v2: - Update the logic to detect multiple overflows between the 2 flush interrupts and also log a message for overflow (Tvrtko) - Track the number of times there was no free sub buffer to capture the GuC log buffer. (Tvrtko) Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c| 28 drivers/gpu/drm/i915/i915_guc_submission.c | 19 +++ drivers/gpu/drm/i915/i915_irq.c| 2 ++ drivers/gpu/drm/i915/intel_guc.h | 7 +++ 4 files changed, 56 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 51b59d5..14e0dcf 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2539,6 +2539,32 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data) return 0; } +static void i915_guc_log_info(struct seq_file *m, + struct drm_i915_private *dev_priv) +{ +struct intel_guc *guc = &dev_priv->guc; + +seq_printf(m, "\nGuC logging stats:\n"); + +seq_printf(m, "\tISR: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_ISR_LOG_BUFFER], +guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]); + +seq_printf(m, "\tDPC: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_DPC_LOG_BUFFER], +guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]); + +seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n", +guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER], +guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]); Why is the width for overflow only 8 chars and not 10 like for flush since both are u32? Looks to be a discrepancy. I will check. Both should be 10 as per the max value of u32, which takes 10 digits in decimal form. + +seq_printf(m, "\tTotal flush interrupt count: %u\n", + guc->log.flush_interrupt_count); + +seq_printf(m, "\tCapture miss count: %u\n", + guc->log.capture_miss_count); +} + static void i915_guc_client_info(struct seq_file *m, struct drm_i915_private *dev_priv, struct i915_guc_client *client) @@ -2613,6 +2639,8 @@ static int i915_guc_info(struct seq_file *m, void *data) seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client); i915_guc_client_info(m, dev_priv, &client); +i915_guc_log_info(m, dev_priv); + /* Add more as required ... */ return 0; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index cb9672b..1ca1866 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -913,6 +913,24 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) sizeof(struct guc_log_buffer_state)); buffer_size = log_buffer_state_local.size; +guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; +if (log_buffer_state_local.buffer_full_cnt != +guc->log.prev_overflow_count[i]) { +guc->log.total_overflow_count[i] += +(log_buffer_state_local.buffer_full_cnt - + guc->log.prev_overflow_count[i]); + +if (log_buffer_state_local.buffer_full_cnt < +guc->log.prev_overflow_count[i]) { +/* buffer_full_cnt is a 4 bit counter */ +guc->log.total_overflow_count[i] += 16; +} + +guc->log.prev_overflow_count[i] = +log_buffer_state_local.buffer_full_cnt; +DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); +} + if (log_buffer_snapshot_state) { /* First copy the state structure in local buffer */ memcpy(log_buffer_snapshot_state, &log_buffer_state_local, @@ -953,6 +971,7 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) * getting consumed by User at a slow rate. */ DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n"); +guc->log.capture_miss_count++; } } diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index d4d6f0a..b08d1d2 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1705,6 +1705,8 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir)
Re: [Intel-gfx] [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer
On 8/12/2016 8:12 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel GuC firmware sends an interrupt to flush the log buffer when it becomes half full, so Driver doesn't really need to sample the complete buffer and can just copy only the newly written data by GuC into the local buffer, i.e. as per the read & write pointer values. Moreover the flush interrupt would generally come for one type of log buffer, when it becomes half full, so at that time the other 2 types of log buffer would comparatively have much lesser unread data in them. In case of overflow reported by GuC, Driver do need to copy the entire buffer as the whole buffer would contain the unread data. v2: Rebase. Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 40 +- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 1ca1866..8e0f360 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; struct guc_log_buffer_state log_buffer_state_local; void *src_data_ptr, *dst_data_ptr; -u32 i, buffer_size; +bool new_overflow; +u32 i, buffer_size, read_offset, write_offset, bytes_to_copy; if (!guc->log.buf_addr) return; @@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) memcpy(&log_buffer_state_local, log_buffer_state, sizeof(struct guc_log_buffer_state)); buffer_size = log_buffer_state_local.size; +read_offset = log_buffer_state_local.read_ptr; +write_offset = log_buffer_state_local.sampled_write_ptr; guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; if (log_buffer_state_local.buffer_full_cnt != guc->log.prev_overflow_count[i]) { Wrong alignment. You can try checkpatch.pl for all of those. Sorry for all the alignment & indentation issues. Should the above condition be written like this ? if (log_buffer_state_local.buffer_full_cnt != guc->log.prev_overflow_count[i]) { +new_overflow = 1; true/false since it is a bool fine will do that. guc->log.total_overflow_count[i] += (log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[i]); @@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) guc->log.prev_overflow_count[i] = log_buffer_state_local.buffer_full_cnt; DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); -} +} else +new_overflow = 0; if (log_buffer_snapshot_state) { /* First copy the state structure in local buffer */ @@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) * for consistency set the write pointer value to same * value of sampled_write_ptr in the snapshot buffer. */ -log_buffer_snapshot_state->write_ptr = -log_buffer_snapshot_state->sampled_write_ptr; +log_buffer_snapshot_state->write_ptr = write_offset; log_buffer_snapshot_state++; /* Now copy the actual logs */ memcpy(dst_data_ptr, src_data_ptr, buffer_size); The confusing bit - the memcpy above still copies the whole buffer, no? Really very sorry for this blooper. Best regards Akash +if (unlikely(new_overflow)) { +/* copy the whole buffer in case of overflow */ +read_offset = 0; +write_offset = buffer_size; +} else if (unlikely((read_offset > buffer_size) || +(write_offset > buffer_size))) { +DRM_ERROR("invalid log buffer state\n"); +/* copy whole buffer as offsets are unreliable */ +read_offset = 0; +write_offset = buffer_size; +} + +/* Just copy the newly written data */ +if (read_offset <= write_offset) { +bytes_to_copy = write_offset - read_offset; +memcpy(dst_data_ptr + read_offset, + src_data_ptr + read_offset, bytes_to_copy); +} else { +bytes_to_copy = buffer_size - read_offset; +memcpy(dst_data_ptr + read_offset, + src_data_ptr + read_offset, bytes_to_copy); + +bytes_to_copy = write_offset; +memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy); +} src_data_ptr += buffer_size; dst_data_ptr += buffer_size;
Re: [Intel-gfx] [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer
On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel GuC firmware sends an interrupt to flush the log buffer when it becomes half full, so Driver doesn't really need to sample the complete buffer and can just copy only the newly written data by GuC into the local buffer, i.e. as per the read & write pointer values. Moreover the flush interrupt would generally come for one type of log buffer, when it becomes half full, so at that time the other 2 types of log buffer would comparatively have much lesser unread data in them. In case of overflow reported by GuC, Driver do need to copy the entire buffer as the whole buffer would contain the unread data. v2: Rebase. Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 40 +- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 1ca1866..8e0f360 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; struct guc_log_buffer_state log_buffer_state_local; void *src_data_ptr, *dst_data_ptr; - u32 i, buffer_size; + bool new_overflow; + u32 i, buffer_size, read_offset, write_offset, bytes_to_copy; if (!guc->log.buf_addr) return; @@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) memcpy(&log_buffer_state_local, log_buffer_state, sizeof(struct guc_log_buffer_state)); buffer_size = log_buffer_state_local.size; + read_offset = log_buffer_state_local.read_ptr; + write_offset = log_buffer_state_local.sampled_write_ptr; guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; if (log_buffer_state_local.buffer_full_cnt != guc->log.prev_overflow_count[i]) { Wrong alignment. You can try checkpatch.pl for all of those. + new_overflow = 1; true/false since it is a bool guc->log.total_overflow_count[i] += (log_buffer_state_local.buffer_full_cnt - guc->log.prev_overflow_count[i]); @@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) guc->log.prev_overflow_count[i] = log_buffer_state_local.buffer_full_cnt; DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); - } + } else + new_overflow = 0; if (log_buffer_snapshot_state) { /* First copy the state structure in local buffer */ @@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) * for consistency set the write pointer value to same * value of sampled_write_ptr in the snapshot buffer. */ - log_buffer_snapshot_state->write_ptr = - log_buffer_snapshot_state->sampled_write_ptr; + log_buffer_snapshot_state->write_ptr = write_offset; log_buffer_snapshot_state++; /* Now copy the actual logs */ memcpy(dst_data_ptr, src_data_ptr, buffer_size); The confusing bit - the memcpy above still copies the whole buffer, no? + if (unlikely(new_overflow)) { + /* copy the whole buffer in case of overflow */ + read_offset = 0; + write_offset = buffer_size; + } else if (unlikely((read_offset > buffer_size) || + (write_offset > buffer_size))) { + DRM_ERROR("invalid log buffer state\n"); + /* copy whole buffer as offsets are unreliable */ + read_offset = 0; + write_offset = buffer_size; + } + + /* Just copy the newly written data */ + if (read_offset <= write_offset) { + bytes_to_copy = write_offset - read_offset; + memcpy(dst_data_ptr + read_offset, +src_data_ptr + read_offset, bytes_to_copy); + } else { + bytes_to_copy = buffer_size - read_offset; + memcpy(dst_data_ptr + read_offset, +
Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts
On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote: On 12/08/16 14:10, Goel, Akash wrote: On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun Kamble There are certain types of interrupts which Host can recieve from GuC. GuC ukernel sends an interrupt to Host for certain events, like for example retrieve/consume the logs generated by ukernel. This patch adds support to receive interrupts from GuC but currently enables & partially handles only the interrupt sent by GuC ukernel. Future patches will add support for handling other interrupt types. v2: - Use common low level routines for PM IER/IIR programming (Chris) - Rename interrupt functions to gen9_xxx from gen8_xxx (Chris) - Replace disabling of wake ref asserts with rpm get/put (Chris) v3: - Update comments for more clarity. (Tvrtko) - Remove the masking of GuC interrupt, which was kept masked till the start of bottom half, its not really needed as there is only a single instance of work item & wq is ordered. (Tvrtko) v4: - Rebase. - Rename guc_events to pm_guc_events so as to be indicative of the register/control block it is associated with. (Chris) - Add handling for back to back log buffer flush interrupts. v5: - Move the read & clearing of register, containing Guc2Host message bits, outside the irq spinlock. (Tvrtko) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 5 ++ drivers/gpu/drm/i915/i915_irq.c| 100 +++-- drivers/gpu/drm/i915/i915_reg.h| 11 drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_guc.h | 4 ++ drivers/gpu/drm/i915/intel_guc_loader.c| 4 ++ 7 files changed, 124 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a608a5c..28ffac5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1779,6 +1779,7 @@ struct drm_i915_private { u32 pm_imr; u32 pm_ier; u32 pm_rps_events; +u32 pm_guc_events; u32 pipestat_irq_mask[I915_MAX_PIPES]; struct i915_hotplug hotplug; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index ad3b55f..c7c679f 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev) if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS) return 0; +gen9_disable_guc_interrupts(dev_priv); + ctx = dev_priv->kernel_context; data[0] = HOST2GUC_ACTION_ENTER_S_STATE; @@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev) if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS) return 0; +if (i915.guc_log_level >= 0) +gen9_enable_guc_interrupts(dev_priv); + ctx = dev_priv->kernel_context; data[0] = HOST2GUC_ACTION_EXIT_S_STATE; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 5f93309..5f1974f 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct drm_i915_private *dev_priv, } while (0) static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir); +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir); /* For display hotplug interrupt */ static inline void @@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv) gen6_reset_rps_interrupts(dev_priv); } +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(&dev_priv->irq_lock); +gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events); +spin_unlock_irq(&dev_priv->irq_lock); +} + +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(&dev_priv->irq_lock); +if (!dev_priv->guc.interrupts_enabled) { +WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) & +dev_priv->pm_guc_events); +dev_priv->guc.interrupts_enabled = true; +gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events); +} +spin_unlock_irq(&dev_priv->irq_lock); +} + +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv) +{ +spin_lock_irq(&dev_priv->irq_lock); +dev_priv->guc.interrupts_enabled = false; + +gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events); + +spin_unlock_irq(&dev_priv->irq_lock); +synchronize_irq(dev_priv->drm.irq); + +gen9_reset_guc_interrupts(dev_priv); +} + /** * bdw_update_port_irq - update DE port interrupt * @dev_priv: driver private @@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct *work) mutex_unlock(&dev_priv->rps.hw_
Re: [Intel-gfx] [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts
On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel GuC firmware sends an interrupt to flush the log buffer when it becomes half full. GuC firmware also tracks how many times the buffer overflowed. It would be useful to maintain a statistics of how many flush interrupts were received and for which type of log buffer, along with the overflow count of each buffer type. Augmented i915_log_info debugfs to report back these statistics. v2: - Update the logic to detect multiple overflows between the 2 flush interrupts and also log a message for overflow (Tvrtko) - Track the number of times there was no free sub buffer to capture the GuC log buffer. (Tvrtko) Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_debugfs.c| 28 drivers/gpu/drm/i915/i915_guc_submission.c | 19 +++ drivers/gpu/drm/i915/i915_irq.c| 2 ++ drivers/gpu/drm/i915/intel_guc.h | 7 +++ 4 files changed, 56 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 51b59d5..14e0dcf 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2539,6 +2539,32 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data) return 0; } +static void i915_guc_log_info(struct seq_file *m, +struct drm_i915_private *dev_priv) +{ + struct intel_guc *guc = &dev_priv->guc; + + seq_printf(m, "\nGuC logging stats:\n"); + + seq_printf(m, "\tISR: flush count %10u, overflow count %8u\n", + guc->log.flush_count[GUC_ISR_LOG_BUFFER], + guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]); + + seq_printf(m, "\tDPC: flush count %10u, overflow count %8u\n", + guc->log.flush_count[GUC_DPC_LOG_BUFFER], + guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]); + + seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n", + guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER], + guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]); Why is the width for overflow only 8 chars and not 10 like for flush since both are u32? + + seq_printf(m, "\tTotal flush interrupt count: %u\n", + guc->log.flush_interrupt_count); + + seq_printf(m, "\tCapture miss count: %u\n", + guc->log.capture_miss_count); +} + static void i915_guc_client_info(struct seq_file *m, struct drm_i915_private *dev_priv, struct i915_guc_client *client) @@ -2613,6 +2639,8 @@ static int i915_guc_info(struct seq_file *m, void *data) seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client); i915_guc_client_info(m, dev_priv, &client); + i915_guc_log_info(m, dev_priv); + /* Add more as required ... */ return 0; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index cb9672b..1ca1866 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -913,6 +913,24 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) sizeof(struct guc_log_buffer_state)); buffer_size = log_buffer_state_local.size; + guc->log.flush_count[i] += log_buffer_state_local.flush_to_file; + if (log_buffer_state_local.buffer_full_cnt != + guc->log.prev_overflow_count[i]) { + guc->log.total_overflow_count[i] += + (log_buffer_state_local.buffer_full_cnt - +guc->log.prev_overflow_count[i]); + + if (log_buffer_state_local.buffer_full_cnt < + guc->log.prev_overflow_count[i]) { + /* buffer_full_cnt is a 4 bit counter */ + guc->log.total_overflow_count[i] += 16; + } + + guc->log.prev_overflow_count[i] = + log_buffer_state_local.buffer_full_cnt; + DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); + } + if (log_buffer_snapshot_state) { /* First copy the state structure in local buffer */ memcpy(log_buffer_snapshot_state, &log_buffer_state_local, @@ -953,6 +971,7 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) * getting consumed by User at a slow rate. */ DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n"); + guc->log.capture_miss_count++; } } diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [CI,01/31] drm/i915: Record the position of the start of the request
== Series Details == Series: series starting with [CI,01/31] drm/i915: Record the position of the start of the request URL : https://patchwork.freedesktop.org/series/11029/ State : failure == Summary == Series 11029v1 Series without cover letter http://patchwork.freedesktop.org/api/1.0/series/11029/revisions/1/mbox Test drv_module_reload_basic: pass -> SKIP (ro-hsw-i3-4010u) Test gem_exec_suspend: Subgroup basic-s3: pass -> DMESG-WARN (ro-bdw-i7-5600u) Test kms_cursor_legacy: Subgroup basic-cursor-vs-flip-varying-size: pass -> FAIL (ro-ilk1-i5-650) Subgroup basic-flip-vs-cursor-varying-size: fail -> PASS (ro-skl3-i5-6260u) pass -> FAIL (ro-bdw-i5-5250u) dmesg-fail -> PASS (fi-skl-i7-6700k) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-b: dmesg-warn -> PASS (ro-bdw-i7-5600u) skip -> DMESG-WARN (ro-bdw-i5-5250u) fi-hsw-i7-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-kbl-qkkr total:244 pass:186 dwarn:29 dfail:0 fail:3 skip:26 fi-skl-i7-6700k total:244 pass:209 dwarn:4 dfail:1 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:218 dwarn:2 dfail:0 fail:2 skip:18 ro-bdw-i7-5600u total:240 pass:206 dwarn:1 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:240 pass:194 dwarn:0 dfail:0 fail:4 skip:42 ro-byt-n2820 total:240 pass:197 dwarn:0 dfail:0 fail:3 skip:40 ro-hsw-i3-4010u total:240 pass:213 dwarn:0 dfail:0 fail:0 skip:27 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:173 dwarn:0 dfail:0 fail:2 skip:60 ro-ivb-i7-3770 total:240 pass:205 dwarn:0 dfail:0 fail:0 skip:35 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:223 dwarn:0 dfail:0 fail:3 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1855/ 9a79c0b drm-intel-nightly: 2016y-08m-12d-12h-12m-18s UTC integration manifest af56144 drm/i915: Record the RING_MODE register for post-mortem debugging 90ea1ef drm/i915: Only record active and pending requests upon a GPU hang 5a32c90 drm/i915: Print the batchbuffer offset next to BBADDR in error state dad1d48 drm/i915: Introduce i915_ggtt_offset() 1707d23 drm/i915: Track pinned VMA cf7576d drm/i915: Consolidate i915_vma_unpin_and_release() aeac2e6 drm/i915: Use VMA for wa_ctx tracking c938b0f drm/i915: Use VMA for render state page tracking 70a014e drm/i915: Use VMA as the primary tracker for semaphore page c10834c drm/i915/overlay: Use VMA as the primary tracker for images ccf275f drm/i915: Move common seqno reset to intel_engine_cs.c ec45caa drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c 2a36ad5 drm/i915: Use VMA for scratch page tracking 1624cc0 drm/i915: Use VMA for ringbuffer tracking 4ce773a drm/i915: Move assertion for iomap access to i915_vma_pin_iomap fb0c322f drm/i915: Only change the context object's domain when binding 7c2a6f6 drm/i915: Use VMA as the primary object for context state 160017d drm/i915: Use VMA directly for checking tiling parameters ce4571b drm/i915: Convert fence computations to use vma directly 59e16d6 drm/i915: Track pinned vma inside guc 094e926 drm/i915: Add convenience wrappers for vma's object get/put 3015b87 drm/i915: Add fetch_and_zero() macro 15fbe90 drm/i915: Create a VMA for an object 6131737 drm/i915: Always set the vma->pages 9f0f991 drm/i915: Remove redundant WARN_ON from __i915_add_request() 1c06408 drm/i915: Reduce i915_gem_objects to only show object information fe2ce95 drm/i915: Focus debugfs/i915_gem_pinned to show only display pins c258579 drm/i915: Remove inactive/active list from debugfs be86468 drm/i915: Store the active context object on all engines upon error dc43be9 drm/i915: Reduce amount of duplicate buffer information captured on error 8f4ea2e drm/i915: Record the position of the start of the request ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
On 12/08/16 14:45, Goel, Akash wrote: On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun Kamble GuC ukernel sends an interrupt to Host to flush the log buffer and expects Host to correspondingly update the read pointer information in the state structure, once it has consumed the log buffer contents by copying them to a file or buffer. Even if Host couldn't copy the contents, it can still update the read pointer so that logging state is not disturbed on GuC side. v2: - Use a dedicated workqueue for handling flush interrupt. (Tvrtko) - Reduce the overall log buffer copying time by skipping the copy of crash buffer area for regular cases and copying only the state structure data in first page. v3: - Create a vmalloc mapping of log buffer. (Chris) - Cover the flush acknowledgment under rpm get & put.(Chris) - Revert the change of skipping the copy of crash dump area, as not really needed, will be covered by subsequent patch. v4: - Destroy the wq under the same condition in which it was created, pass dev_piv pointer instead of dev to newly added GuC function, add more comments & rename variable for clarity. (Tvrtko) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.c| 14 +++ drivers/gpu/drm/i915/i915_guc_submission.c | 150 + drivers/gpu/drm/i915/i915_irq.c| 5 +- drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 170 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 0fcd1c0..fc2da32 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv) if (dev_priv->hotplug.dp_wq == NULL) goto out_free_wq; +if (HAS_GUC_SCHED(dev_priv)) { This just reminded me that a previous patch had: +if (HAS_GUC_UCODE(dev)) +dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT; In the interrupt setup. I don't think there is a bug right now, but there is a disagreement between the two which would be good to resolve. This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED for correctness. I think. Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch. As per Chris's comments will move the wq init/destroy to the GuC logging setup/teardown routines (guc_create_log_extras, guc_log_cleanup) You are fine with that ?. Yes thats OK I think. +/* Need a dedicated wq to process log buffer flush interrupts + * from GuC without much delay so as to avoid any loss of logs. + */ +dev_priv->guc.log.wq = +alloc_ordered_workqueue("i915-guc_log", 0); +if (dev_priv->guc.log.wq == NULL) +goto out_free_hotplug_dp_wq; +} + return 0; +out_free_hotplug_dp_wq: +destroy_workqueue(dev_priv->hotplug.dp_wq); out_free_wq: destroy_workqueue(dev_priv->wq); out_err: @@ -782,6 +794,8 @@ out_err: static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv) { +if (HAS_GUC_SCHED(dev_priv)) +destroy_workqueue(dev_priv->guc.log.wq); destroy_workqueue(dev_priv->hotplug.dp_wq); destroy_workqueue(dev_priv->wq); } diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c7c679f..2635b67 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc, return host2guc_action(guc, data, ARRAY_SIZE(data)); } +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) +{ +u32 data[1]; + +data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE; + +return host2guc_action(guc, data, 1); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -840,6 +849,127 @@ err: return NULL; } +static void guc_move_to_next_buf(struct intel_guc *guc) +{ +return; +} + +static void* guc_get_write_buffer(struct intel_guc *guc) +{ +return NULL; +} + +static void guc_read_update_log_buffer(struct intel_guc *guc) +{ +struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; +struct guc_log_buffer_state log_buffer_state_local; +void *src_data_ptr, *dst_data_ptr; +u32 i, buffer_size; unsigned int i if you can be bothered. Fine will do that for both i & buffer_size. buffer_size can match the type of log_buffer_state_local.size or use something else if more appropriate. But I remember earlier in one of the patch, you suggested to use u32 as a type for some variables. Please could you share the guideline. Should u32, u64 be used we are exactly sure of the range of the variable, like for variables containing the register values ?
Re: [Intel-gfx] [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions
On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel With the addition of new Host2GuC actions related to GuC logging, there is a need of a lock to serialize them, as they can execute concurrently with each other and also with other existing actions. v2: Use mutex in place of spinlock to serialize, as sleep can happen while waiting for the action's response from GuC. (Tvrtko) Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++ drivers/gpu/drm/i915/intel_guc.h | 3 +++ 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 1a2d648..cb9672b 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) return -EINVAL; intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); + mutex_lock(&guc->action_lock); I would probably take the mutex before grabbing forcewake as a general rule. Not that I think it matters in this case since we don't expect any contention on this one. dev_priv->guc.action_count += 1; dev_priv->guc.action_cmd = data[0]; @@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len) } dev_priv->guc.action_status = status; + mutex_unlock(&guc->action_lock); intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); return ret; @@ -1312,6 +1314,7 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv) return -ENOMEM; ida_init(&guc->ctx_ids); + mutex_init(&guc->action_lock); guc_create_log(guc); guc_create_ads(guc); diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h index 96ef7dc..e4ec8d8 100644 --- a/drivers/gpu/drm/i915/intel_guc.h +++ b/drivers/gpu/drm/i915/intel_guc.h @@ -156,6 +156,9 @@ struct intel_guc { uint64_t submissions[I915_NUM_ENGINES]; uint32_t last_seqno[I915_NUM_ENGINES]; + + /* To serialize the Host2GuC actions */ + struct mutex action_lock; }; /* intel_guc_loader.c */ With or without the mutex vs forcewake ordering change: Reviewed-by: Tvrtko Ursulin Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs
On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the User to capture GuC firmware logs. Availed relay framework to implement the interface, where Driver will have to just use a relay API to store snapshots of the GuC log buffer in the buffer managed by relay. The snapshot will be taken when GuC firmware sends a log buffer flush interrupt and up to four snaphots could be stored in the relay buffer. snapshots The relay buffer will be operated in a mode where it will overwrite the data not yet collected by User. Besides mmap method, through which User can directly access the relay buffer contents, relay also supports the 'poll' method. Through the 'poll' call on log file, User can come to know whenever a new snapshot of the log buffer is taken by Driver, so can run in tandem with the Driver and capture the logs in a sustained/streaming manner, without any loss of data. v2: Defer the creation of relay channel & associated debugfs file, as debugfs setup is now done at the end of i915 Driver load. (Chris) v3: - Switch to no-overwrite mode for relay. - Fix the relay sub buffer switching sequence. v4: - Update i915 Kconfig to select RELAY config. (TvrtKo) - Log a message when there is no sub buffer available to capture the GuC log buffer. (Tvrtko) - Increase the number of relay sub buffers to 8 from 4, to have sufficient buffering for boot time logs Suggested-by: Chris Wilson Signed-off-by: Sourab Gupta Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_drv.c| 2 + drivers/gpu/drm/i915/i915_guc_submission.c | 206 - drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 209 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 7769e46..fc900d2 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -11,6 +11,7 @@ config DRM_I915 select DRM_KMS_HELPER select DRM_PANEL select DRM_MIPI_DSI + select RELAY # i915 depends on ACPI_VIDEO when ACPI is enabled # but for select to work, need to select ACPI_VIDEO's dependencies, ick select BACKLIGHT_LCD_SUPPORT if ACPI diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index fc2da32..cb8c943 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1145,6 +1145,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) /* Reveal our presence to userspace */ if (drm_dev_register(dev, 0) == 0) { i915_debugfs_register(dev_priv); + i915_guc_register(dev_priv); i915_setup_sysfs(dev); } else DRM_ERROR("Failed to register driver for userspace access!\n"); @@ -1183,6 +1184,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) intel_opregion_unregister(dev_priv); i915_teardown_sysfs(&dev_priv->drm); + i915_guc_unregister(dev_priv); i915_debugfs_unregister(dev_priv); drm_dev_unregister(&dev_priv->drm); diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 2635b67..1a2d648 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -23,6 +23,8 @@ */ #include #include +#include +#include #include "i915_drv.h" #include "intel_guc.h" @@ -851,12 +853,33 @@ err: static void guc_move_to_next_buf(struct intel_guc *guc) { - return; + /* Make sure the updates made in the sub buffer are visible when +* Consumer sees the following update to offset inside the sub buffer. +*/ + smp_wmb(); + + /* All data has been written, so now move the offset of sub buffer. */ + relay_reserve(guc->log.relay_chan, guc->log.obj->base.size); + + /* Switch to the next sub buffer */ + relay_flush(guc->log.relay_chan); } static void* guc_get_write_buffer(struct intel_guc *guc) { - return NULL; + /* FIXME: Cover the check under a lock ? */ Need to resolve before r-b in any case. + if (!guc->log.relay_chan) + return NULL; + + /* Just get the base address of a new sub buffer and copy data into it +* ourselves. NULL will be returned in no-overwrite mode, if all sub +* buffers are full. Could have used the relay_write() to indirectly +* copy the data, but that would have been bit convoluted, as we need to +* write to only certain locations inside a sub buffer which cannot be +* done without using relay_reserve() along with relay_write(). So its +* better to use relay_reserve() alone. +*/ + return relay_reserve(guc->log.relay_chan, 0); } static void guc_read_update
[Intel-gfx] [CI 20/31] drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c
Since the scratch allocation and cleanup is shared by all engine submission backends, move it out of the legacy intel_ringbuffer.c and into the new home for common routines, intel_engine_cs.c Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_engine_cs.c | 50 + drivers/gpu/drm/i915/intel_lrc.c| 1 - drivers/gpu/drm/i915/intel_ringbuffer.c | 50 - drivers/gpu/drm/i915/intel_ringbuffer.h | 4 +-- 4 files changed, 51 insertions(+), 54 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 186c12d07f99..7104dec5e893 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -195,6 +195,54 @@ void intel_engine_setup_common(struct intel_engine_cs *engine) i915_gem_batch_pool_init(engine, &engine->batch_pool); } +int intel_engine_create_scratch(struct intel_engine_cs *engine, int size) +{ + struct drm_i915_gem_object *obj; + struct i915_vma *vma; + int ret; + + WARN_ON(engine->scratch); + + obj = i915_gem_object_create_stolen(&engine->i915->drm, size); + if (!obj) + obj = i915_gem_object_create(&engine->i915->drm, size); + if (IS_ERR(obj)) { + DRM_ERROR("Failed to allocate scratch page\n"); + return PTR_ERR(obj); + } + + vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL); + if (IS_ERR(vma)) { + ret = PTR_ERR(vma); + goto err_unref; + } + + ret = i915_vma_pin(vma, 0, 4096, PIN_GLOBAL | PIN_HIGH); + if (ret) + goto err_unref; + + engine->scratch = vma; + DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08llx\n", +engine->name, vma->node.start); + return 0; + +err_unref: + i915_gem_object_put(obj); + return ret; +} + +static void intel_engine_cleanup_scratch(struct intel_engine_cs *engine) +{ + struct i915_vma *vma; + + vma = fetch_and_zero(&engine->scratch); + if (!vma) + return; + + i915_vma_unpin(vma); + i915_vma_put(vma); +} + /** * intel_engines_init_common - initialize cengine state which might require hw access * @engine: Engine to initialize. @@ -226,6 +274,8 @@ int intel_engine_init_common(struct intel_engine_cs *engine) */ void intel_engine_cleanup_common(struct intel_engine_cs *engine) { + intel_engine_cleanup_scratch(engine); + intel_engine_cleanup_cmd_parser(engine); intel_engine_fini_breadcrumbs(engine); i915_gem_batch_pool_fini(&engine->batch_pool); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 42999ba02152..56c904e2dc98 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1844,7 +1844,6 @@ int logical_render_ring_init(struct intel_engine_cs *engine) else engine->init_hw = gen8_init_render_ring; engine->init_context = gen8_init_rcs_context; - engine->cleanup = intel_engine_cleanup_scratch; engine->emit_flush = gen8_emit_flush_render; engine->emit_request = gen8_emit_request_render; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 7ce912f8d96c..c89aea55bc10 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -613,54 +613,6 @@ out: return ret; } -void intel_engine_cleanup_scratch(struct intel_engine_cs *engine) -{ - struct i915_vma *vma; - - vma = fetch_and_zero(&engine->scratch); - if (!vma) - return; - - i915_vma_unpin(vma); - i915_vma_put(vma); -} - -int intel_engine_create_scratch(struct intel_engine_cs *engine, int size) -{ - struct drm_i915_gem_object *obj; - struct i915_vma *vma; - int ret; - - WARN_ON(engine->scratch); - - obj = i915_gem_object_create_stolen(&engine->i915->drm, size); - if (!obj) - obj = i915_gem_object_create(&engine->i915->drm, size); - if (IS_ERR(obj)) { - DRM_ERROR("Failed to allocate scratch page\n"); - return PTR_ERR(obj); - } - - vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL); - if (IS_ERR(vma)) { - ret = PTR_ERR(vma); - goto err_unref; - } - - ret = i915_vma_pin(vma, 0, 4096, PIN_GLOBAL | PIN_HIGH); - if (ret) - goto err_unref; - - engine->scratch = vma; - DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08llx\n", -engine->name, vma->node.start); - return 0; - -err_unref: - i915_gem_object_put(obj); - return ret; -} - static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req) { struct intel_r
[Intel-gfx] [CI 18/31] drm/i915: Use VMA for ringbuffer tracking
Use the GGTT VMA as the primary cookie for handing ring objects as the most common action upon the ring is mapping and unmapping which act upon the VMA itself. By restructuring the code to work with the ring VMA, we can shrink the code and remove a few cycles from context pinning. v2: Move the flush of the object back to before the first pin. We use the am-I-bound? query to only have to check the flush on the first bind and so avoid stalling on active rings. Lots of little renames and small hoops. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 4 +- drivers/gpu/drm/i915/i915_guc_submission.c | 16 +- drivers/gpu/drm/i915/intel_lrc.c | 17 +- drivers/gpu/drm/i915/intel_ringbuffer.c| 243 ++--- drivers/gpu/drm/i915/intel_ringbuffer.h| 14 +- 6 files changed, 139 insertions(+), 157 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index fcda4e7da127..2da37c196ef0 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -356,7 +356,7 @@ static int per_file_ctx_stats(int id, void *ptr, void *data) if (ctx->engine[n].state) per_file_stats(0, ctx->engine[n].state->obj, data); if (ctx->engine[n].ring) - per_file_stats(0, ctx->engine[n].ring->obj, data); + per_file_stats(0, ctx->engine[n].ring->vma->obj, data); } return 0; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 35394d393edc..4a19494a4f6f 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1128,12 +1128,12 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->cpu_ring_tail = ring->tail; ee->ringbuffer = i915_error_ggtt_object_create(dev_priv, - ring->obj); + ring->vma->obj); } ee->hws_page = i915_error_ggtt_object_create(dev_priv, - engine->status_page.obj); + engine->status_page.vma->obj); ee->wa_ctx = i915_error_ggtt_object_create(dev_priv, engine->wa_ctx.obj); diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 4f0f173f9754..c40b92e212fa 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -343,7 +343,6 @@ static void guc_init_ctx_desc(struct intel_guc *guc, struct intel_context *ce = &ctx->engine[engine->id]; uint32_t guc_engine_id = engine->guc_id; struct guc_execlist_context *lrc = &desc.lrc[guc_engine_id]; - struct drm_i915_gem_object *obj; /* TODO: We have a design issue to be solved here. Only when we * receive the first batch, we know which engine is used by the @@ -358,17 +357,14 @@ static void guc_init_ctx_desc(struct intel_guc *guc, lrc->context_desc = lower_32_bits(ce->lrc_desc); /* The state page is after PPHWSP */ - gfx_addr = ce->state->node.start; - lrc->ring_lcra = gfx_addr + LRC_STATE_PN * PAGE_SIZE; + lrc->ring_lcra = + ce->state->node.start + LRC_STATE_PN * PAGE_SIZE; lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) | (guc_engine_id << GUC_ELC_ENGINE_OFFSET); - obj = ce->ring->obj; - gfx_addr = i915_gem_obj_ggtt_offset(obj); - - lrc->ring_begin = gfx_addr; - lrc->ring_end = gfx_addr + obj->base.size - 1; - lrc->ring_next_free_location = gfx_addr; + lrc->ring_begin = ce->ring->vma->node.start; + lrc->ring_end = lrc->ring_begin + ce->ring->size - 1; + lrc->ring_next_free_location = lrc->ring_begin; lrc->ring_current_tail_pointer_value = 0; desc.engines_used |= (1 << guc_engine_id); @@ -943,7 +939,7 @@ static void guc_create_ads(struct intel_guc *guc) * to find it. */ engine = &dev_priv->engine[RCS]; - ads->golden_context_lrca = engine->status_page.gfx_addr; + ads->golden_context_lrca = engine->status_page.ggtt_offset; for_each_engine(engine, dev_priv) ads->eng_state_size[engine->guc_id] = intel_lr_context_size(engine); diff --git a/drivers/gpu/drm/i915/int
[Intel-gfx] [CI 26/31] drm/i915: Consolidate i915_vma_unpin_and_release()
In a few places, we repeat a call to clear a pointer to a vma whilst unpinning and releasing a reference to its owner. Refactor those into a common function. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c| 12 drivers/gpu/drm/i915/i915_gem_gtt.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 21 - drivers/gpu/drm/i915/intel_engine_cs.c | 9 + drivers/gpu/drm/i915/intel_lrc.c | 9 + drivers/gpu/drm/i915/intel_ringbuffer.c| 8 +--- 6 files changed, 20 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 738a474c5afa..d15eb1d71341 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -3674,3 +3674,15 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma) __i915_vma_pin(vma); return ptr; } + +void i915_vma_unpin_and_release(struct i915_vma **p_vma) +{ + struct i915_vma *vma; + + vma = fetch_and_zero(p_vma); + if (!vma) + return; + + i915_vma_unpin(vma); + i915_vma_put(vma); +} diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index a2691943a404..ec538fcc9c20 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -232,6 +232,7 @@ struct i915_vma * i915_vma_create(struct drm_i915_gem_object *obj, struct i915_address_space *vm, const struct i915_ggtt_view *view); +void i915_vma_unpin_and_release(struct i915_vma **p_vma); static inline bool i915_vma_is_ggtt(const struct i915_vma *vma) { diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c40b92e212fa..e7dbc64ec1da 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -653,19 +653,6 @@ err: return vma; } -/** - * guc_release_vma() - Release gem object allocated for GuC usage - * @vma: gem obj to be released - */ -static void guc_release_vma(struct i915_vma *vma) -{ - if (!vma) - return; - - i915_vma_unpin(vma); - i915_vma_put(vma); -} - static void guc_client_free(struct drm_i915_private *dev_priv, struct i915_guc_client *client) @@ -690,7 +677,7 @@ guc_client_free(struct drm_i915_private *dev_priv, kunmap(kmap_to_page(client->client_base)); } - guc_release_vma(client->vma); + i915_vma_unpin_and_release(&client->vma); if (client->ctx_index != GUC_INVALID_CTX_ID) { guc_fini_ctx_desc(guc, client); @@ -1048,12 +1035,12 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv) { struct intel_guc *guc = &dev_priv->guc; - guc_release_vma(fetch_and_zero(&guc->ads_vma)); - guc_release_vma(fetch_and_zero(&guc->log_vma)); + i915_vma_unpin_and_release(&guc->ads_vma); + i915_vma_unpin_and_release(&guc->log_vma); if (guc->ctx_pool_vma) ida_destroy(&guc->ctx_ids); - guc_release_vma(fetch_and_zero(&guc->ctx_pool_vma)); + i915_vma_unpin_and_release(&guc->ctx_pool_vma); } /** diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 573f642a74f8..f02d66bbec4b 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -279,14 +279,7 @@ err_unref: static void intel_engine_cleanup_scratch(struct intel_engine_cs *engine) { - struct i915_vma *vma; - - vma = fetch_and_zero(&engine->scratch); - if (!vma) - return; - - i915_vma_unpin(vma); - i915_vma_put(vma); + i915_vma_unpin_and_release(&engine->scratch); } /** diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 64cb04e63512..2673fb4f817b 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1193,14 +1193,7 @@ err: static void lrc_destroy_wa_ctx_obj(struct intel_engine_cs *engine) { - struct i915_vma *vma; - - vma = fetch_and_zero(&engine->wa_ctx.vma); - if (!vma) - return; - - i915_vma_unpin(vma); - i915_vma_put(vma); + i915_vma_unpin_and_release(&engine->wa_ctx.vma); } static int intel_init_workaround_bb(struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 30b066140b0c..65ef172e8761 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1257,14 +1257,8 @@ static int init_render_ring(struct intel_engine_cs *engine) static void render_ring_cleanup(struct intel_engine_cs *engine) { struct drm_i915_private *dev_priv = engine->i915; - struct i915_vma *vma; - - vma = fetch_a
[Intel-gfx] [CI 22/31] drm/i915/overlay: Use VMA as the primary tracker for images
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_overlay.c | 39 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c index 90f3ab424e01..d930e3a4a9cd 100644 --- a/drivers/gpu/drm/i915/intel_overlay.c +++ b/drivers/gpu/drm/i915/intel_overlay.c @@ -171,8 +171,8 @@ struct overlay_registers { struct intel_overlay { struct drm_i915_private *i915; struct intel_crtc *crtc; - struct drm_i915_gem_object *vid_bo; - struct drm_i915_gem_object *old_vid_bo; + struct i915_vma *vma; + struct i915_vma *old_vma; bool active; bool pfit_active; u32 pfit_vscale_ratio; /* shifted-point number, (1<<12) == 1.0 */ @@ -317,15 +317,17 @@ static void intel_overlay_release_old_vid_tail(struct i915_gem_active *active, { struct intel_overlay *overlay = container_of(active, typeof(*overlay), last_flip); - struct drm_i915_gem_object *obj = overlay->old_vid_bo; + struct i915_vma *vma; - i915_gem_track_fb(obj, NULL, - INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe)); + vma = fetch_and_zero(&overlay->old_vma); + if (WARN_ON(!vma)) + return; - i915_gem_object_ggtt_unpin(obj); - i915_gem_object_put(obj); + i915_gem_track_fb(vma->obj, NULL, + INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe)); - overlay->old_vid_bo = NULL; + i915_gem_object_unpin_from_display_plane(vma->obj, &i915_ggtt_view_normal); + i915_vma_put(vma); } static void intel_overlay_off_tail(struct i915_gem_active *active, @@ -333,15 +335,15 @@ static void intel_overlay_off_tail(struct i915_gem_active *active, { struct intel_overlay *overlay = container_of(active, typeof(*overlay), last_flip); - struct drm_i915_gem_object *obj = overlay->vid_bo; + struct i915_vma *vma; /* never have the overlay hw on without showing a frame */ - if (WARN_ON(!obj)) + vma = fetch_and_zero(&overlay->vma); + if (WARN_ON(!vma)) return; - i915_gem_object_ggtt_unpin(obj); - i915_gem_object_put(obj); - overlay->vid_bo = NULL; + i915_gem_object_unpin_from_display_plane(vma->obj, &i915_ggtt_view_normal); + i915_vma_put(vma); overlay->crtc->overlay = NULL; overlay->crtc = NULL; @@ -421,7 +423,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay) /* Only wait if there is actually an old frame to release to * guarantee forward progress. */ - if (!overlay->old_vid_bo) + if (!overlay->old_vma) return 0; if (I915_READ(ISR) & I915_OVERLAY_PLANE_FLIP_PENDING_INTERRUPT) { @@ -744,6 +746,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay, struct drm_i915_private *dev_priv = overlay->i915; u32 swidth, swidthsw, sheight, ostride; enum pipe pipe = overlay->crtc->pipe; + struct i915_vma *vma; lockdep_assert_held(&dev_priv->drm.struct_mutex); WARN_ON(!drm_modeset_is_locked(&dev_priv->drm.mode_config.connection_mutex)); @@ -757,6 +760,8 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay, if (ret != 0) return ret; + vma = i915_gem_obj_to_ggtt_view(new_bo, &i915_ggtt_view_normal); + ret = i915_gem_object_put_fence(new_bo); if (ret) goto out_unpin; @@ -834,11 +839,11 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay, if (ret) goto out_unpin; - i915_gem_track_fb(overlay->vid_bo, new_bo, + i915_gem_track_fb(overlay->vma->obj, new_bo, INTEL_FRONTBUFFER_OVERLAY(pipe)); - overlay->old_vid_bo = overlay->vid_bo; - overlay->vid_bo = new_bo; + overlay->old_vma = overlay->vma; + overlay->vma = vma; intel_frontbuffer_flip(dev_priv, INTEL_FRONTBUFFER_OVERLAY(pipe)); -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 23/31] drm/i915: Use VMA as the primary tracker for semaphore page
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 4 +-- drivers/gpu/drm/i915/i915_gpu_error.c | 16 - drivers/gpu/drm/i915/intel_engine_cs.c | 12 --- drivers/gpu/drm/i915/intel_ringbuffer.c | 60 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 4 +-- 6 files changed, 55 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 2da37c196ef0..fb483df1afd6 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -3145,7 +3145,7 @@ static int i915_semaphore_status(struct seq_file *m, void *unused) struct page *page; uint64_t *seqno; - page = i915_gem_object_get_page(dev_priv->semaphore_obj, 0); + page = i915_gem_object_get_page(dev_priv->semaphore->obj, 0); seqno = (uint64_t *)kmap_atomic(page); for_each_engine_id(engine, dev_priv, id) { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 259425d99e17..50dc3613c61c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -733,7 +733,7 @@ struct drm_i915_error_state { u64 fence[I915_MAX_NUM_FENCES]; struct intel_overlay_error_state *overlay; struct intel_display_error_state *display; - struct drm_i915_error_object *semaphore_obj; + struct drm_i915_error_object *semaphore; struct drm_i915_error_engine { int engine_id; @@ -1750,7 +1750,7 @@ struct drm_i915_private { struct pci_dev *bridge_dev; struct i915_gem_context *kernel_context; struct intel_engine_cs engine[I915_NUM_ENGINES]; - struct drm_i915_gem_object *semaphore_obj; + struct i915_vma *semaphore; u32 next_seqno; struct drm_dma_handle *status_page_dmah; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index b80d2a6f56b3..da8aa86ad0c9 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -549,7 +549,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, } } - if ((obj = error->semaphore_obj)) { + if ((obj = error->semaphore)) { err_printf(m, "Semaphore page = 0x%08x\n", lower_32_bits(obj->gtt_offset)); for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { @@ -640,7 +640,7 @@ static void i915_error_state_free(struct kref *error_ref) kfree(ee->waiters); } - i915_error_object_free(error->semaphore_obj); + i915_error_object_free(error->semaphore); for (i = 0; i < ARRAY_SIZE(error->active_bo); i++) kfree(error->active_bo[i]); @@ -876,7 +876,7 @@ static void gen8_record_semaphore_state(struct drm_i915_error_state *error, struct intel_engine_cs *to; enum intel_engine_id id; - if (!error->semaphore_obj) + if (!error->semaphore) return; for_each_engine_id(to, dev_priv, id) { @@ -889,7 +889,7 @@ static void gen8_record_semaphore_state(struct drm_i915_error_state *error, signal_offset = (GEN8_SIGNAL_OFFSET(engine, id) & (PAGE_SIZE - 1)) / 4; - tmp = error->semaphore_obj->pages[0]; + tmp = error->semaphore->pages[0]; idx = intel_engine_sync_index(engine, to); ee->semaphore_mboxes[idx] = tmp[signal_offset]; @@ -1061,11 +1061,9 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, struct drm_i915_gem_request *request; int i, count; - if (dev_priv->semaphore_obj) { - error->semaphore_obj = - i915_error_ggtt_object_create(dev_priv, - dev_priv->semaphore_obj); - } + error->semaphore = + i915_error_ggtt_object_create(dev_priv, + dev_priv->semaphore->obj); for (i = 0; i < I915_NUM_ENGINES; i++) { struct intel_engine_cs *engine = &dev_priv->engine[i]; diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 829624571ca4..573f642a74f8 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -179,12 +179,16 @@ void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno) if (HAS_VEBOX(dev_priv)) I915_WRITE(RING_SYNC_2(engine->mmio_base), 0); } - if (dev_priv->semaphore_obj) { - struct drm_i915_gem_object *obj = dev_priv->semaphore_obj; - struct page *page = i915_gem_object_get_dirty_page(obj, 0); -
[Intel-gfx] [CI 28/31] drm/i915: Introduce i915_ggtt_offset()
This little helper only exists to safely discard the upper unused 32bits of the general 64-bit VMA address - as we know that all Global GTT currently are less than 4GiB in size and so that the upper bits must be zero. In many places, we use a u32 for the global GTT offset and we want to document where we are discarding the full VMA offset. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 2 +- drivers/gpu/drm/i915/i915_drv.h| 2 +- drivers/gpu/drm/i915/i915_gem.c| 11 +-- drivers/gpu/drm/i915/i915_gem_context.c| 6 -- drivers/gpu/drm/i915/i915_gem_gtt.h| 9 + drivers/gpu/drm/i915/i915_guc_submission.c | 15 --- drivers/gpu/drm/i915/intel_display.c | 10 +++--- drivers/gpu/drm/i915/intel_engine_cs.c | 4 ++-- drivers/gpu/drm/i915/intel_fbdev.c | 6 +++--- drivers/gpu/drm/i915/intel_guc_loader.c| 6 +++--- drivers/gpu/drm/i915/intel_lrc.c | 20 +++- drivers/gpu/drm/i915/intel_overlay.c | 10 ++ drivers/gpu/drm/i915/intel_ringbuffer.c| 28 ++-- 13 files changed, 70 insertions(+), 59 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 21961304284e..82652ad28cd4 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2008,7 +2008,7 @@ static void i915_dump_lrc_obj(struct seq_file *m, if (vma->flags & I915_VMA_GLOBAL_BIND) seq_printf(m, "\tBound in GGTT at 0x%08x\n", - lower_32_bits(vma->node.start)); + i915_ggtt_offset(vma)); if (i915_gem_object_get_pages(vma->obj)) { seq_puts(m, "\tFailed to get pages for context object\n\n"); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bbee45acedeb..bd58878de77b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3330,7 +3330,7 @@ static inline unsigned long i915_gem_object_ggtt_offset(struct drm_i915_gem_object *o, const struct i915_ggtt_view *view) { - return i915_gem_object_to_ggtt(o, view)->node.start; + return i915_ggtt_offset(i915_gem_object_to_ggtt(o, view)); } /* i915_gem_fence.c */ diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 07f7d3da5457..8bd2fa7644d5 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -758,7 +758,7 @@ i915_gem_gtt_pread(struct drm_device *dev, i915_gem_object_pin_pages(obj); } else { - node.start = vma->node.start; + node.start = i915_ggtt_offset(vma); node.allocated = false; ret = i915_gem_object_put_fence(obj); if (ret) @@ -1062,7 +1062,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, i915_gem_object_pin_pages(obj); } else { - node.start = vma->node.start; + node.start = i915_ggtt_offset(vma); node.allocated = false; ret = i915_gem_object_put_fence(obj); if (ret) @@ -1703,7 +1703,7 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf) goto err_unpin; /* Finally, remap it using the new GTT offset */ - pfn = ggtt->mappable_base + vma->node.start; + pfn = ggtt->mappable_base + i915_ggtt_offset(vma); pfn >>= PAGE_SHIFT; if (unlikely(view.type == I915_GGTT_VIEW_PARTIAL)) { @@ -3750,10 +3750,9 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, WARN(i915_vma_is_pinned(vma), "bo is already pinned in ggtt with incorrect alignment:" -" offset=%08x %08x, req.alignment=%llx, req.map_and_fenceable=%d," +" offset=%08x, req.alignment=%llx, req.map_and_fenceable=%d," " obj->map_and_fenceable=%d\n", -upper_32_bits(vma->node.start), -lower_32_bits(vma->node.start), +i915_ggtt_offset(vma), alignment, !!(flags & PIN_MAPPABLE), obj->map_and_fenceable); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index e566167d9441..98d2956f91f4 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -631,7 +631,8 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) intel_ring_emit(ring, MI_NOOP); intel_ring_emit(ring, MI_SET_CONTEXT); - intel_ring_emit(ring, req->ctx->engine[RCS].state->node.start | flags); + intel_ring_emit(ring, + i915_ggtt_offset(req->ctx->engine[RCS].state) | flags);
[Intel-gfx] [CI 29/31] drm/i915: Print the batchbuffer offset next to BBADDR in error state
It is useful when looking at captured error states to check the recorded BBADDR register (the address of the last batchbuffer instruction loaded) against the expected offset of the batch buffer, and so do a quick check that (a) the capture is true or (b) HEAD hasn't wandered off into the badlands. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 25 - drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/i915_gem_context.c | 4 drivers/gpu/drm/i915/i915_gem_request.c | 6 -- drivers/gpu/drm/i915/i915_gem_request.h | 3 --- drivers/gpu/drm/i915/i915_gpu_error.c | 28 +++- 6 files changed, 46 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 82652ad28cd4..61e12a0f08d4 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -460,6 +460,8 @@ static int i915_gem_object_info(struct seq_file *m, void* data) print_context_stats(m, dev_priv); list_for_each_entry_reverse(file, &dev->filelist, lhead) { struct file_stats stats; + struct drm_i915_file_private *file_priv = file->driver_priv; + struct drm_i915_gem_request *request; struct task_struct *task; memset(&stats, 0, sizeof(stats)); @@ -473,10 +475,17 @@ static int i915_gem_object_info(struct seq_file *m, void* data) * still alive (e.g. get_pid(current) => fork() => exit()). * Therefore, we need to protect this ->comm access using RCU. */ + mutex_lock(&dev->struct_mutex); + request = list_first_entry_or_null(&file_priv->mm.request_list, + struct drm_i915_gem_request, + client_list); rcu_read_lock(); - task = pid_task(file->pid, PIDTYPE_PID); + task = pid_task(request && request->ctx->pid ? + request->ctx->pid : file->pid, + PIDTYPE_PID); print_file_stats(m, task ? task->comm : "", stats); rcu_read_unlock(); + mutex_unlock(&dev->struct_mutex); } mutex_unlock(&dev->filelist_mutex); @@ -658,12 +667,11 @@ static int i915_gem_request_info(struct seq_file *m, void *data) seq_printf(m, "%s requests: %d\n", engine->name, count); list_for_each_entry(req, &engine->request_list, link) { + struct pid *pid = req->ctx->pid; struct task_struct *task; rcu_read_lock(); - task = NULL; - if (req->pid) - task = pid_task(req->pid, PIDTYPE_PID); + task = pid ? pid_task(pid, PIDTYPE_PID) : NULL; seq_printf(m, "%x @ %d: %s [%d]\n", req->fence.seqno, (int) (jiffies - req->emitted_jiffies), @@ -1952,18 +1960,17 @@ static int i915_context_status(struct seq_file *m, void *unused) list_for_each_entry(ctx, &dev_priv->context_list, link) { seq_printf(m, "HW context %u ", ctx->hw_id); - if (IS_ERR(ctx->file_priv)) { - seq_puts(m, "(deleted) "); - } else if (ctx->file_priv) { - struct pid *pid = ctx->file_priv->file->pid; + if (ctx->pid) { struct task_struct *task; - task = get_pid_task(pid, PIDTYPE_PID); + task = get_pid_task(ctx->pid, PIDTYPE_PID); if (task) { seq_printf(m, "(%s [%d]) ", task->comm, task->pid); put_task_struct(task); } + } else if (IS_ERR(ctx->file_priv)) { + seq_puts(m, "(deleted) "); } else { seq_puts(m, "(kernel) "); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bd58878de77b..bb7d8130dbfd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -775,6 +775,7 @@ struct drm_i915_error_state { struct drm_i915_error_object { int page_count; u64 gtt_offset; + u64 gtt_size; u32 *pages[0]; } *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page; @@ -782,6 +783,7 @@ struct drm_i915_error_state { struct drm_i915_error_request { long jiffies; +
[Intel-gfx] [CI 24/31] drm/i915: Use VMA for render state page tracking
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_render_state.c | 40 +++- drivers/gpu/drm/i915/i915_gem_render_state.h | 2 +- 2 files changed, 23 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 57fd767a2d79..95b7e9afd5f8 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -30,8 +30,7 @@ struct render_state { const struct intel_renderstate_rodata *rodata; - struct drm_i915_gem_object *obj; - u64 ggtt_offset; + struct i915_vma *vma; u32 aux_batch_size; u32 aux_batch_offset; }; @@ -73,7 +72,7 @@ render_state_get_rodata(const struct drm_i915_gem_request *req) static int render_state_setup(struct render_state *so) { - struct drm_device *dev = so->obj->base.dev; + struct drm_device *dev = so->vma->vm->dev; const struct intel_renderstate_rodata *rodata = so->rodata; const bool has_64bit_reloc = INTEL_GEN(dev) >= 8; unsigned int i = 0, reloc_index = 0; @@ -81,18 +80,18 @@ static int render_state_setup(struct render_state *so) u32 *d; int ret; - ret = i915_gem_object_set_to_cpu_domain(so->obj, true); + ret = i915_gem_object_set_to_cpu_domain(so->vma->obj, true); if (ret) return ret; - page = i915_gem_object_get_dirty_page(so->obj, 0); + page = i915_gem_object_get_dirty_page(so->vma->obj, 0); d = kmap(page); while (i < rodata->batch_items) { u32 s = rodata->batch[i]; if (i * 4 == rodata->reloc[reloc_index]) { - u64 r = s + so->ggtt_offset; + u64 r = s + so->vma->node.start; s = lower_32_bits(r); if (has_64bit_reloc) { if (i + 1 >= rodata->batch_items || @@ -154,7 +153,7 @@ static int render_state_setup(struct render_state *so) kunmap(page); - ret = i915_gem_object_set_to_gtt_domain(so->obj, false); + ret = i915_gem_object_set_to_gtt_domain(so->vma->obj, false); if (ret) return ret; @@ -175,6 +174,7 @@ err_out: int i915_gem_render_state_init(struct drm_i915_gem_request *req) { struct render_state so; + struct drm_i915_gem_object *obj; int ret; if (WARN_ON(req->engine->id != RCS)) @@ -187,21 +187,25 @@ int i915_gem_render_state_init(struct drm_i915_gem_request *req) if (so.rodata->batch_items * 4 > 4096) return -EINVAL; - so.obj = i915_gem_object_create(&req->i915->drm, 4096); - if (IS_ERR(so.obj)) - return PTR_ERR(so.obj); + obj = i915_gem_object_create(&req->i915->drm, 4096); + if (IS_ERR(obj)) + return PTR_ERR(obj); - ret = i915_gem_object_ggtt_pin(so.obj, NULL, 0, 0, 0); - if (ret) + so.vma = i915_vma_create(obj, &req->i915->ggtt.base, NULL); + if (IS_ERR(so.vma)) { + ret = PTR_ERR(so.vma); goto err_obj; + } - so.ggtt_offset = i915_gem_obj_ggtt_offset(so.obj); + ret = i915_vma_pin(so.vma, 0, 0, PIN_GLOBAL); + if (ret) + goto err_obj; ret = render_state_setup(&so); if (ret) goto err_unpin; - ret = req->engine->emit_bb_start(req, so.ggtt_offset, + ret = req->engine->emit_bb_start(req, so.vma->node.start, so.rodata->batch_items * 4, I915_DISPATCH_SECURE); if (ret) @@ -209,7 +213,7 @@ int i915_gem_render_state_init(struct drm_i915_gem_request *req) if (so.aux_batch_size > 8) { ret = req->engine->emit_bb_start(req, -(so.ggtt_offset + +(so.vma->node.start + so.aux_batch_offset), so.aux_batch_size, I915_DISPATCH_SECURE); @@ -217,10 +221,10 @@ int i915_gem_render_state_init(struct drm_i915_gem_request *req) goto err_unpin; } - i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req, 0); + i915_vma_move_to_active(so.vma, req, 0); err_unpin: - i915_gem_object_ggtt_unpin(so.obj); + i915_vma_unpin(so.vma); err_obj: - i915_gem_object_put(so.obj); + i915_gem_object_put(obj); return ret; } diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.h b/drivers/gpu/drm/i915/i915_gem_render_state.h index c44fca8599bb..18cce3f06e9c 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.h +++ b/drivers/gpu/drm/i915/i915_gem_render_state.h @@ -24,7
[Intel-gfx] [CI 25/31] drm/i915: Use VMA for wa_ctx tracking
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/intel_lrc.c| 58 ++--- drivers/gpu/drm/i915/intel_ringbuffer.h | 4 +-- 3 files changed, 35 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index da8aa86ad0c9..09219809488d 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1134,7 +1134,7 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, engine->status_page.vma->obj); ee->wa_ctx = i915_error_ggtt_object_create(dev_priv, - engine->wa_ctx.obj); + engine->wa_ctx.vma->obj); count = 0; list_for_each_entry(request, &engine->request_list, link) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 56c904e2dc98..64cb04e63512 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1165,45 +1165,51 @@ static int gen9_init_perctx_bb(struct intel_engine_cs *engine, static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *engine, u32 size) { - int ret; + struct drm_i915_gem_object *obj; + struct i915_vma *vma; + int err; - engine->wa_ctx.obj = i915_gem_object_create(&engine->i915->drm, - PAGE_ALIGN(size)); - if (IS_ERR(engine->wa_ctx.obj)) { - DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n"); - ret = PTR_ERR(engine->wa_ctx.obj); - engine->wa_ctx.obj = NULL; - return ret; - } + obj = i915_gem_object_create(&engine->i915->drm, PAGE_ALIGN(size)); + if (IS_ERR(obj)) + return PTR_ERR(obj); - ret = i915_gem_object_ggtt_pin(engine->wa_ctx.obj, NULL, - 0, PAGE_SIZE, PIN_HIGH); - if (ret) { - DRM_DEBUG_DRIVER("pin LRC WA ctx backing obj failed: %d\n", -ret); - i915_gem_object_put(engine->wa_ctx.obj); - return ret; + vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL); + if (IS_ERR(vma)) { + err = PTR_ERR(vma); + goto err; } + err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH); + if (err) + goto err; + + engine->wa_ctx.vma = vma; return 0; + +err: + i915_gem_object_put(obj); + return err; } static void lrc_destroy_wa_ctx_obj(struct intel_engine_cs *engine) { - if (engine->wa_ctx.obj) { - i915_gem_object_ggtt_unpin(engine->wa_ctx.obj); - i915_gem_object_put(engine->wa_ctx.obj); - engine->wa_ctx.obj = NULL; - } + struct i915_vma *vma; + + vma = fetch_and_zero(&engine->wa_ctx.vma); + if (!vma) + return; + + i915_vma_unpin(vma); + i915_vma_put(vma); } static int intel_init_workaround_bb(struct intel_engine_cs *engine) { - int ret; + struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx; uint32_t *batch; uint32_t offset; struct page *page; - struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx; + int ret; WARN_ON(engine->id != RCS); @@ -1226,7 +1232,7 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine) return ret; } - page = i915_gem_object_get_dirty_page(wa_ctx->obj, 0); + page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0); batch = kmap_atomic(page); offset = 0; @@ -2019,9 +2025,9 @@ populate_lr_context(struct i915_gem_context *ctx, RING_INDIRECT_CTX(engine->mmio_base), 0); ASSIGN_CTX_REG(reg_state, CTX_RCS_INDIRECT_CTX_OFFSET, RING_INDIRECT_CTX_OFFSET(engine->mmio_base), 0); - if (engine->wa_ctx.obj) { + if (engine->wa_ctx.vma) { struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx; - uint32_t ggtt_offset = i915_gem_obj_ggtt_offset(wa_ctx->obj); + u32 ggtt_offset = wa_ctx->vma->node.start; reg_state[CTX_RCS_INDIRECT_CTX+1] = (ggtt_offset + wa_ctx->indirect_ctx.offset * sizeof(uint32_t)) | diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index cb40785e7677..e3777572c70e 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -123,12 +123,12 @@ struct drm_i915_reg_table; *an option for fu
[Intel-gfx] [CI 15/31] drm/i915: Use VMA as the primary object for context state
When working with contexts, we most frequently want the GGTT VMA for the context state, first and foremost. Since the object is available via the VMA, we need only then store the VMA. v2: Formatting tweaks to debugfs output, restored some comments removed in the next patch Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 34 drivers/gpu/drm/i915/i915_drv.h| 3 +- drivers/gpu/drm/i915/i915_gem_context.c| 51 +--- drivers/gpu/drm/i915/i915_gpu_error.c | 7 ++-- drivers/gpu/drm/i915/i915_guc_submission.c | 6 +-- drivers/gpu/drm/i915/intel_lrc.c | 64 +++--- drivers/gpu/drm/i915/intel_ringbuffer.c| 6 +-- 7 files changed, 86 insertions(+), 85 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 32d26b6c4bca..fcda4e7da127 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -354,7 +354,7 @@ static int per_file_ctx_stats(int id, void *ptr, void *data) for (n = 0; n < ARRAY_SIZE(ctx->engine); n++) { if (ctx->engine[n].state) - per_file_stats(0, ctx->engine[n].state, data); + per_file_stats(0, ctx->engine[n].state->obj, data); if (ctx->engine[n].ring) per_file_stats(0, ctx->engine[n].ring->obj, data); } @@ -1977,7 +1977,7 @@ static int i915_context_status(struct seq_file *m, void *unused) seq_printf(m, "%s: ", engine->name); seq_putc(m, ce->initialised ? 'I' : 'i'); if (ce->state) - describe_obj(m, ce->state); + describe_obj(m, ce->state->obj); if (ce->ring) describe_ctx_ring(m, ce->ring); seq_putc(m, '\n'); @@ -1995,36 +1995,34 @@ static void i915_dump_lrc_obj(struct seq_file *m, struct i915_gem_context *ctx, struct intel_engine_cs *engine) { - struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state; + struct i915_vma *vma = ctx->engine[engine->id].state; struct page *page; - uint32_t *reg_state; int j; - unsigned long ggtt_offset = 0; seq_printf(m, "CONTEXT: %s %u\n", engine->name, ctx->hw_id); - if (ctx_obj == NULL) { - seq_puts(m, "\tNot allocated\n"); + if (!vma) { + seq_puts(m, "\tFake context\n"); return; } - if (!i915_gem_obj_ggtt_bound(ctx_obj)) - seq_puts(m, "\tNot bound in GGTT\n"); - else - ggtt_offset = i915_gem_obj_ggtt_offset(ctx_obj); + if (vma->flags & I915_VMA_GLOBAL_BIND) + seq_printf(m, "\tBound in GGTT at 0x%08x\n", + lower_32_bits(vma->node.start)); - if (i915_gem_object_get_pages(ctx_obj)) { - seq_puts(m, "\tFailed to get pages for context object\n"); + if (i915_gem_object_get_pages(vma->obj)) { + seq_puts(m, "\tFailed to get pages for context object\n\n"); return; } - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); - if (!WARN_ON(page == NULL)) { - reg_state = kmap_atomic(page); + page = i915_gem_object_get_page(vma->obj, LRC_STATE_PN); + if (page) { + u32 *reg_state = kmap_atomic(page); for (j = 0; j < 0x600 / sizeof(u32) / 4; j += 4) { - seq_printf(m, "\t[0x%08lx] 0x%08x 0x%08x 0x%08x 0x%08x\n", - ggtt_offset + 4096 + (j * 4), + seq_printf(m, + "\t[0x%04x] 0x%08x 0x%08x 0x%08x 0x%08x\n", + j * 4, reg_state[j], reg_state[j + 1], reg_state[j + 2], reg_state[j + 3]); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3285c8e2c87a..259425d99e17 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -893,9 +893,8 @@ struct i915_gem_context { u32 ggtt_alignment; struct intel_context { - struct drm_i915_gem_object *state; + struct i915_vma *state; struct intel_ring *ring; - struct i915_vma *lrc_vma; uint32_t *lrc_reg_state; u64 lrc_desc; int pin_count; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 547caf26a6b9..3857ce097c84 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -155,7 +155
[Intel-gfx] [CI 27/31] drm/i915: Track pinned VMA
Treat the VMA as the primary struct responsible for tracking bindings into the GPU's VM. That is we want to treat the VMA returned after we pin an object into the VM as the cookie we hold and eventually release when unpinning. Doing so eliminates the ambiguity in pinning the object and then searching for the relevant pin later. v2: Joonas' stylistic nitpicks, a fun rebase. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 2 +- drivers/gpu/drm/i915/i915_drv.h| 60 ++-- drivers/gpu/drm/i915/i915_gem.c| 233 - drivers/gpu/drm/i915/i915_gem_execbuffer.c | 65 drivers/gpu/drm/i915/i915_gem_fence.c | 14 +- drivers/gpu/drm/i915/i915_gem_gtt.c| 74 + drivers/gpu/drm/i915/i915_gem_gtt.h| 14 -- drivers/gpu/drm/i915/i915_gem_request.c| 2 +- drivers/gpu/drm/i915/i915_gem_request.h| 2 +- drivers/gpu/drm/i915/i915_gem_stolen.c | 2 +- drivers/gpu/drm/i915/i915_gem_tiling.c | 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 58 +++ drivers/gpu/drm/i915/intel_display.c | 57 --- drivers/gpu/drm/i915/intel_drv.h | 5 +- drivers/gpu/drm/i915/intel_fbc.c | 2 +- drivers/gpu/drm/i915/intel_fbdev.c | 19 +-- drivers/gpu/drm/i915/intel_guc_loader.c| 21 +-- drivers/gpu/drm/i915/intel_overlay.c | 32 ++-- 18 files changed, 267 insertions(+), 397 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index fb483df1afd6..21961304284e 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -105,7 +105,7 @@ static char get_tiling_flag(struct drm_i915_gem_object *obj) static char get_global_flag(struct drm_i915_gem_object *obj) { - return i915_gem_obj_to_ggtt(obj) ? 'g' : ' '; + return i915_gem_object_to_ggtt(obj, NULL) ? 'g' : ' '; } static char get_pin_mapped_flag(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 50dc3613c61c..bbee45acedeb 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3075,7 +3075,7 @@ struct drm_i915_gem_object *i915_gem_object_create_from_data( void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file); void i915_gem_free_object(struct drm_gem_object *obj); -int __must_check +struct i915_vma * __must_check i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, const struct i915_ggtt_view *view, u64 size, @@ -3279,12 +3279,11 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); int __must_check i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check +struct i915_vma * __must_check i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, u32 alignment, const struct i915_ggtt_view *view); -void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj, - const struct i915_ggtt_view *view); +void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma); int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align); int i915_gem_open(struct drm_device *dev, struct drm_file *file); @@ -3304,63 +3303,34 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *i915_gem_prime_export(struct drm_device *dev, struct drm_gem_object *gem_obj, int flags); -u64 i915_gem_obj_ggtt_offset_view(struct drm_i915_gem_object *o, - const struct i915_ggtt_view *view); -u64 i915_gem_obj_offset(struct drm_i915_gem_object *o, - struct i915_address_space *vm); -static inline u64 -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o) -{ - return i915_gem_obj_ggtt_offset_view(o, &i915_ggtt_view_normal); -} - -bool i915_gem_obj_ggtt_bound_view(struct drm_i915_gem_object *o, - const struct i915_ggtt_view *view); -bool i915_gem_obj_bound(struct drm_i915_gem_object *o, - struct i915_address_space *vm); - struct i915_vma * i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, - struct i915_address_space *vm); -struct i915_vma * -i915_gem_obj_to_ggtt_view(struct drm_i915_gem_object *obj, - const struct i915_ggtt_view *view); +struct i915_address_space *vm, +const struct i915_ggtt_view *view); struct i915_vma * i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj, - struct i915_address_space *vm); -st
[Intel-gfx] [CI 31/31] drm/i915: Record the RING_MODE register for post-mortem debugging
Just another useful register to inspect following a GPU hang. v2: Remove partial decoding of RING_MODE to userspace, be consistent and use GEN > 2 guards around RING_MODE everywhere. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 3 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 7 --- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bb7d8130dbfd..35caa9b2f36a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -757,6 +757,7 @@ struct drm_i915_error_state { u32 tail; u32 head; u32 ctl; + u32 mode; u32 hws; u32 ipeir; u32 ipehr; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 6215c1bf79c8..cdf5464a0c39 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -236,6 +236,7 @@ static void error_print_engine(struct drm_i915_error_state_buf *m, err_printf(m, " HEAD: 0x%08x\n", ee->head); err_printf(m, " TAIL: 0x%08x\n", ee->tail); err_printf(m, " CTL: 0x%08x\n", ee->ctl); + err_printf(m, " MODE: 0x%08x\n", ee->mode); err_printf(m, " HWS: 0x%08x\n", ee->hws); err_printf(m, " ACTHD: 0x%08x %08x\n", (u32)(ee->acthd>>32), (u32)ee->acthd); @@ -1005,6 +1006,8 @@ static void error_record_engine_registers(struct drm_i915_error_state *error, ee->head = I915_READ_HEAD(engine); ee->tail = I915_READ_TAIL(engine); ee->ctl = I915_READ_CTL(engine); + if (INTEL_GEN(dev_priv) > 2) + ee->mode = I915_READ_MODE(engine); if (I915_NEED_GFX_HWS(dev_priv)) { i915_reg_t mmio; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index e3327a2ac6e1..fa22bd87bab0 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -498,7 +498,7 @@ static bool stop_ring(struct intel_engine_cs *engine) { struct drm_i915_private *dev_priv = engine->i915; - if (!IS_GEN2(dev_priv)) { + if (INTEL_GEN(dev_priv) > 2) { I915_WRITE_MODE(engine, _MASKED_BIT_ENABLE(STOP_RING)); if (intel_wait_for_register(dev_priv, RING_MI_MODE(engine->mmio_base), @@ -520,7 +520,7 @@ static bool stop_ring(struct intel_engine_cs *engine) I915_WRITE_HEAD(engine, 0); I915_WRITE_TAIL(engine, 0); - if (!IS_GEN2(dev_priv)) { + if (INTEL_GEN(dev_priv) > 2) { (void)I915_READ_CTL(engine); I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING)); } @@ -2142,7 +2142,8 @@ void intel_engine_cleanup(struct intel_engine_cs *engine) dev_priv = engine->i915; if (engine->buffer) { - WARN_ON(!IS_GEN2(dev_priv) && (I915_READ_MODE(engine) & MODE_IDLE) == 0); + WARN_ON(INTEL_GEN(dev_priv) > 2 && + (I915_READ_MODE(engine) & MODE_IDLE) == 0); intel_ring_unpin(engine->buffer); intel_ring_free(engine->buffer); -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 30/31] drm/i915: Only record active and pending requests upon a GPU hang
There is no other state pertaining to the completed requests in the hang, other than gleamed through the ringbuffer, so including the expired requests in the list of outstanding requests simply adds noise. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/i915_gpu_error.c | 110 +++--- 1 file changed, 62 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 4c00d93396e6..6215c1bf79c8 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1060,12 +1060,69 @@ static void error_record_engine_registers(struct drm_i915_error_state *error, } } +static void engine_record_requests(struct intel_engine_cs *engine, + struct drm_i915_gem_request *first, + struct drm_i915_error_engine *ee) +{ + struct drm_i915_gem_request *request; + int count; + + count = 0; + request = first; + list_for_each_entry_from(request, &engine->request_list, link) + count += !!request->batch; + if (!count) + return; + + ee->requests = kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC); + if (!ee->requests) + return; + + count = 0; + request = first; + list_for_each_entry_from(request, &engine->request_list, link) { + struct drm_i915_error_request *erq; + + if (!request->batch) + continue; + + if (count >= ee->num_requests) { + /* +* If the ring request list was changed in +* between the point where the error request +* list was created and dimensioned and this +* point then just exit early to avoid crashes. +* +* We don't need to communicate that the +* request list changed state during error +* state capture and that the error state is +* slightly incorrect as a consequence since we +* are typically only interested in the request +* list state at the point of error state +* capture, not in any changes happening during +* the capture. +*/ + break; + } + + erq = &ee->requests[count++]; + erq->seqno = request->fence.seqno; + erq->jiffies = request->emitted_jiffies; + erq->head = request->head; + erq->tail = request->tail; + + rcu_read_lock(); + erq->pid = request->ctx->pid ? pid_nr(request->ctx->pid) : 0; + rcu_read_unlock(); + } + ee->num_requests = count; +} + static void i915_gem_record_rings(struct drm_i915_private *dev_priv, struct drm_i915_error_state *error) { struct i915_ggtt *ggtt = &dev_priv->ggtt; - struct drm_i915_gem_request *request; - int i, count; + int i; error->semaphore = i915_error_object_create(dev_priv, dev_priv->semaphore); @@ -1073,6 +1130,7 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, for (i = 0; i < I915_NUM_ENGINES; i++) { struct intel_engine_cs *engine = &dev_priv->engine[i]; struct drm_i915_error_engine *ee = &error->engine[i]; + struct drm_i915_gem_request *request; ee->pid = -1; ee->engine_id = -1; @@ -1131,6 +1189,8 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->cpu_ring_tail = ring->tail; ee->ringbuffer = i915_error_object_create(dev_priv, ring->vma); + + engine_record_requests(engine, request, ee); } ee->hws_page = @@ -1139,52 +1199,6 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->wa_ctx = i915_error_object_create(dev_priv, engine->wa_ctx.vma); - - count = 0; - list_for_each_entry(request, &engine->request_list, link) - count++; - - ee->num_requests = count; - ee->requests = - kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC); - if (!ee->requests) { - ee->num_requests = 0; - continue; - } - - count = 0; - list_for_each_entry(request, &engine->request_list, link) { - struct drm_i915_error_request *erq
[Intel-gfx] [CI 21/31] drm/i915: Move common seqno reset to intel_engine_cs.c
Since the intel_engine_init_seqno() is shared by all engine submission backends, move it out of the legacy intel_ringbuffer.c and into the new home for common routines, intel_engine_cs.c Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_engine_cs.c | 42 + drivers/gpu/drm/i915/intel_ringbuffer.c | 42 - 2 files changed, 42 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 7104dec5e893..829624571ca4 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -161,6 +161,48 @@ cleanup: return ret; } +void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno) +{ + struct drm_i915_private *dev_priv = engine->i915; + + /* Our semaphore implementation is strictly monotonic (i.e. we proceed +* so long as the semaphore value in the register/page is greater +* than the sync value), so whenever we reset the seqno, +* so long as we reset the tracking semaphore value to 0, it will +* always be before the next request's seqno. If we don't reset +* the semaphore value, then when the seqno moves backwards all +* future waits will complete instantly (causing rendering corruption). +*/ + if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) { + I915_WRITE(RING_SYNC_0(engine->mmio_base), 0); + I915_WRITE(RING_SYNC_1(engine->mmio_base), 0); + if (HAS_VEBOX(dev_priv)) + I915_WRITE(RING_SYNC_2(engine->mmio_base), 0); + } + if (dev_priv->semaphore_obj) { + struct drm_i915_gem_object *obj = dev_priv->semaphore_obj; + struct page *page = i915_gem_object_get_dirty_page(obj, 0); + void *semaphores = kmap(page); + memset(semaphores + GEN8_SEMAPHORE_OFFSET(engine->id, 0), + 0, I915_NUM_ENGINES * gen8_semaphore_seqno_size); + kunmap(page); + } + memset(engine->semaphore.sync_seqno, 0, + sizeof(engine->semaphore.sync_seqno)); + + intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno); + if (engine->irq_seqno_barrier) + engine->irq_seqno_barrier(engine); + engine->last_submitted_seqno = seqno; + + engine->hangcheck.seqno = seqno; + + /* After manually advancing the seqno, fake the interrupt in case +* there are any waiters for that seqno. +*/ + intel_engine_wakeup(engine); +} + void intel_engine_init_hangcheck(struct intel_engine_cs *engine) { memset(&engine->hangcheck, 0, sizeof(engine->hangcheck)); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index c89aea55bc10..6008d54b9152 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2314,48 +2314,6 @@ int intel_ring_cacheline_align(struct drm_i915_gem_request *req) return 0; } -void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno) -{ - struct drm_i915_private *dev_priv = engine->i915; - - /* Our semaphore implementation is strictly monotonic (i.e. we proceed -* so long as the semaphore value in the register/page is greater -* than the sync value), so whenever we reset the seqno, -* so long as we reset the tracking semaphore value to 0, it will -* always be before the next request's seqno. If we don't reset -* the semaphore value, then when the seqno moves backwards all -* future waits will complete instantly (causing rendering corruption). -*/ - if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) { - I915_WRITE(RING_SYNC_0(engine->mmio_base), 0); - I915_WRITE(RING_SYNC_1(engine->mmio_base), 0); - if (HAS_VEBOX(dev_priv)) - I915_WRITE(RING_SYNC_2(engine->mmio_base), 0); - } - if (dev_priv->semaphore_obj) { - struct drm_i915_gem_object *obj = dev_priv->semaphore_obj; - struct page *page = i915_gem_object_get_dirty_page(obj, 0); - void *semaphores = kmap(page); - memset(semaphores + GEN8_SEMAPHORE_OFFSET(engine->id, 0), - 0, I915_NUM_ENGINES * gen8_semaphore_seqno_size); - kunmap(page); - } - memset(engine->semaphore.sync_seqno, 0, - sizeof(engine->semaphore.sync_seqno)); - - intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno); - if (engine->irq_seqno_barrier) - engine->irq_seqno_barrier(engine); - engine->last_submitted_seqno = seqno; - - engine->hangcheck.seqno = seqno; - - /* After manually advancing the seqno, fake the interrupt in case -* there
[Intel-gfx] [CI 17/31] drm/i915: Move assertion for iomap access to i915_vma_pin_iomap
Access through the GTT requires the device to be awake. Ideally i915_vma_pin_iomap() is short-lived and the pinning demarcates the access through the iomap. This is not entirely true, we have a mixture of long lived pins that exceed the wakelock (such as legacy ringbuffers) and short lived pin that do live within the wakelock (such as execlist ringbuffers). Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 3 --- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 1bec50bd651b..738a474c5afa 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -3650,6 +3650,9 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma) { void __iomem *ptr; + /* Access through the GTT requires the device to be awake. */ + assert_rpm_wakelock_held(to_i915(vma->vm->dev)); + lockdep_assert_held(&vma->vm->dev->struct_mutex); if (WARN_ON(!vma->obj->map_and_fenceable)) return IO_ERR_PTR(-ENODEV); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 81dc69d1ff05..4a614e567353 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1966,9 +1966,6 @@ int intel_ring_pin(struct intel_ring *ring) if (ret) goto err_unpin; - /* Access through the GTT requires the device to be awake. */ - assert_rpm_wakelock_held(dev_priv); - addr = (void __force *) i915_vma_pin_iomap(i915_gem_obj_to_ggtt(obj)); if (IS_ERR(addr)) { -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 12/31] drm/i915: Track pinned vma inside guc
Since the guc allocates and pins and object into the GGTT for its usage, it is more natural to use that pinned VMA as our resource cookie. v2: Embrace naming tautology v3: Rewrite comments for guc_allocate_vma() Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 10 +- drivers/gpu/drm/i915/i915_gem_gtt.h| 6 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 144 ++--- drivers/gpu/drm/i915/intel_guc.h | 9 +- drivers/gpu/drm/i915/intel_guc_loader.c| 7 +- 5 files changed, 90 insertions(+), 86 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index fd028953453d..32d26b6c4bca 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2526,15 +2526,15 @@ static int i915_guc_log_dump(struct seq_file *m, void *data) struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; struct drm_i915_private *dev_priv = to_i915(dev); - struct drm_i915_gem_object *log_obj = dev_priv->guc.log_obj; - u32 *log; + struct drm_i915_gem_object *obj; int i = 0, pg; - if (!log_obj) + if (!dev_priv->guc.log_vma) return 0; - for (pg = 0; pg < log_obj->base.size / PAGE_SIZE; pg++) { - log = kmap_atomic(i915_gem_object_get_page(log_obj, pg)); + obj = dev_priv->guc.log_vma->obj; + for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) { + u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg)); for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4) seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n", diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index f2769e01cc8c..a2691943a404 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -716,4 +716,10 @@ static inline void i915_vma_unpin_iomap(struct i915_vma *vma) i915_vma_unpin(vma); } +static inline struct page *i915_vma_first_page(struct i915_vma *vma) +{ + GEM_BUG_ON(!vma->pages); + return sg_page(vma->pages->sgl); +} + #endif diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 6831321a9c8c..29de8cec1b58 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -183,7 +183,7 @@ static int guc_update_doorbell_id(struct intel_guc *guc, struct i915_guc_client *client, u16 new_id) { - struct sg_table *sg = guc->ctx_pool_obj->pages; + struct sg_table *sg = guc->ctx_pool_vma->pages; void *doorbell_bitmap = guc->doorbell_bitmap; struct guc_doorbell_info *doorbell; struct guc_context_desc desc; @@ -325,7 +325,6 @@ static void guc_init_proc_desc(struct intel_guc *guc, static void guc_init_ctx_desc(struct intel_guc *guc, struct i915_guc_client *client) { - struct drm_i915_gem_object *client_obj = client->client_obj; struct drm_i915_private *dev_priv = guc_to_i915(guc); struct intel_engine_cs *engine; struct i915_gem_context *ctx = client->owner; @@ -383,8 +382,8 @@ static void guc_init_ctx_desc(struct intel_guc *guc, * The doorbell, process descriptor, and workqueue are all parts * of the client object, which the GuC will reference via the GGTT */ - gfx_addr = i915_gem_obj_ggtt_offset(client_obj); - desc.db_trigger_phy = sg_dma_address(client_obj->pages->sgl) + + gfx_addr = client->vma->node.start; + desc.db_trigger_phy = sg_dma_address(client->vma->pages->sgl) + client->doorbell_offset; desc.db_trigger_cpu = (uintptr_t)client->client_base + client->doorbell_offset; @@ -400,7 +399,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc, desc.desc_private = (uintptr_t)client; /* Pool context is pinned already */ - sg = guc->ctx_pool_obj->pages; + sg = guc->ctx_pool_vma->pages; sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc), sizeof(desc) * client->ctx_index); } @@ -413,7 +412,7 @@ static void guc_fini_ctx_desc(struct intel_guc *guc, memset(&desc, 0, sizeof(desc)); - sg = guc->ctx_pool_obj->pages; + sg = guc->ctx_pool_vma->pages; sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc), sizeof(desc) * client->ctx_index); } @@ -496,7 +495,7 @@ static void guc_add_workqueue_item(struct i915_guc_client *gc, /* WQ starts from the page after doorbell / process_desc */ wq_page = (wq_off + GUC_DB_SIZE) >> PAGE_SHIFT; wq_off &= PAGE_SIZE - 1; - base = kmap_atomic(i
[Intel-gfx] [CI 19/31] drm/i915: Use VMA for scratch page tracking
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_context.c | 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/intel_display.c| 2 +- drivers/gpu/drm/i915/intel_lrc.c| 18 +-- drivers/gpu/drm/i915/intel_ringbuffer.c | 55 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 10 ++ 6 files changed, 46 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 824dfe14bcd0..e566167d9441 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -660,7 +660,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT); intel_ring_emit_reg(ring, last_reg); - intel_ring_emit(ring, engine->scratch.gtt_offset); + intel_ring_emit(ring, engine->scratch->node.start); intel_ring_emit(ring, MI_NOOP); } intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 4a19494a4f6f..b80d2a6f56b3 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1101,7 +1101,7 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, if (HAS_BROKEN_CS_TLB(dev_priv)) ee->wa_batchbuffer = i915_error_ggtt_object_create(dev_priv, - engine->scratch.obj); + engine->scratch->obj); if (request->ctx->engine[i].state) { ee->ctx = i915_error_ggtt_object_create(dev_priv, diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index c5c0c35d4f6e..2e7d03c5bf5c 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -11795,7 +11795,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT); intel_ring_emit_reg(ring, DERRMR); - intel_ring_emit(ring, req->engine->scratch.gtt_offset + 256); + intel_ring_emit(ring, req->engine->scratch->node.start + 256); if (IS_GEN8(dev)) { intel_ring_emit(ring, 0); intel_ring_emit(ring, MI_NOOP); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 73dd2f9e0547..42999ba02152 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -914,7 +914,7 @@ static inline int gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT)); wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4); - wa_ctx_emit(batch, index, engine->scratch.gtt_offset + 256); + wa_ctx_emit(batch, index, engine->scratch->node.start + 256); wa_ctx_emit(batch, index, 0); wa_ctx_emit(batch, index, MI_LOAD_REGISTER_IMM(1)); @@ -932,7 +932,7 @@ static inline int gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, wa_ctx_emit(batch, index, (MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT)); wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4); - wa_ctx_emit(batch, index, engine->scratch.gtt_offset + 256); + wa_ctx_emit(batch, index, engine->scratch->node.start + 256); wa_ctx_emit(batch, index, 0); return index; @@ -993,7 +993,7 @@ static int gen8_init_indirectctx_bb(struct intel_engine_cs *engine, /* WaClearSlmSpaceAtContextSwitch:bdw,chv */ /* Actual scratch location is at 128 bytes offset */ - scratch_addr = engine->scratch.gtt_offset + 2*CACHELINE_BYTES; + scratch_addr = engine->scratch->node.start + 2 * CACHELINE_BYTES; wa_ctx_emit(batch, index, GFX_OP_PIPE_CONTROL(6)); wa_ctx_emit(batch, index, (PIPE_CONTROL_FLUSH_L3 | @@ -1072,8 +1072,8 @@ static int gen9_init_indirectctx_bb(struct intel_engine_cs *engine, /* WaClearSlmSpaceAtContextSwitch:kbl */ /* Actual scratch location is at 128 bytes offset */ if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_A0)) { - uint32_t scratch_addr - = engine->scratch.gtt_offset + 2*CACHELINE_BYTES; + u32 scratch_addr = + engine->scratch->node.start + 2
[Intel-gfx] [CI 13/31] drm/i915: Convert fence computations to use vma directly
Lookup the GGTT vma once for the object assigned to the fence, and then derive everything from that vma. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_fence.c | 55 +-- 1 file changed, 26 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c b/drivers/gpu/drm/i915/i915_gem_fence.c index 9e8173fe2a09..1d0f975c61f8 100644 --- a/drivers/gpu/drm/i915/i915_gem_fence.c +++ b/drivers/gpu/drm/i915/i915_gem_fence.c @@ -85,22 +85,19 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg, POSTING_READ(fence_reg_lo); if (obj) { - u32 size = i915_gem_obj_ggtt_size(obj); + struct i915_vma *vma = i915_gem_obj_to_ggtt(obj); unsigned int tiling = i915_gem_object_get_tiling(obj); unsigned int stride = i915_gem_object_get_stride(obj); - uint64_t val; + u64 size = vma->node.size; + u32 row_size = stride * (tiling == I915_TILING_Y ? 32 : 8); + u64 val; /* Adjust fence size to match tiled area */ - if (tiling != I915_TILING_NONE) { - uint32_t row_size = stride * - (tiling == I915_TILING_Y ? 32 : 8); - size = (size / row_size) * row_size; - } + size = rounddown(size, row_size); - val = (uint64_t)((i915_gem_obj_ggtt_offset(obj) + size - 4096) & -0xf000) << 32; - val |= i915_gem_obj_ggtt_offset(obj) & 0xf000; - val |= (uint64_t)((stride / 128) - 1) << fence_pitch_shift; + val = ((vma->node.start + size - 4096) & 0xf000) << 32; + val |= vma->node.start & 0xf000; + val |= (u64)((stride / 128) - 1) << fence_pitch_shift; if (tiling == I915_TILING_Y) val |= 1 << I965_FENCE_TILING_Y_SHIFT; val |= I965_FENCE_REG_VALID; @@ -123,17 +120,17 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg, u32 val; if (obj) { - u32 size = i915_gem_obj_ggtt_size(obj); + struct i915_vma *vma = i915_gem_obj_to_ggtt(obj); unsigned int tiling = i915_gem_object_get_tiling(obj); unsigned int stride = i915_gem_object_get_stride(obj); int pitch_val; int tile_width; - WARN((i915_gem_obj_ggtt_offset(obj) & ~I915_FENCE_START_MASK) || -(size & -size) != size || -(i915_gem_obj_ggtt_offset(obj) & (size - 1)), -"object 0x%08llx [fenceable? %d] not 1M or pot-size (0x%08x) aligned\n", -i915_gem_obj_ggtt_offset(obj), obj->map_and_fenceable, size); + WARN((vma->node.start & ~I915_FENCE_START_MASK) || +!is_power_of_2(vma->node.size) || +(vma->node.start & (vma->node.size - 1)), +"object 0x%08llx [fenceable? %d] not 1M or pot-size (0x%08llx) aligned\n", +vma->node.start, obj->map_and_fenceable, vma->node.size); if (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev)) tile_width = 128; @@ -144,10 +141,10 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg, pitch_val = stride / tile_width; pitch_val = ffs(pitch_val) - 1; - val = i915_gem_obj_ggtt_offset(obj); + val = vma->node.start; if (tiling == I915_TILING_Y) val |= 1 << I830_FENCE_TILING_Y_SHIFT; - val |= I915_FENCE_SIZE_BITS(size); + val |= I915_FENCE_SIZE_BITS(vma->node.size); val |= pitch_val << I830_FENCE_PITCH_SHIFT; val |= I830_FENCE_REG_VALID; } else @@ -161,27 +158,27 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg, struct drm_i915_gem_object *obj) { struct drm_i915_private *dev_priv = to_i915(dev); - uint32_t val; + u32 val; if (obj) { - u32 size = i915_gem_obj_ggtt_size(obj); + struct i915_vma *vma = i915_gem_obj_to_ggtt(obj); unsigned int tiling = i915_gem_object_get_tiling(obj); unsigned int stride = i915_gem_object_get_stride(obj); - uint32_t pitch_val; + u32 pitch_val; - WARN((i915_gem_obj_ggtt_offset(obj) & ~I830_FENCE_START_MASK) || -(size & -size) != size || -(i915_gem_obj_ggtt_offset(obj) & (size - 1)), -"object 0x%08llx not 512K or pot-size 0x%08x aligned\n", -i915_gem_obj_ggtt_offset(obj), size); + WARN((vma->node.s
[Intel-gfx] [CI 16/31] drm/i915: Only change the context object's domain when binding
We know that the only access to the context object is via the GPU, and the only time when it can be out of the GPU domain is when it is swapped out and unbound. Therefore we only need to clflush the object when binding, thus avoiding any potential stall on touching the domain on an active context. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_context.c | 19 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 4 2 files changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 3857ce097c84..824dfe14bcd0 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -772,6 +772,13 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) if (skip_rcs_switch(ppgtt, engine, to)) return 0; + /* Clear this page out of any CPU caches for coherent swap-in/out. */ + if (!(vma->flags & I915_VMA_GLOBAL_BIND)) { + ret = i915_gem_object_set_to_gtt_domain(vma->obj, false); + if (ret) + return ret; + } + /* Trying to pin first makes error handling easier. */ ret = i915_vma_pin(vma, 0, to->ggtt_alignment, PIN_GLOBAL); if (ret) @@ -786,18 +793,6 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) */ from = engine->last_context; - /* -* Clear this page out of any CPU caches for coherent swap-in/out. Note -* that thanks to write = false in this call and us not setting any gpu -* write domains when putting a context object onto the active list -* (when switching away from it), this won't block. -* -* XXX: We need a real interface to do this instead of trickery. -*/ - ret = i915_gem_object_set_to_gtt_domain(vma->obj, false); - if (ret) - goto err; - if (needs_pd_load_pre(ppgtt, engine, to)) { /* Older GENs and non render rings still want the load first, * "PP_DCLV followed by PP_DIR_BASE register through Load diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 2318a27341c8..81dc69d1ff05 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2092,6 +2092,10 @@ static int intel_ring_context_pin(struct i915_gem_context *ctx, return 0; if (ce->state) { + ret = i915_gem_object_set_to_gtt_domain(ce->state->obj, false); + if (ret) + goto error; + ret = i915_vma_pin(ce->state, 0, ctx->ggtt_alignment, PIN_GLOBAL | PIN_HIGH); if (ret) -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 11/31] drm/i915: Add convenience wrappers for vma's object get/put
The VMA are unreferenced, they belong to the object and live until they are closed. However, if we want to use the VMA as a cookie and use it to keep the object alive, we want to hold onto a reference to the object for the lifetime of the VMA cookie. To facilitate this, add a couple of simple wrappers for managing the reference count on the object owning the VMA. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h| 12 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 ++-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 855833a6306a..3285c8e2c87a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2373,6 +2373,18 @@ i915_gem_object_get_stride(struct drm_i915_gem_object *obj) return obj->tiling_and_stride & STRIDE_MASK; } +static inline struct i915_vma *i915_vma_get(struct i915_vma *vma) +{ + i915_gem_object_get(vma->obj); + return vma; +} + +static inline void i915_vma_put(struct i915_vma *vma) +{ + lockdep_assert_held(&vma->vm->dev->struct_mutex); + i915_gem_object_put(vma->obj); +} + /* * Optimised SGL iterator for GEM objects */ diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index c8d13fea4b25..ced05878b405 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -271,7 +271,7 @@ static void eb_destroy(struct eb_vmas *eb) exec_list); list_del_init(&vma->exec_list); i915_gem_execbuffer_unreserve_vma(vma); - i915_gem_object_put(vma->obj); + i915_vma_put(vma); } kfree(eb); } @@ -900,7 +900,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, vma = list_first_entry(&eb->vmas, struct i915_vma, exec_list); list_del_init(&vma->exec_list); i915_gem_execbuffer_unreserve_vma(vma); - i915_gem_object_put(vma->obj); + i915_vma_put(vma); } mutex_unlock(&dev->struct_mutex); -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 14/31] drm/i915: Use VMA directly for checking tiling parameters
v2: Rename functions to suit their more active role Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_tiling.c | 51 -- 1 file changed, 30 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c index f4b984de83b5..b2b0cb7199ac 100644 --- a/drivers/gpu/drm/i915/i915_gem_tiling.c +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c @@ -116,35 +116,46 @@ i915_tiling_ok(struct drm_device *dev, int stride, int size, int tiling_mode) return true; } -/* Is the current GTT allocation valid for the change in tiling? */ -static bool -i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode) +/* Make the current GTT allocation valid for the change in tiling. */ +static int +i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj, int tiling_mode) { struct drm_i915_private *dev_priv = to_i915(obj->base.dev); + struct i915_vma *vma; u32 size; if (tiling_mode == I915_TILING_NONE) - return true; + return 0; if (INTEL_GEN(dev_priv) >= 4) - return true; + return 0; + + vma = i915_gem_obj_to_ggtt(obj); + if (!vma) + return 0; + + if (!obj->map_and_fenceable) + return 0; if (IS_GEN3(dev_priv)) { - if (i915_gem_obj_ggtt_offset(obj) & ~I915_FENCE_START_MASK) - return false; + if (vma->node.start & ~I915_FENCE_START_MASK) + goto bad; } else { - if (i915_gem_obj_ggtt_offset(obj) & ~I830_FENCE_START_MASK) - return false; + if (vma->node.start & ~I830_FENCE_START_MASK) + goto bad; } size = i915_gem_get_ggtt_size(dev_priv, obj->base.size, tiling_mode); - if (i915_gem_obj_ggtt_size(obj) != size) - return false; + if (vma->node.size < size) + goto bad; - if (i915_gem_obj_ggtt_offset(obj) & (size - 1)) - return false; + if (vma->node.start & (size - 1)) + goto bad; - return true; + return 0; + +bad: + return i915_vma_unbind(vma); } /** @@ -168,7 +179,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, struct drm_i915_gem_set_tiling *args = data; struct drm_i915_private *dev_priv = to_i915(dev); struct drm_i915_gem_object *obj; - int ret = 0; + int err = 0; /* Make sure we don't cross-contaminate obj->tiling_and_stride */ BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK); @@ -187,7 +198,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, mutex_lock(&dev->struct_mutex); if (obj->pin_display || obj->framebuffer_references) { - ret = -EBUSY; + err = -EBUSY; goto err; } @@ -234,11 +245,9 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, * has to also include the unfenced register the GPU uses * whilst executing a fenced command for an untiled object. */ - if (obj->map_and_fenceable && - !i915_gem_object_fence_ok(obj, args->tiling_mode)) - ret = i915_vma_unbind(i915_gem_obj_to_ggtt(obj)); - if (ret == 0) { + err = i915_gem_object_fence_prepare(obj, args->tiling_mode); + if (!err) { if (obj->pages && obj->madv == I915_MADV_WILLNEED && dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES) { @@ -281,7 +290,7 @@ err: intel_runtime_pm_put(dev_priv); - return ret; + return err; } /** -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 01/31] drm/i915: Record the position of the start of the request
Not only does it make for good documentation and debugging aide, but it is also vital for when we want to unwind requests - such as when throwing away an incomplete request. Signed-off-by: Chris Wilson Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-2-git-send-email-arun.siluv...@linux.intel.com Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem_request.c | 13 + drivers/gpu/drm/i915/i915_gpu_error.c | 6 -- 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bf193ba1574e..b1017950087b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -557,6 +557,7 @@ struct drm_i915_error_state { struct drm_i915_error_request { long jiffies; u32 seqno; + u32 head; u32 tail; } *requests; diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index b764c1d440c8..8a9e9bfeea09 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -426,6 +426,13 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, if (ret) goto err_ctx; + /* Record the position of the start of the request so that +* should we detect the updated seqno part-way through the +* GPU processing the request, we never over-estimate the +* position of the head. +*/ + req->head = req->ring->tail; + return req; err_ctx: @@ -500,8 +507,6 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) trace_i915_gem_request_add(request); - request->head = request_start; - /* Seal the request and mark it as pending execution. Note that * we may inspect this state, without holding any locks, during * hangcheck. Hence we apply the barrier to ensure that we do not @@ -514,10 +519,10 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) list_add_tail(&request->link, &engine->request_list); list_add_tail(&request->ring_link, &ring->request_list); - /* Record the position of the start of the request so that + /* Record the position of the start of the breadcrumb so that * should we detect the updated seqno part-way through the * GPU processing the request, we never over-estimate the -* position of the head. +* position of the ring's HEAD. */ request->postfix = ring->tail; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index eecb87063c88..d54848f5f246 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -455,9 +455,10 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, dev_priv->engine[i].name, ee->num_requests); for (j = 0; j < ee->num_requests; j++) { - err_printf(m, " seqno 0x%08x, emitted %ld, tail 0x%08x\n", + err_printf(m, " seqno 0x%08x, emitted %ld, head 0x%08x, tail 0x%08x\n", ee->requests[j].seqno, ee->requests[j].jiffies, + ee->requests[j].head, ee->requests[j].tail); } } @@ -1205,7 +1206,8 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, erq = &ee->requests[count++]; erq->seqno = request->fence.seqno; erq->jiffies = request->emitted_jiffies; - erq->tail = request->postfix; + erq->head = request->head; + erq->tail = request->tail; } } } -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 07/31] drm/i915: Remove redundant WARN_ON from __i915_add_request()
It's an outright programming error, so explode if it is ever hit. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_request.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 8a9e9bfeea09..4c5b7e104f2f 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -470,18 +470,12 @@ static void i915_gem_mark_busy(const struct intel_engine_cs *engine) */ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) { - struct intel_engine_cs *engine; - struct intel_ring *ring; + struct intel_engine_cs *engine = request->engine; + struct intel_ring *ring = request->ring; u32 request_start; u32 reserved_tail; int ret; - if (WARN_ON(!request)) - return; - - engine = request->engine; - ring = request->ring; - /* * To ensure that this call will not fail, space for its emissions * should already have been reserved in the ring buffer. Let the ring -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 06/31] drm/i915: Reduce i915_gem_objects to only show object information
No longer is knowing how much of the GTT (both mappable aperture and beyond) relevant, and the output clutters the real information - that is how many objects are allocated and bound (and by who) so that we can quickly grasp if there is a leak. v2: Relent, and rename pinned to indicate display only. Since the display objects are semi-static and are of variable size, they are the interesting objects to watch over time for aperture leaking. The other pins are either static (such as the scratch page) or very short lived (such as execbuf) and not part of the precious GGTT. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 100 -- drivers/gpu/drm/i915/i915_drv.h | 249 +- drivers/gpu/drm/i915/i915_gpu_error.c | 15 ++ 3 files changed, 168 insertions(+), 196 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index c535c4c2f7af..fd028953453d 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -269,17 +269,6 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data) return 0; } -#define count_objects(list, member) do { \ - list_for_each_entry(obj, list, member) { \ - size += i915_gem_obj_total_ggtt_size(obj); \ - ++count; \ - if (obj->map_and_fenceable) { \ - mappable_size += i915_gem_obj_ggtt_size(obj); \ - ++mappable_count; \ - } \ - } \ -} while (0) - struct file_stats { struct drm_i915_file_private *file_priv; unsigned long count; @@ -394,30 +383,16 @@ static void print_context_stats(struct seq_file *m, print_file_stats(m, "[k]contexts", stats); } -#define count_vmas(list, member) do { \ - list_for_each_entry(vma, list, member) { \ - size += i915_gem_obj_total_ggtt_size(vma->obj); \ - ++count; \ - if (vma->obj->map_and_fenceable) { \ - mappable_size += i915_gem_obj_ggtt_size(vma->obj); \ - ++mappable_count; \ - } \ - } \ -} while (0) - static int i915_gem_object_info(struct seq_file *m, void* data) { struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct i915_ggtt *ggtt = &dev_priv->ggtt; - u32 count, mappable_count, purgeable_count; - u64 size, mappable_size, purgeable_size; - unsigned long pin_mapped_count = 0, pin_mapped_purgeable_count = 0; - u64 pin_mapped_size = 0, pin_mapped_purgeable_size = 0; + u32 count, mapped_count, purgeable_count, dpy_count; + u64 size, mapped_size, purgeable_size, dpy_size; struct drm_i915_gem_object *obj; struct drm_file *file; - struct i915_vma *vma; int ret; ret = mutex_lock_interruptible(&dev->struct_mutex); @@ -428,70 +403,51 @@ static int i915_gem_object_info(struct seq_file *m, void* data) dev_priv->mm.object_count, dev_priv->mm.object_memory); - size = count = mappable_size = mappable_count = 0; - count_objects(&dev_priv->mm.bound_list, global_list); - seq_printf(m, "%u [%u] objects, %llu [%llu] bytes in gtt\n", - count, mappable_count, size, mappable_size); - - size = count = mappable_size = mappable_count = 0; - count_vmas(&ggtt->base.active_list, vm_link); - seq_printf(m, " %u [%u] active objects, %llu [%llu] bytes\n", - count, mappable_count, size, mappable_size); - - size = count = mappable_size = mappable_count = 0; - count_vmas(&ggtt->base.inactive_list, vm_link); - seq_printf(m, " %u [%u] inactive objects, %llu [%llu] bytes\n", - count, mappable_count, size, mappable_size); - size = count = purgeable_size = purgeable_count = 0; list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) { - size += obj->base.size, ++count; - if (obj->madv == I915_MADV_DONTNEED) - purgeable_size += obj->base.size, ++purgeable_count; + size += obj->base.size; + ++count; + + if (obj->madv == I915_MADV_DONTNEED) { + purgeable_size += obj->base.size; + ++purgeable_count; + } + if (obj->mapping) { - pin_mapped_count++; - pin_mapped_size += obj->base.size; - if (obj->pages_pin_count == 0) { - pin_mapped_purgeable_count++; - pin_mapped_purgeable_size += obj->base.size; - } + mapped_count++; + mapped_size += obj->base.size;
[Intel-gfx] [CI 08/31] drm/i915: Always set the vma->pages
Previously, we would only set the vma->pages pointer for GGTT entries. However, if we always set it, we can use it to prettify some code that may want to access the backing store associated with the VMA (as assigned to the VMA). Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c | 8 drivers/gpu/drm/i915/i915_gem_gtt.c | 30 ++ drivers/gpu/drm/i915/i915_gem_gtt.h | 3 +-- 3 files changed, 19 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 5566916870eb..8b1a74dbb870 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2859,12 +2859,12 @@ int i915_vma_unbind(struct i915_vma *vma) if (i915_vma_is_ggtt(vma)) { if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) { obj->map_and_fenceable = false; - } else if (vma->ggtt_view.pages) { - sg_free_table(vma->ggtt_view.pages); - kfree(vma->ggtt_view.pages); + } else if (vma->pages) { + sg_free_table(vma->pages); + kfree(vma->pages); } - vma->ggtt_view.pages = NULL; } + vma->pages = NULL; /* Since the unbound list is global, only move to that list if * no more VMAs exist. */ diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index d876501694c6..9c178b0c40b5 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -170,11 +170,13 @@ static int ppgtt_bind_vma(struct i915_vma *vma, { u32 pte_flags = 0; + vma->pages = vma->obj->pages; + /* Currently applicable only to VLV */ if (vma->obj->gt_ro) pte_flags |= PTE_READ_ONLY; - vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start, + vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start, cache_level, pte_flags); return 0; @@ -2618,8 +2620,7 @@ static int ggtt_bind_vma(struct i915_vma *vma, if (obj->gt_ro) pte_flags |= PTE_READ_ONLY; - vma->vm->insert_entries(vma->vm, vma->ggtt_view.pages, - vma->node.start, + vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start, cache_level, pte_flags); /* @@ -2651,8 +2652,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, if (flags & I915_VMA_GLOBAL_BIND) { vma->vm->insert_entries(vma->vm, - vma->ggtt_view.pages, - vma->node.start, + vma->pages, vma->node.start, cache_level, pte_flags); } @@ -2660,8 +2660,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, struct i915_hw_ppgtt *appgtt = to_i915(vma->vm->dev)->mm.aliasing_ppgtt; appgtt->base.insert_entries(&appgtt->base, - vma->ggtt_view.pages, - vma->node.start, + vma->pages, vma->node.start, cache_level, pte_flags); } @@ -3557,28 +3556,27 @@ i915_get_ggtt_vma_pages(struct i915_vma *vma) { int ret = 0; - if (vma->ggtt_view.pages) + if (vma->pages) return 0; if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) - vma->ggtt_view.pages = vma->obj->pages; + vma->pages = vma->obj->pages; else if (vma->ggtt_view.type == I915_GGTT_VIEW_ROTATED) - vma->ggtt_view.pages = + vma->pages = intel_rotate_fb_obj_pages(&vma->ggtt_view.params.rotated, vma->obj); else if (vma->ggtt_view.type == I915_GGTT_VIEW_PARTIAL) - vma->ggtt_view.pages = - intel_partial_pages(&vma->ggtt_view, vma->obj); + vma->pages = intel_partial_pages(&vma->ggtt_view, vma->obj); else WARN_ONCE(1, "GGTT view %u not implemented!\n", vma->ggtt_view.type); - if (!vma->ggtt_view.pages) { + if (!vma->pages) { DRM_ERROR("Failed to get pages for GGTT view type %u!\n", vma->ggtt_view.type); ret = -EINVAL; - } else if (IS_ERR(vma->ggtt_view.pages)) { - ret = PTR_ERR(vma->ggtt_view.pages); - vma->ggtt_view.pages = NULL; + } else if (IS_ERR(vma->pages)) { + ret = PTR_ERR(vma->pages); + vma->pages = NULL; DRM_ERROR("Failed to get pages for VMA view type %
[Intel-gfx] [CI 03/31] drm/i915: Store the active context object on all engines upon error
With execlists, we have context objects everywhere, not just RCS. So store them for post-mortem debugging. This also has a secondary effect of removing one more unsafe list iteration with using preserved state from the hanging request. And now we can cross-reference the request's context state with that loaded by the GPU. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gpu_error.c | 28 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index a51c5422c1bd..f34e63eda178 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1043,28 +1043,6 @@ static void error_record_engine_registers(struct drm_i915_error_state *error, } } -static void i915_gem_record_active_context(struct intel_engine_cs *engine, - struct drm_i915_error_state *error, - struct drm_i915_error_engine *ee) -{ - struct drm_i915_private *dev_priv = engine->i915; - struct drm_i915_gem_object *obj; - - /* Currently render ring is the only HW context user */ - if (engine->id != RCS || !error->ccid) - return; - - list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { - if (!i915_gem_obj_ggtt_bound(obj)) - continue; - - if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) { - ee->ctx = i915_error_ggtt_object_create(dev_priv, obj); - break; - } - } -} - static void i915_gem_record_rings(struct drm_i915_private *dev_priv, struct drm_i915_error_state *error) { @@ -1114,6 +1092,10 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, i915_error_ggtt_object_create(dev_priv, engine->scratch.obj); + ee->ctx = + i915_error_ggtt_object_create(dev_priv, + request->ctx->engine[i].state); + if (request->pid) { struct task_struct *task; @@ -1144,8 +1126,6 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->wa_ctx = i915_error_ggtt_object_create(dev_priv, engine->wa_ctx.obj); - i915_gem_record_active_context(engine, error, ee); - count = 0; list_for_each_entry(request, &engine->request_list, link) count++; -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 02/31] drm/i915: Reduce amount of duplicate buffer information captured on error
When capturing the error state, we do not need to know about every address space - just those that are related to the error. We know which context is active at the time, therefore we know which VM are implicated in the error. We can then restrict the VM which we report to the relevant subset. v2: s/i/count_active/ (and similar) Rewrite label generation for "Buffers" Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 9 +- drivers/gpu/drm/i915/i915_gpu_error.c | 224 +++--- 2 files changed, 105 insertions(+), 128 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b1017950087b..7eb911e47904 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -517,6 +517,7 @@ struct drm_i915_error_state { int num_waiters; int hangcheck_score; enum intel_engine_hangcheck_action hangcheck_action; + struct i915_address_space *vm; int num_requests; /* our own tracking of ring head and tail */ @@ -587,17 +588,15 @@ struct drm_i915_error_state { u32 read_domains; u32 write_domain; s32 fence_reg:I915_MAX_NUM_FENCE_BITS; - s32 pinned:2; u32 tiling:2; u32 dirty:1; u32 purgeable:1; u32 userptr:1; s32 engine:4; u32 cache_level:3; - } **active_bo, **pinned_bo; - - u32 *active_bo_count, *pinned_bo_count; - u32 vm_count; + } *active_bo[I915_NUM_ENGINES], *pinned_bo; + u32 active_bo_count[I915_NUM_ENGINES], pinned_bo_count; + struct i915_address_space *active_vm[I915_NUM_ENGINES]; }; struct intel_connector; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index d54848f5f246..a51c5422c1bd 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -42,16 +42,6 @@ static const char *engine_str(int engine) } } -static const char *pin_flag(int pinned) -{ - if (pinned > 0) - return " P"; - else if (pinned < 0) - return " p"; - else - return ""; -} - static const char *tiling_flag(int tiling) { switch (tiling) { @@ -189,7 +179,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m, { int i; - err_printf(m, " %s [%d]:\n", name, count); + err_printf(m, "%s [%d]:\n", name, count); while (count--) { err_printf(m, "%08x_%08x %8u %02x %02x [ ", @@ -202,7 +192,6 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m, err_printf(m, "%02x ", err->rseqno[i]); err_printf(m, "] %02x", err->wseqno); - err_puts(m, pin_flag(err->pinned)); err_puts(m, tiling_flag(err->tiling)); err_puts(m, dirty_flag(err->dirty)); err_puts(m, purgeable_flag(err->purgeable)); @@ -414,18 +403,33 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, error_print_engine(m, &error->engine[i]); } - for (i = 0; i < error->vm_count; i++) { - err_printf(m, "vm[%d]\n", i); + for (i = 0; i < ARRAY_SIZE(error->active_vm); i++) { + char buf[128]; + int len, first = 1; - print_error_buffers(m, "Active", + if (!error->active_vm[i]) + break; + + len = scnprintf(buf, sizeof(buf), "Active[%d] (", i); + for (j = 0; j < ARRAY_SIZE(error->engine); j++) { + if (error->engine[j].vm != error->active_vm[i]) + continue; + + len += scnprintf(buf + len, sizeof(buf), "%s%s", +first ? "" : ", ", +dev_priv->engine[j].name); + first = 0; + } + scnprintf(buf + len, sizeof(buf), ")"); + print_error_buffers(m, buf, error->active_bo[i], error->active_bo_count[i]); - - print_error_buffers(m, "Pinned", - error->pinned_bo[i], - error->pinned_bo_count[i]); } + print_error_buffers(m, "Pinned (global)", + error->pinned_bo, + error->pinned_bo_count); + for (i = 0; i < ARRAY_SIZE(error->engine); i++) { struct drm_i915_error_engine *ee = &error->engine[i]; @@ -627,13 +631,10 @@ static void i915_error_state_free(struct kref *error_ref) i915_error_object_free(error->semaphore_o
[Intel-gfx] [CI 04/31] drm/i915: Remove inactive/active list from debugfs
These two files (i915_gem_active, i915_gem_inactive) no longer give pertinent information since active/inactive tracking is per-vm and so we need the information per-vm. They are obsolete so remove them. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 49 - 1 file changed, 49 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index c461072da142..4c08e2d23002 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -210,53 +210,6 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) seq_printf(m, " (frontbuffer: 0x%03x)", frontbuffer_bits); } -static int i915_gem_object_list_info(struct seq_file *m, void *data) -{ - struct drm_info_node *node = m->private; - uintptr_t list = (uintptr_t) node->info_ent->data; - struct list_head *head; - struct drm_device *dev = node->minor->dev; - struct drm_i915_private *dev_priv = to_i915(dev); - struct i915_ggtt *ggtt = &dev_priv->ggtt; - struct i915_vma *vma; - u64 total_obj_size, total_gtt_size; - int count, ret; - - ret = mutex_lock_interruptible(&dev->struct_mutex); - if (ret) - return ret; - - /* FIXME: the user of this interface might want more than just GGTT */ - switch (list) { - case ACTIVE_LIST: - seq_puts(m, "Active:\n"); - head = &ggtt->base.active_list; - break; - case INACTIVE_LIST: - seq_puts(m, "Inactive:\n"); - head = &ggtt->base.inactive_list; - break; - default: - mutex_unlock(&dev->struct_mutex); - return -EINVAL; - } - - total_obj_size = total_gtt_size = count = 0; - list_for_each_entry(vma, head, vm_link) { - seq_printf(m, " "); - describe_obj(m, vma->obj); - seq_printf(m, "\n"); - total_obj_size += vma->obj->base.size; - total_gtt_size += vma->node.size; - count++; - } - mutex_unlock(&dev->struct_mutex); - - seq_printf(m, "Total %d objects, %llu bytes, %llu GTT size\n", - count, total_obj_size, total_gtt_size); - return 0; -} - static int obj_rank_by_stolen(void *priv, struct list_head *A, struct list_head *B) { @@ -5376,8 +5329,6 @@ static const struct drm_info_list i915_debugfs_list[] = { {"i915_gem_objects", i915_gem_object_info, 0}, {"i915_gem_gtt", i915_gem_gtt_info, 0}, {"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST}, - {"i915_gem_active", i915_gem_object_list_info, 0, (void *) ACTIVE_LIST}, - {"i915_gem_inactive", i915_gem_object_list_info, 0, (void *) INACTIVE_LIST}, {"i915_gem_stolen", i915_gem_stolen_list_info }, {"i915_gem_pageflip", i915_gem_pageflip_info, 0}, {"i915_gem_request", i915_gem_request_info, 0}, -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 05/31] drm/i915: Focus debugfs/i915_gem_pinned to show only display pins
Only those objects pinned to the display have semi-permanent pins of a global nature (other pins are transient within their local vm). Simplify i915_gem_pinned to only show the pertinent information about the pinned objects within the GGTT. v2: i915_gem_gtt_info is still shared with debugfs/i915_gem_gtt, rename i915_gem_pinned to i915_gem_pin_display to better reflect its contents Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 4c08e2d23002..c535c4c2f7af 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -40,12 +40,6 @@ #include #include "i915_drv.h" -enum { - ACTIVE_LIST, - INACTIVE_LIST, - PINNED_LIST, -}; - /* As the drm_debugfs_init() routines are called before dev->dev_private is * allocated we need to hook into the minor for release. */ static int @@ -537,8 +531,8 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data) { struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; - uintptr_t list = (uintptr_t) node->info_ent->data; struct drm_i915_private *dev_priv = to_i915(dev); + bool show_pin_display_only = !!data; struct drm_i915_gem_object *obj; u64 total_obj_size, total_gtt_size; int count, ret; @@ -549,7 +543,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data) total_obj_size = total_gtt_size = count = 0; list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { - if (list == PINNED_LIST && !i915_gem_obj_is_pinned(obj)) + if (show_pin_display_only && !obj->pin_display) continue; seq_puts(m, " "); @@ -5328,7 +5322,7 @@ static const struct drm_info_list i915_debugfs_list[] = { {"i915_capabilities", i915_capabilities, 0}, {"i915_gem_objects", i915_gem_object_info, 0}, {"i915_gem_gtt", i915_gem_gtt_info, 0}, - {"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST}, + {"i915_gem_pin_display", i915_gem_gtt_info, 0, (void *)1}, {"i915_gem_stolen", i915_gem_stolen_list_info }, {"i915_gem_pageflip", i915_gem_pageflip_info, 0}, {"i915_gem_request", i915_gem_request_info, 0}, -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 10/31] drm/i915: Add fetch_and_zero() macro
A simple little macro to clear a pointer and return the old value. This is useful for writing value = *ptr; if (!value) return; *ptr = 0; ... free(value); in a slightly more concise form: value = fetch_and_zero(ptr); if (!value) return; ... free(value); with the idea that this establishes a pattern that may be extended for atomic use (using xchg or cmpxchg) i.e. atomic_fetch_and_zero() and similar to llist. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Daniel Vetter Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 25b1e6c010d5..855833a6306a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3920,4 +3920,10 @@ bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len); #define ptr_pack_bits(ptr, bits) \ ((typeof(ptr))((unsigned long)(ptr) | (bits))) +#define fetch_and_zero(ptr) ({ \ + typeof(*ptr) __T = *(ptr); \ + *(ptr) = (typeof(*ptr))0; \ + __T;\ +}) + #endif -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 09/31] drm/i915: Create a VMA for an object
In many places, we wish to store the VMA in preference to the object itself and so being able to create the persistent VMA is useful. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 11 +++ drivers/gpu/drm/i915/i915_gem_gtt.h | 5 + 2 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 9c178b0c40b5..1bec50bd651b 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -3387,6 +3387,17 @@ __i915_gem_vma_create(struct drm_i915_gem_object *obj, } struct i915_vma * +i915_vma_create(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, + const struct i915_ggtt_view *view) +{ + GEM_BUG_ON(view && !i915_is_ggtt(vm)); + GEM_BUG_ON(view ? i915_gem_obj_to_ggtt_view(obj, view) : i915_gem_obj_to_vma(obj, vm)); + + return __i915_gem_vma_create(obj, vm, view ?: &i915_ggtt_view_normal); +} + +struct i915_vma * i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj, struct i915_address_space *vm) { diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index b580e8a013ce..f2769e01cc8c 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -228,6 +228,11 @@ struct i915_vma { struct drm_i915_gem_exec_object2 *exec_entry; }; +struct i915_vma * +i915_vma_create(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, + const struct i915_ggtt_view *view); + static inline bool i915_vma_is_ggtt(const struct i915_vma *vma) { return vma->flags & I915_VMA_GGTT; -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Sagar Arun Kamble GuC ukernel sends an interrupt to Host to flush the log buffer and expects Host to correspondingly update the read pointer information in the state structure, once it has consumed the log buffer contents by copying them to a file or buffer. Even if Host couldn't copy the contents, it can still update the read pointer so that logging state is not disturbed on GuC side. v2: - Use a dedicated workqueue for handling flush interrupt. (Tvrtko) - Reduce the overall log buffer copying time by skipping the copy of crash buffer area for regular cases and copying only the state structure data in first page. v3: - Create a vmalloc mapping of log buffer. (Chris) - Cover the flush acknowledgment under rpm get & put.(Chris) - Revert the change of skipping the copy of crash dump area, as not really needed, will be covered by subsequent patch. v4: - Destroy the wq under the same condition in which it was created, pass dev_piv pointer instead of dev to newly added GuC function, add more comments & rename variable for clarity. (Tvrtko) Signed-off-by: Sagar Arun Kamble Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_drv.c| 14 +++ drivers/gpu/drm/i915/i915_guc_submission.c | 150 + drivers/gpu/drm/i915/i915_irq.c| 5 +- drivers/gpu/drm/i915/intel_guc.h | 3 + 4 files changed, 170 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 0fcd1c0..fc2da32 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv) if (dev_priv->hotplug.dp_wq == NULL) goto out_free_wq; +if (HAS_GUC_SCHED(dev_priv)) { This just reminded me that a previous patch had: +if (HAS_GUC_UCODE(dev)) +dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT; In the interrupt setup. I don't think there is a bug right now, but there is a disagreement between the two which would be good to resolve. This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED for correctness. I think. Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch. As per Chris's comments will move the wq init/destroy to the GuC logging setup/teardown routines (guc_create_log_extras, guc_log_cleanup) You are fine with that ?. +/* Need a dedicated wq to process log buffer flush interrupts + * from GuC without much delay so as to avoid any loss of logs. + */ +dev_priv->guc.log.wq = +alloc_ordered_workqueue("i915-guc_log", 0); +if (dev_priv->guc.log.wq == NULL) +goto out_free_hotplug_dp_wq; +} + return 0; +out_free_hotplug_dp_wq: +destroy_workqueue(dev_priv->hotplug.dp_wq); out_free_wq: destroy_workqueue(dev_priv->wq); out_err: @@ -782,6 +794,8 @@ out_err: static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv) { +if (HAS_GUC_SCHED(dev_priv)) +destroy_workqueue(dev_priv->guc.log.wq); destroy_workqueue(dev_priv->hotplug.dp_wq); destroy_workqueue(dev_priv->wq); } diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c7c679f..2635b67 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc, return host2guc_action(guc, data, ARRAY_SIZE(data)); } +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc) +{ +u32 data[1]; + +data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE; + +return host2guc_action(guc, data, 1); +} + /* * Initialise, update, or clear doorbell data shared with the GuC * @@ -840,6 +849,127 @@ err: return NULL; } +static void guc_move_to_next_buf(struct intel_guc *guc) +{ +return; +} + +static void* guc_get_write_buffer(struct intel_guc *guc) +{ +return NULL; +} + +static void guc_read_update_log_buffer(struct intel_guc *guc) +{ +struct guc_log_buffer_state *log_buffer_state, *log_buffer_snapshot_state; +struct guc_log_buffer_state log_buffer_state_local; +void *src_data_ptr, *dst_data_ptr; +u32 i, buffer_size; unsigned int i if you can be bothered. Fine will do that for both i & buffer_size. But I remember earlier in one of the patch, you suggested to use u32 as a type for some variables. Please could you share the guideline. Should u32, u64 be used we are exactly sure of the range of the variable, like for variables containing the register values ? + +if (!guc->log.buf_addr) +return; Can it hit this? If yes, I think better disable GuC logging when pin map on the object fails rather than let it gen