Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
On Fri, Mar 10, 2017 at 12:42:34PM +0200, Mika Kuoppala wrote: > Chris Wilson writes: > > > On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote: > >> Chris Wilson writes: > >> > >> > On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote: > >> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote: > >> >> > Chris Wilson writes: > >> >> > > >> >> > > We only need to clflush those cachelines that we have validated to > >> >> > > be > >> >> > > read by the GPU. Userspace typically fills the batch length in > >> >> > > correctly, the exceptions tend to be explicit tests within igt. > >> >> > > > >> >> > > Signed-off-by: Chris Wilson > >> >> > > --- > >> >> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++- > >> >> > > 1 file changed, 2 insertions(+), 1 deletion(-) > >> >> > > > >> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c > >> >> > > b/drivers/gpu/drm/i915/i915_cmd_parser.c > >> >> > > index 21b1cd917d81..b9ce9a6881ea 100644 > >> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > >> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > >> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct > >> >> > > intel_engine_cs *engine, > >> >> > > } > >> >> > > > >> >> > > if (ret == 0 && needs_clflush_after) > >> >> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, > >> >> > > batch_len); > >> >> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping, > >> >> > > + (void *)cmd - > >> >> > > shadow_batch_obj->mm.mapping); > >> >> > > >> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping) > >> >> > > >> >> > We get away as the wb mapping being zero but for correctness. > >> >> > >> >> The low bits of mm.mapping cannot change the cacheline and so doesn't > >> >> affect the clflush. Same as before. > >> > > >> > Although, not quite the same as before as before we didn't use a fixed > >> > end-point and so it could overshoot by a cacheline, even across a page > >> > boundary. > >> > >> And there could be PAGE_MASK worth of lowbits. > > > > There can? Wasn't anticipating adding a few thousand memory types, and > > then sharing a single cache. :-p > > #define ptr_mask_bits(ptr) ({ > \ > unsigned long __v = (unsigned long)(ptr); > \ > (typeof(ptr))(__v & PAGE_MASK); > \ > }) Yes, I didn't name that function very well. The low bits of obj->mm.mapping currently represent and index into the cache. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
Chris Wilson writes: > On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote: >> Chris Wilson writes: >> >> > On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote: >> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote: >> >> > Chris Wilson writes: >> >> > >> >> > > We only need to clflush those cachelines that we have validated to be >> >> > > read by the GPU. Userspace typically fills the batch length in >> >> > > correctly, the exceptions tend to be explicit tests within igt. >> >> > > >> >> > > Signed-off-by: Chris Wilson >> >> > > --- >> >> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++- >> >> > > 1 file changed, 2 insertions(+), 1 deletion(-) >> >> > > >> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c >> >> > > b/drivers/gpu/drm/i915/i915_cmd_parser.c >> >> > > index 21b1cd917d81..b9ce9a6881ea 100644 >> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c >> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c >> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct >> >> > > intel_engine_cs *engine, >> >> > > } >> >> > > >> >> > > if (ret == 0 && needs_clflush_after) >> >> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, >> >> > > batch_len); >> >> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping, >> >> > > +(void *)cmd - >> >> > > shadow_batch_obj->mm.mapping); >> >> > >> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping) >> >> > >> >> > We get away as the wb mapping being zero but for correctness. >> >> >> >> The low bits of mm.mapping cannot change the cacheline and so doesn't >> >> affect the clflush. Same as before. >> > >> > Although, not quite the same as before as before we didn't use a fixed >> > end-point and so it could overshoot by a cacheline, even across a page >> > boundary. >> >> And there could be PAGE_MASK worth of lowbits. > > There can? Wasn't anticipating adding a few thousand memory types, and > then sharing a single cache. :-p #define ptr_mask_bits(ptr) ({ \ unsigned long __v = (unsigned long)(ptr); \ (typeof(ptr))(__v & PAGE_MASK); \ }) > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote: > Chris Wilson writes: > > > On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote: > >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote: > >> > Chris Wilson writes: > >> > > >> > > We only need to clflush those cachelines that we have validated to be > >> > > read by the GPU. Userspace typically fills the batch length in > >> > > correctly, the exceptions tend to be explicit tests within igt. > >> > > > >> > > Signed-off-by: Chris Wilson > >> > > --- > >> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++- > >> > > 1 file changed, 2 insertions(+), 1 deletion(-) > >> > > > >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c > >> > > b/drivers/gpu/drm/i915/i915_cmd_parser.c > >> > > index 21b1cd917d81..b9ce9a6881ea 100644 > >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct > >> > > intel_engine_cs *engine, > >> > >} > >> > > > >> > >if (ret == 0 && needs_clflush_after) > >> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, > >> > > batch_len); > >> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping, > >> > > + (void *)cmd - > >> > > shadow_batch_obj->mm.mapping); > >> > > >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping) > >> > > >> > We get away as the wb mapping being zero but for correctness. > >> > >> The low bits of mm.mapping cannot change the cacheline and so doesn't > >> affect the clflush. Same as before. > > > > Although, not quite the same as before as before we didn't use a fixed > > end-point and so it could overshoot by a cacheline, even across a page > > boundary. > > And there could be PAGE_MASK worth of lowbits. There can? Wasn't anticipating adding a few thousand memory types, and then sharing a single cache. :-p -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
Chris Wilson writes: > On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote: >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote: >> > Chris Wilson writes: >> > >> > > We only need to clflush those cachelines that we have validated to be >> > > read by the GPU. Userspace typically fills the batch length in >> > > correctly, the exceptions tend to be explicit tests within igt. >> > > >> > > Signed-off-by: Chris Wilson >> > > --- >> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++- >> > > 1 file changed, 2 insertions(+), 1 deletion(-) >> > > >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c >> > > b/drivers/gpu/drm/i915/i915_cmd_parser.c >> > > index 21b1cd917d81..b9ce9a6881ea 100644 >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs >> > > *engine, >> > > } >> > > >> > > if (ret == 0 && needs_clflush_after) >> > > -drm_clflush_virt_range(shadow_batch_obj->mm.mapping, >> > > batch_len); >> > > +drm_clflush_virt_range(shadow_batch_obj->mm.mapping, >> > > + (void *)cmd - >> > > shadow_batch_obj->mm.mapping); >> > >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping) >> > >> > We get away as the wb mapping being zero but for correctness. >> >> The low bits of mm.mapping cannot change the cacheline and so doesn't >> affect the clflush. Same as before. > > Although, not quite the same as before as before we didn't use a fixed > end-point and so it could overshoot by a cacheline, even across a page > boundary. And there could be PAGE_MASK worth of lowbits. -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote: > On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote: > > Chris Wilson writes: > > > > > We only need to clflush those cachelines that we have validated to be > > > read by the GPU. Userspace typically fills the batch length in > > > correctly, the exceptions tend to be explicit tests within igt. > > > > > > Signed-off-by: Chris Wilson > > > --- > > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c > > > b/drivers/gpu/drm/i915/i915_cmd_parser.c > > > index 21b1cd917d81..b9ce9a6881ea 100644 > > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs > > > *engine, > > > } > > > > > > if (ret == 0 && needs_clflush_after) > > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len); > > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping, > > > +(void *)cmd - > > > shadow_batch_obj->mm.mapping); > > > > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping) > > > > We get away as the wb mapping being zero but for correctness. > > The low bits of mm.mapping cannot change the cacheline and so doesn't > affect the clflush. Same as before. Although, not quite the same as before as before we didn't use a fixed end-point and so it could overshoot by a cacheline, even across a page boundary. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote: > Chris Wilson writes: > > > We only need to clflush those cachelines that we have validated to be > > read by the GPU. Userspace typically fills the batch length in > > correctly, the exceptions tend to be explicit tests within igt. > > > > Signed-off-by: Chris Wilson > > --- > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c > > b/drivers/gpu/drm/i915/i915_cmd_parser.c > > index 21b1cd917d81..b9ce9a6881ea 100644 > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs > > *engine, > > } > > > > if (ret == 0 && needs_clflush_after) > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len); > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping, > > + (void *)cmd - > > shadow_batch_obj->mm.mapping); > > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping) > > We get away as the wb mapping being zero but for correctness. The low bits of mm.mapping cannot change the cacheline and so doesn't affect the clflush. Same as before. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
Chris Wilson writes: > We only need to clflush those cachelines that we have validated to be > read by the GPU. Userspace typically fills the batch length in > correctly, the exceptions tend to be explicit tests within igt. > > Signed-off-by: Chris Wilson > --- > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c > b/drivers/gpu/drm/i915/i915_cmd_parser.c > index 21b1cd917d81..b9ce9a6881ea 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs > *engine, > } > > if (ret == 0 && needs_clflush_after) > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len); > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping, > +(void *)cmd - > shadow_batch_obj->mm.mapping); (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping) We get away as the wb mapping being zero but for correctness. -Mika > i915_gem_object_unpin_map(shadow_batch_obj); > > return ret; > -- > 2.11.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx