Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines

2017-03-10 Thread Chris Wilson
On Fri, Mar 10, 2017 at 12:42:34PM +0200, Mika Kuoppala wrote:
> Chris Wilson  writes:
> 
> > On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote:
> >> Chris Wilson  writes:
> >> 
> >> > On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote:
> >> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
> >> >> > Chris Wilson  writes:
> >> >> > 
> >> >> > > We only need to clflush those cachelines that we have validated to 
> >> >> > > be
> >> >> > > read by the GPU. Userspace typically fills the batch length in
> >> >> > > correctly, the exceptions tend to be explicit tests within igt.
> >> >> > >
> >> >> > > Signed-off-by: Chris Wilson 
> >> >> > > ---
> >> >> > >  drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> >> >> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> >> >> > >
> >> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
> >> >> > > b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> >> > > index 21b1cd917d81..b9ce9a6881ea 100644
> >> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct 
> >> >> > > intel_engine_cs *engine,
> >> >> > > }
> >> >> > >  
> >> >> > > if (ret == 0 && needs_clflush_after)
> >> >> > > -   drm_clflush_virt_range(shadow_batch_obj->mm.mapping, 
> >> >> > > batch_len);
> >> >> > > +   drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> >> >> > > +  (void *)cmd - 
> >> >> > > shadow_batch_obj->mm.mapping);
> >> >> > 
> >> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
> >> >> > 
> >> >> > We get away as the wb mapping being zero but for correctness.
> >> >> 
> >> >> The low bits of mm.mapping cannot change the cacheline and so doesn't
> >> >> affect the clflush. Same as before.
> >> >
> >> > Although, not quite the same as before as before we didn't use a fixed
> >> > end-point and so it could overshoot by a cacheline, even across a page
> >> > boundary.
> >> 
> >> And there could be PAGE_MASK worth of lowbits.
> >
> > There can? Wasn't anticipating adding a few thousand memory types, and
> > then sharing a single cache. :-p
> 
> #define ptr_mask_bits(ptr) ({
>  \
>  unsigned long __v = (unsigned long)(ptr);
>  \
>  (typeof(ptr))(__v & PAGE_MASK);
>  \
>  })

Yes, I didn't name that function very well. The low bits of
obj->mm.mapping currently represent and index into the cache.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines

2017-03-10 Thread Mika Kuoppala
Chris Wilson  writes:

> On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote:
>> Chris Wilson  writes:
>> 
>> > On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote:
>> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
>> >> > Chris Wilson  writes:
>> >> > 
>> >> > > We only need to clflush those cachelines that we have validated to be
>> >> > > read by the GPU. Userspace typically fills the batch length in
>> >> > > correctly, the exceptions tend to be explicit tests within igt.
>> >> > >
>> >> > > Signed-off-by: Chris Wilson 
>> >> > > ---
>> >> > >  drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
>> >> > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >> > >
>> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
>> >> > > b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> >> > > index 21b1cd917d81..b9ce9a6881ea 100644
>> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
>> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct 
>> >> > > intel_engine_cs *engine,
>> >> > >   }
>> >> > >  
>> >> > >   if (ret == 0 && needs_clflush_after)
>> >> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, 
>> >> > > batch_len);
>> >> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
>> >> > > +(void *)cmd - 
>> >> > > shadow_batch_obj->mm.mapping);
>> >> > 
>> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
>> >> > 
>> >> > We get away as the wb mapping being zero but for correctness.
>> >> 
>> >> The low bits of mm.mapping cannot change the cacheline and so doesn't
>> >> affect the clflush. Same as before.
>> >
>> > Although, not quite the same as before as before we didn't use a fixed
>> > end-point and so it could overshoot by a cacheline, even across a page
>> > boundary.
>> 
>> And there could be PAGE_MASK worth of lowbits.
>
> There can? Wasn't anticipating adding a few thousand memory types, and
> then sharing a single cache. :-p

#define ptr_mask_bits(ptr) ({
 \
 unsigned long __v = (unsigned long)(ptr);
 \
 (typeof(ptr))(__v & PAGE_MASK);
 \
 })

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines

2017-03-10 Thread Chris Wilson
On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote:
> Chris Wilson  writes:
> 
> > On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote:
> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
> >> > Chris Wilson  writes:
> >> > 
> >> > > We only need to clflush those cachelines that we have validated to be
> >> > > read by the GPU. Userspace typically fills the batch length in
> >> > > correctly, the exceptions tend to be explicit tests within igt.
> >> > >
> >> > > Signed-off-by: Chris Wilson 
> >> > > ---
> >> > >  drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> >> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> >> > >
> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
> >> > > b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> > > index 21b1cd917d81..b9ce9a6881ea 100644
> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct 
> >> > > intel_engine_cs *engine,
> >> > >}
> >> > >  
> >> > >if (ret == 0 && needs_clflush_after)
> >> > > -  drm_clflush_virt_range(shadow_batch_obj->mm.mapping, 
> >> > > batch_len);
> >> > > +  drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> >> > > + (void *)cmd - 
> >> > > shadow_batch_obj->mm.mapping);
> >> > 
> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
> >> > 
> >> > We get away as the wb mapping being zero but for correctness.
> >> 
> >> The low bits of mm.mapping cannot change the cacheline and so doesn't
> >> affect the clflush. Same as before.
> >
> > Although, not quite the same as before as before we didn't use a fixed
> > end-point and so it could overshoot by a cacheline, even across a page
> > boundary.
> 
> And there could be PAGE_MASK worth of lowbits.

There can? Wasn't anticipating adding a few thousand memory types, and
then sharing a single cache. :-p
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines

2017-03-10 Thread Mika Kuoppala
Chris Wilson  writes:

> On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote:
>> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
>> > Chris Wilson  writes:
>> > 
>> > > We only need to clflush those cachelines that we have validated to be
>> > > read by the GPU. Userspace typically fills the batch length in
>> > > correctly, the exceptions tend to be explicit tests within igt.
>> > >
>> > > Signed-off-by: Chris Wilson 
>> > > ---
>> > >  drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
>> > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
>> > > b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> > > index 21b1cd917d81..b9ce9a6881ea 100644
>> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
>> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
>> > > *engine,
>> > >  }
>> > >  
>> > >  if (ret == 0 && needs_clflush_after)
>> > > -drm_clflush_virt_range(shadow_batch_obj->mm.mapping, 
>> > > batch_len);
>> > > +drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
>> > > +   (void *)cmd - 
>> > > shadow_batch_obj->mm.mapping);
>> > 
>> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
>> > 
>> > We get away as the wb mapping being zero but for correctness.
>> 
>> The low bits of mm.mapping cannot change the cacheline and so doesn't
>> affect the clflush. Same as before.
>
> Although, not quite the same as before as before we didn't use a fixed
> end-point and so it could overshoot by a cacheline, even across a page
> boundary.

And there could be PAGE_MASK worth of lowbits.
-Mika

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines

2017-03-10 Thread Chris Wilson
On Fri, Mar 10, 2017 at 10:04:16AM +, Chris Wilson wrote:
> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
> > Chris Wilson  writes:
> > 
> > > We only need to clflush those cachelines that we have validated to be
> > > read by the GPU. Userspace typically fills the batch length in
> > > correctly, the exceptions tend to be explicit tests within igt.
> > >
> > > Signed-off-by: Chris Wilson 
> > > ---
> > >  drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
> > > b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > > index 21b1cd917d81..b9ce9a6881ea 100644
> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
> > > *engine,
> > >   }
> > >  
> > >   if (ret == 0 && needs_clflush_after)
> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> > > +(void *)cmd - 
> > > shadow_batch_obj->mm.mapping);
> > 
> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
> > 
> > We get away as the wb mapping being zero but for correctness.
> 
> The low bits of mm.mapping cannot change the cacheline and so doesn't
> affect the clflush. Same as before.

Although, not quite the same as before as before we didn't use a fixed
end-point and so it could overshoot by a cacheline, even across a page
boundary.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines

2017-03-10 Thread Chris Wilson
On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
> Chris Wilson  writes:
> 
> > We only need to clflush those cachelines that we have validated to be
> > read by the GPU. Userspace typically fills the batch length in
> > correctly, the exceptions tend to be explicit tests within igt.
> >
> > Signed-off-by: Chris Wilson 
> > ---
> >  drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
> > b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > index 21b1cd917d81..b9ce9a6881ea 100644
> > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
> > *engine,
> > }
> >  
> > if (ret == 0 && needs_clflush_after)
> > -   drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> > +   drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> > +  (void *)cmd - 
> > shadow_batch_obj->mm.mapping);
> 
> (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
> 
> We get away as the wb mapping being zero but for correctness.

The low bits of mm.mapping cannot change the cacheline and so doesn't
affect the clflush. Same as before.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines

2017-03-10 Thread Mika Kuoppala
Chris Wilson  writes:

> We only need to clflush those cachelines that we have validated to be
> read by the GPU. Userspace typically fills the batch length in
> correctly, the exceptions tend to be explicit tests within igt.
>
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
> b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 21b1cd917d81..b9ce9a6881ea 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
> *engine,
>   }
>  
>   if (ret == 0 && needs_clflush_after)
> - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> +(void *)cmd - 
> shadow_batch_obj->mm.mapping);

(void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)

We get away as the wb mapping being zero but for correctness.

-Mika

>   i915_gem_object_unpin_map(shadow_batch_obj);
>  
>   return ret;
> -- 
> 2.11.0
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx