subject:"\[Mesa\-dev\] \[PATCH\] i965 \: optimized bucket index calculation"

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Marathe, Yogesh

>-Original Message-
>From: Matt Turner [mailto:matts...@gmail.com]
>
>On Tue, Sep 12, 2017 at 10:19 AM, Ian Romanick  wrote:
>> On 09/12/2017 02:40 AM, Marathe, Yogesh wrote:
>>> Hi Jason,
>>>
>>>
>>>
>>> On the asserts you’ve mentioned below, I assume we need to add them
>>> after ‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be
>>> 0 initially. Another clarification on ~1%, we meant approx. 1% there,
>>> that’s an improvement we saw in 3Dmark total not a degradation, we’ll
>>> correct it in commit msg.
>>
>> I think the problem is that there is insufficient information about
>> your data.  What we want to see in a commit message is something like:
>>
>> commit 5ae2de81c8350272c122ea38e6bb4c0a41d58921
>> Author: Kenneth Graunke 
>> Date:   Mon Aug 28 16:08:32 2017 -0700
>>
>> i965: Use BLORP for buffer object stall avoidance blits instead of BLT.
>>
>> Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
>> - Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
>> - Car Chase by 1.25607% +/- 0.291262% (n=5).
>>
>> Reviewed-by: Jason Ekstrand 
>>
>> The important bits are:
>>
>> - average improvement
>> - statistical deviation
>> - number of runs
>
>And for generating such data, we often use http://anholt.net/compare-perf/

Ok. Thanks Matt and Ian, we'll look at it and update commit msg accordingly.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Matt Turner

On Tue, Sep 12, 2017 at 10:19 AM, Ian Romanick  wrote:
> On 09/12/2017 02:40 AM, Marathe, Yogesh wrote:
>> Hi Jason,
>>
>>
>>
>> On the asserts you’ve mentioned below, I assume we need to add them
>> after ‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be 0
>> initially. Another clarification on ~1%, we meant approx. 1% there,
>> that’s an improvement we saw in 3Dmark total not a degradation, we’ll
>> correct it in commit msg.
>
> I think the problem is that there is insufficient information about your
> data.  What we want to see in a commit message is something like:
>
> commit 5ae2de81c8350272c122ea38e6bb4c0a41d58921
> Author: Kenneth Graunke 
> Date:   Mon Aug 28 16:08:32 2017 -0700
>
> i965: Use BLORP for buffer object stall avoidance blits instead of BLT.
>
> Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
> - Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
> - Car Chase by 1.25607% +/- 0.291262% (n=5).
>
> Reviewed-by: Jason Ekstrand 
>
> The important bits are:
>
> - average improvement
> - statistical deviation
> - number of runs

And for generating such data, we often use http://anholt.net/compare-perf/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Ian Romanick

On 09/12/2017 02:40 AM, Marathe, Yogesh wrote:
> Hi Jason,
> 
>  
> 
> On the asserts you’ve mentioned below, I assume we need to add them
> after ‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be 0
> initially. Another clarification on ~1%, we meant approx. 1% there,
> that’s an improvement we saw in 3Dmark total not a degradation, we’ll
> correct it in commit msg.

I think the problem is that there is insufficient information about your
data.  What we want to see in a commit message is something like:

commit 5ae2de81c8350272c122ea38e6bb4c0a41d58921
Author: Kenneth Graunke <kenn...@whitecape.org>
Date:   Mon Aug 28 16:08:32 2017 -0700

i965: Use BLORP for buffer object stall avoidance blits instead of BLT.

Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
- Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
- Car Chase by 1.25607% +/- 0.291262% (n=5).

Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>

The important bits are:

- average improvement
- statistical deviation
- number of runs
- platform
- name of specific benchmark

> Rest all review comments from you, Tapani and Emil are noted &
> implemented, we are working on running it through mesa CI/CTS and we
> should see a v2 for review after that.
> 
>  
> 
> Regards,
> 
> Yogesh.
> 
>  
> 
> *From:*mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] *On
> Behalf Of *Jason Ekstrand
> *Sent:* Friday, September 8, 2017 9:09 PM
> *To:* Muthukumar, Aravindan <aravindan.muthuku...@intel.com>
> *Cc:* mesa-dev@lists.freedesktop.org; J Karanje, Kedar
> <kedar.j.kara...@intel.com>
> *Subject:* Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation
> 
>  
> 
> In general, I'm very concerned about how this handles rounding
> behavior.  Almost everywhere, you round down when what you want to do is
> round up.  Also, as I said on IRC, I'd like to see some asserts in
> add_bucket so that we are sure this calculation is correct.  In
> particular, I'd like to see
> 
>  
> 
> assert(bucket_for_size(size) == >cache_bucket[i]);
> 
> assert(bucket_for_size(size - 2048) == >cache_bucket[i]);
> 
> assert(bucket_for_size(size + 1) != >cache_bucket[i]);
> 
>  
> 
> We need to check on both sides of size to be 100% sure we're doing our
> rounding correctly.
> 
>  
> 
> On Fri, Sep 8, 2017 at 1:11 AM, <aravindan.muthuku...@intel.com
> <mailto:aravindan.muthuku...@intel.com>> wrote:
> 
> From: Aravindan Muthukumar <aravindan.muthuku...@intel.com
> <mailto:aravindan.muthuku...@intel.com>>
> 
> Avoiding the loop which was running with O(n) complexity.
> Now the complexity has been reduced to O(1)
> 
> Tested with piglit.
> Slight performance improvement (~1%) in 3d mark.
> 
>  
> 
> Which 3dmark test?  Also, what's the error in that 1%?
> 
>  
> 
> Change-Id: Id099f1cd24ad5b691a69070eda79b8f4e9be39a6
> Signed-off-by: Aravindan Muthukumar <aravindan.muthuku...@intel.com
> <mailto:aravindan.muthuku...@intel.com>>
> Signed-off-by: Kedar Karanje <kedar.j.kara...@intel.com
> <mailto:kedar.j.kara...@intel.com>>
> Reviewed-by: Yogesh Marathe <yogesh.mara...@intel.com
> <mailto:yogesh.mara...@intel.com>>
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c | 48
> +-
>  1 file changed, 41 insertions(+), 7 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index 5b4e784..18cb166 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -87,6 +87,11 @@
> 
>  #define memclear(s) memset(, 0, sizeof(s))
> 
> +/* Macros for BO cache size */
> +#define CACHE_PAGE_SIZE4096
> 
>  
> 
> Just call this PAGE_SIZE
> 
>  
> 
> +#define PAGE_SIZE_SHIFT12
> +#define BO_CACHE_PAGE_SIZE (4 * CACHE_PAGE_SIZE)
> 
>  
> 
> I think I'd rather we just use 4 * PAGE_SIZE explicitly than have this
> extra #define.  I think it's making things harder to read and not easier.
> 
>  
> 
> +
>  #define FILE_DEBUG_FLAG DEBUG_BUFMGR
> 
>  static inline int
> @@ -181,19 +186,48 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr,
> uint32_t pitch, uint32_t tiling)
> return ALIGN(pitch, tile_width);
>  }
> 
> +/*
> + * This functions is to find the correct bucket fit for the input size.
> + * This function works with O(1) complexity when the requested size
> + * was queried instead of itera

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Ian Romanick

On 09/08/2017 08:38 AM, Jason Ekstrand wrote:
> In general, I'm very concerned about how this handles rounding
> behavior.  Almost everywhere, you round down when what you want to do is
> round up.  Also, as I said on IRC, I'd like to see some asserts in
> add_bucket so that we are sure this calculation is correct.  In
> particular, I'd like to see
> 
> assert(bucket_for_size(size) == >cache_bucket[i]);
> assert(bucket_for_size(size - 2048) == >cache_bucket[i]);
> assert(bucket_for_size(size + 1) != >cache_bucket[i]);
> 
> We need to check on both sides of size to be 100% sure we're doing our
> rounding correctly.
> 
> On Fri, Sep 8, 2017 at 1:11 AM,  > wrote:
> 
> From: Aravindan Muthukumar  >
> 
> Avoiding the loop which was running with O(n) complexity.
> Now the complexity has been reduced to O(1)
> 
> Tested with piglit.
> Slight performance improvement (~1%) in 3d mark.
> 
> 
> Which 3dmark test?  Also, what's the error in that 1%?

And, is that Atom or Core?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Jason Ekstrand

On Tue, Sep 12, 2017 at 2:40 AM, Marathe, Yogesh <yogesh.mara...@intel.com>
wrote:

> Hi Jason,
>
>
>
> On the asserts you’ve mentioned below, I assume we need to add them after
> ‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be 0 initially.
>

Yes, because otherwise bucket_for_size will return NULL all the time.


> Another clarification on ~1%, we meant approx. 1% there, that’s an
> improvement we saw in 3Dmark total not a degradation, we’ll correct it in
> commit msg.
>
>
>
> Rest all review comments from you, Tapani and Emil are noted &
> implemented, we are working on running it through mesa CI/CTS and we should
> see a v2 for review after that.
>
>
>
> Regards,
>
> Yogesh.
>
>
>
> *From:* mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] *On
> Behalf Of *Jason Ekstrand
> *Sent:* Friday, September 8, 2017 9:09 PM
> *To:* Muthukumar, Aravindan <aravindan.muthuku...@intel.com>
> *Cc:* mesa-dev@lists.freedesktop.org; J Karanje, Kedar <
> kedar.j.kara...@intel.com>
> *Subject:* Re: [Mesa-dev] [PATCH] i965 : optimized bucket index
> calculation
>
>
>
> In general, I'm very concerned about how this handles rounding behavior.
> Almost everywhere, you round down when what you want to do is round up.
> Also, as I said on IRC, I'd like to see some asserts in add_bucket so that
> we are sure this calculation is correct.  In particular, I'd like to see
>
>
>
> assert(bucket_for_size(size) == >cache_bucket[i]);
>
> assert(bucket_for_size(size - 2048) == >cache_bucket[i]);
>
> assert(bucket_for_size(size + 1) != >cache_bucket[i]);
>
>
>
> We need to check on both sides of size to be 100% sure we're doing our
> rounding correctly.
>
>
>
> On Fri, Sep 8, 2017 at 1:11 AM, <aravindan.muthuku...@intel.com> wrote:
>
> From: Aravindan Muthukumar <aravindan.muthuku...@intel.com>
>
> Avoiding the loop which was running with O(n) complexity.
> Now the complexity has been reduced to O(1)
>
> Tested with piglit.
> Slight performance improvement (~1%) in 3d mark.
>
>
>
> Which 3dmark test?  Also, what's the error in that 1%?
>
>
>
> Change-Id: Id099f1cd24ad5b691a69070eda79b8f4e9be39a6
> Signed-off-by: Aravindan Muthukumar <aravindan.muthuku...@intel.com>
> Signed-off-by: Kedar Karanje <kedar.j.kara...@intel.com>
> Reviewed-by: Yogesh Marathe <yogesh.mara...@intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c | 48
> +-
>  1 file changed, 41 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index 5b4e784..18cb166 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -87,6 +87,11 @@
>
>  #define memclear(s) memset(, 0, sizeof(s))
>
> +/* Macros for BO cache size */
> +#define CACHE_PAGE_SIZE4096
>
>
>
> Just call this PAGE_SIZE
>
>
>
> +#define PAGE_SIZE_SHIFT12
> +#define BO_CACHE_PAGE_SIZE (4 * CACHE_PAGE_SIZE)
>
>
>
> I think I'd rather we just use 4 * PAGE_SIZE explicitly than have this
> extra #define.  I think it's making things harder to read and not easier.
>
>
>
> +
>  #define FILE_DEBUG_FLAG DEBUG_BUFMGR
>
>  static inline int
> @@ -181,19 +186,48 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t
> pitch, uint32_t tiling)
> return ALIGN(pitch, tile_width);
>  }
>
> +/*
> + * This functions is to find the correct bucket fit for the input size.
> + * This function works with O(1) complexity when the requested size
> + * was queried instead of iterating the size through all the buckets.
> + */
>  static struct bo_cache_bucket *
>  bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
>  {
> -   int i;
> +   struct bo_cache_bucket *bucket = NULL;
> +   int x=0,index = -1;
> +   int row, col=0;
>
> -   for (i = 0; i < bufmgr->num_buckets; i++) {
> -  struct bo_cache_bucket *bucket = >cache_bucket[i];
> -  if (bucket->size >= size) {
> - return bucket;
> -  }
> +   /* condition for size less  than 4*4096 (4KB) page size */
> +   if(size < BO_CACHE_PAGE_SIZE){
>
>
>
> This should be "<="
>
>
>
> +  index = (size>>PAGE_SIZE_SHIFT)+((size%(1<<PAGE_SIZE_SHIFT)?1:0)
> )-1;
>
>
>
> I agree with tapani, that this can easily be an early return.
>
>
>
> I think we can also make this calculation a lot more clear:
>
>
>
> index = DIV_ROUND_UP(size, PAGE_SIZE) - 1;
>
>
>
> }
> +   else{
> +  /* When the si

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Marathe, Yogesh

Hi Jason,

On the asserts you’ve mentioned below, I assume we need to add them after 
‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be 0 initially. 
Another clarification on ~1%, we meant approx. 1% there, that’s an improvement 
we saw in 3Dmark total not a degradation, we’ll correct it in commit msg.

Rest all review comments from you, Tapani and Emil are noted & implemented, we 
are working on running it through mesa CI/CTS and we should see a v2 for review 
after that.

Regards,
Yogesh.

From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of 
Jason Ekstrand
Sent: Friday, September 8, 2017 9:09 PM
To: Muthukumar, Aravindan <aravindan.muthuku...@intel.com>
Cc: mesa-dev@lists.freedesktop.org; J Karanje, Kedar <kedar.j.kara...@intel.com>
Subject: Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

In general, I'm very concerned about how this handles rounding behavior.  
Almost everywhere, you round down when what you want to do is round up.  Also, 
as I said on IRC, I'd like to see some asserts in add_bucket so that we are 
sure this calculation is correct.  In particular, I'd like to see

assert(bucket_for_size(size) == >cache_bucket[i]);
assert(bucket_for_size(size - 2048) == >cache_bucket[i]);
assert(bucket_for_size(size + 1) != >cache_bucket[i]);

We need to check on both sides of size to be 100% sure we're doing our rounding 
correctly.

On Fri, Sep 8, 2017 at 1:11 AM, 
<aravindan.muthuku...@intel.com<mailto:aravindan.muthuku...@intel.com>> wrote:
From: Aravindan Muthukumar 
<aravindan.muthuku...@intel.com<mailto:aravindan.muthuku...@intel.com>>

Avoiding the loop which was running with O(n) complexity.
Now the complexity has been reduced to O(1)

Tested with piglit.
Slight performance improvement (~1%) in 3d mark.

Which 3dmark test?  Also, what's the error in that 1%?

Change-Id: Id099f1cd24ad5b691a69070eda79b8f4e9be39a6
Signed-off-by: Aravindan Muthukumar 
<aravindan.muthuku...@intel.com<mailto:aravindan.muthuku...@intel.com>>
Signed-off-by: Kedar Karanje 
<kedar.j.kara...@intel.com<mailto:kedar.j.kara...@intel.com>>
Reviewed-by: Yogesh Marathe 
<yogesh.mara...@intel.com<mailto:yogesh.mara...@intel.com>>
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 48 +-
 1 file changed, 41 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 5b4e784..18cb166 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -87,6 +87,11 @@

 #define memclear(s) memset(, 0, sizeof(s))

+/* Macros for BO cache size */
+#define CACHE_PAGE_SIZE4096

Just call this PAGE_SIZE

+#define PAGE_SIZE_SHIFT12
+#define BO_CACHE_PAGE_SIZE (4 * CACHE_PAGE_SIZE)

I think I'd rather we just use 4 * PAGE_SIZE explicitly than have this extra 
#define.  I think it's making things harder to read and not easier.

+
 #define FILE_DEBUG_FLAG DEBUG_BUFMGR

 static inline int
@@ -181,19 +186,48 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t pitch, 
uint32_t tiling)
return ALIGN(pitch, tile_width);
 }

+/*
+ * This functions is to find the correct bucket fit for the input size.
+ * This function works with O(1) complexity when the requested size
+ * was queried instead of iterating the size through all the buckets.
+ */
 static struct bo_cache_bucket *
 bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
 {
-   int i;
+   struct bo_cache_bucket *bucket = NULL;
+   int x=0,index = -1;
+   int row, col=0;

-   for (i = 0; i < bufmgr->num_buckets; i++) {
-  struct bo_cache_bucket *bucket = >cache_bucket[i];
-  if (bucket->size >= size) {
- return bucket;
-  }
+   /* condition for size less  than 4*4096 (4KB) page size */
+   if(size < BO_CACHE_PAGE_SIZE){

This should be "<="

+  index = (size>>PAGE_SIZE_SHIFT)+((size%(1<<PAGE_SIZE_SHIFT)?1:0))-1;

I agree with tapani, that this can easily be an early return.

I think we can also make this calculation a lot more clear:

index = DIV_ROUND_UP(size, PAGE_SIZE) - 1;

}
+   else{
+  /* When the size is more than 4*4096, the logic follows a matrix method
+   * where the index will be searched using Arithmetico-Geometric 
progression.
+   * So the given size will be divided by 4096 & the index will be traced 
out.
+   */
+  x = size>>PAGE_SIZE_SHIFT;

This rounds down not up like you want.

-   return NULL;
+  /* Find the row using Geometric Progression. The highest bit set will 
give
+   * the row number. num = a * r^(n-1) where num = size a = 4 r = 2
+   */
+  row = 31 - __builtin_clz(x>>1);
+
+ /* Find the column using AP but using the row value
+  * calculated using GP.
+  */
+  col =((x-(1<<(row+1)))/(1<<(row-1)))+1;
+  col += (size%(1<&l

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-08 Thread Jason Ekstrand

In general, I'm very concerned about how this handles rounding behavior.
Almost everywhere, you round down when what you want to do is round up.
Also, as I said on IRC, I'd like to see some asserts in add_bucket so that
we are sure this calculation is correct.  In particular, I'd like to see

assert(bucket_for_size(size) == >cache_bucket[i]);
assert(bucket_for_size(size - 2048) == >cache_bucket[i]);
assert(bucket_for_size(size + 1) != >cache_bucket[i]);

We need to check on both sides of size to be 100% sure we're doing our
rounding correctly.

On Fri, Sep 8, 2017 at 1:11 AM,  wrote:

> From: Aravindan Muthukumar 
>
> Avoiding the loop which was running with O(n) complexity.
> Now the complexity has been reduced to O(1)
>
> Tested with piglit.
> Slight performance improvement (~1%) in 3d mark.
>

Which 3dmark test?  Also, what's the error in that 1%?


> Change-Id: Id099f1cd24ad5b691a69070eda79b8f4e9be39a6
> Signed-off-by: Aravindan Muthukumar 
> Signed-off-by: Kedar Karanje 
> Reviewed-by: Yogesh Marathe 
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c | 48
> +-
>  1 file changed, 41 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index 5b4e784..18cb166 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -87,6 +87,11 @@
>
>  #define memclear(s) memset(, 0, sizeof(s))
>
> +/* Macros for BO cache size */
> +#define CACHE_PAGE_SIZE4096
>

Just call this PAGE_SIZE


> +#define PAGE_SIZE_SHIFT12
> +#define BO_CACHE_PAGE_SIZE (4 * CACHE_PAGE_SIZE)
>

I think I'd rather we just use 4 * PAGE_SIZE explicitly than have this
extra #define.  I think it's making things harder to read and not easier.


> +
>  #define FILE_DEBUG_FLAG DEBUG_BUFMGR
>
>  static inline int
> @@ -181,19 +186,48 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t
> pitch, uint32_t tiling)
> return ALIGN(pitch, tile_width);
>  }
>
> +/*
> + * This functions is to find the correct bucket fit for the input size.
> + * This function works with O(1) complexity when the requested size
> + * was queried instead of iterating the size through all the buckets.
> + */
>  static struct bo_cache_bucket *
>  bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
>  {
> -   int i;
> +   struct bo_cache_bucket *bucket = NULL;
> +   int x=0,index = -1;
> +   int row, col=0;
>
> -   for (i = 0; i < bufmgr->num_buckets; i++) {
> -  struct bo_cache_bucket *bucket = >cache_bucket[i];
> -  if (bucket->size >= size) {
> - return bucket;
> -  }
> +   /* condition for size less  than 4*4096 (4KB) page size */
> +   if(size < BO_CACHE_PAGE_SIZE){
>

This should be "<="


> +  index = (size>>PAGE_SIZE_SHIFT)+((size%(1< )-1;
>

I agree with tapani, that this can easily be an early return.

I think we can also make this calculation a lot more clear:

index = DIV_ROUND_UP(size, PAGE_SIZE) - 1;


> }
> +   else{
> +  /* When the size is more than 4*4096, the logic follows a matrix
> method
> +   * where the index will be searched using Arithmetico-Geometric
> progression.
> +   * So the given size will be divided by 4096 & the index will be
> traced out.
> +   */
> +  x = size>>PAGE_SIZE_SHIFT;
>

This rounds down not up like you want.


> -   return NULL;
> +  /* Find the row using Geometric Progression. The highest bit set
> will give
> +   * the row number. num = a * r^(n-1) where num = size a = 4 r = 2
> +   */
> +  row = 31 - __builtin_clz(x>>1);
> +
> + /* Find the column using AP but using the row value
> +  * calculated using GP.
> +  */
> +  col =((x-(1<<(row+1)))/(1<<(row-1)))+1;
> +  col += (size%(1< +
> +  /* Finding the index value using calculated row and col number */
> +  index = ((row-1)<<2) + col + 2;
>

I think this can probably also be a lot simpler.  How about something like
this:

pages = DIV_ROUND_UP(size, PAGE_SIZE);
/* Steal this from anv_allocator.c */
pages_log2 = ilog2_round_up(pages);
row = pages_log2 - 1;
col = DIV_ROUND_UP(pages, (1 << (pages_log2 - 2)));
index = row * 4 + col;

+   }
> +
> +   /* Checking the error condition */
> +   bucket = (index >= 0 && index < bufmgr->num_buckets)?(
> >cache_bucket[index]):NULL;
> +   return bucket;
>  }
>
>  int
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-08 Thread Emil Velikov

On 8 September 2017 at 12:06, Tapani Pälli  wrote:
> Hi;
>
> Won't comment on the algorithm itself but coding style should be fixed to
> follow style used in the rest of the file (some documentation on style
> available here: https://www.mesa3d.org/codingstyle.html).
>
Even without that document, the commit is very inconsistent with
surrounding coding style.

Kedar, please explain what algorithm you've used and why, in the commit message.
Also please get in touch with Mark Janes (Cc'd, janesma on IRC) to run
it through the Jenkins CI.
It covers all the

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-08 Thread Tapani Pälli


Hi;

Won't comment on the algorithm itself but coding style should be fixed 
to follow style used in the rest of the file (some documentation on 
style available here: https://www.mesa3d.org/codingstyle.html).


On 09/08/2017 11:11 AM, aravindan.muthuku...@intel.com wrote:

From: Aravindan Muthukumar 

Avoiding the loop which was running with O(n) complexity.
Now the complexity has been reduced to O(1)

Tested with piglit.
Slight performance improvement (~1%) in 3d mark.

Change-Id: Id099f1cd24ad5b691a69070eda79b8f4e9be39a6
Signed-off-by: Aravindan Muthukumar 
Signed-off-by: Kedar Karanje 
Reviewed-by: Yogesh Marathe 
---
  src/mesa/drivers/dri/i965/brw_bufmgr.c | 48 +-
  1 file changed, 41 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 5b4e784..18cb166 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -87,6 +87,11 @@
  
  #define memclear(s) memset(, 0, sizeof(s))
  
+/* Macros for BO cache size */

+#define CACHE_PAGE_SIZE4096
+#define PAGE_SIZE_SHIFT12
+#define BO_CACHE_PAGE_SIZE (4 * CACHE_PAGE_SIZE)
+
  #define FILE_DEBUG_FLAG DEBUG_BUFMGR
  
  static inline int

@@ -181,19 +186,48 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t pitch, 
uint32_t tiling)
 return ALIGN(pitch, tile_width);
  }
  
+/*

+ * This functions is to find the correct bucket fit for the input size.
+ * This function works with O(1) complexity when the requested size
+ * was queried instead of iterating the size through all the buckets.
+ */
  static struct bo_cache_bucket *
  bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
  {
-   int i;
+   struct bo_cache_bucket *bucket = NULL;


no need to initialize, I guess no harm though


+   int x=0,index = -1;


space before and after operator 'x = 0', space after ',':

'int x = 0, index = -1;'


+   int row, col=0;


space before and after operator


-   for (i = 0; i < bufmgr->num_buckets; i++) {
-  struct bo_cache_bucket *bucket = >cache_bucket[i];
-  if (bucket->size >= size) {
- return bucket;
-  }
+   /* condition for size less  than 4*4096 (4KB) page size */
+   if(size < BO_CACHE_PAGE_SIZE){


if (size ...) {


+  index = (size>>PAGE_SIZE_SHIFT)+((size%(1<>PAGE_SIZE_SHIFT;


space before and after operator

  
-   return NULL;

+  /* Find the row using Geometric Progression. The highest bit set will 
give
+   * the row number. num = a * r^(n-1) where num = size a = 4 r = 2
+   */
+  row = 31 - __builtin_clz(x>>1);
+
+ /* Find the column using AP but using the row value
+  * calculated using GP.
+  */
+  col =((x-(1<<(row+1)))/(1<<(row-1)))+1;


space before and after operators


+  col += (size%(1<= 0 && index < 
bufmgr->num_buckets)?(>cache_bucket[index]):NULL;


this won't fit to one line (exceeds ~80 char limit)

   bucket = (index >= 0 && index < bufmgr->num_buckets) ?
  (>cache_bucket[index]) : NULL;


+   return bucket;
  }
  
  int



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-08 Thread aravindan . muthukumar

From: Aravindan Muthukumar 

Avoiding the loop which was running with O(n) complexity.
Now the complexity has been reduced to O(1)

Tested with piglit.
Slight performance improvement (~1%) in 3d mark.

Change-Id: Id099f1cd24ad5b691a69070eda79b8f4e9be39a6
Signed-off-by: Aravindan Muthukumar 
Signed-off-by: Kedar Karanje 
Reviewed-by: Yogesh Marathe 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 48 +-
 1 file changed, 41 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 5b4e784..18cb166 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -87,6 +87,11 @@
 
 #define memclear(s) memset(, 0, sizeof(s))
 
+/* Macros for BO cache size */
+#define CACHE_PAGE_SIZE4096
+#define PAGE_SIZE_SHIFT12
+#define BO_CACHE_PAGE_SIZE (4 * CACHE_PAGE_SIZE)
+
 #define FILE_DEBUG_FLAG DEBUG_BUFMGR
 
 static inline int
@@ -181,19 +186,48 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t pitch, 
uint32_t tiling)
return ALIGN(pitch, tile_width);
 }
 
+/*
+ * This functions is to find the correct bucket fit for the input size.
+ * This function works with O(1) complexity when the requested size
+ * was queried instead of iterating the size through all the buckets.
+ */
 static struct bo_cache_bucket *
 bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
 {
-   int i;
+   struct bo_cache_bucket *bucket = NULL;
+   int x=0,index = -1;
+   int row, col=0;
 
-   for (i = 0; i < bufmgr->num_buckets; i++) {
-  struct bo_cache_bucket *bucket = >cache_bucket[i];
-  if (bucket->size >= size) {
- return bucket;
-  }
+   /* condition for size less  than 4*4096 (4KB) page size */
+   if(size < BO_CACHE_PAGE_SIZE){
+  index = (size>>PAGE_SIZE_SHIFT)+((size%(1<>PAGE_SIZE_SHIFT;
 
-   return NULL;
+  /* Find the row using Geometric Progression. The highest bit set will 
give
+   * the row number. num = a * r^(n-1) where num = size a = 4 r = 2
+   */
+  row = 31 - __builtin_clz(x>>1);
+
+ /* Find the column using AP but using the row value
+  * calculated using GP.
+  */
+  col =((x-(1<<(row+1)))/(1<<(row-1)))+1;
+  col += (size%(1<= 0 && index < 
bufmgr->num_buckets)?(>cache_bucket[index]):NULL;
+   return bucket;
 }
 
 int
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

[Mesa-dev] [PATCH] i965 : optimized bucket index calculation

10 matches

Site Navigation

Mail list logo

Footer information