On 09/12/2017 02:40 AM, Marathe, Yogesh wrote: > Hi Jason, > > > > On the asserts you’ve mentioned below, I assume we need to add them > after ‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be 0 > initially. Another clarification on ~1%, we meant approx. 1% there, > that’s an improvement we saw in 3Dmark total not a degradation, we’ll > correct it in commit msg.
I think the problem is that there is insufficient information about your data. What we want to see in a commit message is something like: commit 5ae2de81c8350272c122ea38e6bb4c0a41d58921 Author: Kenneth Graunke <kenn...@whitecape.org> Date: Mon Aug 28 16:08:32 2017 -0700 i965: Use BLORP for buffer object stall avoidance blits instead of BLT. Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2: - Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8). - Car Chase by 1.25607% +/- 0.291262% (n=5). Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> The important bits are: - average improvement - statistical deviation - number of runs - platform - name of specific benchmark > Rest all review comments from you, Tapani and Emil are noted & > implemented, we are working on running it through mesa CI/CTS and we > should see a v2 for review after that. > > > > Regards, > > Yogesh. > > > > *From:*mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] *On > Behalf Of *Jason Ekstrand > *Sent:* Friday, September 8, 2017 9:09 PM > *To:* Muthukumar, Aravindan <aravindan.muthuku...@intel.com> > *Cc:* mesa-dev@lists.freedesktop.org; J Karanje, Kedar > <kedar.j.kara...@intel.com> > *Subject:* Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation > > > > In general, I'm very concerned about how this handles rounding > behavior. Almost everywhere, you round down when what you want to do is > round up. Also, as I said on IRC, I'd like to see some asserts in > add_bucket so that we are sure this calculation is correct. In > particular, I'd like to see > > > > assert(bucket_for_size(size) == &bufmgr->cache_bucket[i]); > > assert(bucket_for_size(size - 2048) == &bufmgr->cache_bucket[i]); > > assert(bucket_for_size(size + 1) != &bufmgr->cache_bucket[i]); > > > > We need to check on both sides of size to be 100% sure we're doing our > rounding correctly. > > > > On Fri, Sep 8, 2017 at 1:11 AM, <aravindan.muthuku...@intel.com > <mailto:aravindan.muthuku...@intel.com>> wrote: > > From: Aravindan Muthukumar <aravindan.muthuku...@intel.com > <mailto:aravindan.muthuku...@intel.com>> > > Avoiding the loop which was running with O(n) complexity. > Now the complexity has been reduced to O(1) > > Tested with piglit. > Slight performance improvement (~1%) in 3d mark. > > > > Which 3dmark test? Also, what's the error in that 1%? > > > > Change-Id: Id099f1cd24ad5b691a69070eda79b8f4e9be39a6 > Signed-off-by: Aravindan Muthukumar <aravindan.muthuku...@intel.com > <mailto:aravindan.muthuku...@intel.com>> > Signed-off-by: Kedar Karanje <kedar.j.kara...@intel.com > <mailto:kedar.j.kara...@intel.com>> > Reviewed-by: Yogesh Marathe <yogesh.mara...@intel.com > <mailto:yogesh.mara...@intel.com>> > --- > src/mesa/drivers/dri/i965/brw_bufmgr.c | 48 > +++++++++++++++++++++++++++++----- > 1 file changed, 41 insertions(+), 7 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c > b/src/mesa/drivers/dri/i965/brw_bufmgr.c > index 5b4e784..18cb166 100644 > --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c > +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c > @@ -87,6 +87,11 @@ > > #define memclear(s) memset(&s, 0, sizeof(s)) > > +/* Macros for BO cache size */ > +#define CACHE_PAGE_SIZE 4096 > > > > Just call this PAGE_SIZE > > > > +#define PAGE_SIZE_SHIFT 12 > +#define BO_CACHE_PAGE_SIZE (4 * CACHE_PAGE_SIZE) > > > > I think I'd rather we just use 4 * PAGE_SIZE explicitly than have this > extra #define. I think it's making things harder to read and not easier. > > > > + > #define FILE_DEBUG_FLAG DEBUG_BUFMGR > > static inline int > @@ -181,19 +186,48 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, > uint32_t pitch, uint32_t tiling) > return ALIGN(pitch, tile_width); > } > > +/* > + * This functions is to find the correct bucket fit for the input size. > + * This function works with O(1) complexity when the requested size > + * was queried instead of iterating the size through all the buckets. > + */ > static struct bo_cache_bucket * > bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size) > { > - int i; > + struct bo_cache_bucket *bucket = NULL; > + int x=0,index = -1; > + int row, col=0; > > - for (i = 0; i < bufmgr->num_buckets; i++) { > - struct bo_cache_bucket *bucket = &bufmgr->cache_bucket[i]; > - if (bucket->size >= size) { > - return bucket; > - } > + /* condition for size less than 4*4096 (4KB) page size */ > + if(size < BO_CACHE_PAGE_SIZE){ > > > > This should be "<=" > > > > + index = > (size>>PAGE_SIZE_SHIFT)+((size%(1<<PAGE_SIZE_SHIFT)?1:0))-1; > > > > I agree with tapani, that this can easily be an early return. > > > > I think we can also make this calculation a lot more clear: > > > > index = DIV_ROUND_UP(size, PAGE_SIZE) - 1; > > > > } > + else{ > + /* When the size is more than 4*4096, the logic follows a > matrix method > + * where the index will be searched using > Arithmetico-Geometric progression. > + * So the given size will be divided by 4096 & the index will > be traced out. > + */ > + x = size>>PAGE_SIZE_SHIFT; > > > > This rounds down not up like you want. > > > > - return NULL; > + /* Find the row using Geometric Progression. The highest bit > set will give > + * the row number. num = a * r^(n-1) where num = size a = 4 r = 2 > + */ > + row = 31 - __builtin_clz(x>>1); > + > + /* Find the column using AP but using the row value > + * calculated using GP. > + */ > + col =((x-(1<<(row+1)))/(1<<(row-1)))+1; > + col += (size%(1<<PAGE_SIZE_SHIFT<<(row-1)))?1:0; > + > + /* Finding the index value using calculated row and col number */ > + index = ((row-1)<<2) + col + 2; > > > > I think this can probably also be a lot simpler. How about something > like this: > > > > pages = DIV_ROUND_UP(size, PAGE_SIZE); > > /* Steal this from anv_allocator.c */ > > pages_log2 = ilog2_round_up(pages); > > row = pages_log2 - 1; > > col = DIV_ROUND_UP(pages, (1 << (pages_log2 - 2))); > > index = row * 4 + col; > > > > + } > + > + /* Checking the error condition */ > + bucket = (index >= 0 && index < > bufmgr->num_buckets)?(&bufmgr->cache_bucket[index]):NULL; > + return bucket; > } > > int > -- > 2.7.4 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev