Hi,

On 02.05.2018 02:25, James Xiong wrote:
From: "Xiong, James" <james.xi...@intel.com>

With the current implementation, brw_bufmgr may round up a request
size to the next bucket size, result in 25% more memory allocated in
the worst senario. For example:
Request size    Actual size
32KB+1Byte      40KB
.
8MB+1Byte       10MB
.
96MB+1Byte      112MB
This series align the buffer size up to page instead of a bucket size
to improve memory allocation efficiency. Performances are almost the
same with Basemark ES3, GfxBench4 and 5:

Basemark ES3
            score                    peak memory allocation
   before      after    diff        before    after      diff
21.537462  21.888784  1.61%    419766272  408809472  -10956800
19.566198  19.763429  1.00%                     

What memory you're measuring:

* VmSize (not that relevant unless you're running out of address space)?

* PrivateDirty (listed in /proc/PID/smaps and e.g. by "smem" tool [1])?

* total of allocation sizes used by Mesa?

Or something else?

In general, unused memory isn't much of a problem, only dirty (written) memory. Kernel maps all unused memory to a single zero page, so unused memory takes only few bytes of RAM for the page table entries (required for tracking the allocation pages).


GfxBench 4.0
                                     score                        peak memory
                      before         after         diff     before   after     
diff
gl_4             564.6052246094  565.2348632813  0.11%  578490368 550199296 
-28291072
gl_4_off         727.0440063477  703.5833129883  -3.33% 629501952 598216704 
-31285248
gl_manhattan     1053.4223632813 1057.3690185547 0.37%  449568768 421134336 
-28434432
gl_trex          2708.0656738281 2699.2646484375 -0.33% 130076672 125042688 
-5033984
gl_alu2          1207.1490478516 1212.2220458984 0.42%  55496704  55029760  
-466944
gl_driver2       103.0383071899  103.5478439331  0.49%  13107200  12980224  
-126976
gl_manhattan_off 1703.4780273438 1736.9074707031 1.92%  490016768 456548352 
-33468416
gl_trex_off      2951.6809082031 3058.5422363281 3.49%  157511680 152260608 
-5251072
gl_alu2_off      2604.0903320313 2626.2524414063 0.84%  86130688  85483520  
-647168
gl_driver2_off   204.0173187256  207.0510101318  1.47%  40869888  40615936  
-253952

You're missing information on:
* On which plaform you did the testing (affects variance)
* how many test rounds you ran, and
* what is your variance

-> I don't know whether your numbers are just random noise.


Memory is allocated in pages from kernel, so there's no point in showing its usage as bytes. Please use KBs, that's more readable.

(Because of randomness e.g. interactions with the windowing system, there can be some variance also in process memory usage, which may
also be useful to report.)

Because of variance, you don't need that decimals for the scores. Removing the extra ones makes that data a bit more readable too.


        - Eero

[1] "smem" is python based tool available at least in Debian.
If you want something simpler, e.g. shell script working with
minimal shells like Busybox, you can use this:
https://github.com/maemo-tools-old/sp-memusage/blob/master/scripts/mem-smaps-private


GfxBench 5.0
             score               peak memory            
          before        after   before     after       diff
gl_5       259   259  1137549312  1038286848 -99262464
gl_5_off   297   297  1170853888  1071357952 -99495936

Xiong, James (4):
   i965/drm: Reorganize code for the next patch
   i965/drm: Round down buffer size and calculate the bucket index
   i965/drm: Searching for a cached buffer for reuse
   i965/drm: Purge the bucket when its cached buffer is evicted

  src/mesa/drivers/dri/i965/brw_bufmgr.c | 139 ++++++++++++++++++---------------
  src/util/list.h                        |   5 ++
  2 files changed, 79 insertions(+), 65 deletions(-)


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to