On Wed, Dec 14, 2016 at 3:01 AM, Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia-bell-labs.com> wrote: > > >> -----Original Message----- >> From: Bill Fischofer [mailto:bill.fischo...@linaro.org] >> Sent: Tuesday, December 13, 2016 4:00 PM >> To: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolainen@nokia-bell- >> labs.com> >> Cc: Maxim Uvarov <maxim.uva...@linaro.org>; lng-odp-forward <lng- >> o...@lists.linaro.org> >> Subject: Re: [lng-odp] sizeof(pool_table_t) = 272M >> >> On Tue, Dec 13, 2016 at 3:23 AM, Savolainen, Petri (Nokia - FI/Espoo) >> <petri.savolai...@nokia-bell-labs.com> wrote: >> > >> > >> >> -----Original Message----- >> >> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of >> Maxim >> >> Uvarov >> >> Sent: Monday, December 12, 2016 6:45 PM >> >> To: lng-odp-forward <lng-odp@lists.linaro.org> >> >> Subject: [lng-odp] sizeof(pool_table_t) = 272M >> >> >> >> The latest pool changed set huge pool_table_t. That is due to: >> >> >> >> #define CONFIG_POOL_MAX_NUM (1 * 1024 * 1024) >> >> >> >> If for example it's >> >> #define CONFIG_POOL_MAX_NUM (4096) >> >> >> >> Then pool_table_t is 18M. >> >> >> >> >> >> Allocation happens here: >> >> int odp_pool_init_global(void) >> >> { >> >> >> >> shm = odp_shm_reserve("_odp_pool_table", >> >> sizeof(pool_table_t), >> >> ODP_CACHE_LINE_SIZE, 0); >> >> >> >> ... >> >> >> >> >> >> Because of with latest changes in api-next for process mode and drivers >> >> mapping file in /tmp/ directory is created. So for each process there >> is >> >> 272M file. I think that is too big and can be a problem with testing on >> >> small machines. >> >> >> >> 1. Can we lower that number of maximum entries in the pool? >> > >> > >> > This is why those are config options. We can trade-off capability to >> memory usage. Both of these affect to the memory usage: >> > >> > /* >> > * Maximum number of pools >> > */ >> > #define ODP_CONFIG_POOLS 64 >> > >> > /* >> > * Maximum number of events in a pool >> > */ >> > #define CONFIG_POOL_MAX_NUM (1 * 1024 * 1024) >> > >> > >> > If you cut both in half, the memory usage drops into one fourth... >> > >> > >> >> >> >> 2. Can we remove that table from here completely and made it dynamic in >> >> odp_pool_create()? >> > >> > >> > Not really, it would be unnecessary complicated compared to the gain. >> For example, I have 8GB memory on my laptop, 128 GB on my server and even >> raspberry pi has 1 GB. Laptop would run one/two instances, server a few >> (up to 6), raspberry only one. >> > >> > The current memory consumption would be problem only for a raspberry >> user, but it would be very easy for him/her to scale down this (and other) >> config options to match the system resources. Default ODP configuration is >> targeted for higher end than single core, single interface, single GB >> memory systems. >> > >> > >> > Anyway, I think at least ODP_CONFIG_POOLS could be decreased, may be >> even halved. Typically, our tests use only one or few pools. Also max >> event count could be decreased, but it should be a power of two. It could >> be also halved, as most of our tests create only 8k num packets per pool. >> >> ODP_CONFIG_POOLS used to be 16 and was increased to 64 as a result of >> complaints that 16 was too small. The size of the pool being created >> is specified as part of the odp_pool_param_t struct passed to >> odp_pool_create(). While rounding this up to powers-of-two is >> reasonable, there's no need to over-reserve storage that will never be >> used. Again, this seems to be an implementation bug. It was never the >> intent that the max architectural limits somehow become the minimum >> memory requirements needed to support any pool. > > This fixed size memory is used for pool global data structure, which includes > rings. It's not for the buffer space, that's allocated dynamically. Fixed > global structure saves indirections, enables compiler optimizations and is > more robust (no need to allocate ring space when pools are created and > destroyed). If rings would be separated, it would increase the number of shm > allocs and would not save considerable amount of memory (depending on the > config values). E.g. with 512k events per pool, each pool needs a 2MB ring > and with 32 pools that's 64 MB total, which is not a lot. Also when hugepages > are 2MB sized, you would not save memory per pool, as in practice all ring > allocations would be rounded up to a 2MB huge page allocation. > > My point is that, the easiest way forward is to select suitable default > values for these configs and consider implementation change only when someone > actually has reported a real issue. 272M is large amount but so far it has > been only speculated to be an issue. If we cut defaults a bit, those are > still large enough (so that nobody notices it) and the memory consumption is > small enough (if someone cares). > > These should be OK for most users ... > > ODP_CONFIG_POOLS 32 > CONFIG_POOL_MAX_NUM (512 * 1024) > > => 64 MB for rings > > ... or even a smaller number of events should be more than enough for most > users. > > CONFIG_POOL_MAX_NUM (256 * 1024) > > => 32 MB for rings > > -Petri >
I see two ways to approach this: 1. Make the current pool_ring_t indirect. That is change pool_t to have pool_ring_t *ring instead of including the ring as part of the pool_t and reserve the ring when the that pool_t is first used. Yes, this means there is an indirection when accessing the global ring but the local cache should be satisfying most alloc/free requests so this should be infrequent. 2. Don't use a single large pool_t but rather break it up into multiple segments and allocate them only when the number of pools grows to a new segment. Either of these approaches would allow memory usage to be proportional to the actual number of pools in use rather than the theoretical maximum number. > >