On Mon, May 30, 2016 at 12:18:37PM +0100, Brian Foley wrote:
> On Mon, May 30, 2016 at 12:40:49PM +0200, Iustin Pop wrote:
> > On Thu, May 26, 2016 at 03:56:10PM +0100, Ganeti Development List wrote:
> > > This is exercised by the luxi QueryInstances call when the sinst_cnt and
> > > sint_list fields are used. This uses a lot of CPU and does a lot of
> > > short-lived heap allocation on clusters with many instances.
> > > 
> > > The reimplementation allocates fewer temporary values, and does fewer
> > > object lookups by UUID. The net effect is to reduce heap use from ~3.2GB
> > > to ~1.5GB, and CPU use from ~1200ms to ~770ms in a test harness using
> > > a config with 1000 DRBD instances on 80 nodes.
> > 
> > Hi Brian & all,
> > 
> > I was surprised to see that much heap use and wall time for this code,
> > so I looked at this yesterday (thanks for the config file/test
> > hardness!).
> > 
> > Profiling shows that overall, the allocation for this test harness is
> > split half-half between config loading (which is a separate issue) and
> > the UTF8.fromString calls for converting between the disk IDs (stored as
> > String in the instance) and the keys for configDisks (which are
> > ByteStrings).
> >
> > Writing a simple 20-line hack to change the instance disks to be
> > ByteStrings shows that the runtime of just "map (snd . getNodeInstances
> > cfg) all_nodes" goes from (on my machine) 200ms to ~45 ms., i.e. a 5×
> > decrease in runtime, with the getNodeInstances doing very few
> > allocations.
> >
> > Is there a reason not to store UUIDs everywhere as ByteStrings? My
> > quick and dirty patch seems to pass all unittests.
> 
> That's a huge improvement, and it would be great if you could get it to
> work. The problem I had is that there's a lot of places that use
> getItem'/getItem, and these currently require a String arg (because they
> can take a name, a name prefix, or a UUID), so you end up doing a lot of
> conversions back from ByteString to String. :(

Yes, saw that, and from a cursory look I believe it can be worked
around. Testing overall performance might be difficult though. Are there
nowadays large-scale performance tests? Or just individual benchmarks?

> Note also that there's a caveat with ByteStrings: they use pinned memory
> and therefore can cause lots of heap fragmentation, and I suspect this
> might be an issue in our daemons.
> 
> Data.ByteString.Short is more compact and doesn't used pinned memory. I
> wonder if this could be used instead.

That's interesting, I didn't know that. Although many of the Container
types (or all?) already use ByteString as keys, so I suspect there are
already problems with it.

OK, will look into this, and see what I can come up with.

thanks!
iustin

Reply via email to