On Mon, May 30, 2016 at 12:18:37PM +0100, Brian Foley wrote: > On Mon, May 30, 2016 at 12:40:49PM +0200, Iustin Pop wrote: > > On Thu, May 26, 2016 at 03:56:10PM +0100, Ganeti Development List wrote: > > > This is exercised by the luxi QueryInstances call when the sinst_cnt and > > > sint_list fields are used. This uses a lot of CPU and does a lot of > > > short-lived heap allocation on clusters with many instances. > > > > > > The reimplementation allocates fewer temporary values, and does fewer > > > object lookups by UUID. The net effect is to reduce heap use from ~3.2GB > > > to ~1.5GB, and CPU use from ~1200ms to ~770ms in a test harness using > > > a config with 1000 DRBD instances on 80 nodes. > > > > Hi Brian & all, > > > > I was surprised to see that much heap use and wall time for this code, > > so I looked at this yesterday (thanks for the config file/test > > hardness!). > > > > Profiling shows that overall, the allocation for this test harness is > > split half-half between config loading (which is a separate issue) and > > the UTF8.fromString calls for converting between the disk IDs (stored as > > String in the instance) and the keys for configDisks (which are > > ByteStrings). > > > > Writing a simple 20-line hack to change the instance disks to be > > ByteStrings shows that the runtime of just "map (snd . getNodeInstances > > cfg) all_nodes" goes from (on my machine) 200ms to ~45 ms., i.e. a 5× > > decrease in runtime, with the getNodeInstances doing very few > > allocations. > > > > Is there a reason not to store UUIDs everywhere as ByteStrings? My > > quick and dirty patch seems to pass all unittests. > > That's a huge improvement, and it would be great if you could get it to > work. The problem I had is that there's a lot of places that use > getItem'/getItem, and these currently require a String arg (because they > can take a name, a name prefix, or a UUID), so you end up doing a lot of > conversions back from ByteString to String. :(
Yes, saw that, and from a cursory look I believe it can be worked around. Testing overall performance might be difficult though. Are there nowadays large-scale performance tests? Or just individual benchmarks? > Note also that there's a caveat with ByteStrings: they use pinned memory > and therefore can cause lots of heap fragmentation, and I suspect this > might be an issue in our daemons. > > Data.ByteString.Short is more compact and doesn't used pinned memory. I > wonder if this could be used instead. That's interesting, I didn't know that. Although many of the Container types (or all?) already use ByteString as keys, so I suspect there are already problems with it. OK, will look into this, and see what I can come up with. thanks! iustin
