I’m also a little surprised at the 1/2-ing… what were the values being stored? I suppose in theory, many values are relatively “small”, memory-wise, so having the overhead of also storing the key could ~double the memory use, but if you’re storing large values, I wouldn’t expect the utilization to drop as dramatically. What were your data values (type and length distribution for strings)?
Thanks! Robert > On Jun 10, 2017, at 6:49 AM, Michael Gentry <blackn...@gmail.com> wrote: > > Hi John, > > I'm a little surprised that map-based storage is over 2x worse in memory > consumption. I'm wondering if there is more going on here than storage of > the property values. Would it be simple enough to adapt your test case to > compare a list of POJOs vs a list of maps and see what the memory footprint > and difference is that way? > > I personally was thinking the big improvement for using fields directly is > the speed improvement. I didn't think the memory consumption difference > would be that dramatic. > > Thanks, > > mrg > > > On Fri, Jun 9, 2017 at 10:55 AM, John Huss <johnth...@gmail.com> wrote: > >> I did some experimenting recently to see if changes to the way data in >> stored in Cayenne objects could reduce the amount of memory they consume. >> >> I chose to use separate fields for each property instead of a HashMap >> (which is what CayenneDataObject uses). The results were very affirming. >> For my test of loading 10,000 objects from every table in my database I got >> it to use about about *half the memory* of the default class (from 921 MB >> down to 431 MB). >> >> I know there has been some discussion already about addressing this topic >> for the next major release, so I thought I'd throw in some observations / >> questions here. >> >> For my implementation I subclassed CayenneDataObject because in previous >> experience I found implementing a replacement to be much more difficult and >> subject to more bugs due to the less frequently used code path that >> PersistentObject and it's descriptors take you down. My apps rely on >> things that are sort of specific to CayenneDataObject like Validating. >> >> So one question is how we should be addressing the need that people may >> have to create their own data classes. Right now I believe the recommended >> path is to subclass PersistentObject, but I'm not convinced that that is a >> viable solution without wholesale copying most of CayenneDataObject into >> your subclass. I'd rather see a fuller base class (in addition to keeping >> PersistentObject around) that includes all of CayenneDataObject except the >> property storage (HashMap). >> >> For my implementation I had to modify CayenneDataObject, but only slightly >> to avoid creating the HashMap which I wasn't using. However, because class >> isn't really intended for customization this map is referenced in multiple >> methods that can't easily be overridden to change the way things are >> stored. >> >> Another approach might be to ask why anyone should need to customize the >> way data is stored in the objects if we can just use the best solution >> possible in the first place? I can't imagine a more efficient >> representation that fields. However, fields present difficulties for the >> use case where you aren't generating unique classes for your model but just >> rely on the generic class. In theory this could be addressed via runtime >> code generation or something else, but that would be quite a change. >> >> So I'm looking forward to discussing this and toward the future. >> >> John >>