I’m also a little surprised at the 1/2-ing… what were the values being stored? 
I suppose in theory, many values are relatively “small”, memory-wise, so having 
the overhead of also storing the key could ~double the memory use, but if 
you’re storing large values, I wouldn’t expect the utilization to drop as 
dramatically. What were your data values (type and length distribution for 
strings)?

Thanks!

Robert

> On Jun 10, 2017, at 6:49 AM, Michael Gentry <blackn...@gmail.com> wrote:
> 
> Hi John,
> 
> I'm a little surprised that map-based storage is over 2x worse in memory
> consumption.  I'm wondering if there is more going on here than storage of
> the property values.  Would it be simple enough to adapt your test case to
> compare a list of POJOs vs a list of maps and see what the memory footprint
> and difference is that way?
> 
> I personally was thinking the big improvement for using fields directly is
> the speed improvement.  I didn't think the memory consumption difference
> would be that dramatic.
> 
> Thanks,
> 
> mrg
> 
> 
> On Fri, Jun 9, 2017 at 10:55 AM, John Huss <johnth...@gmail.com> wrote:
> 
>> I did some experimenting recently to see if changes to the way data in
>> stored in Cayenne objects could reduce the amount of memory they consume.
>> 
>> I chose to use separate fields for each property instead of a HashMap
>> (which is what CayenneDataObject uses).  The results were very affirming.
>> For my test of loading 10,000 objects from every table in my database I got
>> it to use about about *half the memory* of the default class (from 921 MB
>> down to 431 MB).
>> 
>> I know there has been some discussion already about addressing this topic
>> for the next major release, so I thought I'd throw in some observations /
>> questions here.
>> 
>> For my implementation I subclassed CayenneDataObject because in previous
>> experience I found implementing a replacement to be much more difficult and
>> subject to more bugs due to the less frequently used code path that
>> PersistentObject and it's descriptors take you down.  My apps rely on
>> things that are sort of specific to CayenneDataObject like Validating.
>> 
>> So one question is how we should be addressing the need that people may
>> have to create their own data classes. Right now I believe the recommended
>> path is to subclass PersistentObject, but I'm not convinced that that is a
>> viable solution without wholesale copying most of CayenneDataObject into
>> your subclass.  I'd rather see a fuller base class (in addition to keeping
>> PersistentObject around) that includes all of CayenneDataObject except the
>> property storage (HashMap).
>> 
>> For my implementation I had to modify CayenneDataObject, but only slightly
>> to avoid creating the HashMap which I wasn't using. However, because class
>> isn't really intended for customization this map is referenced in multiple
>> methods that can't easily be overridden to change the way things are
>> stored.
>> 
>> Another approach might be to ask why anyone should need to customize the
>> way data is stored in the objects if we can just use the best solution
>> possible in the first place?  I can't imagine a more efficient
>> representation that fields.  However, fields present difficulties for the
>> use case where you aren't generating unique classes for your model but just
>> rely on the generic class.  In theory this could be addressed via runtime
>> code generation or something else, but that would be quite a change.
>> 
>> So I'm looking forward to discussing this and toward the future.
>> 
>> John
>> 

Reply via email to