That makes much more sense! That'll teach me to sleep-read. Well, probably not. :-)
These are pretty nice improvements overall. When is 4.1 coming out? :-) Thanks, mrg On Wed, Jul 5, 2017 at 1:21 PM, Andrus Adamchik <and...@objectstyle.org> wrote: > > I'm wondering if you > > inadvertently switched old vs new in the performance section? (Since the > > new, on the right, is always slower.) > > The benchmark is million ops per second. So a bigger value is > better/faster (kind of like RPM in a car). > > Andrus > > > On Jul 5, 2017, at 7:31 PM, Michael Gentry <blackn...@gmail.com> wrote: > > > > Hi Nikita, > > > > I saw the pull request and was taking a glance at it, so thanks for > > following up with an e-mail. > > > > The memory improvement looks quite nice, but I'm wondering if you > > inadvertently switched old vs new in the performance section? (Since the > > new, on the right, is always slower.) > > > > Thanks, > > > > mrg > > > > > > On Wed, Jul 5, 2017 at 10:19 AM, Nikita Timofeev < > ntimof...@objectstyle.com> > > wrote: > > > >> Hi all, > >> > >> I've run some additional benchmarks for field-based classes inspired > >> by John and they were so promising, that I've moved on > >> to the implementation. > >> > >> So here is pull request for you to review [1]. > >> Here [2] you can see what new generated classes will look like. > >> > >> For me there is no visible downsides in this solution, e.g. both > >> memory usage and speed are improved. > >> All tests are clean and the only minor incompatibility out there > >> is in HOLLOW state that no longer resets object's values [3] > >> (though this can be implemented as well, I'm just > >> not sure this is really needed). > >> > >> P.S. here is some raw numbers from my benchmarks. > >> I'm giving absolute numbers, but really only their relation is > important. > >> Results for old version are on the left, for new version on the right. > >> > >> Memory usage: > >> ============== > >> 1. 10.000 small objects > >> (int, Date and String ~ 20 chars) > >>>>> 6Mb vs 2.5Mb <<< > >> > >> 2. 10.000 objects with big values > >> (int, Date and String ~ 1K chars) > >> Actually in case of same classes (same field number), > >> there will be just constant difference, > >> so this is just to get idea what to expect in different cases. > >>>>> 24.5Mb vs 21Mb <<< > >> > >> Performance: > >> ============== > >> (numbers are in millions ops per sec, measured with JMH benchmark) > >> 1. Getter: > >>>>> 107 vs 177 <<< > >> > >> 2. Setter: > >> Not so impressive, as Cayenne stack took most of the > >> time here to process graph diff, but still new methods are better. > >>>>> 12.5 vs 14.5 <<< > >> > >> 3. readPropertyDirectly: > >>>>> 152 vs 248 <<< > >> > >> 4. writePropertyDirectly: > >> This is map.put() vs switch(String) battle, > >> and map definitely loosing it :) > >>>>> 126 vs 582 <<< > >> > >> [1] https://github.com/apache/cayenne/pull/235 > >> [2] https://github.com/stariy95/cayenne/blob/ > >> 544aae0866e8fb1712f07f00794ea3263a4c95b5/cayenne-server/src/ > >> test/java/org/apache/cayenne/testdo/testmap/auto/_Artist.java > >> [3] https://github.com/stariy95/cayenne/blob/ > >> 544aae0866e8fb1712f07f00794ea3263a4c95b5/cayenne-server/src/ > >> test/java/org/apache/cayenne/access/DataContextExtrasIT.java#L144 > >> > >> On Wed, Jun 21, 2017 at 10:20 PM, John Huss <johnth...@gmail.com> > wrote: > >>> I was surprised by the difference in memory too, but this is a small > diff > >>> (apart from the newly generated readPropertyDirectly/ > >> writePropertyDirectly > >>> methods) so there isn't anything else going on. My unverified > assumption > >>> of HashMap is that is doubles in size each time it resizes, so entities > >>> with more fields could cause more waste. For example a entity with 65 > >>> fields would have 63 empty array slots (ignoring fill factor). So the > >>> exact savings may vary. > >>> > >>> On Sat, Jun 17, 2017 at 1:01 AM Robert Zeigler < > >> robert.zeig...@roxanemy.com> > >>> wrote: > >>> > >>>> I’m also a little surprised at the 1/2-ing… what were the values being > >>>> stored? I suppose in theory, many values are relatively “small”, > >>>> memory-wise, so having the overhead of also storing the key could > >> ~double > >>>> the memory use, but if you’re storing large values, I wouldn’t expect > >> the > >>>> utilization to drop as dramatically. What were your data values (type > >> and > >>>> length distribution for strings)? > >>>> > >>>> Thanks! > >>>> > >>>> Robert > >>>> > >>>>> On Jun 10, 2017, at 6:49 AM, Michael Gentry <blackn...@gmail.com> > >> wrote: > >>>>> > >>>>> Hi John, > >>>>> > >>>>> I'm a little surprised that map-based storage is over 2x worse in > >> memory > >>>>> consumption. I'm wondering if there is more going on here than > >> storage > >>>> of > >>>>> the property values. Would it be simple enough to adapt your test > >> case > >>>> to > >>>>> compare a list of POJOs vs a list of maps and see what the memory > >>>> footprint > >>>>> and difference is that way? > >>>>> > >>>>> I personally was thinking the big improvement for using fields > >> directly > >>>> is > >>>>> the speed improvement. I didn't think the memory consumption > >> difference > >>>>> would be that dramatic. > >>>>> > >>>>> Thanks, > >>>>> > >>>>> mrg > >>>>> > >>>>> > >>>>> On Fri, Jun 9, 2017 at 10:55 AM, John Huss <johnth...@gmail.com> > >> wrote: > >>>>> > >>>>>> I did some experimenting recently to see if changes to the way data > >> in > >>>>>> stored in Cayenne objects could reduce the amount of memory they > >>>> consume. > >>>>>> > >>>>>> I chose to use separate fields for each property instead of a > HashMap > >>>>>> (which is what CayenneDataObject uses). The results were very > >>>> affirming. > >>>>>> For my test of loading 10,000 objects from every table in my > >> database I > >>>> got > >>>>>> it to use about about *half the memory* of the default class (from > >> 921 > >>>> MB > >>>>>> down to 431 MB). > >>>>>> > >>>>>> I know there has been some discussion already about addressing this > >>>> topic > >>>>>> for the next major release, so I thought I'd throw in some > >> observations > >>>> / > >>>>>> questions here. > >>>>>> > >>>>>> For my implementation I subclassed CayenneDataObject because in > >> previous > >>>>>> experience I found implementing a replacement to be much more > >> difficult > >>>> and > >>>>>> subject to more bugs due to the less frequently used code path that > >>>>>> PersistentObject and it's descriptors take you down. My apps rely > on > >>>>>> things that are sort of specific to CayenneDataObject like > >> Validating. > >>>>>> > >>>>>> So one question is how we should be addressing the need that people > >> may > >>>>>> have to create their own data classes. Right now I believe the > >>>> recommended > >>>>>> path is to subclass PersistentObject, but I'm not convinced that > that > >>>> is a > >>>>>> viable solution without wholesale copying most of CayenneDataObject > >> into > >>>>>> your subclass. I'd rather see a fuller base class (in addition to > >>>> keeping > >>>>>> PersistentObject around) that includes all of CayenneDataObject > >> except > >>>> the > >>>>>> property storage (HashMap). > >>>>>> > >>>>>> For my implementation I had to modify CayenneDataObject, but only > >>>> slightly > >>>>>> to avoid creating the HashMap which I wasn't using. However, because > >>>> class > >>>>>> isn't really intended for customization this map is referenced in > >>>> multiple > >>>>>> methods that can't easily be overridden to change the way things are > >>>>>> stored. > >>>>>> > >>>>>> Another approach might be to ask why anyone should need to customize > >> the > >>>>>> way data is stored in the objects if we can just use the best > >> solution > >>>>>> possible in the first place? I can't imagine a more efficient > >>>>>> representation that fields. However, fields present difficulties > for > >>>> the > >>>>>> use case where you aren't generating unique classes for your model > >> but > >>>> just > >>>>>> rely on the generic class. In theory this could be addressed via > >>>> runtime > >>>>>> code generation or something else, but that would be quite a change. > >>>>>> > >>>>>> So I'm looking forward to discussing this and toward the future. > >>>>>> > >>>>>> John > >>>>>> > >>>> > >>>> > >> > >> > >> > >> -- > >> Best regards, > >> Nikita Timofeev > >> > >