Is there a convenient setting for disabling cache completely as David mentioned he did?
On Sat, 2015-03-21 at 21:39 -0400, Ron Wheeler wrote: > I agree with Adrian that caching should be a sysadmin choice. > > I would also caution that measuring cache performance during testing is > not a very useful activity. Testing tends to test one use case once and > move on to the next. > In production, users tend to do the same thing over and over. > Testing might fill a shopping cart a few times and do a lot of other > administrative functions as many times . In real life, shopping carts > are filled much more frequently than catalog updates (one hopes). Using > performance numbers from functional testing will be misleading. > > The other message that I get from David's discussion is that caching t > built by professional caching experts (Database developers as he > mentioned) worked better than caching systems built by application > developers. > It is likely that ehcache and the database built-in caching functions > will outperform caching systems built by OFBiz developers and will > handle the main cases better and will handle edge cases properly. They > will probably integrate better and be easier to configure at run-time or > during deployment. They will also be easier to tune by the system > administrator. > > I understand that Adrian needs to fix this quickly. I suppose that > caching could be eliminated to solve the problem while a better solution > is implemented. > > Do we know what it will take to add enough ehcache to make the system > perform adequately to meet current requirements? > > Ron > > > On 21/03/2015 6:22 AM, Adrian Crum wrote: > > I will try to say it again, but differently. > > > > If I am a developer, I am not aware of the subtleties of caching > > various entities. Entity cache settings will be determined during > > staging. So, I write my code as if everything will be cached - leaving > > the door open for a sysadmin to configure caching during staging. > > > > During staging, a sysadmin can start off with caching disabled, and > > then switch on caching for various entities while performance tests > > are being run. After some time, the sysadmin will have cache settings > > that provide optimal throughput. Does that mean ALL entities are > > cached? No, only the ones that need to be. > > > > The point I'm trying to make is this: The decision to cache or not > > should be made by a sysadmin, not by a developer. > > > > Adrian Crum > > Sandglass Software > > www.sandglass-software.com > > > > On 3/21/2015 10:08 AM, Scott Gray wrote: > >>> My preference is to make ALL Delegator calls use the cache. > >> > >> Perhaps I misunderstood the above sentence? I responded because I don't > >> think caching everything is a good idea > >> > >> On 21 Mar 2015 20:41, "Adrian Crum" <adrian.c...@sandglass-software.com> > >> wrote: > >>> > >>> Thanks for the info David! I agree 100% with everything you said. > >>> > >>> There may be some misunderstanding about my advice. I suggested that > >> caching should be configured in the settings file, I did not suggest > >> that > >> everything should be cached all the time. > >>> > >>> Like you said, JMeter tests can reveal what needs to be cached, and a > >> sysadmin can fine-tune performance by tweaking the cache settings. The > >> problem I mentioned is this: A sysadmin can't improve performance by > >> caching a particular entity if a developer has hard-coded it not to be > >> cached. > >>> > >>> Btw, I removed the complicated condition checking in the condition > >>> cache > >> because it didn't work. Not only was the system spending a lot of time > >> evaluating long lists of values (each value having a potentially long > >> list > >> of conditions), at the end of the evaluation the result was always a > >> cache > >> miss. > >>> > >>> > >>> > >>> Adrian Crum > >>> Sandglass Software > >>> www.sandglass-software.com > >>> > >>> On 3/20/2015 9:22 PM, David E. Jones wrote: > >>>> > >>>> > >>>> Stepping back a little, some history and theory of the entity cache > >> might be helpful. > >>>> > >>>> The original intent of the entity cache was a simple way to keep > >> frequently used values/records closer to the code that uses them, ie > >> in the > >> application server. One real world example of this is the goal to be > >> able > >> to render ecommerce catalog and product pages without hitting the > >> database. > >>>> > >>>> Over time the entity caching was made more complex to handle more > >> caching scenarios, but still left to the developer to determine if > >> caching > >> is appropriate for the code they are writing. > >>>> > >>>> In theory is it possible to write an entity cache that can be used > >>>> 100% > >> of the time? IMO the answer is NO. This is almost possible for single > >> record caching, with the cache ultimately becoming an in-memory > >> relational > >> database running on the app server (with full transaction support, > >> etc)... > >> but for List caching it totally kills the whole concept. The current > >> entity > >> cache keeps lists of results by the query condition used to get those > >> results and this is very different from what a database does, and makes > >> things rather messy and inefficient outside simple use cases. > >>>> > >>>> On top of these big functional issues (which are deal killers IMO), > >> there is also the performance issue. The point, or intent at least, > >> of the > >> entity cache is to improve performance. As the cache gets more > >> complex the > >> performance will suffer, and because of the whole concept of caching > >> results by queries the performance will be WORSE than the DB performance > >> for the same queries in most cases. Databases are quite fast and > >> efficient, > >> and we'll never be able to reproduce their ability to scale and > >> search in > >> something like an in-memory entity cache, especially not considering the > >> massive redundancy and overhead of caching lists of values by condition. > >>>> > >>>> As an example of this in the real world: on a large OFBiz project I > >> worked on that finished last year we went into production with the > >> entity > >> cache turned OFF, completely DISABLED. Why? When doing load testing on a > >> whim one of the guys decided to try it without the entity cache enabled, > >> and the body of JMeter tests that exercised a few dozen of the most > >> common > >> user paths through the system actually ran FASTER. The database > >> (MySQL in > >> this case) was hit over the network, but responded quickly enough to > >> make > >> things work quite well for the various find queries, and FAR faster for > >> updates, especially creates. This project was one of the higher volume > >> projects I'm aware of for OFBiz, at peaks handling sustained > >> processing of > >> around 10 orders per second (36,000 per hour), with some short term > >> peaks > >> much higher, closer to 20-30 orders per second... and longer term peaks > >> hitting over 200k orders in one day (north America only day time, > >> around a > >> 12 hour window). > >>>> > >>>> I found this to be curious so looked into it a bit more and the main > >> performance culprit was updates, ESPECIALLY creates on any entity > >> that has > >> an active list cache. Auto-clearing that cache requires running the > >> condition for each cache entry on the record to see if it matches, > >> and if > >> it does then it is cleared. This could be made more efficient by > >> expanding > >> the reverse index concept to index all values of fields in conditions... > >> though that would be fairly complex to implement because of the wide > >> variety of conditions that CAN be performed on fields, and even > >> moreso when > >> they are combined with other logic... especially NOTs and ORs. This > >> could > >> potentially increase performance, but would again add yet more > >> complexity > >> and overhead. > >>>> > >>>> To turn this dilemma into a nightmare, consider caching view-entities. > >> In general as systems scale if you ever have to iterate over stuff your > >> performance is going to get hit REALLY hard compared to indexed and > >> other > >> less than n operations. > >>>> > >>>> The main lesson from the story: caching, especially list caching, > >>>> should > >> ONLY be done in limited cases when the ratio of reads to write is VERY > >> high, and more particularly the ratio of reads to creates. When > >> considering > >> whether to use a cache this should be considered carefully, because > >> records > >> are sometimes updated from places that developers are unaware, > >> sometimes at > >> surprising volumes. For example, it might seem great (and help a lot > >> in dev > >> and lower scale testing) to cache inventory information for viewing on a > >> category screen, but always go to the DB to avoid stale data on a > >> product > >> detail screen and when adding to cart. The problem is that with high > >> order > >> volumes the inventory data is pretty much constantly being updated, > >> so the > >> caches are constantly... SLOWLY... being cleared as InventoryDetail > >> records > >> are created for reservations and issuances. > >>>> > >>>> To turn this nightmare into a deal killer, consider multiple > >>>> application > >> servers and the need for either a (SLOW) distributed cache or (SLOW) > >> distributed cache clearing. These have to go over the network anyway, so > >> might as well go to the database! > >>>> > >>>> In the case above where we decided to NOT use the entity cache at all > >> the tests were run on one really beefy server showing that disabling the > >> cache was faster. When we ran it in a cluster of just 2 servers with > >> direct > >> DCC (the best case scenario for a distributed cache) we not only saw > >> a big > >> performance hit, but also got various run-time errors from stale data. > >>>> > >>>> I really don't how anyone could back the concept of caching all > >>>> finds by > >> default... you don't even have to imagine edge cases, just consider the > >> problems ALREADY being faced with more limited caching and how often the > >> entity cache simply isn't a good solution. > >>>> > >>>> As for improving the entity caching in OFBiz, there are some > >>>> concepts in > >> Moqui that might be useful: > >>>> > >>>> 1. add a cache attribute to the entity definition with true, false, > >>>> and > >> never options; true and false being defaults that can be overridden by > >> code, and never being an absolute (OFBiz does have this option IIRC); > >> this > >> would default to false, true being a useful setting for common things > >> like > >> Enumeration, StatusItem, etc, etc > >>>> > >>>> 2. add general support in the entity engine find methods for a "for > >> update" parameter, and if true don't cache (and pass this on to the > >> DB to > >> lock the record(s) being queried), also making the value mutable > >>>> > >>>> 3. a write-through per-transaction cache; you can do some really cool > >> stuff with this, avoiding most database hits during a transaction > >> until the > >> end when the changes are dumped to the DB; the Moqui implementation > >> of this > >> concept even looks for cached records that any find condition would > >> require > >> to get results and does the query in-memory, not having to go to the > >> database at all... and for other queries augments the results with > >> values > >> in the cache > >>>> > >>>> The whole concept of a write-through cache that is limited to the > >>>> scope > >> of a single transaction shows some of the issues you would run into > >> even if > >> trying to make the entity cache transactional. Especially with more > >> complex > >> finds it just falls apart. The current Moqui implementation handles > >> quite a > >> bit, but there are various things that I've run into testing it with > >> real-world business services that are either a REAL pain to handle (so I > >> haven't yet, but it is conceptually possible) or that I simply can't > >> think > >> of any good way to handle... and for those you simply can't use the > >> write-through cache. > >>>> > >>>> There are some notes in the code for this, and some code/comments to > >> more thoroughly communicate this concept, in this class in Moqui: > >>>> > >>>> > >> https://github.com/moqui/moqui/blob/master/framework/src/main/groovy/org/moqui/impl/context/TransactionCache.groovy > >> > >> > >>>> > >>>> I should also say that my motivation to handle every edge case even > >>>> for > >> this write-through cache is limited... yes there is room for improvement > >> handling more scenarios, but how big will the performance increase > >> ACTUALLY > >> be for them? The efforts on this so far have been based on profiling > >> results and making sure there is a significant difference (which > >> there is > >> for many services in Mantle Business Artifacts, though I haven't even > >> come > >> close to testing all of them this way). > >>>> > >>>> The same concept would apply to a read-only entity cache... some > >>>> things > >> might be possible to support, but would NOT improve performance > >> making them > >> a moot point. > >>>> > >>>> I don't know if I've written enough to convince everyone listening > >>>> that > >> even attempting a universal read-only entity cache is a useless > >> idea... I'm > >> sure some will still like the idea. If anyone gets into it and wants > >> to try > >> it out in their own branch of OFBiz, great... knock yourself out > >> (probably > >> literally...). But PLEASE no one ever commit something like this to the > >> primary branch in the repo... not EVER. > >>>> > >>>> The whole idea that the OFBiz entity cache has had more limited > >>>> ability > >> to handle different scenarios in the past than it does now is not an > >> argument of any sort supporting the idea of taking the entity cache > >> to the > >> ultimate possible end... which theoretically isn't even that far from > >> where > >> it is now. > >>>> > >>>> To apply a more useful standard the arguments should be for a _useful_ > >> objective, which means increasing performance. I guarantee an always > >> used > >> find cache will NOT increase performance, it will kill it dead and cause > >> infinite concurrency headaches in the process. > >>>> > >>>> -David > >>>> > >>>> > >>>> > >>>> > >>>>> On 19 Mar 2015, at 10:46, Adrian Crum < > >> adrian.c...@sandglass-software.com> wrote: > >>>>> > >>>>> The translation to English is not good, but I think I understand what > >> you are saying. > >>>>> > >>>>> The entity values in the cache MUST be immutable - because multiple > >> threads share the values. To do otherwise would require complicated > >> synchronization code in GenericValue (which would cause blocking and > >> hurt > >> performance). > >>>>> > >>>>> When I first starting working on the entity cache issues, it appeared > >> to me that mutable entity values may have been in the original design > >> (to > >> enable a write-through cache). That is my guess - I am not sure. At some > >> time, the entity values in the cache were made immutable, but the change > >> was incomplete - some cached entity values were immutable and others > >> were > >> not. That is one of the things I fixed - I made sure ALL entity values > >> coming from the cache are immutable. > >>>>> > >>>>> One way we can eliminate the additional complication of cloning > >> immutable entity values is to wrap the List in a custom Iterator > >> implementation that automatically clones elements as they are retrieved > >> from the List. The drawback is the performance hit - because you > >> would be > >> cloning values that might not get modified. I think it is more > >> efficient to > >> clone an entity value only when you intend to modify it. > >>>>> > >>>>> Adrian Crum > >>>>> Sandglass Software > >>>>> www.sandglass-software.com > >>>>> > >>>>> On 3/19/2015 4:19 PM, Nicolas Malin wrote: > >>>>>> > >>>>>> Le 18/03/2015 13:16, Adrian Crum a écrit : > >>>>>>> > >>>>>>> If you code Delegator calls to avoid the cache, then there is no > >>>>>>> way > >>>>>>> for a sysadmin to configure the caching behavior - that bit of code > >>>>>>> will ALWAYS make a database call. > >>>>>>> > >>>>>>> If you make all Delegator calls use the cache, then there is an > >>>>>>> additional complication that will add a bit more code: the > >>>>>>> GenericValue instances retrieved from the cache are immutable - > >>>>>>> if you > >>>>>>> want to modify them, then you will have to clone them. So, this > >>>>>>> approach can produce an additional line of code. > >>>>>> > >>>>>> > >>>>>> I don't see any logical reason why we need to keep a GenericValue > >>>>>> came > >>>>>> from cache as immutable. In large vision, a developper give > >>>>>> information > >>>>>> on cache or not only he want force the cache using during his > >>>>>> process. > >>>>>> As OFBiz manage by default transaction, timezone, locale, > >>>>>> auto-matching > >>>>>> or others. > >>>>>> The entity engine would be works with admin sys cache tuning. > >>>>>> > >>>>>> As example delegator.find("Party", "partyId", partyId) use the > >>>>>> default > >>>>>> parameter from cache.properties and after the store on a cached > >>>>>> GenericValue is a delegator's problem. I see a simple test like > >>>>>> that : > >>>>>> if (genericValue came from cache) { > >>>>>> if (value is already done) { > >>>>>> getFromDataBase > >>>>>> update Value > >>>>>> } > >>>>>> else refuse (or not I have a doubt :) ) > >>>>>> } > >>>>>> store > >>>>>> > >>>>>> > >>>>>> Nicolas > >>>> > >>>> > >> > > > >