Le 20/03/2015 23:37, Michael Brohl a écrit :
David,

wow, quite an interesting read!

Thank you for sharing some historical design insights for the entity cache and the valuable findings using (or not using) the entity cache in a real life production scenario.

I had not the time to dig deeper into the proposed entity cache changes but had the feeling that this would be quite a challenging change which have to be very well thought-out and needs some thorough testing.

Like said Scott, without the ability to compare the 2 possible implementations 
(old and new) it's a risky thing. David's feedback proves it

Jacques


Regards,

Michael


Am 20.03.15 um 22:22 schrieb David E. Jones:
Stepping back a little, some history and theory of the entity cache might be 
helpful.

The original intent of the entity cache was a simple way to keep frequently used values/records closer to the code that uses them, ie in the application server. One real world example of this is the goal to be able to render ecommerce catalog and product pages without hitting the database.

Over time the entity caching was made more complex to handle more caching scenarios, but still left to the developer to determine if caching is appropriate for the code they are writing.

In theory is it possible to write an entity cache that can be used 100% of the time? IMO the answer is NO. This is almost possible for single record caching, with the cache ultimately becoming an in-memory relational database running on the app server (with full transaction support, etc)... but for List caching it totally kills the whole concept. The current entity cache keeps lists of results by the query condition used to get those results and this is very different from what a database does, and makes things rather messy and inefficient outside simple use cases.

On top of these big functional issues (which are deal killers IMO), there is also the performance issue. The point, or intent at least, of the entity cache is to improve performance. As the cache gets more complex the performance will suffer, and because of the whole concept of caching results by queries the performance will be WORSE than the DB performance for the same queries in most cases. Databases are quite fast and efficient, and we'll never be able to reproduce their ability to scale and search in something like an in-memory entity cache, especially not considering the massive redundancy and overhead of caching lists of values by condition.

As an example of this in the real world: on a large OFBiz project I worked on that finished last year we went into production with the entity cache turned OFF, completely DISABLED. Why? When doing load testing on a whim one of the guys decided to try it without the entity cache enabled, and the body of JMeter tests that exercised a few dozen of the most common user paths through the system actually ran FASTER. The database (MySQL in this case) was hit over the network, but responded quickly enough to make things work quite well for the various find queries, and FAR faster for updates, especially creates. This project was one of the higher volume projects I'm aware of for OFBiz, at peaks handling sustained processing of around 10 orders per second (36,000 per hour), with some short term peaks much higher, closer to 20-30 orders per second... and longer term peaks hitting over 200k orders in one day (north America only day time, around a 12 hour window).

I found this to be curious so looked into it a bit more and the main performance culprit was updates, ESPECIALLY creates on any entity that has an active list cache. Auto-clearing that cache requires running the condition for each cache entry on the record to see if it matches, and if it does then it is cleared. This could be made more efficient by expanding the reverse index concept to index all values of fields in conditions... though that would be fairly complex to implement because of the wide variety of conditions that CAN be performed on fields, and even moreso when they are combined with other logic... especially NOTs and ORs. This could potentially increase performance, but would again add yet more complexity and overhead.

To turn this dilemma into a nightmare, consider caching view-entities. In general as systems scale if you ever have to iterate over stuff your performance is going to get hit REALLY hard compared to indexed and other less than n operations.

The main lesson from the story: caching, especially list caching, should ONLY be done in limited cases when the ratio of reads to write is VERY high, and more particularly the ratio of reads to creates. When considering whether to use a cache this should be considered carefully, because records are sometimes updated from places that developers are unaware, sometimes at surprising volumes. For example, it might seem great (and help a lot in dev and lower scale testing) to cache inventory information for viewing on a category screen, but always go to the DB to avoid stale data on a product detail screen and when adding to cart. The problem is that with high order volumes the inventory data is pretty much constantly being updated, so the caches are constantly... SLOWLY... being cleared as InventoryDetail records are created for reservations and issuances.

To turn this nightmare into a deal killer, consider multiple application servers and the need for either a (SLOW) distributed cache or (SLOW) distributed cache clearing. These have to go over the network anyway, so might as well go to the database!

In the case above where we decided to NOT use the entity cache at all the tests were run on one really beefy server showing that disabling the cache was faster. When we ran it in a cluster of just 2 servers with direct DCC (the best case scenario for a distributed cache) we not only saw a big performance hit, but also got various run-time errors from stale data.

I really don't how anyone could back the concept of caching all finds by default... you don't even have to imagine edge cases, just consider the problems ALREADY being faced with more limited caching and how often the entity cache simply isn't a good solution.

As for improving the entity caching in OFBiz, there are some concepts in Moqui 
that might be useful:

1. add a cache attribute to the entity definition with true, false, and never options; true and false being defaults that can be overridden by code, and never being an absolute (OFBiz does have this option IIRC); this would default to false, true being a useful setting for common things like Enumeration, StatusItem, etc, etc

2. add general support in the entity engine find methods for a "for update" parameter, and if true don't cache (and pass this on to the DB to lock the record(s) being queried), also making the value mutable

3. a write-through per-transaction cache; you can do some really cool stuff with this, avoiding most database hits during a transaction until the end when the changes are dumped to the DB; the Moqui implementation of this concept even looks for cached records that any find condition would require to get results and does the query in-memory, not having to go to the database at all... and for other queries augments the results with values in the cache

The whole concept of a write-through cache that is limited to the scope of a single transaction shows some of the issues you would run into even if trying to make the entity cache transactional. Especially with more complex finds it just falls apart. The current Moqui implementation handles quite a bit, but there are various things that I've run into testing it with real-world business services that are either a REAL pain to handle (so I haven't yet, but it is conceptually possible) or that I simply can't think of any good way to handle... and for those you simply can't use the write-through cache.

There are some notes in the code for this, and some code/comments to more 
thoroughly communicate this concept, in this class in Moqui:

https://github.com/moqui/moqui/blob/master/framework/src/main/groovy/org/moqui/impl/context/TransactionCache.groovy

I should also say that my motivation to handle every edge case even for this write-through cache is limited... yes there is room for improvement handling more scenarios, but how big will the performance increase ACTUALLY be for them? The efforts on this so far have been based on profiling results and making sure there is a significant difference (which there is for many services in Mantle Business Artifacts, though I haven't even come close to testing all of them this way).

The same concept would apply to a read-only entity cache... some things might be possible to support, but would NOT improve performance making them a moot point.

I don't know if I've written enough to convince everyone listening that even attempting a universal read-only entity cache is a useless idea... I'm sure some will still like the idea. If anyone gets into it and wants to try it out in their own branch of OFBiz, great... knock yourself out (probably literally...). But PLEASE no one ever commit something like this to the primary branch in the repo... not EVER.

The whole idea that the OFBiz entity cache has had more limited ability to handle different scenarios in the past than it does now is not an argument of any sort supporting the idea of taking the entity cache to the ultimate possible end... which theoretically isn't even that far from where it is now.

To apply a more useful standard the arguments should be for a _useful_ objective, which means increasing performance. I guarantee an always used find cache will NOT increase performance, it will kill it dead and cause infinite concurrency headaches in the process.

-David




Reply via email to