Joining to providing our backgrounds.. I’d be happy to help here too since I have pretty solid background in using and developing caching solutions, however mostly in Java world (expertise in GemFire and Coherence, developing GridGain distributed cache).
Renat Akhmerov @ Mirantis Inc. On 23 Jan 2014, at 18:38, Joshua Harlow <harlo...@yahoo-inc.com> wrote: > Same here; I've done pretty big memcache (and similar technologies) scale > caching & invalidations at Y! before so here to help… > > From: Morgan Fainberg <m...@metacloud.com> > Reply-To: "OpenStack Development Mailing List (not for usage questions)" > <openstack-dev@lists.openstack.org> > Date: Thursday, January 23, 2014 at 4:17 PM > To: "OpenStack Development Mailing List (not for usage questions)" > <openstack-dev@lists.openstack.org> > Subject: Re: [openstack-dev] [oslo] memoizer aka cache > > Yes! There is a reason Keystone has a very small footprint of > caching/invalidation done so far. It really needs to be correct when it > comes to proper invalidation logic. I am happy to offer some help in > determining logic for caching/invalidation with Dogpile.cache in mind as we > get it into oslo and available for all to use. > > --Morgan > > > > On Thu, Jan 23, 2014 at 2:54 PM, Joshua Harlow <harlo...@yahoo-inc.com> wrote: >> Sure, no cancelling cases of conscious usage, but we need to be careful >> here and make sure its really appropriate. Caching and invalidation >> techniques are right up there in terms of problems that appear easy and >> simple to initially do/use, but doing it correctly is really really hard >> (especially at any type of scale). >> >> -Josh >> >> On 1/23/14, 1:35 PM, "Renat Akhmerov" <rakhme...@mirantis.com> wrote: >> >> > >> >On 23 Jan 2014, at 08:41, Joshua Harlow <harlo...@yahoo-inc.com> wrote: >> > >> >> So to me memoizing is typically a premature optimization in a lot of >> >>cases. And doing it incorrectly leads to overfilling the python >> >>processes memory (your global dict will have objects in it that can't be >> >>garbage collected, and with enough keys+values being stored will act >> >>just like a memory leak; basically it acts as a new GC root object in a >> >>way) or more cache invalidation races/inconsistencies than just >> >>recomputing the initial valueŠ >> > >> >I agree with your concerns here. At the same time, I think this thinking >> >shouldn¹t cancel cases of conscious usage of caching technics. A decent >> >cache implementation would help to solve lots of performance problems >> >(which eventually becomes a concern for any project). >> > >> >> Overall though there are a few caching libraries I've seen being used, >> >>any of which could be used for memoization. >> >> >> >> - >> >>https://github.com/openstack/oslo-incubator/tree/master/openstack/common/ >> >>cache >> >> - >> >>https://github.com/openstack/oslo-incubator/blob/master/openstack/common/ >> >>memorycache.py >> > >> >I looked at the code. I have lots of question to the implementation (like >> >cache eviction policies, whether or not it works well with green threads, >> >but I think it¹s a subject for a separate discussion though). Could you >> >please share your experience of using it? Were there specific problems >> >that you could point to? May be they are already described somewhere? >> > >> >> - dogpile cache @ https://pypi.python.org/pypi/dogpile.cache >> > >> >This one looks really interesting in terms of claimed feature set. It >> >seems to be compatible with Python 2.7, not sure about 2.6. As above, it >> >would be cool you told about your experience with it. >> > >> > >> >> I am personally weary of using them for memoization, what expensive >> >>method calls do u see the complexity of this being useful? I didn't >> >>think that many method calls being done in openstack warranted the >> >>complexity added by doing this (premature optimization is the root of >> >>all evil...). Do u have data showing where it would be >> >>applicable/beneficial? >> > >> >I believe there¹s a great deal of use cases like caching db objects or >> >more generally caching any heavy objects involving interprocess >> >communication. For instance, API clients may be caching objects that are >> >known to be immutable on the server side. >> > >> > >> >> >> >> Sent from my really tiny device... >> >> >> >>> On Jan 23, 2014, at 8:19 AM, "Shawn Hartsock" <harts...@acm.org> wrote: >> >>> >> >>> I would like to have us adopt a memoizing caching library of some kind >> >>> for use with OpenStack projects. I have no strong preference at this >> >>> time and I would like suggestions on what to use. >> >>> >> >>> I have seen a number of patches where people have begun to implement >> >>> their own caches in dictionaries. This typically confuses the code and >> >>> mixes issues of correctness and performance in code. >> >>> >> >>> Here's an example: >> >>> >> >>> We start with: >> >>> >> >>> def my_thing_method(some_args): >> >>> # do expensive work >> >>> return value >> >>> >> >>> ... but a performance problem is detected... maybe the method is >> >>> called 15 times in 10 seconds but then not again for 5 minutes and the >> >>> return value can only logically change every minute or two... so we >> >>> end up with ... >> >>> >> >>> _GLOBAL_THING_CACHE = {} >> >>> >> >>> def my_thing_method(some_args): >> >>> key = key_from(some_args) >> >>> if key in _GLOBAL_THING_CACHE: >> >>> return _GLOBAL_THING_CACHE[key] >> >>> else: >> >>> # do expensive work >> >>> _GLOBAL_THING_CACHE[key] = value >> >>> return value >> >>> >> >>> ... which is all well and good... but now as a maintenance programmer >> >>> I need to comprehend the cache mechanism, when cached values are >> >>> invalidated, and if I need to debug the "do expensive work" part I >> >>> need to tease out some test that prevents the cache from being hit. >> >>> Plus I've introduced a new global variable. We love globals right? >> >>> >> >>> I would like us to be able to say: >> >>> >> >>> @memoize(seconds=10) >> >>> def my_thing_method(some_args): >> >>> # do expensive work >> >>> return value >> >>> >> >>> ... where we're clearly addressing the performance issue by >> >>> introducing a cache and limiting it's possible impact to 10 seconds >> >>> which allows for the idea that "do expensive work" has network calls >> >>> to systems that may change state outside of this Python process. >> >>> >> >>> I'd like to see this done because I would like to have a place to >> >>> point developers to during reviews... to say: use "common/memoizer" or >> >>> use "Bob's awesome memoizer" because Bob has worked out all the cache >> >>> problems already and you can just use it instead of worrying about >> >>> introducing new bugs by building your own cache. >> >>> >> >>> Does this make sense? I'd love to contribute something... but I wanted >> >>> to understand why this state of affairs has persisted for a number of >> >>> years... is there something I'm missing? >> >>> >> >>> -- >> >>> # Shawn.Hartsock - twitter: @hartsock - plus.google.com/+ShawnHartsock >> >>> >> >>> _______________________________________________ >> >>> OpenStack-dev mailing list >> >>> OpenStack-dev@lists.openstack.org >> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> >> _______________________________________________ >> >> OpenStack-dev mailing list >> >> OpenStack-dev@lists.openstack.org >> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> > >> >_______________________________________________ >> >OpenStack-dev mailing list >> >OpenStack-dev@lists.openstack.org >> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev