Re: [openstack-dev] [oslo] memoizer aka cache

Morgan Fainberg Thu, 23 Jan 2014 16:25:14 -0800

Yes! There is a reason Keystone has a very small footprint of
caching/invalidation done so far.  It really needs to be correct when it
comes to proper invalidation logic.  I am happy to offer some help in
determining logic for caching/invalidation with Dogpile.cache in mind as we
get it into oslo and available for all to use.


--Morgan



On Thu, Jan 23, 2014 at 2:54 PM, Joshua Harlow <[email protected]>wrote:

> Sure, no cancelling cases of conscious usage, but we need to be careful
> here and make sure its really appropriate. Caching and invalidation
> techniques are right up there in terms of problems that appear easy and
> simple to initially do/use, but doing it correctly is really really hard
> (especially at any type of scale).
>
> -Josh
>
> On 1/23/14, 1:35 PM, "Renat Akhmerov" <[email protected]> wrote:
>
> >
> >On 23 Jan 2014, at 08:41, Joshua Harlow <[email protected]> wrote:
> >
> >> So to me memoizing is typically a premature optimization in a lot of
> >>cases. And doing it incorrectly leads to overfilling the python
> >>processes memory (your global dict will have objects in it that can't be
> >>garbage collected, and with enough keys+values being stored will act
> >>just like a memory leak; basically it acts as a new GC root object in a
> >>way) or more cache invalidation races/inconsistencies than just
> >>recomputing the initial valueŠ
> >
> >I agree with your concerns here. At the same time, I think this thinking
> >shouldn¹t cancel cases of conscious usage of caching technics. A decent
> >cache implementation would help to solve lots of performance problems
> >(which eventually becomes a concern for any project).
> >
> >> Overall though there are a few caching libraries I've seen being used,
> >>any of which could be used for memoization.
> >>
> >> -
> >>
> https://github.com/openstack/oslo-incubator/tree/master/openstack/common/
> >>cache
> >> -
> >>
> https://github.com/openstack/oslo-incubator/blob/master/openstack/common/
> >>memorycache.py
> >
> >I looked at the code. I have lots of question to the implementation (like
> >cache eviction policies, whether or not it works well with green threads,
> >but I think it¹s a subject for a separate discussion though). Could you
> >please share your experience of using it? Were there specific problems
> >that you could point to? May be they are already described somewhere?
> >
> >> - dogpile cache @ https://pypi.python.org/pypi/dogpile.cache
> >
> >This one looks really interesting in terms of claimed feature set. It
> >seems to be compatible with Python 2.7, not sure about 2.6. As above, it
> >would be cool you told about your experience with it.
> >
> >
> >> I am personally weary of using them for memoization, what expensive
> >>method calls do u see the complexity of this being useful? I didn't
> >>think that many method calls being done in openstack warranted the
> >>complexity added by doing this (premature optimization is the root of
> >>all evil...). Do u have data showing where it would be
> >>applicable/beneficial?
> >
> >I believe there¹s a great deal of use cases like caching db objects or
> >more generally caching any heavy objects involving interprocess
> >communication. For instance, API clients may be caching objects that are
> >known to be immutable on the server side.
> >
> >
> >>
> >> Sent from my really tiny device...
> >>
> >>> On Jan 23, 2014, at 8:19 AM, "Shawn Hartsock" <[email protected]>
> wrote:
> >>>
> >>> I would like to have us adopt a memoizing caching library of some kind
> >>> for use with OpenStack projects. I have no strong preference at this
> >>> time and I would like suggestions on what to use.
> >>>
> >>> I have seen a number of patches where people have begun to implement
> >>> their own caches in dictionaries. This typically confuses the code and
> >>> mixes issues of correctness and performance in code.
> >>>
> >>> Here's an example:
> >>>
> >>> We start with:
> >>>
> >>> def my_thing_method(some_args):
> >>>   # do expensive work
> >>>   return value
> >>>
> >>> ... but a performance problem is detected... maybe the method is
> >>> called 15 times in 10 seconds but then not again for 5 minutes and the
> >>> return value can only logically change every minute or two... so we
> >>> end up with ...
> >>>
> >>> _GLOBAL_THING_CACHE = {}
> >>>
> >>> def my_thing_method(some_args):
> >>>   key = key_from(some_args)
> >>>    if key in _GLOBAL_THING_CACHE:
> >>>        return _GLOBAL_THING_CACHE[key]
> >>>    else:
> >>>         # do expensive work
> >>>         _GLOBAL_THING_CACHE[key] = value
> >>>         return value
> >>>
> >>> ... which is all well and good... but now as a maintenance programmer
> >>> I need to comprehend the cache mechanism, when cached values are
> >>> invalidated, and if I need to debug the "do expensive work" part I
> >>> need to tease out some test that prevents the cache from being hit.
> >>> Plus I've introduced a new global variable. We love globals right?
> >>>
> >>> I would like us to be able to say:
> >>>
> >>> @memoize(seconds=10)
> >>> def my_thing_method(some_args):
> >>>   # do expensive work
> >>>   return value
> >>>
> >>> ... where we're clearly addressing the performance issue by
> >>> introducing a cache and limiting it's possible impact to 10 seconds
> >>> which allows for the idea that "do expensive work" has network calls
> >>> to systems that may change state outside of this Python process.
> >>>
> >>> I'd like to see this done because I would like to have a place to
> >>> point developers to during reviews... to say: use "common/memoizer" or
> >>> use "Bob's awesome memoizer" because Bob has worked out all the cache
> >>> problems already and you can just use it instead of worrying about
> >>> introducing new bugs by building your own cache.
> >>>
> >>> Does this make sense? I'd love to contribute something... but I wanted
> >>> to understand why this state of affairs has persisted for a number of
> >>> years... is there something I'm missing?
> >>>
> >>> --
> >>> # Shawn.Hartsock - twitter: @hartsock - plus.google.com/+ShawnHartsock
> >>>
> >>> _______________________________________________
> >>> OpenStack-dev mailing list
> >>> [email protected]
> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>
> >> _______________________________________________
> >> OpenStack-dev mailing list
> >> [email protected]
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >_______________________________________________
> >OpenStack-dev mailing list
> >[email protected]
> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] memoizer aka cache

Reply via email to