[Python-ideas] Re: Additional LRU cache introspection facilities

Christopher Barker Tue, 12 Jan 2021 09:16:41 -0800

Is the implementation of lru_cache too opaque to poke into it without an
existing method? Or write a quick monkey-patch?


Sorry for not checking myself, but the ability to do that kind of thing is
one of the great things about a dynamic open source language.

-CHB

On Tue, Jan 12, 2021 at 9:04 AM Steven D'Aprano <st...@pearwood.info> wrote:

> On Tue, Jan 12, 2021 at 04:32:14PM +0200, Serhiy Storchaka wrote:
> > 12.01.21 12:02, Steven D'Aprano пише:
>
> > > I propose a method:
> > >
> > >     @functools.lru_cache()
> > >     def function(arg):
> > >         ...
> > >
> > >     function.cache_export()
> > >
> > > that returns a dictionary {arg: value} representing the cache. It
> > > wouldn't be the cache itself, just a shallow copy of the cache data.
> >
> > What if the function supports multiple arguments (including passed by
> > keyword)? Note that internal representation of the key is an
> > implementation detail, so you need to invent and specify some new
> > representation. For example return a list of tuples (args, kwargs,
> result).
>
> Sure. That's a detail that can be worked out once we agree that this is
> a useful feature.
>
>
> > Depending on the implementation, getting the list of all arguments can
> > have larger that linear complexity.
>
> I don't mind. Efficiency is not a priority for this. This is an
> introspection feature for development and debugging, not a feature for
> production. I don't expect it to be called in tight loops. I expect to
> use it from the REPL while I am debugging my code.
>
> I might have to rethink if it was exponentionally slow, but O(n log n)
> like sorting would be fine; I'd even consider O(n**2) acceptable, with a
> documentation note that exporting large caches may be slow.
>
>
> > Other cache implementations can contain additional information: the
> > number of hits for every value, times. Are you interesting to get that
> > information too or ignore it?
>
> No.
>
>
> > Currently the cache is thread-safe in CPython, but getting all arguments
> > and values may not be (or we will need to add a synchronization overhead
> > for every call of the cached function).
>
> Can you explain further why the cached function needs additional
> syncronisation overhead?
>
> I am quite happy for exporting to be thread-unsafe, so long as it
> doesn't crash. Don't export the cache if it is being updated from
> another thread, or you might get inaccurate results.
>
> To be clear:
>
> - If you export the cache from one thread while another thread is
>   reading the cache, I expect that would be safe.
>
> - If you export the cache from one thread while another thread is
>   *modifying* the cache, I expect that the only promise we make is
>   that there shouldn't be a segfault.
>
>
>
> > And finally, what is your use case? Is it important enough to justify
> > the cost?
>
> I think so or I wouldn't have asked :-)
>
> There shouldn't be much (or any?) runtime cost on the cache except for
> the presence of an additional method. The exported data is just a
> snapshot, it doesn't have to be a view of the cache. Changing the
> exported snapshot will not change the cache.
>
> My use-case is debugging functions that are using an LRU cache,
> specifically complex recursive functions. I have some functions where:
>
>     f(N)
>
> ends up calling itself many times, but not in any obvious pattern:
>
>     f(N-1), f(N-2), f(N-5), f(N-7), f(N-12), f(N-15), f(N-22), ...
>
> for example. So each call to f() could make dozens of recursive calls,
> if N is big enough, and there are gaps in the calls.
>
> I was having trouble with the function, and couldn't tell if the right
> arguments where going into the cache. What I wanted to do was peek at
> the cache and see which keys were ending up in the cache and compare
> that to what I expected.
>
> I did end up get the function working, but I think it would have been
> much easier if I could have seen what was inside the cache and how the
> cache was changing from one call to the next.
>
> So this is why I don't care about performance (within reason). My use
> case is interactive debugging.
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/EV2W2DMXSONPHUYXGQD5HK3BIUTFIEVU/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-- 
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5ESUAQWPHF7OJILO5KAMXLSHQKTTCQ4G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Additional LRU cache introspection facilities

Reply via email to