Raymond Hettinger <raymond.hettin...@gmail.com> added the comment:
Thanks, I see what you're trying to do now: 1) Given a slow function 2) that takes a complex argument 2a) that includes a hashable unique identifier 2b) and some unhashable data 3) Cache the function result using only the unique identifier The lru_cache() currently can't be used directly because all the function arguments must be hashable. The proposed solution: 1) Write a helper function 1a) that hash the same signature as the original function 1b) that returns only the hashable unique identifier 2) With a single @decorator application, connect 2a) the original function 2b) the helper function 2c) and the lru_cache logic A few areas of concern come to mind: * People have come to expect cached calls to be very cheap, but it is easy to write input transformations that aren't cheap (i.e. looping over all the inputs as in your example or converting entire mutable structures to immutable structures). * While key-functions are relatively well understood, when we use them elsewhere key-functions only get called once per element. Here, the lru_cache() would call the key function every time even if the arguments are identical. This will be surprising to some users. * The helper function signature needs exactly match the wrapped function. Changes would need to be made in both places. * It would be hard to debug if the helper function return values ever stop being unique. For example, if the timestamps start getting rounded to the nearest second, they will sporadically become non-unique. * The lru_cache signature makes it awkward to add more arguments. That is why your examples had to explicitly specify a maxsize of 128 even though 128 is the default. * API simplicity was an early design goal. Already, I made a mistake by accepting the "typed" argument which is almost never used but regularly causes confusion and affects learnability. * The use case is predicated on having a large unhashable dataset accompanied by a hashable identifier that is assumed to be unique. This probably isn't common enough to warrant an API extension. Out of curiosity, what are you doing now without the proposed extension? As a first try, I would likely write a dataclass to be explicit about the types and about which fields are used in hashing and equality testing: @dataclass(unsafe_hash=True) class ItemsList: unique_id: float data: dict = field(hash=False, compare=False) I expect that dataclasses like this will emerge as the standard solution whenever people need a mapping or dict to work with keys that have a mix of hashable and unhashable components. This will work with the lru_cache(), dict(), defaultdict(), ChainMap(), set(), frozenset(), etc. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue41220> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com