Raymond Hettinger <[email protected]> added the comment:
I've left this open for a week to collect comments. I concur with Felipe that
this should be the responsibility of the caller. And I concur with Jim that
the use cases are likely too rare to warrant shoehorning in the extra
functionality.
For my part, I'm concerned that this would be bug bait. The result of the
function would need to be uniquely associated with a particular output,
independent of the contents of the mutable arguments. I don't think it is good
design for function calls to be disregarding the part of the input that was
originally used to compute the output — in a way that notion conflict with the
core idea of the of the lru_cache giving the same outputs for the same inputs.
Likewise, debugging and exception handing would be complicated by having two
underlying user functions. Exceptions could arise from three sources: the
make_key() function, the cached function, and the lru cache internals.
Further, if the make_key() function turns out not to be unique, the ensuring
bugs would be hard to find. In the OP's later example, if we get two distinct
inputs which happen to have the same timestamp, the error would pass without
warning and would be hard to reproduce. Similarly, the make_key() function has
to assume that the mutable portion of the data hasn't mutated, but a key
feature of mutable data is that it can mutate. That too would result in a hard
to find bug.
Also, while the OP has a specific use case in mind, we can't know in advance
how others will use make_key(). It could be misused to convert mutable data to
immutable data, possibly resulting in a cached function that is slower than the
original (caching mean() for example). Or the make_key() function could had
side-effects. The OP's original "my_list[0]" example showed yet another way
that bugs could arise. It is problematic that that particular bug might not
have been caught by a test suite that didn't know to call the function twice
with a common first argument but with differing subsequent arguments.
Another possible bug trap is that it would make it easier to accidentally cache
mutable function results without getting a warning. For example:
@lru_cache(128, make_key = lambda target, size, uniq_id, data: uniq_id)
def search(target, size, uniq_id, data):
i = data.index(target)
return data[i : i+size] # <== bug because this is mutable if data
is mutable
I think the suggestion was a neat idea (and thought provoking), but I think we
should decline because 1) the use case is uncommon 2) because it makes the API
harder to understand 3) because it is bug bait and 4) because the
responsibility should probably be with the caller.
Thank you for the suggestion. It was inspired. Don't be discouraged — many of
my proposals don't get accepted as well. Please continue to submit suggestions.
P.S. make_key() shouldn't be called a key-function so that we don't make that
term ambiguous. Elsewhere key-functions are all about comparison logic rather
than hashing. They can be created by functools.cmp_to_key(), they tend to be
used only once per data element, and the same function tends to be inoperable
with all the tools that accept key-functions. So the term make_key() was
likely the better terminology. Perhaps get_unique_id() would have been even
less ambiguous.
----------
resolution: -> rejected
stage: patch review -> resolved
status: open -> closed
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue41220>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com