In my view this method is like "add" or "subtract". Even if you accurately document the wrong behavior, you are still wrong and you should still fix it.
Also the converse of Hyrum's law applies, always. (Hyrum's law: eventually some user depends on every behavior, even if it is private, so every change is breaking. Converse: eventually every change is breaking so every bugfix is a judgment call about the pro/con, even if it is public) A new method that has better behavior is an OK workaround, and deprecate the old. But it is unnecessary IMO. We focus too much on "breaking" changes that are easy to understand versus those that are important (dependency issues, performance problems, state changes). Kenn On Mon, Feb 28, 2022 at 2:34 AM Jan Lukavský <[email protected]> wrote: > Agree that this might be an issue, though somewhat limited, because I > (strong emphasize) *suppose* that users skipping multiple releases do care > to test the upgrade appropriately. We can wait for more releases between > deprecating and removing the API. But I don't think adding a different > version of the API is any better, we should have a way to fix design bugs > without introducing additional APIs. > > Jan > On 2/26/22 01:34, Luke Cwik wrote: > > I have seen enough customers skip 10+ minor versions when updating to pick > up a bug fix. > > On Fri, Feb 25, 2022 at 3:00 PM Brian Hulette <[email protected]> wrote: > >> Is there a number of versions to wait that would make this acceptable? >> >> I'm curious if there's general guidance here. >> >> On Fri, Feb 25, 2022 at 1:30 PM Luke Cwik <[email protected]> wrote: >> >>> There are enough people that skip many versions so they would get the >>> changed semantics without realizing. >>> >>> On Mon, Feb 21, 2022 at 12:49 AM Jan Lukavský <[email protected]> wrote: >>> >>>> I think we cannot change (documented) semantics without any notice. >>>> That could cause issues for upgraded pipelines (without proper testing, but >>>> can we rely on that?). I'd therefore suggest to: >>>> >>>> a) deprecate computeIfAbsent() >>>> >>>> b) create compute() as a new method and temporary replacement for >>>> computeIfAbsent() >>>> >>>> c) after several (3, 4?) releases remove computeIfAbsent() >>>> >>>> d) after next several releases return it back with changed semantics >>>> >>>> This way the unexpected change would affect only users skipping many >>>> releases. >>>> >>>> Jan >>>> On 2/17/22 19:08, Kenneth Knowles wrote: >>>> >>>> I think this is a design & documentation bug. The behavior does not >>>> make sense. Deviating from Map#computeIfAbsent cannot really be fixed by >>>> documentation in bold, even though that is definitely better than nothing. >>>> Fixing this should not be considered a breaking change. It should be >>>> identical to Map#computeIfAbsent but deferred, hence a ReadableState. >>>> >>>> Kenn >>>> >>>> On Wed, Feb 16, 2022 at 2:12 PM Luke Cwik <[email protected]> wrote: >>>> >>>>> The least we could do is add documentation to the current >>>>> MapState.computeIfAbsent API in bold that this differs from Java's >>>>> Map#computeIfAbsent since it returns the prior value and not the new >>>>> value. >>>>> >>>>> You can add a computeIfAbsent2(...) and deprecate the old one >>>>> specifying the difference in the API. >>>>> >>>>> Adding compute() would also be nice but would add complexity for users >>>>> who only want the computeIfAbsent2() implementation. >>>>> >>>>> >>>>> >>>>> On Wed, Feb 16, 2022 at 1:01 AM Jan Lukavský <[email protected]> wrote: >>>>> >>>>>> That behavior is documented, so if that is a bug, I suppose it is a >>>>>> design bug [1]. I'm not sure how to fix that, because we cannot simply >>>>>> change the semantics without changing API, that could result in user-code >>>>>> bugs. If I understand it correctly, the only reason for not returning the >>>>>> actual value, but a ReadableState is to enable calling readLater() to >>>>>> batch >>>>>> multiple gets to single call. That makes sense and is sort of consistent >>>>>> with the other usages of state in general. But makes it impossible to >>>>>> return the actual value V instead of ReadableState<V>. >>>>>> >>>>>> Maybe we could deprecate 'computeIfAbsent()' and introduce >>>>>> 'compute()' with the JDK-compliant semantics? It is not the same, though, >>>>>> but should be more practical than the current version of >>>>>> 'computeIfAbsent', >>>>>> which enforces various hacks to actually retrieve the computed (that is >>>>>> the >>>>>> current!) value. >>>>>> >>>>>> WDYT? >>>>>> >>>>>> Jan >>>>>> >>>>>> [1] >>>>>> https://beam.apache.org/releases/javadoc/2.36.0/org/apache/beam/sdk/state/MapState.html#computeIfAbsent-K-java.util.function.Function >>>>>> - >>>>>> On 2/16/22 02:10, Reuven Lax wrote: >>>>>> >>>>>> Sounds like a bug, probably related to the bug mentioned in that >>>>>> thread. >>>>>> >>>>>> On Fri, Feb 11, 2022 at 8:58 AM <[email protected]> wrote: >>>>>> >>>>>>> Yes, exactly, it returns null. The Java contract is to return the >>>>>>> existing value or the new (computed value) instead. >>>>>>> >>>>>>> Dne 11. 2. 2022 16:24 napsal uživatel Reuven Lax <[email protected]>: >>>>>>> >>>>>>> I'm not sure I understand what you mean. Do you mean it always >>>>>>> returns null? Since computeIfAbsent doesn't modify the value unless it >>>>>>> is >>>>>>> not present, I'm not sure what the "old value" means, unless you mean >>>>>>> that >>>>>>> it returns null. >>>>>>> >>>>>>> On Fri, Feb 11, 2022 at 4:54 AM Jan Lukavský <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I just found another weirdness in the MapState API - method >>>>>>> computeIfAbsent returns the _old_ value associated with the key. >>>>>>> That is >>>>>>> inconsistent with Java API and frankly with what users should expect >>>>>>> when using this method. The semantics should be "retrieve current >>>>>>> value, >>>>>>> if not present associate the key with this one and return me the >>>>>>> current >>>>>>> value". Is this behavior intentional? What is the motivation? In >>>>>>> combination with the previous thread [1] that states, that the value >>>>>>> is >>>>>>> not updated until read() of the state returned from computeIfAbsent >>>>>>> is >>>>>>> called this seems to force users to either fetch the state twice or >>>>>>> do >>>>>>> various workarounds. >>>>>>> >>>>>>> Jan >>>>>>> >>>>>>> [1] https://lists.apache.org/thread/mp979twlgm76gjg014f5r0k2p0nlcj2q >>>>>>> >>>>>>> >>>>>>>
