On 04/25/2014 06:13 PM, Peter Levart wrote:
Hi Paul,

On 04/25/2014 02:12 PM, Paul Sandoz wrote:
On Apr 23, 2014, at 11:06 AM, Peter Levart<[email protected]>  wrote:

Hi,

I propose a patch for:

    https://bugs.openjdk.java.net/browse/JDK-8040892

+1, nice use of putIfAbsent. Just double checking, did you run the stream unit 
tests?

Yes, all 60 /java/util/stream jtreg tests pass.

I think key step was to separate map merge and accumulate functions for the 
toMap* methods not accepting a value merge function. Now that is the case one 
could still use merge, but that would be less efficient than putIfAbsent since 
that does not require a lambda capturing the key.


That fix caused me to think that we should update the value merge function 
accepting toMap* methods for the case when the merge function returns null, as 
removing an entry would be most undesirable.

Yes, removing doesn't make sense. It's undeterministic. For example, if the merge function always returns null, then duplicate keys result in entry being absent, triplicate keys result in entry (the last one merged) being present, quadruple keys result in entry being absent again, etc ...

Hi Paul,

To make any sense of null return from the mergeFunction, which could mean that an entry with that key is absent from resulting map, such Collector would have to keep some more state - the collecting map (which is also returned at the end) and the set of removed keys. For example:


    public static <T, K, U, M extends Map<K, U>>
    Collector<T, ?, M> toMap(Function<? super T, ? extends K> keyMapper,
                             Function<? super T, ? extends U> valueMapper,
                             BinaryOperator<U> mergeFunction,
                             Supplier<M> mapSupplier) {

        class State {
            final M map = mapSupplier.get();
            final Set<K> removedKeys = new HashSet<>();
            M map() { return map; }
        }

        BiConsumer<State, T> accumulator = (state, element) -> {
                K k = keyMapper.apply(element);
                if (!state.removedKeys.contains(k)) {
U u = state.map.merge(k, valueMapper.apply(element), mergeFunction);
                    if (u == null) state.removedKeys.add(k);
                }
        };

        BinaryOperator<State> merger = (state1, state2) -> {
state1.map.keySet().removeAll(state2.removedKeys);
state2.map.keySet().removeAll(state1.removedKeys);
            state1.removedKeys.addAll(state2.removedKeys);
            for (Map.Entry<K,U> e : state2.map.entrySet()) {
U u = state1.map.merge(e.getKey(), e.getValue(), mergeFunction);
                if (u == null) state1.removedKeys.add(e.getKey());
            }
            return state1;
        };

return new CollectorImpl<>(State::new, accumulator, merger, State::map, CH_ID);
    }


...but the concurrent variant would be tricky to implement.

Regards, Peter



A simple fix would be to throw an NPE if the merge function returns null, if it 
is acceptable to induce a side-effect of a removal, which seems reasonable 
given a partial result would any be returned if a null check is performed on 
the result of the merge function, otherwise that merge function requires 
wrapping e.g. mergeFunction.andThen(Objects::requireNonNull).

I think that allowing null return from mergeFunction to remove the entry from the (maybe intermediate) map is just confusing. Throwing NPE is the most sensible thing we can do here.

Regards, Peter

Paul.


Here's the webrev:

http://cr.openjdk.java.net/~plevart/jdk9-dev/Collectors.duplicateKey/webrev.02/

There has already been a discussion on the lambda-dev about this bug:

http://mail.openjdk.java.net/pipermail/lambda-dev/2014-April/012005.html


Initially a fix has been proposed which used a private RuntimeException subclass thrown 
by the merger function in order to transport the values that attempted to be merged out 
of Map.merge() method to an outer context where they could be combined with the key to 
produce a friendlier "duplicate key" message. It turns out that there is an 
alternative approach that doesn't require exception acrobatics - using Map.putIfAbsent() 
instead of Map.merge(). I checked both methods in Map defaults and in HashMap 
implementations for possible semantic differences. Same behaviour can be achieved by 
requiring that the value passed to Map.putIfAbsent(key, value) is non-null and throwing 
NPE if it isn't.


Regards, Peter


Reply via email to