On 10/8/15 7:39 PM, Paul Benedict wrote:
I don't think the statements "Creates an unmodifiable set containing X elements"
is always true. Since sets cannot have duplicates, it's possible passing in X
elements gives you less than that based on equality. I think the Set docs should
say "...X possible elements if unique". Wordsmith something better if you can,
of course.

Fair point, I probably had the idea of throwing an exception for duplicate elements when I wrote the "Creates an unmodifiable set containing X elements" lines. If the policy were to silently ignore duplicates, applying the usual dose of spec weasel wording might result in something like

    "Creates an unmodifiable set from the X elements passed as arguments."

Then again, one of the advantages of rejecting duplicates is that the resulting collection's size equals the number of arguments passed. This also applies to the varargs cases; the resulting size equals the varargs array length. This not only provides clear semantics to the programmer, but it also lets the implementation preallocate internal arrays knowing that they'll have exactly the right size.

s'marks





Cheers,
Paul

On Thu, Oct 8, 2015 at 6:39 PM, Stuart Marks <stuart.ma...@oracle.com
<mailto:stuart.ma...@oracle.com>> wrote:

    Hi all,

    Please review and comment on this draft API for JEP 269, Convenience
    Collection Factories. For this review I'd like to focus on the API, and set
    aside implementation issues and discussion for later.


    JEP:

    http://openjdk.java.net/jeps/269

    javadoc:

    http://cr.openjdk.java.net/~smarks/reviews/jep269/api.20151008.mod/

    specdiff:

    
http://cr.openjdk.java.net/~smarks/reviews/jep269/api.20151008.specdiff/overview-summary.html


    Most of the API is pretty straightforward, with fixed-arg and varargs "of()"
    factories for List, Set, ArrayList, and HashSet; and with fixed-arg "of()"
    factories and varargs "ofEntries()" factories for Map and HashMap.

    There are a few issues on which I'd like to solicit discussion.

    1. Number of fixed arg overloads.

    I've somewhat arbitrarily provided up to 5 fixed-arg overloads for the lists
    and sets, and up to 8 pairs for the fixed-arg map factories. The rationale
    for 8 pairs is that there are 8 primitives, and various language processing
    tools often have maps for the primitive types. (But such tools also often
    need to handle the Void type, which exceeds the limit of 8. So this might
    need to change if we want to follow this rationale.)

    I also note that Guava's immutable factories provide 11 fixed-arg overloads
    for list, 5 for set, and 5 pairs for map. I'd be curious as to the rationale
    for this, and whether it also would apply to the JDK.

    2. Other concrete collection factories.

    I've chosen to provide factories for the concrete collections ArrayList,
    HashSet, and HashMap, since those seem to be the most commonly used. Is
    there a need to provide factories for other concrete collections, such as
    LinkedHashMap?

    3. Duplicate handling.

    My current thinking is for the Set and Map factories to throw
    IllegalArgumentException if a duplicate element or key is detected. The
    current draft specification is silent on this point. It needs to be
    specified, one way or another.

    The rationale for throwing an exception is that if these factories are used
    in a "literal like" fashion, then having a duplicate is almost certainly a
    programming error. Consider this example:

         Map<String,TypeUse> m = Map.ofEntries(
             entry("CDATA",       CBuiltinLeafInfo.NORMALIZED_STRING),
             entry("ENTITY",      CBuiltinLeafInfo.TOKEN),
             entry("ENTITIES",    CBuiltinLeafInfo.STRING.makeCollection()),
             entry("ENUMERATION", CBuiltinLeafInfo.STRING.makeCollection()),
             entry("NMTOKEN",     CBuiltinLeafInfo.TOKEN),
             entry("NMTOKENS",    CBuiltinLeafInfo.STRING.makeCollection()),
             entry("ID",          CBuiltinLeafInfo.ID),
             entry("IDREF",       CBuiltinLeafInfo.IDREF),
             entry("IDREFS",
                       TypeUseFactory.makeCollection(CBuiltinLeafInfo.IDREF));
             entry("ENUMERATION", CBuiltinLeafInfo.TOKEN));

    (derived from [1])

    If duplicates were silently ignored, this might result in hard-to-spot 
errors.

    There's also the matter of which value ends up being used in the case of
    duplicate map keys, and whether this should be specified. A fairly obvious
    policy would be "last one wins" but I'm reluctant to specify that, as it
    starts to place unnecessary constraints on implementations. However, the
    alternative of leaving it unspecified is also unpalatable.

    I'm aware that very few programming systems with similar constructs will
    signal an error on duplicate elements. Python, Ruby, Groovy, Scala, and Perl
    all seem to allow duplicates in maps or equivalent, apparently with a
    last-wins policy. (Though sometimes it's hard to tell if the policy is
    specified.)

    The only system I've been able to find that explicitly rejects duplicates is
    Clojure, and this policy isn't without controversy. [2] The main rationale
    is to prevent programming errors.

    There is a python bug [3] where it was proposed that duplicates in a dict
    should raise an error or warning, also in order to catch programming errors.
    The request was rejected, not necessarily because it was a bad idea, but
    primarily because it would be a backward incompatible change.

    The easiest thing to do would simply to require last-wins, since "everybody
    else is doing it" ... but that doesn't mean it's right. Since we're
    introducing a new API here, there is no compatibility issue. Throwing an
    exception for duplicates seems like a good way to prevent a certain class of
    programming errors.

    What do people think?

    s'marks

    [1]
    
http://hg.openjdk.java.net/jdk8/jdk8/jaxws/file/d03dd22762db/src/share/jaxws_classes/com/sun/tools/internal/xjc/reader/dtd/TDTDReader.java#l420

    [2]
    
http://dev.clojure.org/display/design/Allow+duplicate+map+keys+and+set+elements

    [3] https://bugs.python.org/issue16385


Reply via email to