Re: Type primitive pattern: the problem with lossy conversions

Brian Goetz Fri, 12 Sep 2025 11:32:26 -0700

Indeed so. Some people are having a hard time making the shift awayfrom `instanceof` as being purely a type query. And this is one of thecosts of the decision we made to "lump" vs "split" in the surface syntaxwhen we added patterns at all; we had a choice of picking a new keyword(e.g., "matches") or extending instanceof to support patterns. Thischoice had pros and cons on both sides of the ledger, and having made achoice, we now have to live with whatever cons that choice had. (FTR Iam completely convinced that this was the better choice, but thatdoesn't make the cons go away.)

In C#, they spelled their `instanceof` operator `is`, and in that world,the "this is only about types" interpretation is less entrenched, forpurely syntactic reasons; both


    if (anObject is String)
and
    if (aFloat is int)

seems pretty natural. It will take some time for people to reprogramthis accidental association, but once they do, it won't be a problem.



On 9/12/2025 2:01 PM, Archie Cobbs wrote:

This is just an aside (I agree with everything Brian has said).

I think a lot of the "uncomfortableness" comes from a simple mentalmodel adjustment that needs to occur.

If "42 instanceof float" feels normal to you, then no adjustment isneeded. But I think for some people it feels funny.

Why? Because those people tend to read "x instanceof Y" as "the typeof x is some subtype of Y".

Why is that a problem? Well, what is "the type of x"? For referencetypes, there is the compiler type and the runtime type, and instanceofis a way to ask about the runtime type of something that the code onlyknows by its compile-time type. So far, so good.

But with primitives, there is no distinction between compiler type andruntime type. An "int" is always an ordered sequence of 32 bits.

So applying the traditional understanding of "instanceof" leads you tothis: "42 instanceof float" is obviously false, because the statement"int is some subtype of float" is false.

So, int is not a subtype of float - but some ints are representable asfloats. The latter property is what matters here.

So perhaps when we talk about this feature, we should start by firsttelling everyone to replace any notion they might have that "xinstanceof Y" means "type of x is some subtype of Y" with "the value xis representable as a Y". After that, the waters should become muchsmoother.


-Archie

On Fri, Sep 12, 2025 at 12:31 PM Brian Goetz <brian.go...@oracle.com>wrote:




    On 9/11/2025 11:55 AM, Brian Goetz wrote:


    I explicitly asked you to answer a question on the semantics when
    these conversions were NOT involved before we moved onto these,
    because I think  you are conflating two separate things, and in
    order to get past just repeating "but...lossy!", we have to
    separate them.


    I see Remi has lost interest in this discussion, so I will play
    both parts of the dialogue on his behalf now.

    Let's start with the easy ones:

        Object p = "foo";           // widen String to Object
        Object q = Integer.valueOf(3)  // widen Integer to Object
        ...
        if (p instanceof String s) { ... }  // yes it is
        if (q instanceof String s) { ... }  // no it isn't

    We can widen String and Integer to Object; we can safely narrow p
    back to String, but we can't do so for q, because it is "outside
    the range" of references to String (which embeds in "references
    to Object".)   Does this make sense so far?


    Remi: It has always been this way.

    OK, now let's do int and long.

        long small = 3
        long big = Long.MAX_VALUE

        if (small instanceof int a) { ... }    // yes, it is
        if (big instanceof int b) { ... }        // no, it isn't

    What these questions are asking is: can I safely narrow these
    longs to int, just like the above.  In the first case, I can --
    just like with String and Object.  In the second, I can't -- just
    like with Integer and Object. Do we agree these are the same?


    Remi: Yes Socrates, I believe they are.

    OK, now let's do int and double.

        double zero = 0;
        double pi = 3.14d;

        if (zero instanceof int i) { ... }
        if (pi instanceof int i) { ... }

    Same thing!  The first exactly encodes a number that is
    representable in int (i.e., could have arisen from widening an
    int to double), the latter does not.


    Remi: It could be no other way, Socrates.

    But we can replace double with float in the previous example, and
    nothing changes:

      float zero = 0;
        float pi = 3.14f;

        if (zero instanceof int i) { ... }
        if (pi instanceof int i) { ... }


    Here, we are asking a sound question: does the number encoded in
    this `float` exactly represent an `int`.  For `zero`, the answer
    is "of course"; for `pi`, the answer is "obviously not."

    Remi: Yes, Socrates, that is clearly evident.


    I'm now entering guessing territory here, but I'm pretty sure I
    understand what's making you uncomfortable.  But, despite the
    clickbaity headline and misplaced claims of wrongness, this really
    has nothing to do with pattern matching at all!  It has to do with
    the existing regrettable treatment of some conversions (which we
    well understood are problematic, and are working on), and the fact
    that by its very duality, pattern matching _exposes_ the
    inconsistency that while most "implicit" conversions (more
    precisely, those allowed in method and assignment context) are
    required to be "safe", we allow several lossy conversions in these
    contexts too (int <--> float, long <--> float, long <--> double). 
    (The argument then makes the leap that "because this exposes a
    seeming inconsistency, the new feature must be wrong, and should
    be changed."  But it is neither fair nor beneficial to blame the
    son, who is actually doing it right, for the sins of the fathers.)

    In other words, you see these two cases as somehow so different
    that we should roll back several years of progress just to avoid
    acknowledging the inconsistency:

        float f = 0;
        if (f instanceof int i) { ... }

    and

        float g = 200_000_007;  // lossy
        if (g instanceof int i) { ... }

    because in the first case, we are merely "recovering" the int-ness
    of something that was an int all along, but in the second case, we
    are _throwing away_ some of the int value and then trying to
    recover it, but without the knowledge that a previous lossy
    operation happened.

    But blaming pattern matching is blaming the messenger. Your beef
    is with the assignment to g, that we allow a lossy assignment
    without at least an explicit cast.  And this does seem
    "inconsistent"!  We generally go out of our way to avoid lossy
    conversions in assignments and method invocation (by allowing
    widening conversions but not narrowing ones) -- except that the
    conversion int -> float is considered (mystifyingly) a "widening
    conversion."

    Except there's no mystery.  This was a compromise made in 1995
    borne of a desire for Java to not seem "too surprising" to C
    programmers.  So this conversion (and two of its irresponsible
    friends) were characterized as "widenings", when in fact they are
    not.

    If I could go back to 1995 and argue against this, I would.  But I
    can't do that.  What I can do is to acknowledge that this was a
    regrettable misuse of the term widening, and not extrapolate from
    this behavior.  Can we change the language to disallow these
    conversions in assignment and method context?  Unlikely, that
    would break way too much code.  We can, though, clarify the
    terminology in the JLS, and start to issue warnings for these
    lossy implicit conversions, and instead encourage an explicit cast
    to emphasize the "convert to float, dammit" intentions of the
    programmer (which is what we plan to do, we just haven't finalized
    this plan yet.)  But again, this has little to do with the proper
    semantics of pattern matching; it is just that pattern matching is
    the mirror that reveals the bad behavior of existing past mistakes
    more clearly.  It would be stupid to extrapolate this mistake
    forward into pattern matching -- that would make it both more
    confusing and more complicated.  Instead, we admit our past
    mistakes and improve the language as best as we can.  In this
    case, there was a pretty good answer here, despite the legacy warts.

    Before I close this thread, I need to reiterate that just because
    there is a seeming inconsistency, this is not necessarily evidence
    that the _most recent move_ is a mistake.  So next time, if you
    see an inconsistency, instead of thumping your shoe on the table
    and crying "mistake! mistake!", you could ask these questions instead:

     - This seems like an inconsistency, but is it really?
     - If this is an inconsistency, does that mean that one case or
    the other is mistake?
     - If there is a mistake, can it be fixed?  If not, should we
    consider changing course to avoid confusion, or are we better off
    living with a small inconsistency to get a greater benefit?

    These are the kinds of questions that language designers grapple
    with every day.




--
Archie L. Cobbs

Re: Type primitive pattern: the problem with lossy conversions

Reply via email to