Re: [External] : Re: Primitive type patterns

Brian Goetz Thu, 03 Mar 2022 08:10:00 -0800

I'm in agreement on not adding new contexts but I had the opposite
impression here.  Doesn't "having it do range checking" require a new
context as this is different from what assignment contexts allow
today?  Or is it the case that regular, non-match assignment must be
total with no left over that allows them to use the same context
despite not being able to do the dynamic range check?  As this
sentence shows, I'm confused on how dynamic range checking fits in the
existing assignment context.

Or are we suggesting that assignment allows:

byte b = new Long(5);

to succeed if we can unbox + meet the dynamic range check?  I'm
clearly confused here.


At a meta level, the alignment target is:
   - given a target type `T`
   - given an expression `e : E`

then:

- being able to statically determine whether `T t` matches `e` shouldbe equivalent to whether the assignment `T t = e` is valid under theexisting 5.2 rules.

That is to say, the existing 5.2 rules may look like a bag of ad-hoc,two-for-one-on-tuesday rules, but really, they will be revealed to bethe set of conversions that are consistent with statically determiningwhether `T t` matches `e : E`. Most of these rules involve only T and E(e.g., widening primitive conversion), but one of them is about ranges,which we can only statically assess when `e` is a constant.

Conversions like unboxing or casting are burdened by the fact that they
have to be total, which means the "does it fit" / "if so, do it" / "if
not, do something else (truncate, throw, etc)" all have to be crammed
into a single operation.  What pattern matching is extracts the "does it
fit, and if so do it" into a more primitive operation, from which other
operations can be composed.

Is it accurate to say this is less reusing assignment context and more
completely replacing it with a new pattern context from which
assignment can be built on top of?

Yes! Ideally, this is one of those "jack up the house and provide asolid foundation" moves.

At some level, what I'm proposing is all spec-shuffling; we'll either
say "a widening primitive conversion is allowed in assignment context",
or we'll say that primitive `P p` matches any primitive type Q that can
be widened to P.  We'll end up with a similar number of rules, but we
might be able to "shake the box" to make them settle to a lower energy
state, and be able to define (whether we explicitly do so or not)
assignment context to support "all the cases where the LHS, viewed as a
type pattern, are exhaustive on the RHS, potentially with remainder, and
throws if remainder is encountered."  (That's what unboxing does; throws
when remainder is encountered.)

Ok. So maybe I'm not confused.  We'd allow the `byte b = new Long(5);`
code to compile and throw not only on a failed unbox, but also on a
dynamic range check failure.


No ;)

Today, we would disallow this assignment because it is not an unboxingfollowed by a primitive widening. (The opposite, long l = new Byte(3),would be allowed today, except that we took away these constructors soyou have to use valueOf.) We would only allow a narrowing if the RHSwere a constant, like "5", in which case the compiler would staticallyevaluate the range check and narrow 5 to byte.

Tomorrow, the assignment would be the same; assignment works based on"statically determined to match", and we can only statically determinethe range check if we know the target value, i.e., its a constant. But,if you *asked*, then you can get a dynamic range check:


    if (anInt matches byte b) // we get a range check here

The reason we don't do that with assignment is we don't know what to doif it doesn't match. But if its in a conditional context (if orswitch), then the programmer is going to tell us what to do if itdoesn't match.

If we took this "dynamic hook" behaviour to the limit, what other new
capabilities does it unlock?  Is this the place to connect other
user-supplied conversion operations as well?  Maybe I'm running too
far with this idea but it seems like this could be laying the
groundwork for other interesting behaviours.  Am I way off in the
weeds here?

Not entirely in the weeds. The problem with assignment, casting, andall of those things is that they have to be total; when you say "x = y"then the guarantee is that *something* got assigned to x. Now, we arealready cheating a bit, because `x = y` allows unboxing, and unboxingcan throw. (Sounds like remainder rejection!) Now, imagine we had an"assign or else" construct (with static types A and B):


    a := (b, e)

then this would mean

    if (b matches A aa)
        a = aa
    else
        a = e  // and maybe e is really a function of b

In the case of unboxing conversions, our existing assignment works kindof like:


    a := (b, throw new NPE)

because we'd try to match, and if it fails, evaluate the secondcomponent, which throws.

Obviously I'm not suggesting we tinker with assignment in this way, butthe point is: pattern matching gives you a chance to stop and say:"don't do it yet, but if you did it, would it work?"

Intuitively, the behaviour you propose is kind of what we want - all
the possible byte cases end up in the byte case and we don't need to
adapt the long case to handle those that would have fit in a byte.
I'm slightly concerned that this changes Java's historical approach
and may lead to surprises when refactoring existing code that treats
unbox(Long) one way and unbox(Short) another.  Will users be confused
when the unbox(Long) in the short right range ends up in a case that
was only intended for unbox(Short)?  I'm having a hard time finding an
example that would trip on this but my lack of imagination isn't
definitive =)

I'm worried about this too.  We examined it briefly, and ran away, when
we were thinking about constant patterns, specifically:

      Object o = ...
      switch (o) {
          case 0: ...
          default: ...
      }

What would this mean?  What I wouldn't want it to mean is "match Long 0,
Integer 0, Short 0, Byte 0, Character 0"; that feels like it is over the
line for "magic".  (Note that this is about defining what the _constant
pattern_ means, not the primitive type pattern.) I think its probably
reasonable to say this is a type error; 0 is applicable to primitive
numerics and their boxes, but not to Number or Object.  I think that is
consistent with what I'm suggesting about primitive type patterns, but
I'd have to think about it more.

Object o =...
switch(o) {
     case (long)0: ...  // can we say this?  Probably not
     case long l && l == 0: // otherwise this would become the way to
catch most of the constant 0 cases
     default: ....
}

I'm starting to think the constant pattern will feel less like magic
once the dynamic range checking becomes commonplace.

Probably can't say `case (long) 0`, but you can say `case 0L`. Though wedon't have suffixes for all the types.

One reason this is especially undesirable is that one of the forms of
let-bind is a let-bind *expression*:

      let P = p, Q = q
      in <expression>

which is useful for pulling out subexpressions and binding them to a
variable, but for which the scope of that variable is limited.  If
refactoring from:

Possible typo in the example.  Attempted to fix:

      int x = stuff;
      m(f(x));

to

      m(let x = stuff in f(x))
      // x no longer in scope here

Not sure I follow this example.  I'm not sure why introducing a new
variable in this scope is useful.

Two reasons: narrower scope for locals, and turning statements intoexpressions.

A common expression with redundant subexpressions is "last 3 charactersof string":


    last3 = s.substring(s.length() - 3, s.length())

We can refactor to

    int sLen = s.length();
    last3 = s.substring(sLen - 3, sLen);

but some people dislike this because now the rest of the scope is"polluted" with a garbage variable. A let expression narrows the scopeof sLen:


    last3 = let sLen = s.length()
                in s.substring(sLen - 3, sLen);

This becomes more important when we want to use the result in, say, amethod call; now we have to unroll the declaration of any helperstatements (e.g., `int sLen = s.length()`) to outside the method call. A similar thing happens when we want to create an object, mutate it, andreturn it; this often requires statements, but a let expression turns itback into an expression.

Re: [External] : Re: Primitive type patterns

Reply via email to