Re: [External] : Pattern assignment statements (extracted from: Primitive type patterns and conversions)

John Rose Wed, 03 Mar 2021 23:18:07 -0800

On Mar 3, 2021, at 8:09 AM, Brian Goetz <brian.go...@oracle.com> wrote:
> 
>> the whole story about initializing a local variable with an array is weird,
>>   int data[] = {1, 2, 3};
>> compiles but
>>   int data[];
>>   data = {1, 2, 3};
>> does not.
> 
> True, and I hate it.  This is related to the C-style array decl; for the 
> first year it was really important that Java seem friendly to C developers, 
> and this was one of the compromises.  For the rest of time, it is just an 
> annoying irregularity.


Here are some thoughts on array handling
and patterns.

Basically, looking at pattern assignment, in
the case of arrays, prompts me to think that
the more rewarding goal is pattern declarations,
not pattern assignments.  So I’ll try to compare
and contrast those, for arrays and other kinds of
patterns.

Backing up…

First, I agree with Brian  that “int data[] = …”
is horrible; we should have only allowed
“int[] data = …” and taken the hit early.

But the nested-braces notation is good even if you never
learned C.  As everyone knows, the array declaration syntax
uses the declaration type as context for decoding the
nested initializer expressions.  This allows the programmer
to provide the array contents without re-asserting the
declared array type when the array instance is created.
The “stuff in braces” array initializer notation is
also uniformly available in expressions of the form
“new T[]{ … }”, with the “new T[]” header providing
exactly the same context to the “{…}”, as in the
array declaration case.

(Bit of context:  Bill Joy and I added the “new T[]{ … }”
expression syntax in 1.1, so I’m partial to the “stuff in
braces” notation, and I also think it can be extended
and rationalized even further.)

If the language required explicitly typed array
initializers, we’d have to write stuff like this:

int[][] as = new int[][]{ new int[]{ 1, 2 }, new int[]{ 3 } };

We would surely tire of it and add some project-coin-style
sugar like today’s syntax:

int[][] as = { { 1, 2 }, { 3 } };

The “var” feature similarly allows users to state a
controlling type (or factory) in just one place.

var as = new int[][] { { 1, 2 }, { 3 } };

(Here’s a factory example:
var as = List.of(List.of(1, 2), List.of(3));

It’s an intriguing problem to cross-generalize
“stuff in braces” syntax to factories and to
non-array types.  But I digress.  We can revisit
when/if we do construction expressions.)

OK, so as Remi says, we don’t allow:

int[][] as;
as = { { 1, 2 }, { 3 } };

Nor, in fact, do we allow:

var as;
as = new int[][] { { 1, 2 }, { 3 } };

I think the two cases are parallel.  Because there is
contextual type information in the two halves
(declaration and assignment) that communicates
from one half to the other (from int[][] as to
the nested exprs, or from the new int[][] expr
to the var as), you can’t break up the declaration
without breaking up the communication.

So, in the case of int[][] as = {…}, the communication
is a sort of very strong, explicit target typing, where
the quasi-expression {…} is evaluated in the context
of, not only an inferred type (as in a hypothetical
“f({…})”) but an explicitly declared type (“int[][]”
before “as”).  Such an explicit header type deserves
to be recycled as context to the rest of the declaration,
if it is useful there, and indeed it is.

(In “var as = new int[][]{…}” the communication is,
well, source typing, or whatever is the opposite
of target typing.  The source type here is very
strong and explicit, but it could also be “var as
= List.of(…)” in which case the source type is
implicit, inferred as a result of type checking.)

Do arrays scale upward toward user-defined literals
or templated construction expressions?  Actually,
I think we are pretty close.  We have spoken at various
times about an explicitly typed head followed by
an implicitly typed tail for such things, and I think
“new int[][] {…}” is a good “bellwether” example.

And if that’s true, then the duality between
“var as = new int[][]{ … }” and the stylistically different
but semantically equivalent “int[][] as = { … }” is also
a reusable concept:  Allowing the type-rich head to be
either a declaration type in a declaration or else an
expression-prefix in an expression seems almost a
forced move.  (Precedents for the type-rich expression
head would be a cast or a wrapping function call as with
lambdas, or ad hoc syntax as with array or object creation
expressions.  Newer ideas will surely follow.)

When I say this, I’m *not* painting a bikeshed for
the templated expressions; the type-rich head doesn’t
have to be “new T” or “(T)” per se, nor does the “tail
stuff” have to use curly braces.  We have spent lots of
whiteboard time over the years (almost a decade)
talking about specific bikeshed colors for this,
or perhaps a whole rainbow of user selectable colors.
But there’s no point in reviewing all that now.

I do think that our experiences with target typing
in pattern matching will help us do something similar
with construction expressions (if we go there).  The
reason is that a construction expression is pretty must
“merely” an arrow-reversed construction expression.
(Specifically, the dynamic data flow is reversed.  The
static flow is probably still left to right.)  So all the
same information is there, just flowing around in
a somewhat different direction.

Going back to the thread topic, of pattern assignments,
I think you get the clearest notations when you use
pattern assignment within the context of a declaration,
exactly because you have the most possible “communication”
between the head type of the declaration and whatever
type information is in the tail.  So I’m not surprised
that breaking a matching declaration, into a data-free
head and a separate assignment of data to a bare name,
doesn’t always work.  In fact, I’m surprised it works
as well as it does.

One more thought:  Deconstruction is the same as
construction, except for data flow direction.  In
deconstruction, data flows from a pre-existing
target object to its extracted components.  In
construction, data flows from (injected?)
components to a (newly created) target object.

It is very desirable, IMO, for the “two directions”
to look and feel somehow similar, in their notations.
For arrays, this suggests that while construction
looks like this:

int[][] output = { inputs… };
var output = new int[][]{ inputs… };

So a deconstruction should look something like this:

int[][] {outputs…} = input;
//maybe: var {outputs…} = (int[][]) input;

I’m saying “something like” NOT “exactly like”!

By example, the construction “new Box(x)” is only
“something” like the pattern “Box(var x)”.  And
yet their similarity suggests what is true, that
they do similar jobs.  (One reverses the other.)

For a standalone assignment that deconstructs
an array, a turned-around array creation expression
could make sense, but it’s really ugly:

new int[][] {outputs…} = input;  //yuck

(Compared with deconstructing declarations,
the standalone assignment syntax feels like a
bridge too far, frankly.)

And for factories, maybe:

var output = List.of(inputs…);
  //<= reverses =>
List of(outputs…) = input;
//maybe: var of(outputs…) = (List) input;

And for partial extraction methods, maybe:

var outputMap = m.with(k, inputVal);
  //<= reverses =>
Map<> with(k)(outputVal) = inputMap;  //ignore m

Their ugly cousins, the standalone assignments
seem to want to take this ground:

List.of(outputs…) = input;
Map<>.with(k)(outputVal) = inputMap;

But they shouldn’t, I think.  There should a token
somewhere that says, “yes, I do want to assign
to stuff, not declare stuff”.  Straw man:

assign int[][] {outputs…} = input;
assign List.of(outputs…) = input;
assign Map<>.with(k)(outputVal) = inputMap;

The extra syntax is needed if we privilege the
convention to declare binding names as needed,
rather than to rummage around and assign to
them as needed.

A better syntax (IMO) would be to mark each
pattern binding variable in such a way that
if unmarked, it is a newly bound variable, and
if marked, it is assigned to a pre-existing variable
(which must be in scope).

int[][] {assign output, assign output2} = input;
assign List.of(assign output, assign output2) = input;
assign Map<>.with(k)(assign outputVal) = inputMap;

This has two benefits:  1. It’s clear which variables are
getting assigned to (and therefore require attention
to non-local declarations, from surrounding code).
2. You can mix assignments and bindings in the
same pattern.

— John

P.S. Another point, only slightly related: If we add
syntax support for combined declarations of
*frozen* arrays we will run into the limits of
the compact array notation, and I think there
will be some pressure to make it more flexible.

To explain, this works OK:

var as = new int[] { 1, 2, 3 }.freeze();
var as = Arrays.freeze(new int[] { 1, 2, 3 });

This doesn’t:

var as = new int[][] { { 1, 2 }, { 3 } }.freeze();

because (subtly) the sub-arrays are mutable.

Nor does this, although I have seen it used
informally:

int[] as = { 1, 2, 3 }.freeze();

Even if that is rationalized somehow, this has
the same ambiguity problem with mutable
subarrays:

int[][] as = { { 1, 2 }, { 3 } }.freeze();

Happily, we can start playing with the frozen
arrays themselves before we start cooking up
sugar for them.  After trying out use cases we’ll
have a better feel for what sugar we want to add.
In the early days it might sometimes look as
bad as this:

var as = new int[][]{ new int[]{ 1, 2 }.freeze(), new int[]{ 3 }.freeze() 
}.freeze();

(Or worse, if we are scrupulous about avoiding
double copies and use an ArrayBuilder helper.
But, one step at a time…)

The connection between patterns per se and frozen
arrays is quite simple:  An array deconstruction 
pattern works exactly the same on a mutable and on a
frozen array.  The above is really about the limitations
of compact array initializers.  Compact notations are
inherently difficult to adjust in meaning, at least
while preserving compactness.

Re: [External] : Pattern assignment statements (extracted from: Primitive type patterns and conversions)

Reply via email to