Re: Factory methods & the language model

Dan Smith Thu, 09 Sep 2021 11:01:18 -0700

To clarify a bit that I left out: this discussion assumes a pretty fixed JVM 
feature: a factory method is a static method with a special name, invoked via 
invokestatic, and possibly subject to certain constraints about the 
descriptor/enclosing class. I'm not proposing any changes to that basic 
approach, although choices we make for the Java language & tools _might_ 
influence the set of constraints we choose to impose in JVMS.

> On Sep 9, 2021, at 10:15 AM, Dan Heidinga <heidi...@redhat.com> wrote:
> 
> On Thu, Sep 9, 2021 at 10:24 AM Dan Smith <daniel.sm...@oracle.com> wrote:
>> 
>> JEP 401 includes special JVM factory methods, spelled <new> (or, 
>> alternatively, <init> with a non-void return), which are needed as a 
>> standardized way to encode the Java language's primitive class constructors.
>> 
>> We have a lot of flexibility in how much we restrict use of these methods. 
>> Too many restrictions seem arbitrary and incoherent from the JVM's point of 
>> view; but too few restrictions risk untested corner cases, unfortunate 
>> compatibility obligations, and difficulties mapping back to the Java 
>> language model.
>> 
>> Expanding on that last one: for tools that operate with a Java language 
>> model, there are essentially three strategies for dealing with factory 
>> methods outside of the core primitive class construction use case:
>> 
>> 1) Have the JVM reject them
> 
> This gives us the maximum flexibility to expand factories in the
> future and let's us concentrate on the inline types use cases.  Seems
> like a pretty safe fallback position on factories.

Yeah. Seems a little... lacking in vision to impose this restriction on class 
files of all languages, but it also avoids over-committing.

> 
>> 2) Ignore them
> 
> I strongly dislike this.  If javac were to ignore them, and just not
> generate them, they are effectively dead code.

Dead to the Java language and tools, but perhaps a useful way to compile a 
Scala feature or something?

>  It's be much clearer
> to users if javac flagged them as such and refused to compile unless
> they were deleted.  If javac ignores them, we still need an answer on
> what the JVM does with them - reject them?  load them but prevent them
> from being invoked?  drop them when loading the classfile?  This seems
> like it collapses back to option 1.

The JVM semantics are clean and wouldn't change: if you want to use a factory, 
invoke it with invokestatic. It's just that the Java language wouldn't provide 
any mechanism to do so (because <new> or <init> aren't legal Java method names).

Ignoring does feel a bit like the feature is incomplete or something, but this 
sort of behavior does show up from time to time where Java and the JVM aren't 
perfectly in sync. For example:
- If there are two fields with the same name, one of them is effectively 
invisible
- If there are two methods with the same params and different returns, they're 
considered overloads that are impossible to disambiguate
- If there's a stray <clinit> method in an interface (before we outlawed this), 
javac either filters it out or treats it as a normal method, but anyway you 
can't call it because of its name

>> 3) Expand the model to include them
> 
> How much expanding does the model need?  We had originally modeled the
> <new> factory methods as regular static methods and only gave them the
> specialized name to make them easy to detect, to deal with withfield
> being limited to the nest,  and to allow reflective operations like
> Class::getConstructor() and Class::newInstance() to identify the
> inline type "constructors".  Am I forgetting a case?

Talking here about expanding the *language* model in some way so that factory 
methods appearing in non-primitive classes and interfaces can somehow be 
recognized or invoked. (1) and (2) are reasonable options, too, but here I'm 
exploring other approaches that go beyond rejecting or ignoring.

>> 3) Or we can allow javac to view factory methods in any class as 
>> constructors. A few complications:
>> 
>>    - Constructors of non-final classes have both 'new Foo()' and 'super()' 
>> entry points; factories only support the first. So we either need to 
>> validate that a matching pair of <new> and <init> exist, or expand the 
>> language to model factories independently from constructors.
> 
> I don't think we want to touch the "new/dup/<init>" sequence and
> trying to allow factories to operate in that delicate dance would be a
> mistake.  Factories, beyond the inline types uses, give us a chance to
> encapsulate the "new/dup/<init>" dance and present a cleaner model.
> We shouldn't attempt to mix the two.

Not sure which direction you're going here?

One stance we could take: new/dup/<init> is fine for identity classes, we're 
not going to do anything different.

Another stance we could take: new/dup/<init> is painful, let's try to migrate 
to a different convention where factory methods encapsulate new/dup/<init>, and 
clients just call the factory.

I'm saying if we take the latter stance, there's a problem in that constructors 
would then be compiled down to factory methods *and* (for super calls) <init> 
methods, and we might need some validation to ensure they are aligned.

>>    - The language expects instance creation expressions to create fresh 
>> instances. We need to either validate this behavior (does the factory look 
>> like "new/dup/<init>"?) or relax the language semantics (perhaps this is in 
>> the grey area of mixed binaries?)
>> 
> 
> Only the invokestatic bytecode should be used to invoke a factory.
> Classes can have both factories and constructors, but they serve
> different purposes and only overlap due to reflective operations.
> Keeping them completely separate at the bytecode level is cleanest.

Sure, it's nice at the JVM level to treat them as independent features. But 
that doesn't match the Java language, so there's a mismatch to work out (either 
by changing the language, or restricting the VM, or having javac ignore code 
shapes that don't match).

> 
>>    - Factories can appear in abstract classes and interfaces. Again, are we 
>> willing to change the language model to support these use cases? Perhaps to 
>> even allow their declaration?
> 
> This makes sense.  Factories are just static methods with a special
> name.  A factory on an abstract class or interface makes sense if the
> concrete implementations are all package-private (sealed?) so users
> only reference the one public abstract class.

Yep, could be a useful feature. Is it one we could actually see implementing? 
TBD...

>>    - If a factory method has a mismatched return type (declared in Foo, but 
>> returns a Bar), are we willing to support a type system where the type of a 
>> factory invocation is not the type of the class to which the factory belongs?
>> 
> 
> I thought we needed this capability for anonymous inline classes as
> they can't name themselves in the return type of the factory.

We concluded that "need" is too strong a word here. It's a corner case that can 
be handled without using the factory method feature.

>  And I
> don't see a problem with it as long as we don't touch the new/dup/init
> dance.  Is there another problem here I'm not seeing?

Clients like the Java language will expect the return type to match, and will 
have to work around the issue if it doesn't (again, with any of these 
strategies: reject as malformed, ignore, or expand the language to allow it).

Specifically, even if we limit the feature to primitive classes, if a primitive 
class can have a factory that returns something other than the primitive 
class's type, javac needs to decide what to do about that.

>> There are probably limits to what we're willing to do with (3), which pushes 
>> at least some cases into the (1) or (2) buckets.
>> 
>> So, my question: what should we expect from (3), now and in the foreseeable 
>> future? And for the cases that fall outside of it, should we fall back to 
>> (1), (2), or a mixture of both?
>> 
> 
> (1), limiting to inline types, is the easiest and safest option while
> allowing the most flexibility to change in the future.
> 
> For (3), it seems like all the complexity goes away if we don't try to
> make factories == constructors at the bytecode level.  Am I missing
> something that would force us to do so?

No, it's not necessarily about JVM bytecode constraints. It's about how javac 
interprets whatever class files are thrown at it.

But you're right, if we limit the feature in the JVM to the minimal needs of 
the Java language (in a primitive class, matching return type), we can avoid 
these issues.

Re: Factory methods & the language model

Reply via email to