Re: User model: terminology

Kevin Bourrillion Wed, 04 May 2022 11:28:17 -0700

My favorite kind of thread...

At the risk of inducing groans, a reminder that much of my own terminology
backstory is found in Data in Java Programs
<https://docs.google.com/document/d/1J-a_K87P-R3TscD4uW2Qsbt5BlBR_7uX_BekwJ5BLSE/preview>,
and that when there are places we disagree below, the disagreement is
probably highlighted by something in *that* document first.

Since for all we know it might be out of step with how typical Java devs
really think, I'll just mention that it's at least been well-received by
reddit
<https://www.reddit.com/r/java/search/?q=%22data%20in%20java%20programs%22>
twice (if reddit can find something to complain about, they usually do!).

On Wed, May 4, 2022 at 8:05 AM Brian Goetz <brian.go...@oracle.com> wrote:

Currently, we have primitives and classes/references, where primitives have
> box/wrapper reference companions.  The original goal of Bucket 3 was to
> model primitive/box pairs.  We have tentatively been calling these
> "primitives", but there are good arguments why we should not overload this
> term.
>
> We have tentatively assigned the phrase "value class" to all identity-free
> classes, but it is also possible we can use value to describe what we've
> been calling primitives, and use something else (identity-free,
> non-identity) to describe the bigger family.
>
> So, in our search for how to stack the user model, we should bear in mind
> that names that have been tentatively assigned to one thing might be a
> better fit for something else (e.g., the "new primitives").  We are looking
> for:
>
>  - A term for all non-identity classes.  (Previously, all classes had
> identity.)
>

The term applies to the objects first and foremost. The object either has
identity or does not.

What *is* identity? I'll claim it's exactly like an ordinary immutable
field-based property, with one special provision: it is *always*
auto-assigned to be unique, and thus can never be copied. That feels to me
like it tells the whole story. So the difference between these kinds of
objects is exactly a "with identity" / "without identity" distinction, and
as we know from interface naming ("HasFoo"), it is often impossible to turn
that into adjective form.

The second complication here is the backward default. *Having* identity is
actually the special property! I do think we should lean into that. Part of
upgrading your code to be "Java 21-modern" (or whatever) really should be
marking all your classes that you really *want* to have identity and
letting the rest lose it. The terms that feel right are "identity object"
and "class that produces identity objects" shortened to "identity class".

For the most part I think we'll end up talking about "identity classes" and
"classes in general", and more rarely needing to refer to "classes without
identity" or "non-identity classes". So I think it's okay to let them use
"A \ B"-style terminology as I've done here. (I furthermore still think
it's okay to have an IdentityObject interface but no ValueObject interface,
as the latter doesn't really embody additional client-facing capabilities.)

This is one of at least four examples of backward defaults in the language.
We are either stuck with painful/awkward terminology choices in all of
them, or we could pursue the idea of letting source files declare their
language level, upon which the problem vanishes.

>  - A term for what we've been calling atomicity: that instances cannot
> appear to be torn, even when published under race.  (Previously, all
> classes had this property.)
>

I think this term we really need is this one's negation. You never need to
(or can) mention it with identity classes; with the rest you can use it to
opt into more risk. The English words that come to mind are
https://www.thesaurus.com/browse/fragile.

>  - A term for those non-identity classes which do not _require_ a
> reference.  These must have a valid zero, and give rise to two types, what
> we've been calling the "ref" and "val" projections.
>

I think we need to name the *type* first before the class. Today we have

1. primitive types (the values are the instances)
2. reference types (the values are references to the instances)

But this isn't the *heart* of what it means to be "primitive"; it just
happens to be true of primitives so far. And sure, we'll certainly explain
all of this *partly* by saying these types are "primitive-LIKE". But what
is the quality that they and true primitive types have in common? It's "the
values are the instances", so this can either lead to "value type" or go
back to "inline/direct/immediate type".

At this moment I like both "value type" and "inline type" well enough.
Value *is* overloaded, to be sure, because of "value semantics" (aka why
AutoValue is called AutoValue). But the connection is strong enough imho. I
can delve deep into this topic if desired.

Then, back to your question, what is the name for a *class* that *also*
gives rise to a value type -- a "valuable class"?

>  - A term for what we've been calling the "ref" and "val" projections.
>

Note I think we should only invoke the concept of "projection" once we get
into type variables. Otherwise we simply have two types for one class. (And
the reason for that is very solid / easy to defend, just by appealing to
how we'd've preferred int and Integer had worked.) I would just call them
the reference type and the (name debated just above) type, simple as that.

> Let's start with _terms_, not _declaration syntax_.
>

Yes, and even the term we like 2nd best for a thing can still be useful in
the documentation of that thing.

-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kev...@google.com

Re: User model: terminology

Reply via email to