Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

Rowan Tommins [IMSoP] Thu, 04 Apr 2024 15:29:52 -0700

On 03/04/2024 00:01, Ilija Tovilo wrote:

Data classes are classes with a single additional > zend_class_entry.ce_flags flag. So unless customized, they behave as

> classes. This way, we have the option to tweak any behavior we would> like, but we don't need to. > > Of course, this will still require ananalysis of what behavior we > might want to tweak.

Regardless of the implementation, there are a lot of interactions wewill want to consider; and we will have to keep considering new ones aswe add to the language. For instance, the Property Hooks RFC wouldprobably have needed a section on "Interaction with Data Classes".

On the other hand, maybe having two types of objects to consider eachtime is better than having to consider combinations of lots of smallfeatures.



On a practical note, a few things I've already thought of to consider:

- Can a data class have readonly properties (or be marked "readonly dataclass")? If so, how will they behave?- Can you explicitly use the "clone" keyword with an instance of a dataclass? Does it make any difference?

- Tied into that: can you implement __clone(), and when will it be called?

- If you implement __set(), will copy-on-write be triggered before it'scalled?

- Can you implement __destruct()? Will it ever be called?

Consider this example, which would  > work with the current approach: > > 
$shapes[0]->position->zero!();

I find this concise example confusing, and I think there's a few thingsto unpack here...



Firstly, there's putting a data object in an array:

$numbers = [ new Number(42) ];
$cow = $numbers;
$cow[0]->increment!();
assert($numbers !== $cow);

This is fairly clearly equivalent to this:

$numbers = [ 42 ];
$cow = $numbers;
$cow[0]++;
assert($numbers !== $cow);

CoW is triggered on the array for both, because ++ and ->increment!()are both clearly modifications.



Second, there's putting a data object into another data object:

$shape = new Shape(new Position(42,42));
$cow = $shape;
$cow->position->zero!();
assert($shape !== $cow);

This is slightly less obvious, because it presumably depends on thedefinition of Shape. Assuming Position is a data class:

- If Shape is a normal class, changing the value of $cow->position justhappens in place, and the assertion fails

- If Shape is a readonly class (or position is a readonly property on anormal class), changing the value of $cow->position shouldn't beallowed, so this will presumably give an error

- If Shape is a data class, changing the value of $shape->positionimplies a "mutation" of $shape itself, so we get a separation beforeanything is modified, and the assertion passes

Unlike in the array case, this behaviour can't be resolved until youknow the run-time type of $shape.



Now, back to your example:

$shapes = [ new Shape(new Position(42,42)) ];
$cow = $shapes;
$shapes[0]->position->zero!(); assert($cow !== $shapes);

This combines the two, meaning that now we can't know whether toseparate the array until we know (at run-time) whether Shape is a normalclass or a data class.

But once that is known, the whole of "->position->zero!()" is amodification to $shapes[0], so we need to separate $shapes.

Without such a class-wide marker, you'll need to remember to add the
special syntax exactly where applicable.

$shapes![0]!->position!->zero();

The array access doesn't need any special marker, because there's noambiguity. The ambiguous call is the reference to ->position: in yourcurrent proposal, this represents a modification *if Shape is a dataclass, and is itself being modified*. My suggestion (or really, thoughtexperiment) was that it would represent a modification *if it has a ! inthe call*.


So if Shape is a readonly class:

$shapes[0]->position->!zero();
// Error: attempting to modify readonly property Shape::$position

$shapes[0]->!position->!zero();
// OK; an optimised version of:
$shapes[0] = clone $shapes[0] with [
    'position' =>  (clone $shapes[0]->position with ['x'=>0,'y'=>0])
];

If ->! is only allowed if the RHS is either a readonly property or amutating method, then this can be reasoned about statically: it willeither error, or cause a CoW separation of $shapes. It also allowsclasses to mix aspects of "data class" and "normal class" behaviour,which might or might not be a good idea.

This is mostly just a thought experiment, but I am a bit concerned thatcode like this is going to be confusingly ambiguous:


$item->shape->position->zero!();

What is going to be CoW cloned, and what is going to be modified inplace? I can't actually know without knowing the definition behind both$item and $item->shape. It might even vary depending on input.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

Reply via email to