=== and array-refs

David Green Tue, 15 Aug 2006 13:51:49 -0700

On 8/14/06, Smylers wrote:

David Green writes:

 I guess my problem is that [1,2] *feels* like it should === [1,2].
 You can explain that there's this mutable object stuff going on, and I
 can follow that (sort of...), but it seems like an implementation
 detail leaking out.

The currently defined behaviour seems intuitive to me, from astarting point of Perl 5.

But is Perl 5 the best place to start? It's something many of us areused to, but that doesn't mean it's the best solution conceptually,even if it was the most reasonable way to implement it in P5.

The reason I think it's an implementation wart is that an array --thought of as a single, self-contained lump -- is different from areference or pointer to some other variable. Old versions of Perlalways eagerly exploded arrays, so there was no way to refer to anarray as a whole; put two arrays together and P5 (or P4, etc.) thinksit's just one big array or list.Then when references were introduced, "array-refs" provided a way toencapsulate arrays so we could work with them as single lumps. It'snot the most elegant solution, but being able to nest data structuresat all was a tremendous benefit, and it was backwards-compatible.

P6 doesn't have to be that backwards-compatible -- it already isn't.P6 more naturally treats arrays as lumps; this may or may not be*implemented* using references as in P5, but it doesn't have to -- orat least, it doesn't have to *look* as though that's how it's doingit. Conceptually, an array consisting only of constant literals,like (1,2,3), isn't referring to anything, so it doesn't need tobehave that way.

The difference between:
  my $new = [EMAIL PROTECTED];
and:
  my $new = [EMAIL PROTECTED];
is that the second one is a copy; square brackets always create anew anonymous array rather than merely refering to an existing one,and that's the same thing that's happening here. Think of squarebrackets as meaning something like Array->new and each one isobviously distinct.

I agree that [EMAIL PROTECTED] should be distinct from [EMAIL PROTECTED] -- in the formercase, we're deliberately taking a reference to the @orig variable.What I don't like is that [EMAIL PROTECTED] is distinct from [EMAIL PROTECTED] -- sure,I'm doing something similar to Array->new(1,2) followed by anotherArray->new(1,2), but I still want them to be the same, just as I wantStr->new("foo") to be the same as Str->new("foo"). They're justconstants, they should compare equally regardless of how I createdthem. (And arrays should work a lot like strings, because at someconceptual level, a string is an array [of characters].)

 > And I feel this way because [1,2] looks like it should be platonically
 unique.
I'd say that C< (1, 2) > looks like that. But C< [1, 2] > lookslike it's its own thing that won't be equal to another one.

Except [1,2] can look like (1,2) in P6 because it automaticallyrefs/derefs stuff so that things Just Work. That's good, because youshouldn't have to be referencing arrays yourself (hence my pointabove about an array conceptually being a single lump). But if we'regoing to hide the [implementational] distinction in some places, weshould hide it everywhere.

Actually, my point isn't even about arrays per se; that's just theimplementation/practical side of it. You can refer to a scalarconstant too:

        perl -e 'print \1, \1'
        SCALAR(0x8104980)SCALAR(0x810806c)

They're different because the *references* are different, but I don'tcare about that. A reference to a constant value is kind ofpointless, because the value isn't going to change. References to*variables* are useful, because you never know what value thatvariable might have, and refs give you a pointer to the current valueof the variable at any time.

The fact that it's even possible to take a reference to a literal iskind of funny, really; but since in P5 you had to be explicit about(de)referencing, it didn't hurt, and you could maybe even find somecute ways to take advantage of it (such as an easy way to get uniqueIDs out of the str/numification of a ref?). P6 just lets you glossover certain ref/deref distinctions that in a perfect world wouldn'thave existed in the first place.

Leibniz's "identity of indiscernibles" is a perfectly practicalprinciple to pursue in programming. Now [EMAIL PROTECTED] may be discerniblefrom [EMAIL PROTECTED] or [1, @orig] from [1, @other], but \1 is completely thesame as \1 in all ways -- all ways except for being able to get arepresentation of its memory location. And that's not anything about"1", that's a bit of metadata about the reference itself -- somethingthat definitely is based on the implementation.

(I can imagine some other implementation where in a ridiculousattempt to optimise for minimal memory footprint, everything with avalue of 1 points to the same address. When I say "$a=1; $a++", $afirst points to 0x1234567, and when I increment it, I don't changethe bits in that location, instead $a changes to point to address0x3456789, where my unique 2 value is stored. Then the only way todifferentiate \1 from \1 is to generate some arbitrary unique ID.Which would be silly.)

Anyway, I hope I'm making sense about why \1 !=== \1, etc. seems abit unnatural to me.



-David

=== and array-refs

Reply via email to