Re: return of copies vs references

Larry Wall Wed, 16 Mar 2005 23:49:10 -0800

On Wed, Mar 16, 2005 at 09:49:47PM -0800, Darren Duncan wrote:
: I need some clarification on the semantics of subroutine or method 
: return statements, regarding whether copies or references are 
: returned.  It will help me in my p6ification of p5 code.
: 
: Say I had a class with 3 private attributes, named [$:foo, @:bar, 
: %:baz], and I was making an explicit accessor for returning the full 
: values of each.


I am assuming you're talking about read-only accessors, not rw accessors.

: Take these 3 example method statements:
: 
:   return $self.foo;
:   return $self.bar;
:   return $self.baz;

Those would have to be:

    return $self.:foo;
    return $self.:bar;
    return $self.:baz;

or

    return $:foo;
    return @:bar;
    return %:baz;

: For each of the above cases, is a copy of or a reference to the 
: attribute returned?

Perl 5 always makes a copy of return values, but that just turns
out to not matter for references, since a copy of a reference is as
good as the original reference.  Perl 5 also propagates scalar/list
context into subs.  For $:foo it doesn't matter--it always behaves
as a scalar value even in list context.  In list context, @:bar and
%:baz should probably return copies of their values much like they
do in Perl 5, (or more likely, some kind of lazy COW reference that
can lazily interpolate into the surrounding lazy list context).
Whether $self.:bar and $self.:baz should behave the same is an
interesting question.  They *look* scalar, so maybe they should imply
reference return, and you'd have to say

    return $self.:bar[];
    return $self.:baz{};

to get the equivalent of

    return @:bar;
    return %:baz;

But bare

    return $self.:bar;
    return $self.:baz;

would be equivalent to:

    return \@:bar;
    return \%:baz;

But I could argue it the other way too.

: For each, will the calling code be able to 
: modify $obj's attributes by modifying the return values, or not?

The caller can modify the value only if an explicit ref is returned (or
the accessor is marked "rw").

Where we seem to differ from Perl 5 is that in scalar context, a bare
array or hash automatically enreferences itself rather than returning
some kind of size.  So in scalar context, it would seem that

    return @:bar;
    return %:baz;

and

    return $self.:bar;
    return $self.:baz;

are equivalent to:

    return \@:bar;
    return \%:baz;

(Again, $:foo is never a problem unless it's already a reference.)

So the issue is whether this interpretation will encourage people to 
accidentally
return references to things they didn't want to give write access to.  On the
other hand, making the private methods context sensitive doesn't actually
seem to fix this particular problem, but just pushes it down one level into
the implicit accessor.  Maybe we need to work something up where
references returned from read-only accessors are always COW references.
If we assume that [...] is lazy when it can be, then that would be saying that
scalar context forces

    return @:bar;

to mean

    return [@:bar];

and you'd have to write an explicit

    return \@:bar;

to get around that.  But that seems kind of hacky and special-casey.

On the other hand, there are going to be strong cultural forces
discouraging people from writing such accessors in the first place,
so maybe we just go ahead and let people return hard refs in scalar
context on the assumption they know what they're doing.  I suspect
that most actual accessors to arrays and hashes will just look like
ordinary getter and setter methods with extra args for subscripts, or
will return an explicit proxy if they want to behave like an lvalue.
And in either of those cases, you don't try to return the entire
array or hash.  So maybe we should settle for the clean but slightly
dangerous semantics here.

Except that we've defined default read-only accessors that would,
under the "clean" rules, give people automatic access to arrays and
hashes if called in scalar context.  So I think we really only have
three options here for the public accessors:

    Don't generate autogenerate accessors at all for arrays and hashes.
    Generate array and hash accessors that refuse to work in scalar context.
    Generate array and hash accessors that autocopy in scalar context.

Of those three, the last seems the most friendly.

: Going further, what is the exact syntax for each type of attribute to 
: specify whether a copy or a reference is returned?
: 
: In Perl 5, with the latter two, the difference was a "return $bar" vs 
: "return [EMAIL PROTECTED]" for reference vs copy.  I would like that Perl 6 
: is also at least as clearly disambiguated.

If we go "dwimmy" rather than "clean", and assume private array and
hash accessors always return refs, then these return refs from public
accessors:

    return \$self.:foo;         # in any context
    return $self.:bar;          # in any context
    return $self.:baz;          # in any context
    return \$:foo;              # in any context
    return \@:bar;              # in any context
    return \%:baz;              # in any context

and these return copies:

    return $self.:foo           # in any context
    return $self.:bar[];        # in list context
    return $self.:baz{};        # in list context
    return $:foo;               # in any context
    return @:bar;               # in any context
    return %:baz;               # in any context

That's actually simpler than the tables I made up when I was believin in
the "clean" solution.  What it essentially boils down to is that return
from a public accessor always forces list context on its arguments even
if the accessor was called in scalar context (in which case a ref to
a COW list is stored in the scalar).

I'm wondering if we should just up and say that "return" always forces list
context on its arguments.  It would simplify some things, and complicate 
others...

: Note that specifying this in the attribute definition isn't 
: appropriate, since an attribute could just as easily be an array of 
: arrays, or hash of hashes, and I am returning an inner array or hash 
: that I either do or don't want to be modifiable by calling code.

Yes, even with the return context hack, it'd only go one level down.
I think at some point we have to rely on people not to write stupid
accessors, and rely on culture to enforce that.  I really, really
want to avoid falling into C++ const hell.

Maybe we can define some kind of deepcow operator for the nested cases.
Or maybe there's some other solution I don't see yet, or that someone has
already told me about and I've forgotten...

: Separate question, just to confirm, I assume that plain '=' always does a 
: copy?

Yes.  And with := it's mandatory for "is copy" parameters, optional
for "is constant" parameters, and prohibited for "is rw" parameters.

Hopefully if values come back COW from a routine, we can avoid a
duplicate copy on the assignment.

: Thank you for any clarification.

Well, I don't know if I clarified it that much, but you're welcome
to the mud too.  :-)

Larry

Re: return of copies vs references

Reply via email to