Re: guidelines for parameter types

Ali Çehreli Tue, 18 Dec 2012 09:25:21 -0800

On 12/18/2012 04:51 AM, Dan wrote:
> On Tuesday, 18 December 2012 at 06:34:55 UTC, Ali Çehreli wrote:
>> I don't think this is well known at all. :) I have thought about these
>> myself and came up with some guidelines at http://ddili.org/ders/d.en
>
> Thanks - I will study it. I see that you have covered also in, out,
> inout, lazy, scope, and shared, so that should keep me busy for a while.


For convenience, here are the chapters and guidelines that are relevant:

1) Immutability:

  http://ddili.org/ders/d.en/const_and_immutable.html

Quoting:

* As a general rule, prefer immutable variables over mutable
  ones.

* Define constant values as enum if their values can be
  calculated at compile time. For example, the constant value of
  seconds per minute can be an enum:

        enum int secondsPerMinute = 60;

* There is no need to specify the type explicitly if it can be
  inferred from the right hand side:

        enum secondsPerMinute = 60;

* Consider the hidden cost of enum arrays and enum associative
  arrays. Define them as immutable variables if the arrays are
  large and they are used more than once in the program.  Specify
  variables as immutable if their values will never change but
  cannot be known at compile time. Again, the type can be
  inferred:

        immutable guess = read_int("What is your guess");

* If a function does not modify a parameter, specify that
  parameter as const. This would allow both mutable and immutable
  variables to be passed as arguments:

    void foo(const char[] s)
    {
        // ...
    }

    void main()
    {
        char[] mutableString;
        string immutableString;

        foo(mutableString);      // ← compiles
        foo(immutableString);    // ← compiles
    }

* Following from the previous guideline, consider that const
  parameters cannot be passed to functions taking immutable. See
  the section titled "Should a parameter be const or immutable?"
  above.

* If the function modifies a parameter, leave that parameter as
  mutable (const or immutable would not allow modifications
  anyway):

    import std.stdio;

    void reverse(dchar[] s)
    {
        foreach (i; 0 .. s.length / 2) {
            immutable temp = s[i];
            s[i] = s[$ - 1 - i];
            s[$ - 1 - i] = temp;
        }
    }

    void main()
    {
        dchar[] salutation = "hello"d.dup;
        reverse(salutation);
        writeln(salutation);
    }

    The output:

    olleh


2) const ref Parameters and const Member Functions:

  http://ddili.org/ders/d.en/const_member_functions.html

Quoting:

* To give the guarantee that a parameter is not modified by the
  function, mark that parameter as in, const, or const ref.

* Mark member functions that do not modify the object as const:

    struct TimeOfDay
    {
    // ...
        string toString() const
        {
            return format("%02s:%02s", hour, minute);
        }
    }

 This would make the struct (or class) more useful by removing an
 unnecessary limitation. The examples in the rest of the book
 will observe this guideline.


3) Constructor and Other Special Functions:

  http://ddili.org/ders/d.en/special_functions.html

Quoting:

Immutability of constructor parameters

  In the Immutability chapter we have seen that it is not easy to
  decide whether parameters of reference types should be defined
  as const or immutable. Although the same considerations apply
  for constructor parameters as well, immutable is usually a
  better choice for constructor parameters.

  The reason is, it is common to assign the parameters to members
  to be used at a later time. When a parameter is not immutable,
  there is no guarantee that the original variable will not
  change by the time the member gets used.

>> I don't know how practical it is but it would be nice if the price of
>> copying an object could be considered by the compiler, not by the
>> programmer.
>
> I agree - would be nice if compiler could do it but if it tried some
> would just not be happy about the choices, no matter what.
>
>>
>> According to D's philosophy structs don't have identities. If I pass a
>> struct by-value, the compiler should pick the fastest method.
>>
>
> Even if there is a postblit? Maybe that would work, but say your object
> were a reference counting type. If the compiler decided to pass by ref
> sneakily for performance gain when you think it is by value that might
> be a problem. Maybe not, though, as long as you know how it works. I
> have seen that literal structs passed to a function will not call the
> postblit - but Johnathan says this was a bug in the way the compiler
> classifies literals.

I am also keeping in mind that struct objects are supposed to be treatedas simple values without identities:


  http://dlang.org/struct.html

Quoting:

  A struct is defined to not have an identity; that is, the
  implementation is free to make bit copies of the struct as
  convenient.

>> That's sensible. (In practice though, it is rarely done in C++. For
>> example, if V is int and v is not intended to be modified, it is still
>> passed in as 'V v'.)
>>
>
> Absolutely. I read somewhere it was pedantic to do such things. Then I
> read some other articles that touted the benefit, even on an int,
> because the reader of (void foo(const int x) {...} ) knows x will/should
> not change, so it has clearer intentions for future maintainers.

Yeah. In C++, it is funny that all of my local variables are const asmuch as possible, but all of the by-value parameters are left non-const.I think part of the reason is the fact that, that top level const isseen as leaking an implementation detail to the signature. It also has apotential to confuse the newer users.


>> That makes a difference whether V is a value type or not. (It is not
>> clear whether you mean V is a value type.) Otherwise, e.g.
>> immutable(char[]) v has a legitimate meaning: The function requires
>> that the caller provides immutable data.
>
> When is 'immutable(char[]) v' preferable to 'const(char[]) v'? If you
> select 'const(char[]) v' instead, your function will not mutate v and if
> it is generally a useful function it will even accept 'char[]' that *is*
> mutable. I agree with the meaning you suggest, but under what
> circumstances is it important to a function to know that v is immutable
> as opposed to simply const?

Yes, const(char)[] is more welcoming as you state. On the other hand,immutable is a requirement on the user: The function demands immutabledata. This may be so if that string should be used later unchanged.Imagine a constructor takes the file name as 'string' (i.e.immutable(char)[]). Then the object is assured that the file name can beused later and it will be the same as when the object has been constructed.

Assuming that the object (or a function) needs the string to not changeever, let's enumerate the cases:

If the function signature is const(char)[], the function must make an.idup of it because it cannot rely on the user not changing it.

If the function signature is immutable(char)[], then the function isleaking out an implementation detail: It is communicating the fact tothe user, saying "I need an immutable string, if you have one, great; ifnot, *you* make an immutable copy to give me." By that analysis, I see'string' parameters as an optimization: Yes, an immutable data isneeded. If the user has one, the immutable copy is elided.

A solution that I have for the above is to make the function a template,and use a 'static if' to decide whether the object was mutable, and makean immutable copy if needed:


import std.stdio;
import std.conv;

ref immutableOf(T)(ref T param)
{
    static if (is(typeof(T[0]) == immutable)) {
        return param;

    } else {
        writeln("Duplicating mutable " ~ T.stringof);
        return to!(immutable(T))(param);
    }
}

void foo(T)(T s)
{
    immutable imm_s = immutableOf(s);
    writefln("s.ptr: %s, imm_s.ptr: %s", s.ptr, imm_s.ptr);
}

void main()
{
    char[] m = "hello".dup;
    immutable(char)[] s = "world";

    foo(m);
    foo(s);
}

The output shows that an immutable copy is made only when user's datahas been mutable to begin with:


Duplicating mutable char[]
s.ptr: 7F6E216E8FD0, imm_s.ptr: 7F6E216E8FC0
s.ptr: 482240, imm_s.ptr: 482240

The above works but obviously is very cumbersome.

There is a similar analysis for return value types: Why should I everreturn a string from a function that produces one? Why restrict myusers? I should return char[] so that they can further modify it theywant to.

Later I learned that mutable return values of pure functions canautomatically casted to immutable; so yes, it makes more sense to return.


char[] foo() pure     // <-- returns mutable
{
    char[] result;
    return result;
}

void main()
{
    char[] m = foo();  // <-- works
    string s = foo();  // <-- works
}

>> | ref immutable(V) v | No need - restrictive with no benefit|
>> | | over 'ref const(V) v' |
>>
>> I still has a different meaning: You must have an immutable V and I
>> need a reference to it. It may be that the identity of the object is
>> important and that the function would store a reference to it.
>>
>
> This may be a use-case for it. You want to store a reference to v and
> save it for later - so immutable is preferred over const. I may be
> mistaken but I thought the thread on 'rvalue references' talks about
> taking away the rights to take the address of any ref parameter:
> http://forum.dlang.org/post/4f863629.6000...@erdani.com

I am behind with my reading. I remember that thread but I must study itagain. :)


>> Again, if the function demands immutable(V), which may be null, then
>> it actually has some use.
>
> I agree - I just don't know yet when a function would demand
> 'immutable(V)' over 'const(V)'.

It makes sense only for by-reference I think. At the risk of repeatingmyself, the function wants to store a file name to be used later.


>> | T t | T is primitive, dynamic array, or assoc |
>> | | array (i.e. cheap/shallow copies). For |
>> | | generic code no knowledge of COW or |
>> | | cheapness so prefer 'ref T t' |
>>
>> I am not sure about that last guideline. I think we should simply type

>> T and the compiler does its magic. I don't know how practical myhope is.

>>
>> Besides, we don't know whether T is primitive or not. It can be
>> anything. If T is int, 'ref T t' could actually be slower due to the
>> pointer indirection due to ref.
>
> Agreed. In a separate thread
> http://forum.dlang.org/thread/opufykfxwkkjchqcw...@forum.dlang.org I
> included some timings of passing a struct as 'in S', 'in ref S', and
> 'const ref S'. The very small sizes, matching up to sizes of primitives,
> showed litte if any benefit of by value over ref. Maybe the
> test/benchmark was flawed?

I must read that too. :)

I wonder whether the compiler applied optimizations and was able to keeplots of stuff in registers. If the code is complex enough perhaps thenby-value may be faster. (?)


> But for big sizes, the by reference clearly
> won by a large margin. The problem with template code is you don't have
> any knowledge and the cost of 'by value' is unbounded, whereas
> difference between 'int t' and 'ref const(int) t' might be small.

Right. I hope others bring their experiences. We must understand thesedetails. :)

I was fortunate enough to meet with deadalnix and Denis Koroskin lastweek. I told deadalnix about this very topic and how important it is tohave a talk on this at DConf. He said he might be willing to give thattalk. (Unless of course you make your submission for DConf 2013 first. ;) )

Ali

Re: guidelines for parameter types

Reply via email to