Container templates and constancy of elements

2012-03-14 Thread Stewart Gordon
With some containers, such as lists and trees (binary search trees aside), it doesn't 
matter if the elements can change state, since the data structure remains well-formed.


However, with others, such as sets and AAs, it's desirable if the keys don't mutate.  Any 
operations on them won't work right if there are keys in the wrong hash slots or out of 
order (depending on the underlying data structure) because they have changed since being 
put into the container.  Of course, this doesn't apply to _values_ in an AA, since these 
can safely mutate.


In Java and D1, it seems you just have to go on trust if you want your container to accept 
key types other than strict value types.  But can we do better in D2 - short of forcing 
the key type to be immutable, which would hinder the container's usefulness?


But it seems D2 has taken one step in that general direction by automatically 
tail-consting the key type of an AA.  But it doesn't stop modifications to the key through 
mutable references to the data.  And if the key is of class type, you can still modify the 
object, since D2 conflates constancy of an object reference with constancy of the object 
itself.  This is in itself silly.  Really, D should have tail-const built in for stuff 
like this.  OK, so there's Rebindable, but I've found it to be a PITA when trying to do 
generic programming with it, as well as leading to this AA key anomaly because it isn't a 
built-in feature.



I suppose there are a few possibilities for what constancy to apply to elements of a data 
structure to which changes might affect the structure's integrity:


(a) force the type to be tail-const if it's an array, otherwise don't add any 
constancy
(b) force the type to be tail-const if it's an array, or fully const if it's a 
class
(c) force the type to be tail-immutable if it's an array, otherwise don't add 
any constancy
(d) force the type to be tail-immutable if it's an array, or fully immutable if 
it's a class
(e) don't add any constancy, but just rely on trust


Currently, AAs implement (a).  (d) guarantees that changes to the data that mess up the 
data structure will never happen, at least if the programmer doesn't bypass the const 
system.  (e) is the route D1 and Java are forced to take.  Am I right in thinking that 
sets and maps in the C++ STL take the same path?



Anyway, what are people's thoughts on the right way to go here?

Stewart.


Re: Container templates and constancy of elements

2012-03-14 Thread H. S. Teoh
On Wed, Mar 14, 2012 at 06:51:54PM +, Stewart Gordon wrote:
> With some containers, such as lists and trees (binary search trees
> aside), it doesn't matter if the elements can change state, since
> the data structure remains well-formed.
> 
> However, with others, such as sets and AAs, it's desirable if the
> keys don't mutate.  Any operations on them won't work right if there
> are keys in the wrong hash slots or out of order (depending on the
> underlying data structure) because they have changed since being put
> into the container.  Of course, this doesn't apply to _values_ in an
> AA, since these can safely mutate.

IMO, AA keys should be *implicitly* immutable. That is, when you declare
something like:

int[ubyte[]] x;

then the key type of x should be immutable(ubyte)[], not ubyte[].

Otherwise, it just doesn't make any sense, and causes several of the
AA-related bugs currently on the bugtracker.


> In Java and D1, it seems you just have to go on trust if you want
> your container to accept key types other than strict value types.
> But can we do better in D2 - short of forcing the key type to be
> immutable, which would hinder the container's usefulness?
> 
> But it seems D2 has taken one step in that general direction by
> automatically tail-consting the key type of an AA.  But it doesn't
> stop modifications to the key through mutable references to the
> data.

Exactly. AA keys must be immutable, no matter what. Of course, to
interact nicely with existing code, methods like .opIndex or .get should
also accept mutable keys for lookup purposes, and .idup them when a new
entry needs to be created.


> And if the key is of class type, you can still modify the object,
> since D2 conflates constancy of an object reference with constancy of
> the object itself.  This is in itself silly.  Really, D should have
> tail-const built in for stuff like this.

This is a major PITA. Especially if you're trying to be const-correct in
your code, then it leads to nonsense like being unable to traverse a
linked list inside a const method, because the iteration pointer itself
is const (whereas it's the *data* it's pointing to that's const) so you
can't overwrite the loop variable. However, you *can* recursively search
the list.

I find this to be a major WAT in D.


[...]
> I suppose there are a few possibilities for what constancy to apply
> to elements of a data structure to which changes might affect the
> structure's integrity:
> 
> (a) force the type to be tail-const if it's an array, otherwise don't add any 
> constancy
> (b) force the type to be tail-const if it's an array, or fully const if it's 
> a class
> (c) force the type to be tail-immutable if it's an array, otherwise don't add 
> any constancy
> (d) force the type to be tail-immutable if it's an array, or fully immutable 
> if it's a class
> (e) don't add any constancy, but just rely on trust
> 
> 
> Currently, AAs implement (a).

Which is prone to bugs.


> (d) guarantees that changes to the data that mess up the data
> structure will never happen, at least if the programmer doesn't bypass
> the const system.
[...]

I believe this is the best way to go. Well, at least for AA's this is
needed. Otherwise there will always be the possibility that AA's will
malfunction when the key changes behind the container's back.


T

-- 
Just because you survived after you did it, doesn't mean it wasn't stupid!


Re: Container templates and constancy of elements

2012-03-14 Thread Ali Çehreli

On 03/14/2012 12:24 PM, H. S. Teoh wrote:
> On Wed, Mar 14, 2012 at 06:51:54PM +, Stewart Gordon wrote:
>> With some containers, such as lists and trees (binary search trees
>> aside), it doesn't matter if the elements can change state, since
>> the data structure remains well-formed.
>>
>> However, with others, such as sets and AAs, it's desirable if the
>> keys don't mutate.  Any operations on them won't work right if there
>> are keys in the wrong hash slots or out of order (depending on the
>> underlying data structure) because they have changed since being put
>> into the container.  Of course, this doesn't apply to _values_ in an
>> AA, since these can safely mutate.
>
> IMO, AA keys should be *implicitly* immutable. That is, when you declare
> something like:
>
>int[ubyte[]] x;
>
> then the key type of x should be immutable(ubyte)[], not ubyte[].

Yes, the internally stored copy of the key should be immutable(ubyte[]). 
Please note the difference from yours, but I guess that extra protection 
is just to protect the developers from themselves by avoiding mutating 
the key in the implementation.


But the conceptual key type of the AA should be const(ubyte[]). The 
reason is, a const parameter is welcoming: It says "I accept both 
mutable and immutable as arguments". On the other hand, an immutable 
parameter is restricting: It says "I demand only immutable as argument".


The following opIndex accepts  keys of five different types of immutability:

class SingleElementAA
{
immutable(uint[]) key;
int value;

ref int opIndex(const(uint[]) key)
{
return value;
}
}

void main()
{
auto aa = new SingleElementAA();

uint[] mutable;
immutable(uint)[] element_immutable;
immutable(uint[]) all_immutable;
enum uint[] enum_immutable = [ 1 ];

aa[mutable] = 42;
aa[element_immutable] = 43;
aa[all_immutable] = 44;
aa[enum_immutable] = 45;
aa[[0, 1]] = 42;  // literal key
}

Ali



Re: Container templates and constancy of elements

2012-03-14 Thread Steven Schveighoffer
On Wed, 14 Mar 2012 14:51:54 -0400, Stewart Gordon   
wrote:


With some containers, such as lists and trees (binary search trees  
aside), it doesn't matter if the elements can change state, since the  
data structure remains well-formed.


However, with others, such as sets and AAs, it's desirable if the keys  
don't mutate.  Any operations on them won't work right if there are keys  
in the wrong hash slots or out of order (depending on the underlying  
data structure) because they have changed since being put into the  
container.  Of course, this doesn't apply to _values_ in an AA, since  
these can safely mutate.


In Java and D1, it seems you just have to go on trust if you want your  
container to accept key types other than strict value types.  But can we  
do better in D2 - short of forcing the key type to be immutable, which  
would hinder the container's usefulness?


But it seems D2 has taken one step in that general direction by  
automatically tail-consting the key type of an AA.  But it doesn't stop  
modifications to the key through mutable references to the data.  And if  
the key is of class type, you can still modify the object, since D2  
conflates constancy of an object reference with constancy of the object  
itself.  This is in itself silly.  Really, D should have tail-const  
built in for stuff like this.  OK, so there's Rebindable, but I've found  
it to be a PITA when trying to do generic programming with it, as well  
as leading to this AA key anomaly because it isn't a built-in feature.



I suppose there are a few possibilities for what constancy to apply to  
elements of a data structure to which changes might affect the  
structure's integrity:


(a) force the type to be tail-const if it's an array, otherwise don't  
add any constancy
(b) force the type to be tail-const if it's an array, or fully const if  
it's a class
(c) force the type to be tail-immutable if it's an array, otherwise  
don't add any constancy
(d) force the type to be tail-immutable if it's an array, or fully  
immutable if it's a class

(e) don't add any constancy, but just rely on trust


Currently, AAs implement (a).  (d) guarantees that changes to the data  
that mess up the data structure will never happen, at least if the  
programmer doesn't bypass the const system.  (e) is the route D1 and  
Java are forced to take.  Am I right in thinking that sets and maps in  
the C++ STL take the same path?



Anyway, what are people's thoughts on the right way to go here?


IMO, @safe code should only allow d, @system/trusted should allow e.

And the compiler shouldn't be "helping" you by adding const annotations.   
That is:


int[char[]] aa;

this line should either compile, and the type should be int[char[]] aa, or  
it should not compile.


D is supposed to be able to do "at your own risk" bare-metal types of  
things.  This seems like it unnecessarily limits code.


-Steve


Re: Container templates and constancy of elements

2012-03-14 Thread Stewart Gordon

On 14/03/2012 19:24, H. S. Teoh wrote:


Exactly. AA keys must be immutable, no matter what. Of course, to
interact nicely with existing code, methods like .opIndex or .get should
also accept mutable keys for lookup purposes, and .idup them when a new
entry needs to be created.


This would rely on class authors making sure they define .idup.  We would also need a deep 
version of .idup for array-of-array and array-of-class types.




(d) guarantees that changes to the data that mess up the data
structure will never happen, at least if the programmer doesn't bypass
the const system.

[...]

I believe this is the best way to go. Well, at least for AA's this is
needed. Otherwise there will always be the possibility that AA's will
malfunction when the key changes behind the container's back.


Thinking about it, if we're going to go this far, maybe we could just make the key type 
fully immutable whatever it is.  This would enable the key variable in a foreach to be ref 
for efficiency (especially useful if it's a struct type) and still prevent changing of the 
key through it.


But this precludes implementing hash slots as arrays, at least if you want to be able to 
delete elements.  I might have to rethink my strategy here as well.  A linked list would 
get around it but increase the memory allocation overhead - not sure if this is worth 
worrying about.


Stewart.