Re: Decision on container design

2011-02-02 Thread dsimcha
I see your point.  Some of the stuff I've posted lately (RandAA, 
CsvRange) is in a usable but admittedly rough state, and was written 
partially for myself and partially for Phobos.


If this came off as take-it-or-leave-it, this was unintended. These 
posts were supposed to be informal requests for comment.  My main goal 
in these posts was to gauge whether there's enough interest, whether the 
design was good enough that it's worth polishing up for Phobos, and 
whether I missed any key features.  If there was sufficient interest I 
had every intention of polishing/fleshing out these libraries.


On 2/1/2011 11:00 AM, Andrei Alexandrescu wrote:

On 1/29/11 3:36 PM, dsimcha wrote:

I've uploaded the documentation to
http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on
the mailing list. The documentation is pretty sparse because
interface-wise it's just a standard hash table. More generally, though,
are we still interested in sealed/ref counted containers?


Sorry for being slow in continuing this thread.

Regarding the general issue that someone makes an informal proposal
(either here, as a DIP, or on the Phobos mailing list), followed by a
thundering silence: I believe that a good technique is to formalize the
proposal review process, which has been a homerun for Boost. The
disadvantage of that is that almost without exception this is very
taxing to library submitters. This means the submitter must put a lot of
thought and a lot of work into motivating, polishing, and documenting an
artifact without any guarantee that it would lead to inclusion in the
target library. I've seen very, VERY elaborate Boost submissions fail -
literally months of work gone to waste.

I'm not sure how to save people from doing work up front in hope of an
uncertain outcome in the future.

I do know what does _not_ work: the take it or leave it approach: Hey,
I have this code for abstraction XYZ that I extracted from a project of
mine and I think it may be of general interest. It's at
http://site.com/path/to/code.{d,html}. It needs polishing here and
there, it's largely undocumented, but I'm sure the ideas shine through.
Eh?

The doc at http://cis.jhu.edu/~dsimcha/randaasealed.html is somewhere in
between. It is clear you have a good understanding of sealing and hash
containers, but let me ask you this - if you wanted to sell this to
someone, what would you do? Probably you'd show some relevant benchmarks
putting the built-in hashes to shame. Maybe you'd have some good
examples - yes, we know it's a hash, but it doesn't hurt to see some
code pulp over there. Maybe you'd explain sealing and discuss its
relative advantages and disadvantages (which have not yet been
documented anywhere - a great opportunity). Maybe you'd even show some
numbers showing how sealing does as well/better/worse than reference
leaking.

This is a good amount of upfront work for little promise. Again, I don't
know yet how to optimize for minimizing that. What I did see works on
Boost is a request for interest in the form of a discussion (usually
with NO source, only USAGE examples) asking if there are people who are
interested in such a notion. In this particular case, you'd need numbers
to make a strong case which means that code must be already written. For
something like e.g. how about a Finite State Automaton library?
perhaps upfront code wouldn't be necessary for gauging interest.


Andrei





Re: Decision on container design

2011-02-02 Thread dsimcha
BTW, I thought the case for a sealed hash table with an optimized memory 
allocation scheme was self-evident to most heavy D users, given how the 
builtin hash table leaks memory/fragments the heap (I think it's a 
little of both) and destroys multithreaded performance.


On 2/1/2011 11:00 AM, Andrei Alexandrescu wrote:

On 1/29/11 3:36 PM, dsimcha wrote:

I've uploaded the documentation to
http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on
the mailing list. The documentation is pretty sparse because
interface-wise it's just a standard hash table. More generally, though,
are we still interested in sealed/ref counted containers?


Sorry for being slow in continuing this thread.

Regarding the general issue that someone makes an informal proposal
(either here, as a DIP, or on the Phobos mailing list), followed by a
thundering silence: I believe that a good technique is to formalize the
proposal review process, which has been a homerun for Boost. The
disadvantage of that is that almost without exception this is very
taxing to library submitters. This means the submitter must put a lot of
thought and a lot of work into motivating, polishing, and documenting an
artifact without any guarantee that it would lead to inclusion in the
target library. I've seen very, VERY elaborate Boost submissions fail -
literally months of work gone to waste.

I'm not sure how to save people from doing work up front in hope of an
uncertain outcome in the future.

I do know what does _not_ work: the take it or leave it approach: Hey,
I have this code for abstraction XYZ that I extracted from a project of
mine and I think it may be of general interest. It's at
http://site.com/path/to/code.{d,html}. It needs polishing here and
there, it's largely undocumented, but I'm sure the ideas shine through.
Eh?

The doc at http://cis.jhu.edu/~dsimcha/randaasealed.html is somewhere in
between. It is clear you have a good understanding of sealing and hash
containers, but let me ask you this - if you wanted to sell this to
someone, what would you do? Probably you'd show some relevant benchmarks
putting the built-in hashes to shame. Maybe you'd have some good
examples - yes, we know it's a hash, but it doesn't hurt to see some
code pulp over there. Maybe you'd explain sealing and discuss its
relative advantages and disadvantages (which have not yet been
documented anywhere - a great opportunity). Maybe you'd even show some
numbers showing how sealing does as well/better/worse than reference
leaking.

This is a good amount of upfront work for little promise. Again, I don't
know yet how to optimize for minimizing that. What I did see works on
Boost is a request for interest in the form of a discussion (usually
with NO source, only USAGE examples) asking if there are people who are
interested in such a notion. In this particular case, you'd need numbers
to make a strong case which means that code must be already written. For
something like e.g. how about a Finite State Automaton library?
perhaps upfront code wouldn't be necessary for gauging interest.


Andrei





Re: Decision on container design

2011-02-02 Thread Andrei Alexandrescu

On 2/2/11 8:56 AM, dsimcha wrote:

BTW, I thought the case for a sealed hash table with an optimized memory
allocation scheme was self-evident to most heavy D users, given how the
builtin hash table leaks memory/fragments the heap (I think it's a
little of both) and destroys multithreaded performance.


Perhaps a very good argument could be made with some benchmarks that 
show how your table is better than others in casual use and in heavy use.


Andrei


Re: Decision on container design

2011-02-01 Thread Andrei Alexandrescu

On 1/29/11 5:01 AM, Simen kjaeraas wrote:

Tomek Sowiński j...@ask.me wrote:


Michel Fortin napisał:


 Is there anything implementation specific in the outer struct that
provides
 ref semantics to Impl? If not, Container could be generic,
parametrized by
 Impl type.

You could provide an implementation-specific version of some functions
as an optimization. For instance there is no need to create the Impl
when asking for the length, if the pointer is null, length is zero.
Typically, const function can be implemented in the outward container
with a shortcut checking for null.


I think the reference struct can still be orthogonal to the container.

struct Ref(Impl)
{
private Impl* _impl;
ref Impl impl() @property
{
if (!impl) impl = new Impl;
return *impl;
}

static if (hasLength!Impl)
{
auto length() @property
{
return impl ? impl.length : 0;
}
}

alias impl this;
}


Now, other functions may also exploit such a shortcut. How do you plan
to implement that?

Actually, by adopting another idiom, it can be done:

struct Ref(Impl) {
private Impl* _impl;
@property ref Impl impl() {
return *(_impl = (_impl ? _impl : new Impl));
}
alias impl this;

auto opDispatch(string name, T...)(T args) {
mixin(return impl.~name~(_impl,args););
}
}

struct ExampleImpl {
static int length(ExampleImpl* that) {
return that ? *that.actualLength : 0;
}
int actualLength( ) {
return 42;
}
}

Of course, I did not say it was a good idiom. :p


I do something similar with RefCounted. There are problems - you need to 
know in advance which functions you can implement on a null container 
(empty and length are obvious candidates, but there could be others).


Andrei



Re: Decision on container design

2011-02-01 Thread Andrei Alexandrescu

On 1/29/11 3:36 PM, dsimcha wrote:

I've uploaded the documentation to
http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on
the mailing list. The documentation is pretty sparse because
interface-wise it's just a standard hash table. More generally, though,
are we still interested in sealed/ref counted containers?


Sorry for being slow in continuing this thread.

Regarding the general issue that someone makes an informal proposal 
(either here, as a DIP, or on the Phobos mailing list), followed by a 
thundering silence: I believe that a good technique is to formalize the 
proposal review process, which has been a homerun for Boost. The 
disadvantage of that is that almost without exception this is very 
taxing to library submitters. This means the submitter must put a lot of 
thought and a lot of work into motivating, polishing, and documenting an 
artifact without any guarantee that it would lead to inclusion in the 
target library. I've seen very, VERY elaborate Boost submissions fail - 
literally months of work gone to waste.


I'm not sure how to save people from doing work up front in hope of an 
uncertain outcome in the future.


I do know what does _not_ work: the take it or leave it approach: Hey, 
I have this code for abstraction XYZ that I extracted from a project of 
mine and I think it may be of general interest. It's at 
http://site.com/path/to/code.{d,html}. It needs polishing here and 
there, it's largely undocumented, but I'm sure the ideas shine through. 
Eh?


The doc at http://cis.jhu.edu/~dsimcha/randaasealed.html is somewhere in 
between. It is clear you have a good understanding of sealing and hash 
containers, but let me ask you this - if you wanted to sell this to 
someone, what would you do? Probably you'd show some relevant benchmarks 
putting the built-in hashes to shame. Maybe you'd have some good 
examples - yes, we know it's a hash, but it doesn't hurt to see some 
code pulp over there. Maybe you'd explain sealing and discuss its 
relative advantages and disadvantages (which have not yet been 
documented anywhere - a great opportunity). Maybe you'd even show some 
numbers showing how sealing does as well/better/worse than reference 
leaking.


This is a good amount of upfront work for little promise. Again, I don't 
know yet how to optimize for minimizing that. What I did see works on 
Boost is a request for interest in the form of a discussion (usually 
with NO source, only USAGE examples) asking if there are people who are 
interested in such a notion. In this particular case, you'd need numbers 
to make a strong case which means that code must be already written. For 
something like e.g. how about a Finite State Automaton library? 
perhaps upfront code wouldn't be necessary for gauging interest.



Andrei



Re: Decision on container design

2011-02-01 Thread Andrei Alexandrescu

On 1/28/11 8:12 PM, Michel Fortin wrote:

On 2011-01-28 20:10:06 -0500, Denis Koroskin 2kor...@gmail.com said:


Unfortunately, this design has big issues:


void fill(Appender appender)
{
appender.put(hello);
appender.put(world);
}

void test()
{
Appenderstring appender;
fill(appender); // Appender is supposed to have reference semantics
assert(appender.length != 0); // fails!
}

Asserting above fails because at the time you pass appender object to
the fill method it isn't initialized yet (lazy initialization). As
such, a null is passed, creating an instance at first appending, but
the result isn't seen to the caller.


That's indeed a problem. I don't think it's a fatal flaw however, given
that the idiom already exists in AAs.

That said, the nice thing about my proposal is that you can easily reuse
the Impl to create a new container to build a new container wrapper with
the semantics you like with no loss of efficiency.

As for the case of Appender... personally in the case above I'd be
tempted to use Appender.Impl directly (value semantics) and make fill
take a 'ref'. There's no point in having an extra heap allocation,
especially if you're calling test() in a loop or if there's a good
chance fill() has nothing to append to it.

That's the issue with containers. The optimal semantics always change
depending on the use case.


Yep, yep, I found myself wrestling with the same issues. All good 
points. On one hand containers are a target for optimization because 
many will use them. On the other hand you'd want to have reasonably 
simple and idiomatic code in the container implementation because you 
want people to understand them easily and also to write their own. I 
thought for a while of a layered approach in which you'd have both the 
value and the sealed reference version of a container... it's just too 
much aggravation.



An explicit initialization is needed to work around this design issue.
The worst thing is that in many cases it would work fine (you might
have already initialized it indirectly) but sometimes you get
unexpected result. I got hit by this in past, and it wasn't easy to
trace down.

As such, I strongly believe containers either need to have copy
semantics, or be classes. However, copy semantics contradicts with the
cheap copy ctor idiom because you need to copy all the elements from
source container.


Personally, I'm really concerned by the case where you have a container
of containers. Class semantics make things really complicated as you
always have to initialize everything in the container explicitly; value
semantics makes things semantically easier but quite inefficient as
moving elements inside of the outermost container implies copying the
containers. Making containers auto-initialize themselves on first use
solves the case where containers are references-types; making containers
capable of using move semantics solves the problem for value-type
containers.


Neither values nor references are perfect indeed. For example, someone 
mentioned, hey, in STL I write set vectordouble  and it Just 
Works(tm). On the other hand, if you swap the two names it still seems 
to work but it's awfully inefficient (something that may trip even 
experienced developers).



Andrei


Re: Decision on container design

2011-02-01 Thread Michel Fortin
On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu 
seewebsiteforem...@erdani.org said:



On 1/28/11 8:12 PM, Michel Fortin wrote:

On 2011-01-28 20:10:06 -0500, Denis Koroskin 2kor...@gmail.com said:


Unfortunately, this design has big issues:


void fill(Appender appender)
{
appender.put(hello);
appender.put(world);
}

void test()
{
Appenderstring appender;
fill(appender); // Appender is supposed to have reference semantics
assert(appender.length != 0); // fails!
}

Asserting above fails because at the time you pass appender object to
the fill method it isn't initialized yet (lazy initialization). As
such, a null is passed, creating an instance at first appending, but
the result isn't seen to the caller.


That's indeed a problem. I don't think it's a fatal flaw however, given
that the idiom already exists in AAs.

That said, the nice thing about my proposal is that you can easily reuse
the Impl to create a new container to build a new container wrapper with
the semantics you like with no loss of efficiency.

As for the case of Appender... personally in the case above I'd be
tempted to use Appender.Impl directly (value semantics) and make fill
take a 'ref'. There's no point in having an extra heap allocation,
especially if you're calling test() in a loop or if there's a good
chance fill() has nothing to append to it.

That's the issue with containers. The optimal semantics always change
depending on the use case.


Yep, yep, I found myself wrestling with the same issues. All good 
points. On one hand containers are a target for optimization because 
many will use them. On the other hand you'd want to have reasonably 
simple and idiomatic code in the container implementation because you 
want people to understand them easily and also to write their own. I 
thought for a while of a layered approach in which you'd have both the 
value and the sealed reference version of a container... it's just too 
much aggravation.


But are you not just pushing the aggravation elsewhere? If I need a by 
value container for some reason (performance or semantics) I'll have to 
write my own, and likely others will write their own too.


Using classes for containers is just marginally better than making them 
by-value structs: you can use 'new' with a by-value struct if you want 
it to behave as a class-like by-reference container:


struct Container {
...
}

auto c = new Container();

The only noticeable difference from a class container is that now c is 
now a Container*.




Personally, I'm really concerned by the case where you have a container
of containers. Class semantics make things really complicated as you
always have to initialize everything in the container explicitly; value
semantics makes things semantically easier but quite inefficient as
moving elements inside of the outermost container implies copying the
containers. Making containers auto-initialize themselves on first use
solves the case where containers are references-types; making containers
capable of using move semantics solves the problem for value-type
containers.


Neither values nor references are perfect indeed. For example, someone 
mentioned, hey, in STL I write set vectordouble  and it Just 
Works(tm). On the other hand, if you swap the two names it still seems 
to work but it's awfully inefficient (something that may trip even 
experienced developers).


Isn't that solved by C++0x, using move semantics in swap?

--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



Re: Decision on container design

2011-02-01 Thread Andrei Alexandrescu

On 2/1/11 10:44 AM, Michel Fortin wrote:

On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu
seewebsiteforem...@erdani.org said:


On 1/28/11 8:12 PM, Michel Fortin wrote:

On 2011-01-28 20:10:06 -0500, Denis Koroskin 2kor...@gmail.com said:


Unfortunately, this design has big issues:


void fill(Appender appender)
{
appender.put(hello);
appender.put(world);
}

void test()
{
Appenderstring appender;
fill(appender); // Appender is supposed to have reference semantics
assert(appender.length != 0); // fails!
}

Asserting above fails because at the time you pass appender object to
the fill method it isn't initialized yet (lazy initialization). As
such, a null is passed, creating an instance at first appending, but
the result isn't seen to the caller.


That's indeed a problem. I don't think it's a fatal flaw however, given
that the idiom already exists in AAs.

That said, the nice thing about my proposal is that you can easily reuse
the Impl to create a new container to build a new container wrapper with
the semantics you like with no loss of efficiency.

As for the case of Appender... personally in the case above I'd be
tempted to use Appender.Impl directly (value semantics) and make fill
take a 'ref'. There's no point in having an extra heap allocation,
especially if you're calling test() in a loop or if there's a good
chance fill() has nothing to append to it.

That's the issue with containers. The optimal semantics always change
depending on the use case.


Yep, yep, I found myself wrestling with the same issues. All good
points. On one hand containers are a target for optimization because
many will use them. On the other hand you'd want to have reasonably
simple and idiomatic code in the container implementation because you
want people to understand them easily and also to write their own. I
thought for a while of a layered approach in which you'd have both the
value and the sealed reference version of a container... it's just too
much aggravation.


But are you not just pushing the aggravation elsewhere? If I need a by
value container for some reason (performance or semantics) I'll have to
write my own, and likely others will write their own too.


If semantics are the primary concern, you could (and in fact Phobos 
could) provide a Value!C template that automatically calls dup in 
this(this) etc.


For performance I agree there is stuff that class containers leave on 
the table.



Using classes for containers is just marginally better than making them
by-value structs: you can use 'new' with a by-value struct if you want
it to behave as a class-like by-reference container:

struct Container {
...
}

auto c = new Container();

The only noticeable difference from a class container is that now c is
now a Container*.


Well one problem now is that if you have a Container* you don't know 
whether it's dynamically allocated or the address of some 
stack-allocated object. This is pretty big; a major issue that I believe 
C++ has is that you can seldom reason modularly about functions because 
C++ makes it impossible to represent reference semantics with 
local/remote/shared/no ownership without resorting to convention.


A better solution is to define something like

auto c = new Classify!Container;

which transforms a value into a class object.

With this, the question becomes a matter of choosing the right default: 
do we want values most of the time and occasional references, or vice 
versa? I think most of the time you need references, as witnessed by the 
many ''s out there in code working on STL containers.



Personally, I'm really concerned by the case where you have a container
of containers. Class semantics make things really complicated as you
always have to initialize everything in the container explicitly; value
semantics makes things semantically easier but quite inefficient as
moving elements inside of the outermost container implies copying the
containers. Making containers auto-initialize themselves on first use
solves the case where containers are references-types; making containers
capable of using move semantics solves the problem for value-type
containers.


Neither values nor references are perfect indeed. For example, someone
mentioned, hey, in STL I write set vectordouble  and it Just
Works(tm). On the other hand, if you swap the two names it still seems
to work but it's awfully inefficient (something that may trip even
experienced developers).


Isn't that solved by C++0x, using move semantics in swap?


This particular incarnation yes, but that doesn't automatically fix user 
code that forgets the cost of copying. But that took a large language 
change. My point was that values by default is not automatically a good 
choice.



Andrei


Re: Decision on container design

2011-02-01 Thread Steven Schveighoffer
On Tue, 01 Feb 2011 11:44:36 -0500, Michel Fortin  
michel.for...@michelf.com wrote:


On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu  
seewebsiteforem...@erdani.org said:



On 1/28/11 8:12 PM, Michel Fortin wrote:
On 2011-01-28 20:10:06 -0500, Denis Koroskin 2kor...@gmail.com  
said:



Unfortunately, this design has big issues:
  void fill(Appender appender)
{
appender.put(hello);
appender.put(world);
}
 void test()
{
Appenderstring appender;
fill(appender); // Appender is supposed to have reference semantics
assert(appender.length != 0); // fails!
}
 Asserting above fails because at the time you pass appender object to
the fill method it isn't initialized yet (lazy initialization). As
such, a null is passed, creating an instance at first appending, but
the result isn't seen to the caller.
 That's indeed a problem. I don't think it's a fatal flaw however,  
given

that the idiom already exists in AAs.
 That said, the nice thing about my proposal is that you can easily  
reuse
the Impl to create a new container to build a new container wrapper  
with

the semantics you like with no loss of efficiency.
 As for the case of Appender... personally in the case above I'd be
tempted to use Appender.Impl directly (value semantics) and make fill
take a 'ref'. There's no point in having an extra heap allocation,
especially if you're calling test() in a loop or if there's a good
chance fill() has nothing to append to it.


I've been thinking of making Appender.Impl public or at least its own  
type.  A stack-based appender makes a lot of sense when you are using it  
temporarily to build an array.


But array-based containers really are in a separate class from node-based  
containers.  It's tempting to conflate the two because they are both  
'containers', but arrays allow many more optimizations/features that  
node-based containers simply can't do.


 Yep, yep, I found myself wrestling with the same issues. All good  
points. On one hand containers are a target for optimization because  
many will use them. On the other hand you'd want to have reasonably  
simple and idiomatic code in the container implementation because you  
want people to understand them easily and also to write their own. I  
thought for a while of a layered approach in which you'd have both the  
value and the sealed reference version of a container... it's just too  
much aggravation.


But are you not just pushing the aggravation elsewhere? If I need a by  
value container for some reason (performance or semantics) I'll have to  
write my own, and likely others will write their own too.


foo(container.dup) ; // value semantics

I'm sure some template guru can make a wrapper type for this for the rare  
occasions that you need it.  We can work on solving the  
auto-initialization issue (where a nested container 'just works'), I  
think there are ways to do it.


If that still doesn't help for your issues, then writing your own may be  
the only valid option.




Using classes for containers is just marginally better than making them  
by-value structs: you can use 'new' with a by-value struct if you want  
it to behave as a class-like by-reference container:


struct Container {
...
}

auto c = new Container();

The only noticeable difference from a class container is that now c is  
now a Container*.


And doesn't get cleaned up by the GC properly.  Plus, each member call  
must check if the container is 'instantiated', since we can have no  
default ctors.


Yes, it's a trade-off, and I think by far class-based containers win for  
the common case.



Personally, I'm really concerned by the case where you have a container
of containers. Class semantics make things really complicated as you
always have to initialize everything in the container explicitly; value
semantics makes things semantically easier but quite inefficient as
moving elements inside of the outermost container implies copying the
containers. Making containers auto-initialize themselves on first use
solves the case where containers are references-types; making  
containers

capable of using move semantics solves the problem for value-type
containers.
 Neither values nor references are perfect indeed. For example, someone  
mentioned, hey, in STL I write set vectordouble  and it Just  
Works(tm). On the other hand, if you swap the two names it still seems  
to work but it's awfully inefficient (something that may trip even  
experienced developers).


Isn't that solved by C++0x, using move semantics in swap?



swap isn't the problem.

foreach(s; myVectorSet)
{
   // if s is by value, it must be copied for each iteration in the loop
}

-Steve


Re: Decision on container design

2011-02-01 Thread Jonathan M Davis
On Tuesday 01 February 2011 09:07:55 Andrei Alexandrescu wrote:
 On 2/1/11 10:44 AM, Michel Fortin wrote:
  On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu
  
  seewebsiteforem...@erdani.org said:
  On 1/28/11 8:12 PM, Michel Fortin wrote:
  On 2011-01-28 20:10:06 -0500, Denis Koroskin 2kor...@gmail.com said:
  Unfortunately, this design has big issues:
  
  
  void fill(Appender appender)
  {
  appender.put(hello);
  appender.put(world);
  }
  
  void test()
  {
  Appenderstring appender;
  fill(appender); // Appender is supposed to have reference semantics
  assert(appender.length != 0); // fails!
  }
  
  Asserting above fails because at the time you pass appender object to
  the fill method it isn't initialized yet (lazy initialization). As
  such, a null is passed, creating an instance at first appending, but
  the result isn't seen to the caller.
  
  That's indeed a problem. I don't think it's a fatal flaw however, given
  that the idiom already exists in AAs.
  
  That said, the nice thing about my proposal is that you can easily
  reuse the Impl to create a new container to build a new container
  wrapper with the semantics you like with no loss of efficiency.
  
  As for the case of Appender... personally in the case above I'd be
  tempted to use Appender.Impl directly (value semantics) and make fill
  take a 'ref'. There's no point in having an extra heap allocation,
  especially if you're calling test() in a loop or if there's a good
  chance fill() has nothing to append to it.
  
  That's the issue with containers. The optimal semantics always change
  depending on the use case.
  
  Yep, yep, I found myself wrestling with the same issues. All good
  points. On one hand containers are a target for optimization because
  many will use them. On the other hand you'd want to have reasonably
  simple and idiomatic code in the container implementation because you
  want people to understand them easily and also to write their own. I
  thought for a while of a layered approach in which you'd have both the
  value and the sealed reference version of a container... it's just too
  much aggravation.
  
  But are you not just pushing the aggravation elsewhere? If I need a by
  value container for some reason (performance or semantics) I'll have to
  write my own, and likely others will write their own too.
 
 If semantics are the primary concern, you could (and in fact Phobos
 could) provide a Value!C template that automatically calls dup in
 this(this) etc.
 
 For performance I agree there is stuff that class containers leave on
 the table.
 
  Using classes for containers is just marginally better than making them
  by-value structs: you can use 'new' with a by-value struct if you want
  it to behave as a class-like by-reference container:
  
  struct Container {
  ...
  }
  
  auto c = new Container();
  
  The only noticeable difference from a class container is that now c is
  now a Container*.
 
 Well one problem now is that if you have a Container* you don't know
 whether it's dynamically allocated or the address of some
 stack-allocated object. This is pretty big; a major issue that I believe
 C++ has is that you can seldom reason modularly about functions because
 C++ makes it impossible to represent reference semantics with
 local/remote/shared/no ownership without resorting to convention.
 
 A better solution is to define something like
 
 auto c = new Classify!Container;
 
 which transforms a value into a class object.
 
 With this, the question becomes a matter of choosing the right default:
 do we want values most of the time and occasional references, or vice
 versa? I think most of the time you need references, as witnessed by the
 many ''s out there in code working on STL containers.

Java implements containers as classes (not that it really has any other 
choice), 
so all containers in Java have reference semantics, and I've _never_ found that 
to be a problem. I do think that there are rare cases where it makes sense for 
a 
container to be a value type, but I really do think that it's a rare case for 
the average programmer. On the other hand, having containers be value types in 
C++ is _frequently_ a problem.

- Jonathan M Davis


Re: Decision on container design

2011-02-01 Thread spir

On 02/01/2011 05:00 PM, Andrei Alexandrescu wrote:

Regarding the general issue that someone makes an informal proposal (either
here, as a DIP, or on the Phobos mailing list), followed by a thundering
silence: I believe that a good technique is to formalize the proposal review
process, which has been a homerun for Boost. The disadvantage of that is that
almost without exception this is very taxing to library submitters. This means
the submitter must put a lot of thought and a lot of work into motivating,
polishing, and documenting an artifact without any guarantee that it would lead
to inclusion in the target library. I've seen very, VERY elaborate Boost
submissions fail - literally months of work gone to waste.


An alternative, or a complementary approach, may be to delegate part of your 
responsability. In this case, I'm thinking at a pool of people which mission 
would be to obviously show interest (or lack of) for proposals made on the 
mailing list --whatever their advancement, formality, code quality... This 
would provide a valuable indicator while removing some load from your 
shoulders, I guess. Such people may be chosen by cooptation.
Note this approach is not exclusive of formal  heavy adoption process like 
Boost's; instead it can be a complementary or preliminary way of judging 
interest for proposals.


A similar principle may indeed be used for other purpose: specification  
evolution of D-the-language, implementation  bug removal of the reference 
compiler,...


Denis
--
_
vita es estrany
spir.wikidot.com



Re: Decision on container design

2011-02-01 Thread bearophile
Andrei:

 A better solution is to define something like
 
 auto c = new Classify!Container;
 
 which transforms a value into a class object.
 
 With this, the question becomes a matter of choosing the right default: 
 do we want values most of the time and occasional references, or vice 
 versa? I think most of the time you need references, as witnessed by the 
 many ''s out there in code working on STL containers.

I agree that most times a reference is better. This brings back the need for a 
very good (efficient, syntactically readable, and even if not safe, able to 
spot most common errors, and RAII-safe) way to allocate class instances 
in-place, on the stack or inside another struct/class.

Bye,
bearophile


Re: Decision on container design

2011-02-01 Thread Simen kjaeraas

Andrei Alexandrescu seewebsiteforem...@erdani.org wrote:

I do something similar with RefCounted. There are problems - you need to  
know in advance which functions you can implement on a null container  
(empty and length are obvious candidates, but there could be others).


Static functions can safely be called. Hence, this template:


import std.traits;

template isStaticFunc(T, string fn) {
enum isStaticFunc = is(typeof({
mixin(alias T.~fn~ func;);
ParameterTypeTuple!func args;
mixin(T.~fn~(args););
}));
}


Now, Ref can look like this:


struct Ref(Impl) {
private Impl* _impl;
@property ref Impl impl() {
return *(_impl = (_impl ? _impl : new Impl));
}
//alias impl this; // Apparently, alias this takes precedence over  
opDispatch, so
   // using both doesn't work. Well, unless you put  
opDispatch in Impl.


auto opDispatch(string name, T...)(T args) if ( isStaticFunc!(Impl,  
name)) {

mixin(return impl.~name~(_impl,args););
}
}

struct ExampleImpl {
static int length(ExampleImpl* that) {
return that ? that.actualLength : 0;
}
int actualLength( ) {
return 42;
}
}


import std.stdio;

void main( ) {
Ref!ExampleImpl a;

writeln( a.length );
}


--
Simen


Re: Decision on container design

2011-02-01 Thread Michel Fortin
On 2011-02-01 12:07:55 -0500, Andrei Alexandrescu 
seewebsiteforem...@erdani.org said:


With this, the question becomes a matter of choosing the right default: 
do we want values most of the time and occasional references, or vice 
versa? I think most of the time you need references, as witnessed by 
the many ''s out there in code working on STL containers.


What exactly is most of the time? In C++, you pass containers by '' 
for function parameters, using '' elsewhere is rare.


One thing I proposed some time ago to address this problem (and to 
which no one replied) was this:


ref struct Container { ... } // new ref struct concept

void func(Container c) {
// c is implicitly a ref Container
}

Container a; // by value
func(a); // implicitly passed by ref

Containers would be stored by value, but always passed by ref in 
functions parameters.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



Re: Decision on container design

2011-02-01 Thread Simon Buerger

On 01.02.2011 18:08, Steven Schveighoffer wrote:

On Tue, 01 Feb 2011 11:44:36 -0500, Michel Fortin
michel.for...@michelf.com wrote:


On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu
seewebsiteforem...@erdani.org said:


On 1/28/11 8:12 PM, Michel Fortin wrote:

On 2011-01-28 20:10:06 -0500, Denis Koroskin 2kor...@gmail.com
said:


Unfortunately, this design has big issues:
void fill(Appender appender)
{
appender.put(hello);
appender.put(world);
}
void test()
{
Appenderstring appender;
fill(appender); // Appender is supposed to have reference semantics
assert(appender.length != 0); // fails!
}
Asserting above fails because at the time you pass appender
object to
the fill method it isn't initialized yet (lazy initialization). As
such, a null is passed, creating an instance at first appending, but
the result isn't seen to the caller.

That's indeed a problem. I don't think it's a fatal flaw however,
given
that the idiom already exists in AAs.
That said, the nice thing about my proposal is that you can easily
reuse
the Impl to create a new container to build a new container
wrapper with
the semantics you like with no loss of efficiency.
As for the case of Appender... personally in the case above I'd be
tempted to use Appender.Impl directly (value semantics) and make fill
take a 'ref'. There's no point in having an extra heap allocation,
especially if you're calling test() in a loop or if there's a good
chance fill() has nothing to append to it.


I've been thinking of making Appender.Impl public or at least its own
type. A stack-based appender makes a lot of sense when you are using
it temporarily to build an array.

But array-based containers really are in a separate class from
node-based containers. It's tempting to conflate the two because they
are both 'containers', but arrays allow many more
optimizations/features that node-based containers simply can't do.


Yep, yep, I found myself wrestling with the same issues. All good
points. On one hand containers are a target for optimization
because many will use them. On the other hand you'd want to have
reasonably simple and idiomatic code in the container
implementation because you want people to understand them easily
and also to write their own. I thought for a while of a layered
approach in which you'd have both the value and the sealed
reference version of a container... it's just too much aggravation.


But are you not just pushing the aggravation elsewhere? If I need a
by value container for some reason (performance or semantics) I'll
have to write my own, and likely others will write their own too.


foo(container.dup) ; // value semantics

I'm sure some template guru can make a wrapper type for this for the
rare occasions that you need it. We can work on solving the
auto-initialization issue (where a nested container 'just works'), I
think there are ways to do it.

If that still doesn't help for your issues, then writing your own may
be the only valid option.



Using classes for containers is just marginally better than making
them by-value structs: you can use 'new' with a by-value struct if
you want it to behave as a class-like by-reference container:

struct Container {
...
}

auto c = new Container();

The only noticeable difference from a class container is that now c
is now a Container*.


And doesn't get cleaned up by the GC properly. Plus, each member call
must check if the container is 'instantiated', since we can have no
default ctors.

Yes, it's a trade-off, and I think by far class-based containers win
for the common case.


Personally, I'm really concerned by the case where you have a
container
of containers. Class semantics make things really complicated as you
always have to initialize everything in the container explicitly;
value
semantics makes things semantically easier but quite inefficient as
moving elements inside of the outermost container implies copying the
containers. Making containers auto-initialize themselves on first use
solves the case where containers are references-types; making
containers
capable of using move semantics solves the problem for value-type
containers.

Neither values nor references are perfect indeed. For example,
someone mentioned, hey, in STL I write set vectordouble  and it
Just Works(tm). On the other hand, if you swap the two names it
still seems to work but it's awfully inefficient (something that
may trip even experienced developers).


Isn't that solved by C++0x, using move semantics in swap?



swap isn't the problem.

foreach(s; myVectorSet)
{
// if s is by value, it must be copied for each iteration in the loop
}


 -Steve

Just to note: the correct solution for the last problem is

foreach(const ref s; myVectorSet)

which is working in current D. In a more value-based language you may 
even want to default to const ref for foreach-loop-values, and even 
for function-parameters. I suggested that a while ago, but wasn't 
liked much for D, for good reasons.


- Krox



Re: Decision on container design

2011-02-01 Thread Simon Buerger

On 01.02.2011 20:01, Michel Fortin wrote:

On 2011-02-01 12:07:55 -0500, Andrei Alexandrescu
seewebsiteforem...@erdani.org said:


With this, the question becomes a matter of choosing the right
default: do we want values most of the time and occasional
references, or vice versa? I think most of the time you need
references, as witnessed by the many ''s out there in code working
on STL containers.


What exactly is most of the time? In C++, you pass containers by ''
for function parameters, using '' elsewhere is rare.

One thing I proposed some time ago to address this problem (and to
which no one replied) was this:

ref struct Container { ... } // new ref struct concept

void func(Container c) {
// c is implicitly a ref Container
}

Container a; // by value
func(a); // implicitly passed by ref

Containers would be stored by value, but always passed by ref in
functions parameters.



Thats would not be per-value as most understand it.

Container globalC;
func(c);

void func(Container paramC)
{
c.add(42);  // modifies globalC in reference-semantics
// leaves globalC as it was in value-semantics
}


Your Idea would be somehow truly value-based if you default not only 
to ref but to const ref, because then the function would not be 
able to alter globalC. But making parameters default-const was not 
considered the right way for D.


- Krox


Re: Decision on container design

2011-01-31 Thread Steven Schveighoffer
On Fri, 28 Jan 2011 18:32:28 -0500, Michel Fortin  
michel.for...@michelf.com wrote:


On 2011-01-28 17:09:08 -0500, Andrei Alexandrescu  
seewebsiteforem...@erdani.org said:



On 1/28/11 3:05 PM, Michel Fortin wrote:

Not my preferred choices (especially #1), but having containers in
Phobos will certainly be an improvement over not having them. So go  
ahead!
 Well if you brought forth some strong argument I'm all ears. What I  
see for now is that struct containers are just difficult to implement  
and need to have carefully explained semantics, whereas a lot of people  
know how classes behave from day one.


We already argument this over and over in the past. First, I totally  
acknowledge that C++ style containers have a problem: they make it  
easier to copy the content than pass it by reference. On the other side  
of the spectrum, I think that class semantics makes it too easy to have  
null dereferences, it's easy to get lost when you have a container of  
containers.


I have some experience with containers having class-style semantics: in  
Objective-C, I ended up creating a set of macro-like functions which I  
use to initialize containers whenever I use them in case they are null.  
And I had to do more of these utility functions to handle a particular  
data structure of mine which is a dictionary of arrays of objects. In  
C++, I'd have declared this as a map string, vector Object   and  
be done with it; no need for special care initializing each vector, so  
much easier than in Objective-C.


I see value in this idiom, and I have ideas on how to solve this with  
class-based containers.  Essentially, I think we need a way to construct a  
container of containers where the owner container is the one who has  
class semantics, and the sub-containers only provide access through the  
owner.  I haven't fleshed out how this will work, but it solves another  
really important problem that dcollections currently has --  
over-allocation.


The issue is, if you have a nested container (say a map of strings to  
linked-lists), the map container pre-allocates a page of elements, to ease  
stress on the GC.  But so does each linked list!  However, since the lists  
all are part of the map, they could potentially share the same allocator.   
If you have few elements in each list, this could significantly reduce the  
over-allocation of memory.


-Steve


Re: Decision on container design

2011-01-31 Thread Steven Schveighoffer

On Sat, 29 Jan 2011 12:09:00 -0500, Simon Buerger k...@gmx.net wrote:


On 28.01.2011 19:31, Andrei Alexandrescu wrote:

1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up
to the container to make a method final or not.

3. Containers and their ranges decide whether they give away
references to their objects. Sealing is a great idea but it makes
everybody's life too complicated. I'll defer sealing to future
improvements in the language and/or the reflection subsystem.

4. Containers will assume that objects are cheap to copy so they won't
worry about moving primitives.


Not perfectly what I would like, but a reasonable choice, and most  
important to actually have a mature container-lib. But there are other  
choices remaining: what containers will be there and what will they be  
called? My suggestion is


* Set, MulitSet, Map, MultiMap (hash-table based)
* OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based)
* Sequence (like stl-Deque. the name is just more intuitive. Funny  
enough, the stl-deque implemenation has nothing to do with a doubly  
linked list)
* Array (like stl-vector. I think vector is a kinda strange name, but  
that may be only my impression)

* List (linked list)

* Stack/Queue/PriorityQueue should be done on top of an other class,  
with a impl-template-param, like the stl-ones


Things to note:
* container should be named with respect to their use, not the  
implementation. HashSet is a bad name, because the user shouldnt care  
about the implemenation.


* unordered sets are used more often than ordered. So it should be  
Set/OrderedSet, and not UnorderedSet/Set (also, the first one is two  
characters less typing *g*)


* opEqual should work between different types of Sets (or Maps, or  
sequences). Nothing wrong with comparing an ordered to an unordered one,  
or a list to an array.


http://www.dsource.org/projects/dcollections

-Steve


Re: Decision on container design

2011-01-31 Thread Simon Buerger

On 31.01.2011 17:53, Steven Schveighoffer wrote:

http://www.dsource.org/projects/dcollections

-Steve


Well, seems not bad on a quick look. But source is updated 2 years 
ago, so I doubt it would compile with current dmd. Anyway, the topic 
here is the std-behaviour of the std-lib. But sure, always nice to 
have custom alternatives.


Krox


Re: Decision on container design

2011-01-31 Thread David Nadlinger

On 1/31/11 6:48 PM, Simon Buerger wrote:

Well, seems not bad on a quick look. But source is updated 2 years ago,…


http://www.dsource.org/projects/dcollections/browser/branches/d2 – four 
months are not quite as long as two years. Oh, and Steve is the very 
author of dcollections. ;)


David


Re: Decision on container design

2011-01-31 Thread Steven Schveighoffer

On Mon, 31 Jan 2011 12:48:06 -0500, Simon Buerger k...@gmx.net wrote:


On 31.01.2011 17:53, Steven Schveighoffer wrote:

http://www.dsource.org/projects/dcollections

-Steve


Well, seems not bad on a quick look. But source is updated 2 years ago,  
so I doubt it would compile with current dmd. Anyway, the topic here is  
the std-behaviour of the std-lib. But sure, always nice to have custom  
alternatives.


latest meaningful change was 4 months ago:  
http://www.dsource.org/projects/dcollections/changeset/102


It should compile on the latest DMD (if not, file a ticket).  BTW, it  
changes very little mostly because I haven't had many complaints about it  
(maybe not a good sign?) and I haven't had much opportunity to use it, as  
my day job does not allow using D unfortunately.


It was proposed as a possibility for the std lib, but Andrei and I  
couldn't come to an agreement on what the collections should look like.   
Ironically, his latest decision moves std.container closer to dcollections  
in design.


However, it should be relatively compatible with phobos' container lib (in  
fact RedBlackTree is a direct port of dcollections' version).


-Steve


Re: Decision on container design

2011-01-31 Thread Simon Buerger
Okay, my fault. Didnt realize you were the author, and the project is 
still active. The 2 years came from here: 
http://www.dsource.org/projects/dcollections/browser/trunk/dcollections. 
I thought, that trunk was the most recent version. Added a bookmark, 
and will definitely take a closer look later. Thx for mentioning.



On 31.01.2011 19:09, Steven Schveighoffer wrote:

On Mon, 31 Jan 2011 12:48:06 -0500, Simon Buerger k...@gmx.net wrote:


On 31.01.2011 17:53, Steven Schveighoffer wrote:

http://www.dsource.org/projects/dcollections

-Steve


Well, seems not bad on a quick look. But source is updated 2 years
ago, so I doubt it would compile with current dmd. Anyway, the topic
here is the std-behaviour of the std-lib. But sure, always nice to
have custom alternatives.


latest meaningful change was 4 months ago:
http://www.dsource.org/projects/dcollections/changeset/102

It should compile on the latest DMD (if not, file a ticket). BTW, it
changes very little mostly because I haven't had many complaints about
it (maybe not a good sign?) and I haven't had much opportunity to use
it, as my day job does not allow using D unfortunately.

It was proposed as a possibility for the std lib, but Andrei and I
couldn't come to an agreement on what the collections should look
like. Ironically, his latest decision moves std.container closer to
dcollections in design.

However, it should be relatively compatible with phobos' container lib
(in fact RedBlackTree is a direct port of dcollections' version).

-Steve




Re: Decision on container design

2011-01-31 Thread Steven Schveighoffer

On Mon, 31 Jan 2011 13:36:57 -0500, Simon Buerger k...@gmx.net wrote:

Okay, my fault. Didnt realize you were the author, and the project is  
still active. The 2 years came from here:  
http://www.dsource.org/projects/dcollections/browser/trunk/dcollections.  
I thought, that trunk was the most recent version. Added a bookmark,  
and will definitely take a closer look later. Thx for mentioning.


Sorry about that, I have had at least 3 people be confused at why trunk  
wasn't the D2 version.  I really need to swap those (make the 1.x version  
the branch).  Unfortunately, it will screw up the D1 documentation, since  
it's auto-generated.  But that should probably be moot since the D1  
version isn't changing...


-Steve


Re: Decision on container design

2011-01-30 Thread spir

On 01/29/2011 06:09 PM, Simon Buerger wrote:

Things to note:
* container should be named with respect to their use, not the implementation.
HashSet is a bad name, because the user shouldnt care about the implemenation.

* unordered sets are used more often than ordered. So it should be
Set/OrderedSet, and not UnorderedSet/Set (also, the first one is two
characters less typing *g*)

* opEqual should work between different types of Sets (or Maps, or sequences).
Nothing wrong with comparing an ordered to an unordered one, or a list to an
array.


+++ on all those points

Denis
--
_
vita es estrany
spir.wikidot.com



Re: Decision on container design

2011-01-29 Thread Tomek Sowiński
Michel Fortin napisał:

  Is there anything implementation specific in the outer struct that provides
  ref semantics to Impl? If not, Container could be generic, parametrized by
  Impl type.  
 
 You could provide an implementation-specific version of some functions 
 as an optimization. For instance there is no need to create the Impl 
 when asking for the length, if the pointer is null, length is zero. 
 Typically, const function can be implemented in the outward container 
 with a shortcut checking for null.

I think the reference struct can still be orthogonal to the container.

struct Ref(Impl)
{
private Impl* _impl;
ref Impl impl() @property
{
if (!impl) impl = new Impl;
return *impl;
}

static if (hasLength!Impl)
{
auto length() @property
{
return impl ? impl.length : 0;
}
}

alias impl this;
}

Reusability lightens the burden of the container's author (less fuss for user 
implementations) and somewhat standardizes containers as they all must exhibit 
a certain API with certain semantics to be able to fit into Ref.

The downside is that the syntax for the most common case (ref semantics) is a 
little nosier than for value-like behavior.

-- 
Tomek



Re: Decision on container design

2011-01-29 Thread Simen kjaeraas

Tomek Sowiński j...@ask.me wrote:


Michel Fortin napisał:

 Is there anything implementation specific in the outer struct that  
provides
 ref semantics to Impl? If not, Container could be generic,  
parametrized by

 Impl type.

You could provide an implementation-specific version of some functions
as an optimization. For instance there is no need to create the Impl
when asking for the length, if the pointer is null, length is zero.
Typically, const function can be implemented in the outward container
with a shortcut checking for null.


I think the reference struct can still be orthogonal to the container.

struct Ref(Impl)
{
private Impl* _impl;
ref Impl impl() @property
{
if (!impl) impl = new Impl;
return *impl;
}

static if (hasLength!Impl)
{
auto length() @property
{
return impl ? impl.length : 0;
}
}

alias impl this;
}


Now, other functions may also exploit such a shortcut. How do you plan
to implement that?

Actually, by adopting another idiom, it can be done:

struct Ref(Impl) {
private Impl* _impl;
@property ref Impl impl() {
return *(_impl = (_impl ? _impl : new Impl));
}
alias impl this;

auto opDispatch(string name, T...)(T args) {
mixin(return impl.~name~(_impl,args););
}
}

struct ExampleImpl {
static int length(ExampleImpl* that) {
return that ? *that.actualLength : 0;
}
int actualLength( ) {
return 42;
}
}

Of course, I did not say it was a good idiom. :p

--
Simen


Re: Decision on container design

2011-01-29 Thread Tomek Sowiński
Michel Fortin napisał:

 As for the case of Appender... personally in the case above I'd be 
 tempted to use Appender.Impl directly (value semantics) and make fill 
 take a 'ref'. There's no point in having an extra heap allocation, 
 especially if you're calling test() in a loop or if there's a good 
 chance fill() has nothing to append to it.

Or take an output range.

-- 
Tomek



Re: Decision on container design

2011-01-29 Thread Tomek Sowiński
bearophile napisał:

 This page:
 http://www.jroller.com/scolebourne/entry/the_next_big_jvm_language1
 
 A quotation:
 
 3) Everything is a monitor. In Java and the JVM, every object is a monitor, 
 meaning that you can synchronize on any
 object. This is incredibly wasteful at the JVM level. Senior JVM guys have 
 indicated large percentage improvements
 in JVM space and performance if we removed the requirement that every object 
 can be synchronized on. (Instead, you
 would have specific classes like Java 5 Lock)
 
 I have read similar comments in various other places.
 
 What about creating a @nomonitor annotation, for D2 classes to not create a 
 monitor for specific classes annotated
 with it? This may reduce some class overhead.

Better just remove it, it's not used often. Besides, there are different locks, 
one size doesn't fit all.

-- 
Tomek



Re: Decision on container design

2011-01-29 Thread SHOO

(2011/01/29 3:31), Andrei Alexandrescu wrote:

Today after work I plan to start making one pass through std.container.
After having thought of things for a long time, my conclusions are as
follows:

1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up to
the container to make a method final or not.

3. Containers and their ranges decide whether they give away references
to their objects. Sealing is a great idea but it makes everybody's life
too complicated. I'll defer sealing to future improvements in the
language and/or the reflection subsystem.

4. Containers will assume that objects are cheap to copy so they won't
worry about moving primitives.

Any showstoppers, please share.


Andrei


1, 2
I think that it is a good policy.
There is hardly the superiority that a containers are structs. :-)

In the first place I had doubt towards treating structs as classes 
(OOP). When I use it by such a way, the lack of the default constructor 
is fatal.


3, 4
If speed is accompanied, the sealing up is useful, but if it is late, 
there are many unnecessary cases. Because it is thought that a lot of 
repetition access to the element and life cycles occurs in the case of a 
container, handling it by the element unit should be high-performance. 
In addition, for usability, a thing of simple behavior is preferable 
rather than a complicated thing.
However, on the other hand, the design that wrong use is impossible 
should be considered enough.


Re: Decision on container design

2011-01-29 Thread Simon Buerger

On 28.01.2011 19:31, Andrei Alexandrescu wrote:

1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up
to the container to make a method final or not.

3. Containers and their ranges decide whether they give away
references to their objects. Sealing is a great idea but it makes
everybody's life too complicated. I'll defer sealing to future
improvements in the language and/or the reflection subsystem.

4. Containers will assume that objects are cheap to copy so they won't
worry about moving primitives.


Not perfectly what I would like, but a reasonable choice, and most 
important to actually have a mature container-lib. But there are other 
choices remaining: what containers will be there and what will they be 
called? My suggestion is


* Set, MulitSet, Map, MultiMap (hash-table based)
* OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based)
* Sequence (like stl-Deque. the name is just more intuitive. Funny 
enough, the stl-deque implemenation has nothing to do with a doubly 
linked list)
* Array (like stl-vector. I think vector is a kinda strange name, 
but that may be only my impression)

* List (linked list)

* Stack/Queue/PriorityQueue should be done on top of an other class, 
with a impl-template-param, like the stl-ones


Things to note:
* container should be named with respect to their use, not the 
implementation. HashSet is a bad name, because the user shouldnt 
care about the implemenation.


* unordered sets are used more often than ordered. So it should be 
Set/OrderedSet, and not UnorderedSet/Set (also, the first one is 
two characters less typing *g*)


* opEqual should work between different types of Sets (or Maps, or 
sequences). Nothing wrong with comparing an ordered to an unordered 
one, or a list to an array.


just my 2 cents,
Krox


Re: Decision on container design

2011-01-29 Thread bearophile
Simon Buerger:

 * container should be named with respect to their use, not the 
 implementation. HashSet is a bad name, because the user shouldnt 
 care about the implemenation.

This is good for a language like Python, or even Java, but for a system 
language I want to know the algorithms I'm using under the hood. So I am not 
sure.

Bye,
bearophile


Re: Decision on container design

2011-01-29 Thread KennyTM~

On Jan 30, 11 01:09, Simon Buerger wrote:

On 28.01.2011 19:31, Andrei Alexandrescu wrote:

1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up
to the container to make a method final or not.

3. Containers and their ranges decide whether they give away
references to their objects. Sealing is a great idea but it makes
everybody's life too complicated. I'll defer sealing to future
improvements in the language and/or the reflection subsystem.

4. Containers will assume that objects are cheap to copy so they won't
worry about moving primitives.


Not perfectly what I would like, but a reasonable choice, and most
important to actually have a mature container-lib. But there are other
choices remaining: what containers will be there and what will they be
called? My suggestion is

* Set, MulitSet, Map, MultiMap (hash-table based)
* OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based)
* Sequence (like stl-Deque. the name is just more intuitive. Funny
enough, the stl-deque implemenation has nothing to do with a doubly
linked list)


A 'deque' just mean a double-ended queue, not necessarily doubly linked 
list. I don't like the name Sequence since it doesn't specify the 
container can be modified from both ends. 'Deque' is just fine.



* Array (like stl-vector. I think vector is a kinda strange name, but
that may be only my impression)
* List (linked list)

* Stack/Queue/PriorityQueue should be done on top of an other class,
with a impl-template-param, like the stl-ones

Things to note:
* container should be named with respect to their use, not the
implementation. HashSet is a bad name, because the user shouldnt care
about the implemenation.



Except that there is already a std.container.RedBlackTree named in this 
fashion :).



* unordered sets are used more often than ordered. So it should be
Set/OrderedSet, and not UnorderedSet/Set (also, the first one is two
characters less typing *g*)

* opEqual should work between different types of Sets (or Maps, or
sequences). Nothing wrong with comparing an ordered to an unordered one,
or a list to an array.



This breaks transitivity of '==' unnecessarily. I don't see the benefit 
of comparing two different kinds of containers. If you really need to 
compare them, use std.algorithm.equal.



just my 2 cents,
Krox







Re: Decision on container design

2011-01-29 Thread Jim
bearophile Wrote:

 Simon Buerger:
 
  * container should be named with respect to their use, not the 
  implementation. HashSet is a bad name, because the user shouldnt 
  care about the implemenation.
 
 This is good for a language like Python, or even Java, but for a system 
 language I want to know the algorithms I'm using under the hood. So I am not 
 sure.

There could be an interface named Set and various implementations of it, e.g. 
HashSet. I think this would be a good principle in general for all abstract 
collection types.


Re: Decision on container design

2011-01-29 Thread dsimcha
In general I actually like this idea, but can we still have sealed 
ref-counted arrays and associative arrays somewhere in Phobos (not 
necessarily any more esoteric containers than these and not necessarily 
in std.container, maybe in a new module called std.refcounted or 
something)?  For huge arrays/AAs I think ref counting is an important 
optimization, especially with the GC in its current state but to a 
lesser extent even when we get a better GC.  GC tends to be slow 
compared to manual memory management in cases where most allocated 
blocks are large.


Also, I like reference counting for huge arrays/AAs enough that I 
submitted a ref counted hash table implementation to the Phobos list for 
review, which noone seemed to notice.  Please comment.


On 1/28/2011 1:31 PM, Andrei Alexandrescu wrote:

Today after work I plan to start making one pass through std.container.
After having thought of things for a long time, my conclusions are as
follows:

1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up to
the container to make a method final or not.

3. Containers and their ranges decide whether they give away references
to their objects. Sealing is a great idea but it makes everybody's life
too complicated. I'll defer sealing to future improvements in the
language and/or the reflection subsystem.

4. Containers will assume that objects are cheap to copy so they won't
worry about moving primitives.

Any showstoppers, please share.


Andrei




Re: Decision on container design

2011-01-29 Thread Andrei Alexandrescu

On 01/29/2011 12:54 PM, dsimcha wrote:

Also, I like reference counting for huge arrays/AAs enough that I
submitted a ref counted hash table implementation to the Phobos list for
review, which noone seemed to notice. Please comment.


Where is it?

Andrei


Re: Decision on container design

2011-01-29 Thread dsimcha

http://www.dsource.org/projects/aa/browser/trunk/randAA/randaasealed.d

On 1/29/2011 2:18 PM, Andrei Alexandrescu wrote:

On 01/29/2011 12:54 PM, dsimcha wrote:

Also, I like reference counting for huge arrays/AAs enough that I
submitted a ref counted hash table implementation to the Phobos list for
review, which noone seemed to notice. Please comment.


Where is it?

Andrei




Re: Decision on container design

2011-01-29 Thread Andrei Alexandrescu

On 01/29/2011 12:44 PM, Jim wrote:

bearophile Wrote:


Simon Buerger:


* container should be named with respect to their use, not the
implementation. HashSet is a bad name, because the user shouldnt
care about the implemenation.


This is good for a language like Python, or even Java, but for a system 
language I want to know the algorithms I'm using under the hood. So I am not 
sure.


There could be an interface named Set and various implementations of it, e.g. 
HashSet. I think this would be a good principle in general for all abstract 
collection types.


Thus approach has a few advantages but a lot of serious issues. I 
decided to not define a hierarchy for std.container.


Andrei


Re: Decision on container design

2011-01-29 Thread Andrei Alexandrescu

On 01/29/2011 01:29 PM, dsimcha wrote:

http://www.dsource.org/projects/aa/browser/trunk/randAA/randaasealed.d

On 1/29/2011 2:18 PM, Andrei Alexandrescu wrote:

On 01/29/2011 12:54 PM, dsimcha wrote:

Also, I like reference counting for huge arrays/AAs enough that I
submitted a ref counted hash table implementation to the Phobos list for
review, which noone seemed to notice. Please comment.


Where is it?

Andrei




Well to create a proposal you'd need to generate the documentation and 
put it somewhere and write a note to this group motivating the poposal 
and asking for a review manager.


Andrei


Re: Decision on container design

2011-01-29 Thread dsimcha
I've uploaded the documentation to 
http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on 
the mailing list.  The documentation is pretty sparse because 
interface-wise it's just a standard hash table.  More generally, though, 
are we still interested in sealed/ref counted containers?


On 1/29/2011 2:52 PM, Andrei Alexandrescu wrote:

On 01/29/2011 01:29 PM, dsimcha wrote:

http://www.dsource.org/projects/aa/browser/trunk/randAA/randaasealed.d

On 1/29/2011 2:18 PM, Andrei Alexandrescu wrote:

On 01/29/2011 12:54 PM, dsimcha wrote:

Also, I like reference counting for huge arrays/AAs enough that I
submitted a ref counted hash table implementation to the Phobos list
for
review, which noone seemed to notice. Please comment.


Where is it?

Andrei




Well to create a proposal you'd need to generate the documentation and
put it somewhere and write a note to this group motivating the poposal
and asking for a review manager.

Andrei




Re: Decision on container design

2011-01-29 Thread Jonathan M Davis
On Friday 28 January 2011 10:31:58 Andrei Alexandrescu wrote:
 Today after work I plan to start making one pass through std.container.
 After having thought of things for a long time, my conclusions are as
 follows:
 
 1. Containers will be classes.
 
 2. Most of the methods in existing containers will be final. It's up to
 the container to make a method final or not.
 
 3. Containers and their ranges decide whether they give away references
 to their objects. Sealing is a great idea but it makes everybody's life
 too complicated. I'll defer sealing to future improvements in the
 language and/or the reflection subsystem.
 
 4. Containers will assume that objects are cheap to copy so they won't
 worry about moving primitives.
 
 Any showstoppers, please share.

Overall, I think definitely support this approach. I definitely think that 
containers should be reference types, and I'm not sure that I care much whether 
they end up being structs or classes. My one concern would be the ability to 
inline the container's function calls, and marking them as final fixes that 
problem. I'm not sure that I know enough about the sealing issue to comment on 
it, but if what we're doing doesn't preclude having it later, then that sounds 
like it's a good compromise. And I definitely think that make it so that 
containers don't have to worry about moving primitives is the way to go.

So, I'm essentially 100% behind this, and I don't see any real issues with it. 
Containers is one of the places that Phobos is generally behind on and which 
really needs to be taken care if we really want people to be using D and 
Phobos. 
So, it's great to see that progress is being made with regards to containers.

- Jonathan M Davis


Re: Decision on container design

2011-01-29 Thread Jonathan M Davis
On Saturday 29 January 2011 09:09:00 Simon Buerger wrote:
 * container should be named with respect to their use, not the
 implementation. HashSet is a bad name, because the user shouldnt
 care about the implemenation.

It was decided previously that containers in std.container would specifically 
be 
named based on the type of data structure that they are rather than what 
they're 
used for. Hence why the newest container to std.container is called 
RedBlackTree. If you want to call it TreeSet or something similar, then all you 
have to do is alias it. So, it's easy to get names based on usage if that's 
what 
you want, but that's not the naming scheme that std.container is using.

- Jonathan M Davis


Decision on container design

2011-01-28 Thread Andrei Alexandrescu
Today after work I plan to start making one pass through std.container. 
After having thought of things for a long time, my conclusions are as 
follows:


1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up to 
the container to make a method final or not.


3. Containers and their ranges decide whether they give away references 
to their objects. Sealing is a great idea but it makes everybody's life 
too complicated. I'll defer sealing to future improvements in the 
language and/or the reflection subsystem.


4. Containers will assume that objects are cheap to copy so they won't 
worry about moving primitives.


Any showstoppers, please share.


Andrei


Re: Decision on container design

2011-01-28 Thread bearophile
Andrei:

As far as I understand the implications, I agree with your conclusions.
Other people will be free to create an alternative D2 library like this one:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html
but a standard library can't be too much hard to use.

Bye,
bearophile


Re: Decision on container design

2011-01-28 Thread Michel Fortin
On 2011-01-28 13:31:58 -0500, Andrei Alexandrescu 
seewebsiteforem...@erdani.org said:


Today after work I plan to start making one pass through std.container. 
After having thought of things for a long time, my conclusions are as 
follows:


1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up to 
the container to make a method final or not.


3. Containers and their ranges decide whether they give away references 
to their objects. Sealing is a great idea but it makes everybody's life 
too complicated. I'll defer sealing to future improvements in the 
language and/or the reflection subsystem.


4. Containers will assume that objects are cheap to copy so they won't 
worry about moving primitives.


Any showstoppers, please share.


Not my preferred choices (especially #1), but having containers in 
Phobos will certainly be an improvement over not having them. So go 
ahead!


About #4, it'd be nice to have the containers use move semantics when 
possible even if they fallback to (cheap) copy semantic when move isn't 
available. That way, if you have a type which is moveable but not 
copyable you can still put it in a container. Does that makes sense?



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



Re: Decision on container design

2011-01-28 Thread Andrei Alexandrescu

On 1/28/11 3:05 PM, Michel Fortin wrote:

On 2011-01-28 13:31:58 -0500, Andrei Alexandrescu
seewebsiteforem...@erdani.org said:


Today after work I plan to start making one pass through
std.container. After having thought of things for a long time, my
conclusions are as follows:

1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up
to the container to make a method final or not.

3. Containers and their ranges decide whether they give away
references to their objects. Sealing is a great idea but it makes
everybody's life too complicated. I'll defer sealing to future
improvements in the language and/or the reflection subsystem.

4. Containers will assume that objects are cheap to copy so they won't
worry about moving primitives.

Any showstoppers, please share.


Not my preferred choices (especially #1), but having containers in
Phobos will certainly be an improvement over not having them. So go ahead!


Well if you brought forth some strong argument I'm all ears. What I see 
for now is that struct containers are just difficult to implement and 
need to have carefully explained semantics, whereas a lot of people know 
how classes behave from day one.



About #4, it'd be nice to have the containers use move semantics when
possible even if they fallback to (cheap) copy semantic when move isn't
available. That way, if you have a type which is moveable but not
copyable you can still put it in a container. Does that makes sense?


That's what I did up until now. It is tantamount to defining a bunch of 
methods (aliases or not) that add to the interface that the user must 
absorb, but that are seldom useful. It just seems that the entire move 
paraphernalia doesn't lift its weight.



Andrei



Re: Decision on container design

2011-01-28 Thread Michel Fortin
On 2011-01-28 17:09:08 -0500, Andrei Alexandrescu 
seewebsiteforem...@erdani.org said:



On 1/28/11 3:05 PM, Michel Fortin wrote:

Not my preferred choices (especially #1), but having containers in
Phobos will certainly be an improvement over not having them. So go ahead!


Well if you brought forth some strong argument I'm all ears. What I see 
for now is that struct containers are just difficult to implement and 
need to have carefully explained semantics, whereas a lot of people 
know how classes behave from day one.


We already argument this over and over in the past. First, I totally 
acknowledge that C++ style containers have a problem: they make it 
easier to copy the content than pass it by reference. On the other side 
of the spectrum, I think that class semantics makes it too easy to have 
null dereferences, it's easy to get lost when you have a container of 
containers.


I have some experience with containers having class-style semantics: in 
Objective-C, I ended up creating a set of macro-like functions which I 
use to initialize containers whenever I use them in case they are null. 
And I had to do more of these utility functions to handle a particular 
data structure of mine which is a dictionary of arrays of objects. In 
C++, I'd have declared this as a map string, vector Object   and 
be done with it; no need for special care initializing each vector, so 
much easier than in Objective-C.


I agree that defining structs to have reference semantics as you have 
done is complicated. But I like the lazy initialization, and we have a 
precedent for that with AAs (ideally, AAs would be a compatible 
container too). Can't we just use the GC instead of reference counting? 
I'd make things much easier. Here is a implementation:


struct Container
{
struct Impl { ... }

private Impl* _impl;
ref Impl impl() @property
{
if (!impl) impl = new Impl;
return *impl;
}

alias impl this;
}

I also believe reference semantics are not to be used everywhere, even 
though they're good most of the time. I'd like to have a way to bypass 
it and get a value-semantic container. With the above, it's easy as 
long as you keep Container.Impl public:


void main() {
Container  lazyHeapAllocatedContainer;
Container.Impl stackAllocatedContainer;
}

void MyObject {
Container.Impl listOfObjects;
}



About #4, it'd be nice to have the containers use move semantics when
possible even if they fallback to (cheap) copy semantic when move isn't
available. That way, if you have a type which is moveable but not
copyable you can still put it in a container. Does that makes sense?


That's what I did up until now. It is tantamount to defining a bunch of 
methods (aliases or not) that add to the interface that the user must 
absorb, but that are seldom useful. It just seems that the entire move 
paraphernalia doesn't lift its weight.


But could we limit this to say that only containers that can return 
elements by ref? Perhaps that won't help. You know the problem better 
than me, I don't really have anything more to say.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



Re: Decision on container design

2011-01-28 Thread bearophile
Michel Fortin:

 On the other side 
 of the spectrum, I think that class semantics makes it too easy to have 
 null dereferences,

That's why I have asked for a @ suffix syntax+semantics for notnull reference 
types in D :-)


 I agree that defining structs to have reference semantics as you have 
 done is complicated. But I like the lazy initialization, and we have a 
 precedent for that with AAs

Built-in AAs are currently broken and in need to be fixed:

import std.stdio: writeln;
void foo(int[int] aa, int n) {
aa[n] = n;
}
void main() {
int[int] a;
foo(a, 0);
writeln(a);
a[1] = 1;
foo(a, 2);
writeln(a);
}

Bye,
bearophile


Re: Decision on container design

2011-01-28 Thread Tomek Sowiński
Michel Fortin napisał:

 We already argument this over and over in the past. First, I totally 
 acknowledge that C++ style containers have a problem: they make it 
 easier to copy the content than pass it by reference. On the other side 
 of the spectrum, I think that class semantics makes it too easy to have 
 null dereferences, it's easy to get lost when you have a container of 
 containers.
 
 I have some experience with containers having class-style semantics: in 
 Objective-C, I ended up creating a set of macro-like functions which I 
 use to initialize containers whenever I use them in case they are null. 
 And I had to do more of these utility functions to handle a particular 
 data structure of mine which is a dictionary of arrays of objects. In 
 C++, I'd have declared this as a map string, vector Object   and 
 be done with it; no need for special care initializing each vector, so 
 much easier than in Objective-C.
 
 I agree that defining structs to have reference semantics as you have 
 done is complicated. But I like the lazy initialization, and we have a 
 precedent for that with AAs (ideally, AAs would be a compatible 
 container too). Can't we just use the GC instead of reference counting? 
 I'd make things much easier. Here is a implementation:
 
   struct Container
   {
   struct Impl { ... }
 
   private Impl* _impl;
   ref Impl impl() @property
   {
   if (!impl) impl = new Impl;
   return *impl;
   }
   
   alias impl this;
   }
 
 I also believe reference semantics are not to be used everywhere, even 
 though they're good most of the time. I'd like to have a way to bypass 
 it and get a value-semantic container. With the above, it's easy as 
 long as you keep Container.Impl public:
 
   void main() {
   Container  lazyHeapAllocatedContainer;
   Container.Impl stackAllocatedContainer;
   }
 
   void MyObject {
   Container.Impl listOfObjects;
   }

Is there anything implementation specific in the outer struct that provides ref 
semantics to Impl? If not, Container could be generic, parametrized by Impl 
type.

Overall, I think a value-like implementation in a referency wrapper is a 
clear-cut idiom, bringing order to otherwise messy struct-implemented 
ref-semantics. Do you know of a existing collection library that exploits this 
idea?

-- 
Tomek



non-ref null arrays [was: Re: Decision on container design]

2011-01-28 Thread spir

On 01/29/2011 01:01 AM, bearophile wrote:

Built-in AAs are currently broken and in need to be fixed:

import std.stdio: writeln;
void foo(int[int] aa, int n) {
 aa[n] = n;
}
void main() {
 int[int] a;
 foo(a, 0);
 writeln(a);
 a[1] = 1;
 foo(a, 2);
 writeln(a);
}

Bye,
bearophile


Variation on the theme:

void foo(int[int] aa, int n) {
aa[n] = n;
}
void foo(int[] a, int n) {
a ~= n;
}
void bar(ref int[int] aa, int n) {
aa[n] = n;
}
void bar(ref int[] a, int n) {
a ~= n;
}

unittest {
int[int] aa;
foo(aa, 3);
writeln(aa.length);

int[] a;
foo(a, 3);
writeln(a.length);

int[int] bb;
bar(bb, 3);
writeln(bb.length);

int[] b;
bar(b, 3);
writeln(b.length);
}

Denis
--
_
vita es estrany
spir.wikidot.com



Re: Decision on container design

2011-01-28 Thread bearophile
Andrei:

 1. Containers will be classes.

This page:
http://www.jroller.com/scolebourne/entry/the_next_big_jvm_language1

A quotation:

3) Everything is a monitor. In Java and the JVM, every object is a monitor, 
meaning that you can synchronize on any object. This is incredibly wasteful at 
the JVM level. Senior JVM guys have indicated large percentage improvements in 
JVM space and performance if we removed the requirement that every object can 
be synchronized on. (Instead, you would have specific classes like Java 5 
Lock)

I have read similar comments in various other places.

What about creating a @nomonitor annotation, for D2 classes to not create a 
monitor for specific classes annotated with it? This may reduce some class 
overhead.

Bye,
bearophile


Re: Decision on container design

2011-01-28 Thread Denis Koroskin
On Sat, 29 Jan 2011 02:32:28 +0300, Michel Fortin  
michel.for...@michelf.com wrote:


On 2011-01-28 17:09:08 -0500, Andrei Alexandrescu  
seewebsiteforem...@erdani.org said:



On 1/28/11 3:05 PM, Michel Fortin wrote:

Not my preferred choices (especially #1), but having containers in
Phobos will certainly be an improvement over not having them. So go  
ahead!
 Well if you brought forth some strong argument I'm all ears. What I  
see for now is that struct containers are just difficult to implement  
and need to have carefully explained semantics, whereas a lot of people  
know how classes behave from day one.


We already argument this over and over in the past. First, I totally  
acknowledge that C++ style containers have a problem: they make it  
easier to copy the content than pass it by reference. On the other side  
of the spectrum, I think that class semantics makes it too easy to have  
null dereferences, it's easy to get lost when you have a container of  
containers.


I have some experience with containers having class-style semantics: in  
Objective-C, I ended up creating a set of macro-like functions which I  
use to initialize containers whenever I use them in case they are null.  
And I had to do more of these utility functions to handle a particular  
data structure of mine which is a dictionary of arrays of objects. In  
C++, I'd have declared this as a map string, vector Object   and  
be done with it; no need for special care initializing each vector, so  
much easier than in Objective-C.


I agree that defining structs to have reference semantics as you have  
done is complicated. But I like the lazy initialization, and we have a  
precedent for that with AAs (ideally, AAs would be a compatible  
container too). Can't we just use the GC instead of reference counting?  
I'd make things much easier. Here is a implementation:


struct Container
{
struct Impl { ... }

private Impl* _impl;
ref Impl impl() @property
{
if (!impl) impl = new Impl;
return *impl;
}

alias impl this;
}



Unfortunately, this design has big issues:


void fill(Appender appender)
{
appender.put(hello);
appender.put(world);
}

void test()
{
Appenderstring appender;
fill(appender); // Appender is supposed to have reference semantics
assert(appender.length != 0); // fails!
}

Asserting above fails because at the time you pass appender object to the  
fill method it isn't initialized yet (lazy initialization). As such, a  
null is passed, creating an instance at first appending, but the result  
isn't seen to the caller.


An explicit initialization is needed to work around this design issue. The  
worst thing is that in many cases it would work fine (you might have  
already initialized it indirectly) but sometimes you get unexpected  
result. I got hit by this in past, and it wasn't easy to trace down.


As such, I strongly believe containers either need to have copy semantics,  
or be classes. However, copy semantics contradicts with the cheap copy  
ctor idiom because you need to copy all the elements from source  
container.


I also believe reference semantics are not to be used everywhere, even  
though they're good most of the time. I'd like to have a way to bypass  
it and get a value-semantic container. With the above, it's easy as long  
as you keep Container.Impl public:


void main() {
Container  lazyHeapAllocatedContainer;
Container.Impl stackAllocatedContainer;
}

void MyObject {
Container.Impl listOfObjects;
}



About #4, it'd be nice to have the containers use move semantics when
possible even if they fallback to (cheap) copy semantic when move isn't
available. That way, if you have a type which is moveable but not
copyable you can still put it in a container. Does that makes sense?
 That's what I did up until now. It is tantamount to defining a bunch  
of methods (aliases or not) that add to the interface that the user  
must absorb, but that are seldom useful. It just seems that the entire  
move paraphernalia doesn't lift its weight.


But could we limit this to say that only containers that can return  
elements by ref? Perhaps that won't help. You know the problem better  
than me, I don't really have anything more to say.






--
Using Opera's revolutionary email client: http://www.opera.com/mail/


Re: Decision on container design

2011-01-28 Thread Michel Fortin

On 2011-01-28 20:10:06 -0500, Denis Koroskin 2kor...@gmail.com said:


Unfortunately, this design has big issues:


void fill(Appender appender)
{
 appender.put(hello);
 appender.put(world);
}

void test()
{
 Appenderstring appender;
 fill(appender); // Appender is supposed to have reference semantics
 assert(appender.length != 0); // fails!
}

Asserting above fails because at the time you pass appender object to 
the  fill method it isn't initialized yet (lazy initialization). As 
such, a  null is passed, creating an instance at first appending, but 
the result  isn't seen to the caller.


That's indeed a problem. I don't think it's a fatal flaw however, given 
that the idiom already exists in AAs.


That said, the nice thing about my proposal is that you can easily 
reuse the Impl to create a new container to build a new container 
wrapper with the semantics you like with no loss of efficiency.


As for the case of Appender... personally in the case above I'd be 
tempted to use Appender.Impl directly (value semantics) and make fill 
take a 'ref'. There's no point in having an extra heap allocation, 
especially if you're calling test() in a loop or if there's a good 
chance fill() has nothing to append to it.


That's the issue with containers. The optimal semantics always change 
depending on the use case.



An explicit initialization is needed to work around this design issue. 
The  worst thing is that in many cases it would work fine (you might 
have  already initialized it indirectly) but sometimes you get 
unexpected  result. I got hit by this in past, and it wasn't easy to 
trace down.


As such, I strongly believe containers either need to have copy 
semantics,  or be classes. However, copy semantics contradicts with the 
cheap copy  ctor idiom because you need to copy all the elements from 
source  container.


Personally, I'm really concerned by the case where you have a container 
of containers. Class semantics make things really complicated as you 
always have to initialize everything in the container explicitly; value 
semantics makes things semantically easier but quite inefficient as 
moving elements inside of the outermost container implies copying the 
containers. Making containers auto-initialize themselves on first use 
solves the case where containers are references-types; making 
containers capable of using move semantics solves the problem for 
value-type containers.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



Re: Decision on container design

2011-01-28 Thread Michel Fortin

On 2011-01-28 19:00:02 -0500, Tomek Sowiński j...@ask.me said:


Michel Fortin napisał:


We already argument this over and over in the past. First, I totally
acknowledge that C++ style containers have a problem: they make it
easier to copy the content than pass it by reference. On the other side
of the spectrum, I think that class semantics makes it too easy to have
null dereferences, it's easy to get lost when you have a container of
containers.

I have some experience with containers having class-style semantics: in
Objective-C, I ended up creating a set of macro-like functions which I
use to initialize containers whenever I use them in case they are null.
And I had to do more of these utility functions to handle a particular
data structure of mine which is a dictionary of arrays of objects. In
C++, I'd have declared this as a map string, vector Object   and
be done with it; no need for special care initializing each vector, so
much easier than in Objective-C.

I agree that defining structs to have reference semantics as you have
done is complicated. But I like the lazy initialization, and we have a
precedent for that with AAs (ideally, AAs would be a compatible
container too). Can't we just use the GC instead of reference counting?
I'd make things much easier. Here is a implementation:

struct Container
{
struct Impl { ... }

private Impl* _impl;
ref Impl impl() @property
{
if (!impl) impl = new Impl;
return *impl;
}

alias impl this;
}

I also believe reference semantics are not to be used everywhere, even
though they're good most of the time. I'd like to have a way to bypass
it and get a value-semantic container. With the above, it's easy as
long as you keep Container.Impl public:

void main() {
Container  lazyHeapAllocatedContainer;
Container.Impl stackAllocatedContainer;
}

void MyObject {
Container.Impl listOfObjects;
}


Is there anything implementation specific in the outer struct that provides
ref semantics to Impl? If not, Container could be generic, parametrized by
Impl type.


You could provide an implementation-specific version of some functions 
as an optimization. For instance there is no need to create the Impl 
when asking for the length, if the pointer is null, length is zero. 
Typically, const function can be implemented in the outward container 
with a shortcut checking for null.



Overall, I think a value-like implementation in a referency wrapper is 
a clear-cut idiom, bringing order to otherwise messy struct-implemented 
ref-semantics. Do you know of a existing collection library that 
exploits this idea?


No. Only associative arrays in D do that, that I know of.

--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/