Re: std.container update

Andrei Alexandrescu Thu, 27 May 2010 07:25:33 -0700

On 05/27/2010 08:23 AM, Steven Schveighoffer wrote:

You're making a number of great points in the four posts starting with
this. Since they sort of augment one another, let me quote and answer
them all here.

On Thu, 27 May 2010 00:36:24 -0400, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:

I just implemented a singly-linked list type to illustrate the
container abstraction.

http://erdani.com/d/phobos/std_container.html
http://erdani.com/d/phobos/container.d

Please let me know of how you find it. There are a few refinements
 and ancillary functions to define, but those are pretty clear
given the rest.


Well, one thing I don't like about it is that you cannot do O(1)
insertion or removal. insertBefore requires O(n) complexity and
insertAfter requires O(m) complexity. Removal will require O(n)
complexity, even though it doesn't exist right now.

Fast removal and insertion are key to having a linked list. I'd
suggest modifying the range so it uses a Node ** instead, which
points to the _next element of the previous node of the one currently
being iterated, or _root if it's the first node in the list.

-Steve


The performance profile of SList is what one would expect for a
straightforward singly-linked list implemented with nodes, pointer to
root, and the such. You are making a good point that adding one 1-2
extra indirections would lead to improvements of certain primitives.
There have been various experiments with STL-like slist implementations
that dwell on such ideas. See e.g.
http://manpages.ubuntu.com/manpages/lucid/man3/std::forward_list.3cxx.html
with functions like before_begin() in addition to begin(), and also
insert_after() instead of insert().

It has become clear to STL implementors that adding hidden indirections
puts forward_list at a disadvantage compared to the straightforward
lists against which they compete. I want to avoid that by abiding to the
straight storage strategy and deal with its consequences.

Second post:

In fact, I'd say that it would be critical to split the insert and
remove functions to slow (O(n) or greater) versions and fast( O(lg(n)
or less)) functions.  Some algorithms may depend on fast insertion
and removal, such as mergesort on a linked list.


This is a good idea, which generalizes linearRemove(). I'd suspected if
there's one place to distinguish linear from better, then there ought to
be more.

From here we have a fork in the road:

a) Tighten complexity requirements for e.g. insertBefore(Range, Stuff)
and let containers that can't implement them just not have them.

b) Define linearInsertBefore(Range, Stuff) etc.

I'm leaning towards (a) because a list can insert virtually anywhere by
only using insertAfter(Take!Range, Stuff) (which I haven't defined yet

but I will, and which also takes care of the complexity issue) andinsertFront(Stuff).


Third post:

Actualy mergesort could not be implemented without some sort of way
to restructure the list without allocating new nodes, requiring
knowledge of the link structure.  I'm not sure it could be
generic... I'll think about this a bit.


You are entirely right. A mergesort that relies on link shuffling needs
to have intimate knowledge of the node structure. It would most likely
be a member of the list.

Fourth post:

I have another general question in general about these collections.

Ranges fix a very bad problem with iterators -- non-matching begin
and end iterators.  For example, passing a begin from one list and
and end from another.


Right.

However, all your range functions such as remove, insertBefore, etc.
are called via the container type, using a range as an argument.  But
there can be no static verification that a range actually is part of
a given container.  Should there be some requirement that a container
validates that a range is part of the container before performing the
operation?  This is the case for dcollections.

In your current implementation of SList, verification seems
incidental for insertBefore because you must find the previous node
from the root anyways (and that will throw on enforce if it cannot
find it).  But insertAfter does not, so something like this will
compile, run, and result in strange behavior:

auto sl1 = make(SList!int, 1, 2, 3, 4, 5); // I may not have this
correct, but it's not important auto sl2 = make(SList!int, 6, 7, 8,
9, 10);

auto r1 = sl1[]; r1.popFront(); r1.popFront();

auto r2 = take(r1, 2); sl2.insertAfter(r2, 666); // actually inserts
into sl1


It's a good question.

My current view of the matter is to maximize library flexibility andversatility. Following the adage that you can build safe on fast but notthe other way around, I'd go with the following philosophy: checking ofrange membership to their collections should be made only to the extentof ensuring memory safety. Beyond that, if efficiency is in tension withverifiability (as is the case with your example above), leave it to theprogrammer or the supra-structure to correctly pair collections withtheir ranges.


Sounds good?


Andrei

Re: std.container update

Reply via email to