Re: std.container & ranges

Steven Schveighoffer Wed, 02 Nov 2011 06:15:33 -0700

On Wed, 02 Nov 2011 08:40:19 -0400, Ary Manzana <[email protected]>wrote:

On 11/2/11 8:48 AM, Steven Schveighoffer wrote:
The basic response to this is, when dealing with containers generically
(that is, you know you have a container, but you don't know what type),
the "remove this element" operation is not necessarily a good primitive
to have.

Simply because from the myriad of containers, only some can implement
this operation efficiently. Java embeds this operation in the interface,
which means any interface you have to a container could potentially use
O(n) time to remove that element. Such an innocuous piece of syntax
*should* have a cost if it's not efficient IMO.

BTW, the original question doesn't provide enough information to say
"remove this element." Even in Java, if you aren't using the default
comparison, you must use a comparator method to determine which one to
remove. If cell.x == x && cell.y == y *is* the comparison operator for
the type, then the syntax gets much simpler, because you don't need to
pass a specialized comparison function.

In dcollections, removing a specific element (using the default
comparison operator for that element) on a *fast lookup* container is as
simple as:

container.remove(container.find(x));

Which removes the element x if it's found. However, this is not defined
for containers which use O(n) time to search (such as linked list), you
must use std.algorithm.find for that:

container.remove(find(container[], x).begin);

Should work, and takes O(n) time.

-Steve
I don't really understand what's wrong with inefficient methods. You canhave inefficient methods that are convenient, like removing an elementby the default comparison, or giving it a delegate to match theelement(s) to remove.
You profile your application. Is that method the bottle-neck? If so, youchange it to a more efficient one. If not, you are happy you had thatmethod there, performing in an inefficient way, but which doesn't matterthat much compared to, say, opening an SQL connection.
Programmers want to program, fast. They have schedules, they need todeliver. They don't need to always find the best solution. They can finda compromise between "working" and "fast", move on, and later profileand worry about what matters most.
Programmers don't want to fight with the language or think "Oh, so toremove an element I need to use this operation and combine it with thatone and with that other one"...


Or use the right container for the job?

Where it really comes into play is generic programming.

Let's say I write some algorithm that removes certain elements from acontainer:


removeElements(C, T)(C c, T t[]...)
{
   foreach(x; t)
     c.remove(t);
}

What's the complexity of this algorithm? For a HashSet, for instance, itwill be O(n) where n is the number of elements to remove.

But for an ArrayList, it will be O(n*m) where m is the number of elementsin c.

So what we get is, an algorithm whose complexity depends on the complexityof an operation that varies *widely*. What you end up with is things likesorting algorithms which are O(n^2) or even worse.

What omitting those methods from the containers that don't support it doesis allow you to make *predictably efficient* generic algorithms. Forexample, you know sorting is going to be at most O(nlgn). I don't carehow beautiful the syntax is, a quadratic sort is epic fail, and should beavoided at all costs.

std.container tries to allow having these inefficient methods by changingthe name (so algorithms can still claim efficiency by not using thosenames). My stance is this makes the API overly complex for very littlegain. If something is inefficient, either don't use it, or use the rangeinterface via std.algorithm. Why should I write a linear search algorithmwhen std.algorithm already has done this?


-Steve

Re: std.container & ranges

Reply via email to