On Wed, 02 Nov 2011 08:40:19 -0400, Ary Manzana <[email protected]> wrote:

On 11/2/11 8:48 AM, Steven Schveighoffer wrote:

The basic response to this is, when dealing with containers generically
(that is, you know you have a container, but you don't know what type),
the "remove this element" operation is not necessarily a good primitive
to have.

Simply because from the myriad of containers, only some can implement
this operation efficiently. Java embeds this operation in the interface,
which means any interface you have to a container could potentially use
O(n) time to remove that element. Such an innocuous piece of syntax
*should* have a cost if it's not efficient IMO.

BTW, the original question doesn't provide enough information to say
"remove this element." Even in Java, if you aren't using the default
comparison, you must use a comparator method to determine which one to
remove. If cell.x == x && cell.y == y *is* the comparison operator for
the type, then the syntax gets much simpler, because you don't need to
pass a specialized comparison function.

In dcollections, removing a specific element (using the default
comparison operator for that element) on a *fast lookup* container is as
simple as:

container.remove(container.find(x));

Which removes the element x if it's found. However, this is not defined
for containers which use O(n) time to search (such as linked list), you
must use std.algorithm.find for that:

container.remove(find(container[], x).begin);

Should work, and takes O(n) time.

-Steve

I don't really understand what's wrong with inefficient methods. You can have inefficient methods that are convenient, like removing an element by the default comparison, or giving it a delegate to match the element(s) to remove.

You profile your application. Is that method the bottle-neck? If so, you change it to a more efficient one. If not, you are happy you had that method there, performing in an inefficient way, but which doesn't matter that much compared to, say, opening an SQL connection.

Programmers want to program, fast. They have schedules, they need to deliver. They don't need to always find the best solution. They can find a compromise between "working" and "fast", move on, and later profile and worry about what matters most.

Programmers don't want to fight with the language or think "Oh, so to remove an element I need to use this operation and combine it with that one and with that other one"...

Or use the right container for the job?

Where it really comes into play is generic programming.

Let's say I write some algorithm that removes certain elements from a container:

removeElements(C, T)(C c, T t[]...)
{
   foreach(x; t)
     c.remove(t);
}

What's the complexity of this algorithm? For a HashSet, for instance, it will be O(n) where n is the number of elements to remove.

But for an ArrayList, it will be O(n*m) where m is the number of elements in c.

So what we get is, an algorithm whose complexity depends on the complexity of an operation that varies *widely*. What you end up with is things like sorting algorithms which are O(n^2) or even worse.

What omitting those methods from the containers that don't support it does is allow you to make *predictably efficient* generic algorithms. For example, you know sorting is going to be at most O(nlgn). I don't care how beautiful the syntax is, a quadratic sort is epic fail, and should be avoided at all costs.

std.container tries to allow having these inefficient methods by changing the name (so algorithms can still claim efficiency by not using those names). My stance is this makes the API overly complex for very little gain. If something is inefficient, either don't use it, or use the range interface via std.algorithm. Why should I write a linear search algorithm when std.algorithm already has done this?

-Steve

Reply via email to