Jonathan M Davis wrote:
On Thursday, October 07, 2010 14:44:50 Stanislav Blinov wrote:
But that is not a matter of library interface isn't it? It's a matter of
algorithm/container choice. It's not the push_back that was slow in the
end, it was std::vector (yes, that's arguable, but the point is that
container defines rules for its methods, not vice-versa).
Except that when you're dealing with generic code which has to deal with
multiple container types (like std.algorithm), you _need_ certain complexity
guarantees about an operation since it could happen on any container that it's
given. Using operator in in an algorithm could be perfectly reasonable if it had
O(1) or O(log n) complexity but be completely unreasonable if it had O(n)
complexity. So, the algorithm then is reasonable with some containers and not
others when if in were restricted to O(1) or O(log n), it could be used by the
algorithm without worrying that a container would be used with it which would
make it an order of complexity greater, or worse.
If it were strictly a matter of writing code which directly used a particular
algorithm and container type, then you could know what the complexities of the
algorithm and operations on the container are, but once you're dealing with
generic code, operations need to have complexity guarantees regardless of their
container, or the complexity of the algorithm will vary with the container type.
And if that algorithm is used in yet another algorithm, then it gets that much
worse. It can quickly become easy to bury the place where what you're trying to
do becomes an order of complexity greater, because the container that you
selected was an order of complexity greater than other containers on an
operation that an algorithm is using buried in code somewhere.
- Jonathan M Davis
Yet still, generality ends at some point. You can't devise every
possible algorithm for any possible types and have it have set-in-stone
complexity independently of types.
Take std.range.popFrontN(). It's generic, and it's used in other
algorithms. Yet it has O(1) complexity for ranges that support slicing,
and O(n) for those that do not. But you don't take into account
complexity of slicing operation, or complexity of stepping through the
range.
Or std.algorithm.find(). It's basically O(n), but then again, when using
it, one should also consider complexity of used predicate.
Just the same happened in the case Andrei described: the algorithm was
O(n) judging from the description (performing n insertions into
container). But the container itself blew it up into quadratic time
because of it's own insertion algorithm.
What I mean is you'll always have algorithms that will perform
differently for different containers, and you'll always have to choose
containers that best fit your needs. Generic code is not only about
efficiency: you'd at least expect it to work for different types.
Replacing vector with list in Andrei's case would probably solve the
problem at the cost of losing random access (together with contiguous
storage). Which means it'd also require changing code if random access
was in use somewhere. Having a generic random access function
at(Container,index) (offering complexity that varies with container
used: e.g. O(1) for arrays, O(n) for lists) would save you maintenance.