In further performance debugging, I discovered that vertex removal is also an O(|V|) operation. This makes sense given the fact that accessing predecessors is O(|V|), because the predecessor links have to be removed in order to properly remove a vertex. Stepping back, however, I still think it's a bad default for a generic mutable graph library.

The hack I came up with for this one is even less elegant that the predecessors issue. I added a set of removed vertices to my wrapper graph type, and changed the wrapper vertex removal function to remove all the relevant edges and put the vertex into the removed set. Vertex member/iter/fold operations need to check the removed set to determine whether a vertex is actually in the graph or not. Periodically, when the set of removed vertices has gotten large, I make a new graph and copy in just the "real" vertices from the old graph.

If the library were to include a version that maintains back links, it would incur the space overhead and (small in my opinion) time overhead of my predecessors wrapper hack. It could dispense with the vertex removal hack, though, because the back links could be deleted directly.

Ben


On Aug 10, 2009, at 5:19 AM, kche...@math.carleton.ca wrote:

Perhaps something like that in ConcreteBidirectional
be implemented for general Digraph so predecessors
can be accessed in O(1)?  If I am not mistaken, that
will double the storage and running time of most of
the operations.  This implementation could be added
as an additional variant without modifying existing
code.

Kevin Cheung.

From: ri...@happyleptic.org
What you're asking is similar to the problem of finding the
predecessor
of an arbitrary node in a singly-linked-list. You have no option but
to
scan the whole list to find its predecessor.  If you had a
doubly-linked-list, predecessor lookups would work easily, but that's
a
different data structure, with much more overhead.

Much more overhead, really ?
So this is for performance reasons that all functionnal languages
promote singly-linked lists, while for instance in Linux every list
is implemented with a doubly linked list for purely ideological reasons
?

:-)

Yes indeed, much more overhead. But the source is not the fact you
have to maintain backlinks, but their impact on the GC.
With a GC, any modification on existing values has a cost, since you
have to keep track of them independently of the value itself.
Since linux has no GC, using doubly linked lists has only a very
limited cost, mostly related to the extra space needed.
By the way, BSD uses lots of singly-linked lists, probably because it
comes from a time when there was not so much memory.

Jacques Garrigue


_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Reply via email to