Thanks, Tamás, for your answer. It's much clearer now .
On Thu, Feb 27, 2014 at 1:51 AM, Tamás Nepusz <[email protected]> wrote: > Hi, > > > I was wondering what's the rationale behind not returning a reference > object to newly created vertices and edges in Python (or at least an index)? > References to newly created vertices and edges are not returned because > there is actually no new vertex/edge object created when you do the > addition. The core of igraph is written in C, so adding a vertex involves > increasing a variable that stores the number of vertices, plus extending a > few index vectors; similarly, adding an edge involves extending some > vectors and recreating some indices. The Vertex and Edge objects you see in > Python are actually dead simple proxies that store a reference to the graph > they are a part of, plus the index of the vertex or edge that they > represent. Creating such a proxy object just for the sake of returning it > is an overhead, especially considering that they will be thrown away > immediately. > > You are right that we could at least return the index of the newly created > edge or vertex; we do not do this because igraph simply allocates the next > available vertex or edge ID. So, if you have n vertices and add one more, > the newly created vertex will have ID=n (and the same for edges). > > > If I am reading a large graph file (special-format, let's assume), > checking if a vertex (or an edge between two vertices) has been already > created is really problematic. > How about using g.are_connected(source, target)? This method automatically > accepts vertex names and converts them to vertex IDs. > > > I am assuming that "find" takes O(n) time where n is number of nodes. So > this is an O(n*n) time operation. > > True, but most methods accept vertex names in place of vertex IDs, and > they use an internal dictionary that maps vertex names back to vertex IDs > to do the conversion. So, for instance, using > g.vs.find(name="whatever").degree() is O(n), but using g.degree("whatever") > is O(1). This is because the find() method cannot assume that the VertexSeq > it is called on represents the entire vertex set (it might represent a > subset or a permutation of it) so we cannot use a plain dictionary lookup. > > For what it's worth, my goal is to allow the usage of vertex names in > place of vertex IDs for every igraph method that accepts vertices or lists > of vertices. Some of the methods have slipped through the cracks, so for > instance g.get_eid("A", "B") does not work yet (even though > g.are_connected("A", "B") does). If you find a method that should accept > vertex names but does not accept it yet, please file a bug report at > https://github.com/igraph/igraph and I'll fix it. > > By the way, if you are constructing a large graph with repeated calls to > add_vertex() and add_edge(), then you will probably encounter another > bottleneck as your graph grows. igraph's data structures are optimized for > static graphs (i.e. graphs where vertex and edge mutations are far less > frequent than queries). Every time you add an edge, some internal data > structures have to be rebuilt from scratch, which effectively means that it > is almost as fast to add, say, thousands of edges at the same time as the > addition of a single edge. In most cases, it is way faster to create the > edge list of your graph in advance and then add all the edges at once. > > Alternatively, you may take a look at the Graph.DictList() constructor, > which takes two iterables, one for the vertices and one for the edges. Both > iterables should yield dictionaries; keys of the dictionaries are converted > to vertex/edge attribute names and the corresponding values are converted > to vertex/edge attribute values. The "source" and "target" keys in the > dictionaries corresponding to edges are automatically used to look up the > source and target vertex efficiently, and Graph.DictList() is also capable > of "batching" edge additions in larger chunks to alleviate the overhead of > repeated calls to add_edges(). > > All the best, > T. > > > _______________________________________________ > igraph-help mailing list > [email protected] > https://lists.nongnu.org/mailman/listinfo/igraph-help >
_______________________________________________ igraph-help mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/igraph-help
