Thanks for the comprehensive answer, helped a lot! Tim
On Sat, May 10, 2014 at 12:57 PM, Tamás Nepusz <[email protected]> wrote: > Hi, > > > - find node object based on keyword attribute > Okay, this is a bit undocumented and must be improved for the next > version, but here’s the trick: the “name” vertex attribute is indexed > g.vs.find() is the right way to go but keep in mind that it has O(n) > complexity since node attributes are not indexed -- except the “name” > attribute, which is indexed. igraph will take advantage of this if you > invoke g.vs.find() with a positional argument only *and* the type of this > positional argument is string. (Integers won’t work because g.vs.find(X) > will return the node with internal ID equal to X). So, this is what I have > on my machine: > > In [1]: g=Graph.Barabasi(100000) > In [2]: g.vs["name"] = map(str, range(100000)) > In [3]: g.vs["another_attribute"] = map(str, range(100000)) > In [4]: %timeit g.vs.find(another_attribute="99999") > 100 loops, best of 3: 17.8 ms per loop > In [5]: %timeit g.vs.find(name="99999") > 100 loops, best of 3: 18.6 ms per loop > > but: > > In [6]: %timeit g.vs.find("99999") > 100000 loops, best of 3: 2.13 us per loop > > The first issue here is that this behaviour is undocumented; the second is > that it should work even if you specify g.vs.find(name=“99999”) since the > two operations are essentially equivalent. An additional advantage of using > a “name” attribute with type string is that you can use these names > wherever igraph expects a node or node ID, _without_ having to look up the > node explicitly using find(). However, using the “name” attribute comes at > a cost since its internal index has to be invalidated every time you modify > the graph and then rebuilt the next time you try to look up a node by name. > > > - add new node to graph (add_vertex) [since this returns none I need to > > "find" the new node obj] > Here you can take advantage of the fact that the newly added node always > receives the first available integer node ID, so its ID will be > g.vcount()-1 after the addition. This eliminates the need for .find() since > you can simply use g.add_edge(paper_id_node, g.vcount()-1). > > > Currently this is very slow. Any tips? > As Gabor said: igraph’s data structure is optimized for static graphs, so > adding edges one by one is slower than adding them in batches. Maybe it is > faster to do something like this: > > id_gen = UniqueIdGenerator() > edgelist = [] > for paper_id_1, paper_id_2 in whatever_generator_generates_your_edges: > edgelist.append((id_gen[paper_id_1], id_gen[paper_id_2])) > g = Graph(edgelist) > g.vs[“name”] = id_gen.values() > > -- > T. >
_______________________________________________ igraph-help mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/igraph-help
