Ah, that point about the attributes is interesting. I hadn't realized that that was pluggable. It makes sense now that you describe it, though.
After poking at my problem a bit more, I realized that having the graph persist may not be worth the effort it would take to implement. Thanks for the pointers! -Cale On Wed Feb 25 2015 at 4:22:53 PM Tamas Nepusz <[email protected]> wrote: > > Thanks, Tamas. I'll add a little more color to what I'm doing. > Thanks for the clarification, it is a bit clearer now. > > > The advantage I see of memory mapping is that if we can get to the point > > where a particular igraph_t (and all of the data it references) is backed > > by a memory mapped file, then loading it doesn't involve parsing at all. > Getting all the memory that igraph_t refers to either directly or > indirectly > is a bit complicated, especially when igraph is embedded in a higher level > language. Let me explain it. igraph_t itself is basically six growable > vectors > (to store the edge list and the first and second level indexes of the edge > list) plus a pointer to another chunk of memory that is used by the > attribute > handler to store the graph/vertex/edge attributes. Attribute handling is > decoupled from the "core" igraph library to allow different igraph > interfaces > (R, Python, Ruby etc) to bring their own implementations of the attribute > handler -- this is how we ensure that the Python interface can store > arbitrary > Python objects as attributes while the R interface can store arbitrary > R objects. Mapping the six vectors to a file is probably doable with a bit > of > hacking -- however, mapping the memory that is used for storing the > attributes > is probably impossible without hacking the host language itself. For > instance, > in the Python interface, the pointer in igraph_t points to an array > containing > three Python dicts; one for the graph attributes, one for the vertex > attributes > and one for the edge attributes. Each of these dicts in turn probably > refer to > other Python objects (the keys and values of the dicts), but these cannot > be > memory-mapped easily because the Python interpreter is responsible for > managing > the memory that they occupy. > > So, *if* you don't use graph, vertex and edge attributes at all, then > *maybe* > it is doable, although I'm not sure about the amount of hacking involved. > Theoretically, igraph_t is an opaque data type for all parts of the igraph > library except a single file: type_indexededgelist.c. The idea is that you > could implement a, say, type_mmapped_edgelist.c, replace > type_indexededgelist.c > with it, recompile igraph and then you get the same library with a > different > storage method for graphs. The truth is that no one ever really tried it > (as > far as I know). > > > Also, I don't think it's the case that "you have to load the entire graph > > into memory at some point". If using a memory mapped file, I think you'd > be > > able to rely on the os to only have the subset of pages resident that you > > actually need to run the particular graph operations. > True -- I was sort of assuming that the analysis you would like to perform > would need the entire graph sooner or later anyway, so you would be just > spreading out the cost of loading the graph across a longer time interval. > > T. > > _______________________________________________ > igraph-help mailing list > [email protected] > https://lists.nongnu.org/mailman/listinfo/igraph-help >
_______________________________________________ igraph-help mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/igraph-help
