Hi, This could be a bug in the pickle implementation (not in igraph, but in Python itself):
https://stackoverflow.com/questions/31468117/python-3-can-pickle-handle-byte-objects-larger-than-4gb https://bugs.python.org/issue24658 The workaround is to pickle the object into a string, and then write that string in chunks less than 2^31 bytes into a file. However, note that pickling is not a terribly efficient format -- since it needs to support serializing an arbitrary set of Python objects that may link to each other and form cycles in any conceivable configuration, it has to do a lot of extra bookkeeping so that object cycles and objects embedded within themselves do not trip up the implementation. That's why the memory usage rockets up to 35 GB during pickling. If you only have a name and an additional attribute for each vertex, you could potentially gain some speed (and cut down on the memory usage) if you brew your custom format -- for instance, you could get the edge list and the two vertex attributes, stuff them into a Python dict, and then save the dict in JSON format: def graph_as_json(graph): return { "vertices": { "name": graph.vs["name"], "pt": graph.vs["pt"] }, "edges": graph.get_edgelist() } with open("output.json", "w") as fp: json.dump(graph_as_json(graph), fp) You could also use gzip.open() instead of open() to compress the saved data on-the-fly. You'll also need a json_as_graph() function to perform the conversion in the opposite direction. T. On Tue, Apr 18, 2017 at 9:25 PM, Nick Eubank <nickeub...@gmail.com> wrote: > Hello all, > > I'm trying to pickle a very large graph (23 million vertices, 152 million > edges, two vertex attributes), but keep getting an `OSError: [Errno 22] > Invalid argument` error. However, I think that's erroneous, as if I > subsample the graph and save with exact same code I have no problems. > Here's the traceback: > > > g.summary() > Out[8]: 'IGRAPH UN-- 23331862 152099394 -- \n+ attr: name (v), pt (v)' > > g.write_pickle(fname='graphs/with_inferred/vz_inferred3_sms{}_voz{}.pkl'.format(sms, > voz)) > Traceback (most recent call last): > > File "<ipython-input-9-6b5409a79251>", line 1, in <module> > > g.write_pickle(fname='graphs/with_inferred/vz_inferred3_sms{}_voz{}.pkl'.format(sms, > voz)) > > File "/Users/Nick/anaconda/lib/python3.5/site-packages/igraph/__init__.py", > line 1778, in write_pickle > result=pickle.dump(self, fname, version) > > OSError: [Errno 22] Invalid argument > > > g=g.vs[range(3331862)].subgraph() > > g.write_pickle(fname='graphs/with_inferred/vz_inferred3_sms{}_voz{}.pkl'.format(sms, > voz)) > > [success] > > The graph takes up about 10gb in memory, and the pickle command expands > Python's memory footprint to about 35gb before the exception gets thrown, > but I'm on a machine with 80gb ram, so that's not the constraint. > > Any suggestions as to what might be going on / is there a work around for > saving? > > Thanks! > > Nick > > _______________________________________________ > igraph-help mailing list > igraph-help@nongnu.org > https://lists.nongnu.org/mailman/listinfo/igraph-help > >
_______________________________________________ igraph-help mailing list igraph-help@nongnu.org https://lists.nongnu.org/mailman/listinfo/igraph-help