I think that Avro and protobufs are the current best options for large data assets like this.
On Fri, Sep 16, 2011 at 2:44 PM, Jake Mannix <jake.man...@gmail.com> wrote: > Can I vote for whichever one isn't based on XML? :) > > I really can't imagine encoding a 10-billion node graph in XML. Or rather, > I can, and I'm skeeeeeered. > > On Fri, Sep 16, 2011 at 1:02 PM, Grant Ingersoll <gsing...@apache.org > >wrote: > > > I'm going to write a converter to dump out clusters and their points to a > > graph structure so they can be displayed. > > > > Gephi (and others) supports a myriad of formats: > > http://gephi.org/users/supported-graph-formats/ > > > > * GEXF > > > > * GDF > > > > * GML > > > > * GraphML > > > > * Pajek NET > > > > * GraphViz DOT > > > > * CSV > > > > * UCINET DL > > > > * Tulip TPL > > > > * Netdraw VNA > > > > * Spreadsheet > > > > Anyone have strong opinions on a format? CSV is the simplest, obviously, > > but doesn't support attributes like GraphML. I'm inclined to use GraphML > or > > CSV. > > > > -Grant >