Yeah, if you’re just worried about statistics, maybe you can do sampling (do single-pair paths from 100 random nodes and you get an idea of what percentage of nodes have what distribution of neighbors in a given distance).
Matei On Mar 26, 2014, at 5:55 PM, Ryan Compton <compton.r...@gmail.com> wrote: > Much thanks, I suspected this would be difficult. I was hoping to > generate some "4 degrees of separation"-like statistics. Looks like > I'll just have to work with a subset of my graph. > > On Wed, Mar 26, 2014 at 5:20 PM, Matei Zaharia <matei.zaha...@gmail.com> > wrote: >> All-pairs distances is tricky for a large graph because you need O(V^2) >> storage. Do you want to just quickly query the distance between two >> vertices? In that case you can do single-source shortest paths, which I >> believe exists in GraphX, or at least is very quick to implement on top of >> its Pregel API. If your graph is small enough that storing all-pairs is >> feasible, you can probably run this as an iterative algorithm: >> http://en.wikipedia.org/wiki/Floyd–Warshall_algorithm, though I haven’t >> tried it. It may be tough to do with GraphX. >> >> Matei >> >> On Mar 26, 2014, at 3:51 PM, Ryan Compton <compton.r...@gmail.com> wrote: >> >>> To clarify: I don't need the actual paths, just the distances. >>> >>> On Wed, Mar 26, 2014 at 3:04 PM, Ryan Compton <compton.r...@gmail.com> >>> wrote: >>>> No idea how feasible this is. Has anyone done it? >>