Re: D graph library -- update

John Colvin Thu, 11 Jul 2013 04:26:29 -0700

On Thursday, 11 July 2013 at 10:25:40 UTC, Joseph RushtonWakeling wrote:

On 07/11/2013 02:24 AM, Joseph Rushton Wakeling wrote:

I know, and it's coming. :-) The main memory-related issueswill probably notshow up in a situation like this where all we're doing isstoring the graphdata, but in the case where algorithms are being performed onthe data.

For comparison, I performed the same tests but with a 10_000node graph. Herewe see similar memory use, but igraph outperforms dgraph by afactor of nearly

10 even with the insert of nodes one at a time.

Profiling shows that the time difference is accounted for bythe sorting

algorithm used in the indexEdges() method:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 93.12    110.01   110.01    20000     0.01     0.01
_D6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165Z11SortedRange561__T13quickSortImplS5066dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ1!

1__lambda165

Z11SortedRange11__lambda168TS3std5range14__T3ZipTAmTAmZ3ZipZ13quickSortImplMFS3std5range14__T3ZipTAmTAmZ3ZipZv
  3.88    114.59     4.58    20000     0.00     0.00
_D6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165Z11SortedRange561__T13quickSortImplS5066dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ1!

1__lambda165

Z11SortedRange11__lambda168TS3std5range14__T3ZipTAmTAmZ3ZipZ13quickSortImplMFS3std5range14__T3ZipTAmTAmZ3ZipZv
  1.81    116.73     2.14 1124043790     0.00     0.00
_D3std5range14__T3ZipTAmTAmZ3Zip7opIndexMFNaNbNfmZS3std8typecons14__T5TupleTmTmZ5Tuple
  0.59    117.42     0.70 203164131     0.00     0.00
_D3std9algorithm43__T6swapAtTS3std5range14__T3ZipTAmTAmZ3ZipZ6swapAtFNaNbNfS3std5range14__T3ZipTAmTAmZ3ZipmmZv
  0.42    117.92     0.50    20000     0.00     0.01
_D6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv
By default, schwartzSort uses SwapStrategy.unstable, whichmeans quicksort isused as the sorting mechanism. If we replace this withSwapStrategy.stable orSwapStrategy.semistable, TimSort is used instead, and thisdramatically cuts therunning time -- from almost 2 minutes to under 3 seconds(compared to 13 secondsfor igraph with one-by-one edge addition), and under 2 if ldmd2is used for
compiling.
This comes at a cost of increasing memory usage from about 3.7MB (almost
identical to igraph) to 5.6.
Probably the best way to secure optimal performance without thememory hit is touse an insertion sort (or maybe a smoothsort?). I guess thatTimsort would bebest to use in the case of adding multiple edges in one go,unless no edges atall have been added before, in which case quicksort wouldprobably be optimal;though quicksort would probably remain best if memorymanagement is the priority.
So, the new D code is still competitive with igraph, but needssome smoothing
around the edges (quite literally!:-).


This is very promising. Great work!

Re: D graph library -- update

Reply via email to