[graph-tool] Re: All Pairs Shortest Path with limitations
Hi Aleks! I thank you for your long answer! > def wrap(node): > return shortest_distance(g=G, source=node, target=node, > weights=G.edge_properties['weight']) > > res = multiprocessing.Pool().map(wrap, random.choices(V, N)) > do_some_processing_on_result(res)... > > This is not optimal due to various overheads, however from python its the > best you can get. So you basically recommend full APSP manually? In terms of run-time, this is the worst you can get divided by the number of cores - or the worst nightmare of an hpc admin. Just out of curiosity: Even on my small test-graph it's ~ 1200 times slower than full shortest_distance(). I'll have a look at the other resources! Cheers, Stephan Von: Aleksandar Trifunovic Gesendet: Donnerstag, 25. November 2021 01:16 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations Hi Stephan, On Wed, Nov 24, 2021 at 5:58 PM Monecke, Stephan wrote: > > Hi Aleks! > > > The multiprocessing module is what I'm currently using as a workaround until > something other pops up - I just throw it into a mp.Pool. > > For the initial debugging / get-to-know phase, so to say. This is the first time I hear that someone uses multiprocessing for debugging. Usually its the other way around :) > > I don't know what exactly you've put into the multiprocessing module but I > don't see how this could be of help in my case. You requested multi-threaded shortest_distance and wanted to compute multiple paths and then get the mean dist? So my suggestion was to throw Pool on shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) which is trivial to do e.g. def wrap(node): return shortest_distance(g=G, source=node, target=node, weights=G.edge_properties['weight']) res = multiprocessing.Pool().map(wrap, random.choices(V, N)) do_some_processing_on_result(res)... This is not optimal due to various overheads, however from python its the best you can get. But if you know a bit of c++ and want to optimize some more, graph-tool already uses openmp so making a parallel loop over vertices should be easy. Tiago keeps the code very readable and everything is logical so finding a place to make your modification should take much time: https://git.skewed.de/count0/graph-tool/-/blob/master/src/graph/topology/graph_distance.cc > I'm running a metropolis algorithm that runs hours on a very small graph. > Each step depends on the previous. On the graphs I'm actually - or at least - > aiming for, it would probably run at least over a week. > > > I also have a large parameter space and would like to try something like > lipo* but same here: Just 100 iterations already approach 2 weeks on the tiny > graph and I can't multiprocess it. You can for sure parallelize parameter opt with dlib. Check the official docs (http://dlib.net/dlib/global_optimization/global_function_search_abstract.h.html#global_function_search) however its already done in other frameworks: listed by lipo readme: https://optuna.readthedocs.io/en/stable/ I've tested this one (~30 parameters): https://github.com/facebookresearch/nevergrad Good luck, Aleks > Cheers, > Stephan > > *https://pypi.org/project/lipo/ > > > Von: Aleksandar Trifunovic > Gesendet: Mittwoch, 24. November 2021 16:44 > An: Main discussion list for the graph-tool project > Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations > > Hi Stephan, > > I've used graph tool to compute a lot of shortest paths and used a > dumb solution of just throwing it into multiprocessing module from > python (with some lru_cache tricks). If you have enough memory and you > don't call the main procedure often (seems to be the case?) then the > overhead of multiprocessing becomes insignificant. A benefit is that > this is pretty easy to code and scales perfectly to more cores (until > you run out of memory). > > Best, > Aleks > > On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan > wrote: > > > > Hi Tiago and the list, > > > > > > > > my algorithm is still _infeasibly_ slow and > 90 % of the run-time results > > from shortest_path(). > > > > > > Do you have any idea on how this could be sped up? > > > > > > I have the graph G, the set of all nodes V and a sub-set B ⊂ V. > > > > > > I want to calculate or approximate the mean shortest distance between all > > pairs V\B x V\B but on the complete graph - or in other words, calculate > > the mean of all pairs shortest distance ( mean(shortest_distance()) ) but > > exclude n∈B from being source
[graph-tool] Re: All Pairs Shortest Path with limitations
Hi Aleks! The multiprocessing module is what I'm currently using as a workaround until something other pops up - I just throw it into a mp.Pool. For the initial debugging / get-to-know phase, so to say. I don't know what exactly you've put into the multiprocessing module but I don't see how this could be of help in my case. I'm running a metropolis algorithm that runs hours on a very small graph. Each step depends on the previous. On the graphs I'm actually - or at least - aiming for, it would probably run at least over a week. I also have a large parameter space and would like to try something like lipo* but same here: Just 100 iterations already approach 2 weeks on the tiny graph and I can't multiprocess it. Cheers, Stephan *https://pypi.org/project/lipo/ Von: Aleksandar Trifunovic Gesendet: Mittwoch, 24. November 2021 16:44 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations Hi Stephan, I've used graph tool to compute a lot of shortest paths and used a dumb solution of just throwing it into multiprocessing module from python (with some lru_cache tricks). If you have enough memory and you don't call the main procedure often (seems to be the case?) then the overhead of multiprocessing becomes insignificant. A benefit is that this is pretty easy to code and scales perfectly to more cores (until you run out of memory). Best, Aleks On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan wrote: > > Hi Tiago and the list, > > > > my algorithm is still _infeasibly_ slow and > 90 % of the run-time results > from shortest_path(). > > > Do you have any idea on how this could be sped up? > > > I have the graph G, the set of all nodes V and a sub-set B ⊂ V. > > > I want to calculate or approximate the mean shortest distance between all > pairs V\B x V\B but on the complete graph - or in other words, calculate the > mean of all pairs shortest distance ( mean(shortest_distance()) ) but exclude > n∈B from being source or target. > > Sub-sampling would be okay as well. > > > I currently don't see a way to do this fast with graph-tool besides writing a > custom function but that probably exceeds my capacities. Am I missing > something? Do you have any ideas? > > > Ideal would be something like a multi-threaded shortest_distance(g=G, > source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with > N pairs random sub-sampling and source / target being lists. > > > > Thanks a ton! > > Stephan > > > Von: Monecke, Stephan > Gesendet: Donnerstag, 18. November 2021 14:59 > An: Main discussion list for the graph-tool project > Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations > > This works like a charm! > > With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for > shortest_distance(G) ). > > It's still very very expensive as it's 2.75 times APSP but approaching > reasonability. The costly remaining part is the manual B x B.* > > I tried to speed it up by the approximation to limit G to B since I know that > B is connected but graph-tool segfaults right away (I wrote another ticket > for that). :/ > > If anyone else has any ideas on how to speed this up, I'd be very grateful! > > > Cheers, > Stephan > > - - - o - - - > > *Since the graph is directed, we have the following: Say V is the set of all > vertices and B a subset of V I want to exclude as sources / targets from > APSP. Then the set of points is > (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B). > > Or in graph-tool terms: > - (V x V): shortest_distance(G) > - (B x V): shortest_distance(G, b, _) for b in B > - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B > - (B x B): shortest_distance(G, u, v) for u,v in B x B > > > > > Von: Stephan Monecke > Gesendet: Mittwoch, 17. November 2021 20:24 > An: Main discussion list for the graph-tool project > Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations > > 🥳 I didn't know that existed! > > Thanks a ton, I'm really looking forward to test it out tomorrow! > > Cheers, > Stephan > -- > Sent from my mobile device. Please excuse my brevity. > > Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto > : > >Am 17.11.21 um 19:22 schrieb Monecke, Stephan: > >> The main reason is that shortest_distance() has a "single source, all > >> targets" mode but no "single target, all sources" mode (the reverse) - as > >> soon as "so
[graph-tool] Re: All Pairs Shortest Path with limitations
Hi Tiago and the list, my algorithm is still _infeasibly_ slow and > 90 % of the run-time results from shortest_path(). Do you have any idea on how this could be sped up? I have the graph G, the set of all nodes V and a sub-set B ⊂ V. I want to calculate or approximate the mean shortest distance between all pairs V\B x V\B but on the complete graph - or in other words, calculate the mean of all pairs shortest distance ( mean(shortest_distance()) ) but exclude n∈B from being source or target. Sub-sampling would be okay as well. I currently don't see a way to do this fast with graph-tool besides writing a custom function but that probably exceeds my capacities. Am I missing something? Do you have any ideas? Ideal would be something like a multi-threaded shortest_distance(g=G, source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with N pairs random sub-sampling and source / target being lists. Thanks a ton! Stephan ________ Von: Monecke, Stephan Gesendet: Donnerstag, 18. November 2021 14:59 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations This works like a charm! With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for shortest_distance(G) ). It's still very very expensive as it's 2.75 times APSP but approaching reasonability. The costly remaining part is the manual B x B.* I tried to speed it up by the approximation to limit G to B since I know that B is connected but graph-tool segfaults right away (I wrote another ticket for that). :/ If anyone else has any ideas on how to speed this up, I'd be very grateful! Cheers, Stephan - - - o - - - *Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B). Or in graph-tool terms: - (V x V): shortest_distance(G) - (B x V): shortest_distance(G, b, _) for b in B - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B Von: Stephan Monecke Gesendet: Mittwoch, 17. November 2021 20:24 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations 🥳 I didn't know that existed! Thanks a ton, I'm really looking forward to test it out tomorrow! Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity. Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto : >Am 17.11.21 um 19:22 schrieb Monecke, Stephan: >> The main reason is that shortest_distance() has a "single source, all >> targets" mode but no "single target, all sources" mode (the reverse) - as >> soon as "source" is empty, it defaults to APSP, neglecting "target". >> > >Of course it does. Just reverse the graph with: > > GraphView(g, reversed=True) > > ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de
[graph-tool] Re: All Pairs Shortest Path with limitations
This works like a charm! With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for shortest_distance(G) ). It's still very very expensive as it's 2.75 times APSP but approaching reasonability. The costly remaining part is the manual B x B.* I tried to speed it up by the approximation to limit G to B since I know that B is connected but graph-tool segfaults right away (I wrote another ticket for that). :/ If anyone else has any ideas on how to speed this up, I'd be very grateful! Cheers, Stephan - - - o - - - *Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B). Or in graph-tool terms: - (V x V): shortest_distance(G) - (B x V): shortest_distance(G, b, _) for b in B - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B Von: Stephan Monecke Gesendet: Mittwoch, 17. November 2021 20:24 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations 🥳 I didn't know that existed! Thanks a ton, I'm really looking forward to test it out tomorrow! Cheers, Stephan -- Sent from my mobile device. Please excuse my brevity. Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto : >Am 17.11.21 um 19:22 schrieb Monecke, Stephan: >> The main reason is that shortest_distance() has a "single source, all >> targets" mode but no "single target, all sources" mode (the reverse) - as >> soon as "source" is empty, it defaults to APSP, neglecting "target". >> > >Of course it does. Just reverse the graph with: > > GraphView(g, reversed=True) > > ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de
[graph-tool] Re: All Pairs Shortest Path with limitations
Hi Aleks! I thank you for your answer! If already tried that but have not been successful speed-wise. A full APSP of a very small test graph already takes around 200 ms. The fastest I could get as described below runs in about 3.6 s. :/ The main reason is that shortest_distance() has a "single source, all targets" mode but no "single target, all sources" mode (the reverse) - as soon as "source" is empty, it defaults to APSP, neglecting "target". Hence I have to do this part manually using shortest_distance(G, u, v).* My bottleneck is the APSP calculation. Cheers, Stephan *Since the graph is directed, we have the following: Say V is the set of all vertices and B a subset of V I want to exclude as sources / targets from APSP. Then the set of points is [ (V x V) \ [ (V x B) + (B x V) ] ] + (B x B). Or in graph-tool terms: - (V x V): shortest_distance(G) - (V x B): shortest_distance(G, v, b) for b in B for v in V (horrible run-time) - (B x V): shortest_distance(G, b, _) for b in B - (B x B): shortest_distance(G, u, v) for u,v in B x B Von: Aleksandar Trifunovic Gesendet: Mittwoch, 17. November 2021 17:54 An: Main discussion list for the graph-tool project Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations Hi Stephan, If the number of start-end pairs is not high you could compute all shortest paths and then remove the ones you are not interested in. Best, Aleks On Wed, Nov 17, 2021 at 3:55 PM Monecke, Stephan wrote: > > Hi together, > > > > I have a question on how you would tackle the following: > > > I have a graph and want to calculate all possible shortest paths but exclude > a few nodes as starting / stopping positions. > > > I can not filter out those vertices, since paths should be allowed to pass > them - just not start or terminate there. > > > Calculating all possible shortest paths manually is computationally > infeasible. > > > > Thanks a lot! > > > Stephan > > ___ > graph-tool mailing list -- graph-tool@skewed.de > To unsubscribe send an email to graph-tool-le...@skewed.de ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de
[graph-tool] All Pairs Shortest Path with limitations
Hi together, I have a question on how you would tackle the following: I have a graph and want to calculate all possible shortest paths but exclude a few nodes as starting / stopping positions. I can not filter out those vertices, since paths should be allowed to pass them - just not start or terminate there. Calculating all possible shortest paths manually is computationally infeasible. Thanks a lot! Stephan ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de
[graph-tool] Vertex pointer not updated on graph filtering
Hi together! I just noticed that vertices that had been saved within an external list are not being updated when filtering the graph: >>> G.set_edge_filter( G.edge_properties[ 'e_isbus'], inverted=True) >>> G.set_vertex_filter(G.vertex_properties['v_isbus'], inverted=True) >>> print( [ int(n) for n in node.all_neighbors() ] ) [101, 22, 265, 496, 518, 22, 265, 496, 101, 518] >>> print( [ int(n) for n in G.vertex( int(node) ).all_neighbors() ] ) [101, 22, 265, 22, 265, 101] Did I miss something or is this a bug? Thanks a lot for any ideas! Stephan ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de
[graph-tool] Re: Core dump
I bisected manually until I found the problem: shortest_path() segfaults when the vertex is not present. - - MWE - - import graph_tool.all as gt gt.shortest_path(gt.Graph(), 1, 2) Segmentation fault (core dumped) - - o - - I wrote a ticket for that. Thanks for your help! Von: Tiago de Paula Peixoto Gesendet: Donnerstag, 11. November 2021 11:05 An: graph-tool@skewed.de Betreff: [graph-tool] Re: Core dump Am 11.11.21 um 10:47 schrieb Monecke, Stephan: > Any ideas on how I can debug this? > Yes: try to isolate the problem by constructing a minimal, self-contained program that reproduces the crash. -- Tiago de Paula Peixoto ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de
[graph-tool] Core dump
Hi together, I have an algorithm whose results incl. the graph I save to disk at the end. When I load it again, on some vertices, shortest_distance() crashes python: ``` munmap_chunk(): invalid pointer Aborted (core dumped) ``` Any ideas on how I can debug this? All the property maps are internal and my custom vertex lists are converted to int before pickling and restored using gt.Graph().vertex() afterwards. Thank a lot for ideas! Cheers, Stephan ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de
[graph-tool] Re: Speed up graph_tool.topology.shortest_distance
On my test graph with 4201 nodes and 9683 edges I already tried a quick test with both: I need ~ 70 random (source, target) node pairs to approximate the real mean of all pairs shortest path with an error of about 1 %. This is about 40 % of the run-time of all pairs shortest path. Using single source all targets shortest path I need to sample at least 6 random sources to approximate the real mean of all pairs shortest path with an error of about 1 %. This is about 45-50 % of the run-time of all pairs shortest path. So, although single source all targets shortest path is waay more efficient in it's computation than shortest_path manually it apparently is not a good candidate for subsampling and effectively doing worse. Von: Tiago de Paula Peixoto Gesendet: Dienstag, 26. Oktober 2021 16:52 An: graph-tool@skewed.de Betreff: [graph-tool] Re: Speed up graph_tool.topology.shortest_distance Am 26.10.21 um 16:23 schrieb Tiago de Paula Peixoto: > It is possible to pass a single source but multiple targets. It is possible to pass also a single source and no target, in which case the function computes the distances to all other vertices. By sub-sampling the source vertices you can avoid the "overhead" you were referring to. -- Tiago de Paula Peixoto ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de
[graph-tool] Speed up graph_tool.topology.shortest_distance
Hi! I use graph_tool.topology.shortest_distance for an all pairs shortest path calculation, what is the main run-time footprint of my algorithm and way to large. How would you speed it up / tackle this? I tried to sub-sample with manual source, target pairs but that's terribly inefficient since it does not use the internal bookkeeping. Nice would be graph_tool.topology.shortest_distance(G, U, V), where U and V are lists of same length with sources / targets but that's not implemented. Thank a lot in advance! Stephan ___ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-le...@skewed.de