[graph-tool] Re: All Pairs Shortest Path with limitations

2021-11-26 Thread Monecke, Stephan
Hi Aleks!


I thank you for your long answer!

> def wrap(node):
>    return shortest_distance(g=G,  source=node, target=node,
> weights=G.edge_properties['weight'])
>
> res = multiprocessing.Pool().map(wrap, random.choices(V, N))
> do_some_processing_on_result(res)...
>
> This is not optimal due to various overheads, however from python its the 
> best you can get.

So you basically recommend full APSP manually?

In terms of run-time, this is the worst you can get divided by the number of 
cores - or the worst nightmare of an hpc admin.

Just out of curiosity: Even on my small test-graph it's ~ 1200 times slower 
than full shortest_distance().


I'll have a look at the other resources!


Cheers,
Stephan


Von: Aleksandar Trifunovic 
Gesendet: Donnerstag, 25. November 2021 01:16
An: Main discussion list for the graph-tool project
Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
    
Hi Stephan,

On Wed, Nov 24, 2021 at 5:58 PM Monecke, Stephan
 wrote:
>
> Hi Aleks!
>
>
> The multiprocessing module is what I'm currently using as a workaround until 
> something other pops up - I just throw it into a mp.Pool.
>
> For the initial debugging / get-to-know phase, so to say.

This is the first time I hear that someone uses multiprocessing for
debugging. Usually its the other way around :)

>
> I don't know what exactly you've put into the multiprocessing module but I 
> don't see how this could be of help in my case.

You requested multi-threaded shortest_distance and wanted to compute
multiple paths and then get the mean dist?
So my suggestion was to throw Pool on shortest_distance(g=G,
source=V\B, target=V\B, weights=G.edge_properties['weight'],
samples=N) which is trivial to do e.g.

def wrap(node):
    return shortest_distance(g=G,  source=node, target=node,
weights=G.edge_properties['weight'])

res = multiprocessing.Pool().map(wrap, random.choices(V, N))
do_some_processing_on_result(res)...

This is not optimal due to various overheads, however from python its
the best you can get.
But if you know a bit of c++ and want to optimize some more,
graph-tool already uses openmp so making a parallel loop over vertices
should be easy. Tiago keeps the code very readable and everything is
logical so finding a place to make your modification should take much
time:  
https://git.skewed.de/count0/graph-tool/-/blob/master/src/graph/topology/graph_distance.cc

> I'm running a metropolis algorithm that runs hours on a very small graph. 
> Each step depends on the previous. On the graphs I'm actually - or at least - 
> aiming for, it would probably run at least over a week.
>
>
> I also have a large parameter space and would like to try something like 
> lipo* but same here: Just 100 iterations already approach 2 weeks on the tiny 
> graph and I can't multiprocess it.

You can for sure parallelize parameter opt with dlib. Check the
official docs 
(http://dlib.net/dlib/global_optimization/global_function_search_abstract.h.html#global_function_search)
however its already done in other frameworks:
listed by lipo readme: https://optuna.readthedocs.io/en/stable/
I've tested this one (~30 parameters):
https://github.com/facebookresearch/nevergrad

Good luck,
Aleks

> Cheers,
> Stephan
>
> *https://pypi.org/project/lipo/
>
> 
> Von: Aleksandar Trifunovic 
> Gesendet: Mittwoch, 24. November 2021 16:44
> An: Main discussion list for the graph-tool project
> Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
>
> Hi Stephan,
>
> I've used graph tool to compute a lot of shortest paths and used a
> dumb solution of just throwing it into multiprocessing module from
> python (with some lru_cache tricks). If you have enough memory and you
> don't call the main procedure often (seems to be the case?) then the
> overhead of multiprocessing becomes insignificant. A benefit is that
> this is pretty easy to code and scales perfectly to more cores (until
> you run out of memory).
>
> Best,
> Aleks
>
> On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan
>  wrote:
> >
> > Hi Tiago and the list,
> >
> >
> >
> > my algorithm is still _infeasibly_ slow and > 90 % of the run-time results 
> > from shortest_path().
> >
> >
> > Do you have any idea on how this could be sped up?
> >
> >
> > I have the graph G, the set of all nodes V and a sub-set B ⊂ V.
> >
> >
> > I want to calculate or approximate the mean shortest distance between all 
> > pairs V\B x V\B but on the complete graph - or in other words, calculate 
> > the mean of all pairs shortest distance ( mean(shortest_distance()) ) but 
> > exclude n∈B from being source 

[graph-tool] Re: All Pairs Shortest Path with limitations

2021-11-24 Thread Monecke, Stephan
Hi Aleks!


The multiprocessing module is what I'm currently using as a workaround until 
something other pops up - I just throw it into a mp.Pool.

For the initial debugging / get-to-know phase, so to say.


I don't know what exactly you've put into the multiprocessing module but I 
don't see how this could be of help in my case.


I'm running a metropolis algorithm that runs hours on a very small graph. Each 
step depends on the previous. On the graphs I'm actually - or at least - aiming 
for, it would probably run at least over a week.

I also have a large parameter space and would like to try something like lipo* 
but same here: Just 100 iterations already approach 2 weeks on the tiny graph 
and I can't multiprocess it.

Cheers,
Stephan

*https://pypi.org/project/lipo/


Von: Aleksandar Trifunovic 
Gesendet: Mittwoch, 24. November 2021 16:44
An: Main discussion list for the graph-tool project
Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations

Hi Stephan,

I've used graph tool to compute a lot of shortest paths and used a
dumb solution of just throwing it into multiprocessing module from
python (with some lru_cache tricks). If you have enough memory and you
don't call the main procedure often (seems to be the case?) then the
overhead of multiprocessing becomes insignificant. A benefit is that
this is pretty easy to code and scales perfectly to more cores (until
you run out of memory).

Best,
Aleks

On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan
 wrote:
>
> Hi Tiago and the list,
>
>
>
> my algorithm is still _infeasibly_ slow and > 90 % of the run-time results 
> from shortest_path().
>
>
> Do you have any idea on how this could be sped up?
>
>
> I have the graph G, the set of all nodes V and a sub-set B ⊂ V.
>
>
> I want to calculate or approximate the mean shortest distance between all 
> pairs V\B x V\B but on the complete graph - or in other words, calculate the 
> mean of all pairs shortest distance ( mean(shortest_distance()) ) but exclude 
> n∈B from being source or target.
>
> Sub-sampling would be okay as well.
>
>
> I currently don't see a way to do this fast with graph-tool besides writing a 
> custom function but that probably exceeds my capacities. Am I missing 
> something? Do you have any ideas?
>
>
> Ideal would be something like a multi-threaded shortest_distance(g=G, 
> source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with 
> N pairs random sub-sampling and source / target being lists.
>
>
>
> Thanks a ton!
>
> Stephan
>
> 
> Von: Monecke, Stephan 
> Gesendet: Donnerstag, 18. November 2021 14:59
> An: Main discussion list for the graph-tool project
> Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
>
> This works like a charm!
>
> With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for 
> shortest_distance(G) ).
>
> It's still very very expensive as it's 2.75 times APSP but approaching 
> reasonability. The costly remaining part is the manual B x B.*
>
> I tried to speed it up by the approximation to limit G to B since I know that 
> B is connected but graph-tool segfaults right away (I wrote another ticket 
> for that). :/
>
> If anyone else has any ideas on how to speed this up, I'd be very grateful!
>
>
> Cheers,
> Stephan
>
> - - - o - - -
>
> *Since  the graph is directed, we have the following: Say V is the set of all 
> vertices and B a subset of V I want to exclude as sources / targets from 
> APSP. Then the set of points is
> (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B).
>
> Or in graph-tool terms:
>  - (V x V): shortest_distance(G)
>  - (B x V): shortest_distance(G, b, _) for b in B
>  - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B
>  - (B x B): shortest_distance(G, u, v) for u,v in B x B
>
>
>
>
> Von: Stephan Monecke 
> Gesendet: Mittwoch, 17. November 2021 20:24
> An: Main discussion list for the graph-tool project
> Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
>
> 🥳 I didn't know that existed!
>
> Thanks a ton, I'm really looking forward to test it out tomorrow!
>
> Cheers,
> Stephan
> --
> Sent from my mobile device. Please excuse my brevity.
>
> Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto 
> :
> >Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
> >> The main reason is that shortest_distance() has a "single source, all 
> >> targets" mode but no "single target, all sources" mode (the reverse) - as 
> >> soon as "so

[graph-tool] Re: All Pairs Shortest Path with limitations

2021-11-24 Thread Monecke, Stephan
Hi Tiago and the list,



my algorithm is still _infeasibly_ slow and > 90 % of the run-time results from 
shortest_path().


Do you have any idea on how this could be sped up?


I have the graph G, the set of all nodes V and a sub-set B ⊂ V.


I want to calculate or approximate the mean shortest distance between all pairs 
V\B x V\B but on the complete graph - or in other words, calculate the mean of 
all pairs shortest distance ( mean(shortest_distance()) ) but exclude n∈B from 
being source or target.

Sub-sampling would be okay as well.


I currently don't see a way to do this fast with graph-tool besides writing a 
custom function but that probably exceeds my capacities. Am I missing 
something? Do you have any ideas?


Ideal would be something like a multi-threaded shortest_distance(g=G, 
source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) with N 
pairs random sub-sampling and source / target being lists.


Thanks a ton!

Stephan

________
Von: Monecke, Stephan 
Gesendet: Donnerstag, 18. November 2021 14:59
An: Main discussion list for the graph-tool project
Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations

This works like a charm!

With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for 
shortest_distance(G) ).

It's still very very expensive as it's 2.75 times APSP but approaching 
reasonability. The costly remaining part is the manual B x B.*

I tried to speed it up by the approximation to limit G to B since I know that B 
is connected but graph-tool segfaults right away (I wrote another ticket for 
that). :/

If anyone else has any ideas on how to speed this up, I'd be very grateful!


Cheers,
Stephan

- - - o - - -

*Since  the graph is directed, we have the following: Say V is the set of all 
vertices and B a subset of V I want to exclude as sources / targets from APSP. 
Then the set of points is
(V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B).

Or in graph-tool terms:
 - (V x V): shortest_distance(G)
 - (B x V): shortest_distance(G, b, _) for b in B
 - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B
 - (B x B): shortest_distance(G, u, v) for u,v in B x B




Von: Stephan Monecke 
Gesendet: Mittwoch, 17. November 2021 20:24
An: Main discussion list for the graph-tool project
Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations

🥳 I didn't know that existed!

Thanks a ton, I'm really looking forward to test it out tomorrow!

Cheers,
Stephan
--
Sent from my mobile device. Please excuse my brevity.

Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto 
:
>Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
>> The main reason is that shortest_distance() has a "single source, all 
>> targets" mode but no "single target, all sources" mode (the reverse) - as 
>> soon as "source" is empty, it defaults to APSP, neglecting "target".
>>
>
>Of course it does. Just reverse the graph with:
>
> GraphView(g, reversed=True)
>
>
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de

___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Re: All Pairs Shortest Path with limitations

2021-11-18 Thread Monecke, Stephan
This works like a charm!

With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for 
shortest_distance(G) ).

It's still very very expensive as it's 2.75 times APSP but approaching 
reasonability. The costly remaining part is the manual B x B.*

I tried to speed it up by the approximation to limit G to B since I know that B 
is connected but graph-tool segfaults right away (I wrote another ticket for 
that). :/

If anyone else has any ideas on how to speed this up, I'd be very grateful!


Cheers,
Stephan

- - - o - - -

*Since  the graph is directed, we have the following: Say V is the set of all 
vertices and B a subset of V I want to exclude as sources / targets from APSP. 
Then the set of points is 
(V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B).

Or in graph-tool terms: 
 - (V x V): shortest_distance(G)
 - (B x V): shortest_distance(G, b, _) for b in B
 - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B 
 - (B x B): shortest_distance(G, u, v) for u,v in B x B




Von: Stephan Monecke 
Gesendet: Mittwoch, 17. November 2021 20:24
An: Main discussion list for the graph-tool project
Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
    
🥳 I didn't know that existed!

Thanks a ton, I'm really looking forward to test it out tomorrow!

Cheers,
Stephan
-- 
Sent from my mobile device. Please excuse my brevity.

Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto 
:
>Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
>> The main reason is that shortest_distance() has a "single source, all 
>> targets" mode but no "single target, all sources" mode (the reverse) - as 
>> soon as "source" is empty, it defaults to APSP, neglecting "target".
>> 
>
>Of course it does. Just reverse the graph with:
>
> GraphView(g, reversed=True)
>
>
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de
 
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Re: All Pairs Shortest Path with limitations

2021-11-17 Thread Monecke, Stephan
Hi Aleks!



I thank you for your answer! If already tried that but have not been successful 
speed-wise.


A full APSP of a very small test graph already takes around 200 ms.

The fastest I could get as described below runs in about 3.6 s. :/


The main reason is that shortest_distance() has a "single source, all targets" 
mode but no "single target, all sources" mode (the reverse) - as soon as 
"source" is empty, it defaults to APSP, neglecting "target".

Hence I have to do this part manually using shortest_distance(G, u, v).*


My bottleneck is the APSP calculation.



Cheers,


Stephan



*Since the graph is directed, we have the following: Say V is the set of all 
vertices and B a subset of V I want to exclude as sources / targets from APSP. 
Then the set of points is [ (V x V) \ [ (V x B) + (B x V) ] ] + (B x B).

Or in graph-tool terms:
 - (V x V): shortest_distance(G)

 - (V x B): shortest_distance(G, v, b) for b in B for v in V (horrible run-time)

 - (B x V): shortest_distance(G, b, _) for b in B

 - (B x B): shortest_distance(G, u, v) for u,v in B x B


Von: Aleksandar Trifunovic 
Gesendet: Mittwoch, 17. November 2021 17:54
An: Main discussion list for the graph-tool project
Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations

Hi Stephan,

If the number of start-end pairs is not high you could compute all
shortest paths and then remove the ones you are not interested in.

Best,
Aleks


On Wed, Nov 17, 2021 at 3:55 PM Monecke, Stephan
 wrote:
>
> Hi together,
>
>
>
> I have a question on how you would tackle the following:
>
>
> I have a graph and want to calculate all possible shortest paths but exclude 
> a few nodes as starting / stopping positions.
>
>
> I can not filter out those vertices, since paths should be allowed to pass 
> them -  just not start or terminate there.
>
>
> Calculating all possible shortest paths manually is computationally 
> infeasible.
>
>
>
> Thanks a lot!
>
>
> Stephan
>
> ___
> graph-tool mailing list -- graph-tool@skewed.de
> To unsubscribe send an email to graph-tool-le...@skewed.de
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] All Pairs Shortest Path with limitations

2021-11-17 Thread Monecke, Stephan
Hi together,



I have a question on how you would tackle the following:


I have a graph and want to calculate all possible shortest paths but exclude a 
few nodes as starting / stopping positions.


I can not filter out those vertices, since paths should be allowed to pass them 
-  just not start or terminate there.


Calculating all possible shortest paths manually is computationally infeasible.



Thanks a lot!


Stephan
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Vertex pointer not updated on graph filtering

2021-11-15 Thread Monecke, Stephan
Hi together!



I just noticed that vertices that had been saved within an external list are 
not being updated when filtering the graph:


>>> G.set_edge_filter(  G.edge_properties[  'e_isbus'], inverted=True)
>>> G.set_vertex_filter(G.vertex_properties['v_isbus'], inverted=True)

>>> print( [ int(n) for n in node.all_neighbors() ] )
[101, 22, 265, 496, 518, 22, 265, 496, 101, 518]

>>> print( [ int(n) for n in G.vertex( int(node) ).all_neighbors() ] )
[101, 22, 265, 22, 265, 101]


Did I miss something or is this a bug?



Thanks a lot for any ideas!



Stephan
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Re: Core dump

2021-11-11 Thread Monecke, Stephan
I bisected manually until I found the problem: shortest_path() segfaults when 
the vertex is not present.


- - MWE - -


import graph_tool.all as gt
gt.shortest_path(gt.Graph(), 1, 2)
Segmentation fault (core dumped)

- - o - -


I wrote a ticket for that.


Thanks for your help!


Von: Tiago de Paula Peixoto 
Gesendet: Donnerstag, 11. November 2021 11:05
An: graph-tool@skewed.de
Betreff: [graph-tool] Re: Core dump

Am 11.11.21 um 10:47 schrieb Monecke, Stephan:
> Any ideas on how I can debug this?
>

Yes: try to isolate the problem by constructing a minimal,
self-contained program that reproduces the crash.

--
Tiago de Paula Peixoto 
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Core dump

2021-11-11 Thread Monecke, Stephan
Hi together,


I have an algorithm whose results incl. the graph I save to disk at the end.
When I load it again, on some vertices, shortest_distance() crashes python:

```
munmap_chunk(): invalid pointer
Aborted (core dumped)
```


Any ideas on how I can debug this?


All the property maps are internal and my custom vertex lists are converted to 
int before pickling and restored using gt.Graph().vertex() afterwards.



Thank a lot for ideas!



Cheers,


Stephan
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Re: Speed up graph_tool.topology.shortest_distance

2021-10-27 Thread Monecke, Stephan
On my test graph with 4201 nodes and 9683 edges I already tried a quick test 
with both:


I need ~ 70 random (source, target) node pairs to approximate the real mean of 
all pairs shortest path with an error of about 1 %.

This is about 40 % of the run-time of all pairs shortest path.


Using single source all targets shortest path I need to sample at least 6 
random sources to approximate the real mean of all pairs shortest path with an 
error of about 1 %.

This is about 45-50 % of the run-time of all pairs shortest path.


So, although single source all targets shortest path is waay more efficient in 
it's computation than shortest_path manually it apparently is not a good 
candidate for subsampling and effectively doing worse.


Von: Tiago de Paula Peixoto 
Gesendet: Dienstag, 26. Oktober 2021 16:52
An: graph-tool@skewed.de
Betreff: [graph-tool] Re: Speed up graph_tool.topology.shortest_distance

Am 26.10.21 um 16:23 schrieb Tiago de Paula Peixoto:
> It is possible to pass a single source but multiple targets.

It is possible to pass also a single source and no target, in which case
the function computes the distances to all other vertices.

By sub-sampling the source vertices you can avoid the "overhead" you
were referring to.

--
Tiago de Paula Peixoto 
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Speed up graph_tool.topology.shortest_distance

2021-10-26 Thread Monecke, Stephan
Hi!



I use graph_tool.topology.shortest_distance for an all pairs shortest path 
calculation, what is the main run-time footprint of my algorithm and way to 
large.


How would you speed it up / tackle this?


I tried to sub-sample with manual source, target pairs but that's terribly 
inefficient since it does not use the internal bookkeeping.

Nice would be graph_tool.topology.shortest_distance(G, U, V), where U and V are 
lists of same length with sources / targets but that's not implemented.



Thank a lot in advance!


Stephan

___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de