Hi Stephan,

On Wed, Nov 24, 2021 at 5:58 PM Monecke, Stephan
<[email protected]> wrote:
>
> Hi Aleks!
>
>
> The multiprocessing module is what I'm currently using as a workaround until 
> something other pops up - I just throw it into a mp.Pool.
>
> For the initial debugging / get-to-know phase, so to say.

This is the first time I hear that someone uses multiprocessing for
debugging. Usually its the other way around :)

>
> I don't know what exactly you've put into the multiprocessing module but I 
> don't see how this could be of help in my case.

You requested multi-threaded shortest_distance and wanted to compute
multiple paths and then get the mean dist?
So my suggestion was to throw Pool on shortest_distance(g=G,
source=V\B, target=V\B, weights=G.edge_properties['weight'],
samples=N) which is trivial to do e.g.

def wrap(node):
    return shortest_distance(g=G,  source=node, target=node,
weights=G.edge_properties['weight'])

res = multiprocessing.Pool().map(wrap, random.choices(V, N))
do_some_processing_on_result(res)...

This is not optimal due to various overheads, however from python its
the best you can get.
But if you know a bit of c++ and want to optimize some more,
graph-tool already uses openmp so making a parallel loop over vertices
should be easy. Tiago keeps the code very readable and everything is
logical so finding a place to make your modification should take much
time: 
https://git.skewed.de/count0/graph-tool/-/blob/master/src/graph/topology/graph_distance.cc

> I'm running a metropolis algorithm that runs hours on a very small graph. 
> Each step depends on the previous. On the graphs I'm actually - or at least - 
> aiming for, it would probably run at least over a week.
>
>
> I also have a large parameter space and would like to try something like 
> lipo* but same here: Just 100 iterations already approach 2 weeks on the tiny 
> graph and I can't multiprocess it.

You can for sure parallelize parameter opt with dlib. Check the
official docs 
(http://dlib.net/dlib/global_optimization/global_function_search_abstract.h.html#global_function_search)
however its already done in other frameworks:
listed by lipo readme: https://optuna.readthedocs.io/en/stable/
I've tested this one (~30 parameters):
https://github.com/facebookresearch/nevergrad

Good luck,
Aleks

> Cheers,
> Stephan
>
> *https://pypi.org/project/lipo/
>
> ________________________________
> Von: Aleksandar Trifunovic <[email protected]>
> Gesendet: Mittwoch, 24. November 2021 16:44
> An: Main discussion list for the graph-tool project
> Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
>
> Hi Stephan,
>
> I've used graph tool to compute a lot of shortest paths and used a
> dumb solution of just throwing it into multiprocessing module from
> python (with some lru_cache tricks). If you have enough memory and you
> don't call the main procedure often (seems to be the case?) then the
> overhead of multiprocessing becomes insignificant. A benefit is that
> this is pretty easy to code and scales perfectly to more cores (until
> you run out of memory).
>
> Best,
> Aleks
>
> On Wed, Nov 24, 2021 at 3:32 PM Monecke, Stephan
> <[email protected]> wrote:
> >
> > Hi Tiago and the list,
> >
> >
> >
> > my algorithm is still _infeasibly_ slow and > 90 % of the run-time results 
> > from shortest_path().
> >
> >
> > Do you have any idea on how this could be sped up?
> >
> >
> > I have the graph G, the set of all nodes V and a sub-set B ⊂ V.
> >
> >
> > I want to calculate or approximate the mean shortest distance between all 
> > pairs V\B x V\B but on the complete graph - or in other words, calculate 
> > the mean of all pairs shortest distance ( mean(shortest_distance()) ) but 
> > exclude n∈B from being source or target.
> >
> > Sub-sampling would be okay as well.
> >
> >
> > I currently don't see a way to do this fast with graph-tool besides writing 
> > a custom function but that probably exceeds my capacities. Am I missing 
> > something? Do you have any ideas?
> >
> >
> > Ideal would be something like a multi-threaded shortest_distance(g=G, 
> > source=V\B, target=V\B, weights=G.edge_properties['weight'], samples=N) 
> > with N pairs random sub-sampling and source / target being lists.
> >
> >
> >
> > Thanks a ton!
> >
> > Stephan
> >
> > ________________________________
> > Von: Monecke, Stephan <[email protected]>
> > Gesendet: Donnerstag, 18. November 2021 14:59
> > An: Main discussion list for the graph-tool project
> > Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
> >
> > This works like a charm!
> >
> > With it, I'm down to 228 ms ± 1.31 ms from 3.6 s ( 82.9 ms ± 2.45 ms for 
> > shortest_distance(G) ).
> >
> > It's still very very expensive as it's 2.75 times APSP but approaching 
> > reasonability. The costly remaining part is the manual B x B.*
> >
> > I tried to speed it up by the approximation to limit G to B since I know 
> > that B is connected but graph-tool segfaults right away (I wrote another 
> > ticket for that). :/
> >
> > If anyone else has any ideas on how to speed this up, I'd be very grateful!
> >
> >
> > Cheers,
> > Stephan
> >
> > - - - o - - -
> >
> > *Since  the graph is directed, we have the following: Say V is the set of 
> > all vertices and B a subset of V I want to exclude as sources / targets 
> > from APSP. Then the set of points is
> > (V\B) x (V\B) = [ (V x V) \ [ (V x B) + (B x V) ] ] U (B x B).
> >
> > Or in graph-tool terms:
> >  - (V x V): shortest_distance(G)
> >  - (B x V): shortest_distance(G, b, _) for b in B
> >  - (V x B): shortest_distance(GraphView(G, reversed=True), b, _) for b in B
> >  - (B x B): shortest_distance(G, u, v) for u,v in B x B
> >
> >
> >
> >
> > Von: Stephan Monecke <[email protected]>
> > Gesendet: Mittwoch, 17. November 2021 20:24
> > An: Main discussion list for the graph-tool project
> > Betreff: [graph-tool] Re: All Pairs Shortest Path with limitations
> >
> > 🥳 I didn't know that existed!
> >
> > Thanks a ton, I'm really looking forward to test it out tomorrow!
> >
> > Cheers,
> > Stephan
> > --
> > Sent from my mobile device. Please excuse my brevity.
> >
> > Am 17. November 2021 19:27:37 MEZ schrieb Tiago de Paula Peixoto 
> > <[email protected]>:
> > >Am 17.11.21 um 19:22 schrieb Monecke, Stephan:
> > >> The main reason is that shortest_distance() has a "single source, all 
> > >> targets" mode but no "single target, all sources" mode (the reverse) - 
> > >> as soon as "source" is empty, it defaults to APSP, neglecting "target".
> > >>
> > >
> > >Of course it does. Just reverse the graph with:
> > >
> > >     GraphView(g, reversed=True)
> > >
> > >
> > _______________________________________________
> > graph-tool mailing list -- [email protected]
> > To unsubscribe send an email to [email protected]
> >
> > _______________________________________________
> > graph-tool mailing list -- [email protected]
> > To unsubscribe send an email to [email protected]
> > _______________________________________________
> > graph-tool mailing list -- [email protected]
> > To unsubscribe send an email to [email protected]
> _______________________________________________
> graph-tool mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> _______________________________________________
> graph-tool mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
graph-tool mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to