[graph-tool] Re: memory handling when using graph-tool's MCMCs and multiprocessing on large networks

2021-10-22 Thread Sam G
thanks for your reply

``It is clear to you that if you run 10 processes in parallel you will need 10 
times more memory, right?``

this is clear to me. however 2GB*10 ~= 20GB, and the machine has 260GB memory. 
so unless algorithm is creating 10 copies of the graph within each iteration, 
it should be within bounds

ideally the parallelization would not make a copy of the entire graph. since 
the edges are fixed, all it needs to do is make vertex properties. i haven't 
figured that out yet. i was hoping someone might have tips on how to do this in 
graph-tool. there are ways to do this with other objects (data frames, lists), 
but i am not sure how to approach it with graph-tool graphs

```Maybe you can reproduce the problem for a smaller graph, or show how 
much memory you are actually using for a single process. It would also 
be important to tell us what version of graph-tool you are running.```

will do this soon
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Re: memory handling when using graph-tool's MCMCs and multiprocessing on large networks

2021-10-21 Thread Sam G
thanks for your reply.

here's an example running minimize_blockmodel_dl() 10 times on 10 cores. when i 
run this on a large network (2GB, 2M vertices, 20M edges) i get a MemoryError. 

```
import graph_tool.all as gt
import multiprocessign as mp
import numpy as np

g = gt.load_graph("large_graph", fmt="graphml")

N_iter = 10
N_core = 10

def fit_sbm():
state = gt.minimize_blockmodel_dl(g)
b = state.get_blocks()
return b

def _parallel_sbm(iter = N_iter):
pool = mp.Pool(N_core)
future_res = [pool.apply_async(fit_sbm) for m in range(iter)]
res = [f.get() for f in future_res]
return res

def parallel_fit_sbm(iter = M_iter):
results = parallel_sbm(iter)
return results

results = parallel_fit_sbm()

___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] memory handling when using graph-tool's MCMCs and multiprocessing on large networks

2021-10-21 Thread Sam G
hi,

i was wondering if anyone had any tips on conserving memory when running 
graph-tool MCMCs on large networks in parallel.

i have a large network (2GB), and about 260GB of memory, and was surprised to 
receive a MemoryError when i ran 10 MCMC chains in parallel using 
multiprocessing. my impression is that MCMC was relatively light on memory.

cheers,
-sam
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] running minimize_blockmodel_dl() in parallel

2021-10-04 Thread Sam G
hi,

here is a simple example where i run minimize_blockmodel_dl() 10 times in 
parallel using multiprocessing and collect the entropy. when i run this, i get 
the same value of entropy every single time.

```
import multiprocessing as mp
import numpy as np
import time
import graph_tool.all as gt

# load graph
g = gt.collection.data["celegansneural"]

N_iter = 10

def get_sbm_entropy():
np.random.seed()
state = gt.minimize_blockmodel_dl(g)
return state.entropy()

def _parallel_mc(iter=N_iter):
pool = mp.Pool(10)

future_res = [pool.apply_async(get_sbm_entropy) for _ in range(iter)]
res = [f.get() for f in future_res]

return res

def parallel_monte_carlo(iter=N_iter):
entropies = _parallel_mc(iter)

return entropies

parallel_monte_carlo()
```

result: [8331.810102822546, 8331.810102822546, 8331.810102822546, 
8331.810102822546, 8331.810102822546, 8331.810102822546, 8331.810102822546, 
8331.810102822546, 8331.810102822546, 8331.810102822546]

ultimately i would like to use this to keep entropy as well as the block 
membership vector for each iteration

any ideas?

cheers,
-sam
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de