Re: [julia-users] Initializing a SharedArray Memory Error

2014-12-11 Thread benFranklin
I have noticed that these remote references can't be fetched:

fetch(zeroMatrix.refs[1]) 

 the driver process just waits until infinity, so I'm thinking that the 
remotecall_wait() 
in 
https://github.com/JuliaLang/julia/blob/f3c355115ab02868ac644a5561b788fc16738443/base/sharedarray.jl#L96
 
exit before it should. Any ideas?

On Wednesday, 10 December 2014 13:47:19 UTC-5, benFranklin wrote:
>
> I think you are right about some references not being released yet:
>
> If I change the while loop to include you way of replacing every 
> reference, the put! actually never gets executed, it just waits:
>
> while true
> zeroMatrix = 
> SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), init = x->inF(x,nQ))
> println("ran!")
>
> for i = 1:length(zeroMatrix.refs) 
> put!(zeroMatrix.refs[i], 1) 
> end 
> @everywhere gc()
>end
> ran!
> 
>
> Runs once and stalls, after C-c:
>
>
> ^CERROR: interrupt
>  in process_events at /usr/bin/../lib64/julia/sys.so
>  in wait at /usr/bin/../lib64/julia/sys.so (repeats 2 times)
>  in wait_full at /usr/bin/../lib64/julia/sys.so
> 
> After C-d
>
> julia> 
>
> WARNING: Forcibly interrupting busy workers
> error in running finalizer: InterruptException()
> error in running finalizer: InterruptException()
> WARNING: Unable to terminate all workers
> [...]
>
>
> It seems after the init function not all workers are "done". I'll see if 
> there's something weird with that part, but if the SharedArray is being 
> returned, I don't see any reason for this to be so.
>
>
>
> On Wednesday, 10 December 2014 05:19:55 UTC-5, Tim Holy wrote:
>>
>> After your gc() it should be able to be unmapped, see 
>>
>> https://github.com/JuliaLang/julia/blob/f3c355115ab02868ac644a5561b788fc16738443/base/mmap.jl#L113
>>  
>>
>> My guess is something in the parallel architecture is holding a 
>> reference. 
>> Have you tried going at this systematically from the internal 
>> representation 
>> of the SharedArray? For example, I might consider trying to put! new 
>> stuff in 
>> zeroMatrix.refs: 
>>
>> for i = 1:length(zeroMatrix.refs) 
>> put!(zeroMatrix.refs[i], 1) 
>> end 
>>
>> before calling gc(). I don't know if this will work, but it's where I'd 
>> start 
>> experimenting. 
>>
>> If you can fix this, please do submit a pull request. 
>>
>> Best, 
>> --Tim 
>>
>> On Tuesday, December 09, 2014 08:06:10 PM ele...@gmail.com wrote: 
>> > On Wednesday, December 10, 2014 12:28:29 PM UTC+10, benFranklin wrote: 
>> > > I've made a small example of the memory problems I've been running 
>> into. I 
>> > > can't find a way to deallocate a SharedArray, 
>> > 
>> > Someone more expert might find it, but I can't see anywhere that the 
>> > mmapped memory is unmapped. 
>> > 
>> > > if the code below runs once, it means the computer has enough memory 
>> to 
>> > > run this. If I can properly deallocate the memory I should be able to 
>> do 
>> > > it 
>> > > again, however, I run out of memory. Am I misunderstanding something 
>> about 
>> > > garbage collection in Julia? 
>> > > 
>> > > Thanks for your attention 
>> > > 
>> > > Code: 
>> > > 
>> > > @everywhere nQ = 60 
>> > > 
>> > > @everywhere function inF(x::SharedArray,nQ::Int64) 
>> > > 
>> > > number = myid()-1; 
>> > > targetLength = nQ*nQ*3 
>> > > 
>> > > startN = floor((number-1)*targetLength/nworkers()) + 1 
>> > > endN = floor(number*targetLength/nworkers()) 
>> > > 
>> > > myIndexes = int64(startN:endN) 
>> > > for j in myIndexes 
>> > > inds = ind2sub((nQ,nQ,nQ),j) 
>> > > x[inds[1],inds[2],inds[3],:,:,:] = rand(nQ,nQ,nQ) 
>> > > end 
>> > > 
>> > > 
>> > > end 
>> > > 
>> > > while true 
>> > > zeroMatrix = SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), 
>> init = 
>> > > x->inF(x,nQ)) 
>> > > println("ran!") 
>> > > @everywhere zeroMatrix = 1 
>> > > @everywhere gc() 
>> > > end 
>> > > 
>> > > On Monday, 8 December 2014 23:43:03 UTC-5, Isaiah wrote: 
>> > >> Hopefully you will get an answer on pmap from someone more familiar 
>> with 
>> > >> the parallel stuff, but: have you tried splitting the init step? 
>> (see the 
>> > >> example in the manual for how to init an array in chunks done by 
>> > >> different 
>> > >> workers). Just guessing though: I'm not sure if/how those will be 
>> > >> serialized if each worker is contending for the whole array. 
>> > >> 
>> > >> On Fri, Dec 5, 2014 at 4:23 PM, benFranklin  
>> wrote: 
>> > >>> Hi all, I'm trying to figure out how to best initialize a 
>> SharedArray, 
>> > >>> using a C function to fill it up that computes a huge matrix in 
>> parts, 
>> > >>> and 
>> > >>> all comments are appreciated. To summarise: Is A, making an empty 
>> shared 
>> > >>> array, computing the matrix in parallel using pmap and then filling 
>> it 
>> > >>> up 
>> > >>> serially, better than using B, computing in parallel and storing in 
>> one 
>> > >>> step by using an init function in the SharedArray declaration? 
>> > >>> 
>> 

Re: [julia-users] Initializing a SharedArray Memory Error

2014-12-10 Thread benFranklin
I think you are right about some references not being released yet:

If I change the while loop to include you way of replacing every reference, 
the put! actually never gets executed, it just waits:

while true
zeroMatrix = SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), 
init = x->inF(x,nQ))
println("ran!")

for i = 1:length(zeroMatrix.refs) 
put!(zeroMatrix.refs[i], 1) 
end 
@everywhere gc()
   end
ran!


Runs once and stalls, after C-c:


^CERROR: interrupt
 in process_events at /usr/bin/../lib64/julia/sys.so
 in wait at /usr/bin/../lib64/julia/sys.so (repeats 2 times)
 in wait_full at /usr/bin/../lib64/julia/sys.so

After C-d

julia> 

WARNING: Forcibly interrupting busy workers
error in running finalizer: InterruptException()
error in running finalizer: InterruptException()
WARNING: Unable to terminate all workers
[...]


It seems after the init function not all workers are "done". I'll see if 
there's something weird with that part, but if the SharedArray is being 
returned, I don't see any reason for this to be so.



On Wednesday, 10 December 2014 05:19:55 UTC-5, Tim Holy wrote:
>
> After your gc() it should be able to be unmapped, see 
>
> https://github.com/JuliaLang/julia/blob/f3c355115ab02868ac644a5561b788fc16738443/base/mmap.jl#L113
>  
>
> My guess is something in the parallel architecture is holding a reference. 
> Have you tried going at this systematically from the internal 
> representation 
> of the SharedArray? For example, I might consider trying to put! new stuff 
> in 
> zeroMatrix.refs: 
>
> for i = 1:length(zeroMatrix.refs) 
> put!(zeroMatrix.refs[i], 1) 
> end 
>
> before calling gc(). I don't know if this will work, but it's where I'd 
> start 
> experimenting. 
>
> If you can fix this, please do submit a pull request. 
>
> Best, 
> --Tim 
>
> On Tuesday, December 09, 2014 08:06:10 PM ele...@gmail.com  
> wrote: 
> > On Wednesday, December 10, 2014 12:28:29 PM UTC+10, benFranklin wrote: 
> > > I've made a small example of the memory problems I've been running 
> into. I 
> > > can't find a way to deallocate a SharedArray, 
> > 
> > Someone more expert might find it, but I can't see anywhere that the 
> > mmapped memory is unmapped. 
> > 
> > > if the code below runs once, it means the computer has enough memory 
> to 
> > > run this. If I can properly deallocate the memory I should be able to 
> do 
> > > it 
> > > again, however, I run out of memory. Am I misunderstanding something 
> about 
> > > garbage collection in Julia? 
> > > 
> > > Thanks for your attention 
> > > 
> > > Code: 
> > > 
> > > @everywhere nQ = 60 
> > > 
> > > @everywhere function inF(x::SharedArray,nQ::Int64) 
> > > 
> > > number = myid()-1; 
> > > targetLength = nQ*nQ*3 
> > > 
> > > startN = floor((number-1)*targetLength/nworkers()) + 1 
> > > endN = floor(number*targetLength/nworkers()) 
> > > 
> > > myIndexes = int64(startN:endN) 
> > > for j in myIndexes 
> > > inds = ind2sub((nQ,nQ,nQ),j) 
> > > x[inds[1],inds[2],inds[3],:,:,:] = rand(nQ,nQ,nQ) 
> > > end 
> > > 
> > > 
> > > end 
> > > 
> > > while true 
> > > zeroMatrix = SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), 
> init = 
> > > x->inF(x,nQ)) 
> > > println("ran!") 
> > > @everywhere zeroMatrix = 1 
> > > @everywhere gc() 
> > > end 
> > > 
> > > On Monday, 8 December 2014 23:43:03 UTC-5, Isaiah wrote: 
> > >> Hopefully you will get an answer on pmap from someone more familiar 
> with 
> > >> the parallel stuff, but: have you tried splitting the init step? (see 
> the 
> > >> example in the manual for how to init an array in chunks done by 
> > >> different 
> > >> workers). Just guessing though: I'm not sure if/how those will be 
> > >> serialized if each worker is contending for the whole array. 
> > >> 
> > >> On Fri, Dec 5, 2014 at 4:23 PM, benFranklin  
> wrote: 
> > >>> Hi all, I'm trying to figure out how to best initialize a 
> SharedArray, 
> > >>> using a C function to fill it up that computes a huge matrix in 
> parts, 
> > >>> and 
> > >>> all comments are appreciated. To summarise: Is A, making an empty 
> shared 
> > >>> array, computing the matrix in parallel using pmap and then filling 
> it 
> > >>> up 
> > >>> serially, better than using B, computing in parallel and storing in 
> one 
> > >>> step by using an init function in the SharedArray declaration? 
> > >>> 
> > >>> 
> > >>> The difference tends to be that B uses a lot more memory, each 
> process 
> > >>> using the exact same amount of memory. However it is much faster 
> than A, 
> > >>> as 
> > >>> the copy step takes longer than the computation, but in A most of 
> the 
> > >>> memory usage is in one process, using less memory overall. 
> > >>> 
> > >>> Any tips on how to do this better? Also, this pmap is how I'm 
> handling 
> > >>> more complex paralellizations in Julia. Any comments on that 
> approach? 
> > >>> 
> > >>> Thanks a lot! 
> > >>> 
> > >>> Best, 
> > >>>

Re: [julia-users] Initializing a SharedArray Memory Error

2014-12-10 Thread Tim Holy
After your gc() it should be able to be unmapped, see 
https://github.com/JuliaLang/julia/blob/f3c355115ab02868ac644a5561b788fc16738443/base/mmap.jl#L113

My guess is something in the parallel architecture is holding a reference. 
Have you tried going at this systematically from the internal representation 
of the SharedArray? For example, I might consider trying to put! new stuff in 
zeroMatrix.refs:

for i = 1:length(zeroMatrix.refs)
put!(zeroMatrix.refs[i], 1)
end

before calling gc(). I don't know if this will work, but it's where I'd start 
experimenting.

If you can fix this, please do submit a pull request.

Best,
--Tim

On Tuesday, December 09, 2014 08:06:10 PM ele...@gmail.com wrote:
> On Wednesday, December 10, 2014 12:28:29 PM UTC+10, benFranklin wrote:
> > I've made a small example of the memory problems I've been running into. I
> > can't find a way to deallocate a SharedArray,
> 
> Someone more expert might find it, but I can't see anywhere that the
> mmapped memory is unmapped.
> 
> > if the code below runs once, it means the computer has enough memory to
> > run this. If I can properly deallocate the memory I should be able to do
> > it
> > again, however, I run out of memory. Am I misunderstanding something about
> > garbage collection in Julia?
> > 
> > Thanks for your attention
> > 
> > Code:
> > 
> > @everywhere nQ = 60
> > 
> > @everywhere function inF(x::SharedArray,nQ::Int64)
> > 
> > number = myid()-1;
> > targetLength = nQ*nQ*3
> > 
> > startN = floor((number-1)*targetLength/nworkers()) + 1
> > endN = floor(number*targetLength/nworkers())
> > 
> > myIndexes = int64(startN:endN)
> > for j in myIndexes
> > inds = ind2sub((nQ,nQ,nQ),j)
> > x[inds[1],inds[2],inds[3],:,:,:] = rand(nQ,nQ,nQ)
> > end
> > 
> > 
> > end
> > 
> > while true
> > zeroMatrix = SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), init =
> > x->inF(x,nQ))
> > println("ran!")
> > @everywhere zeroMatrix = 1
> > @everywhere gc()
> > end
> > 
> > On Monday, 8 December 2014 23:43:03 UTC-5, Isaiah wrote:
> >> Hopefully you will get an answer on pmap from someone more familiar with
> >> the parallel stuff, but: have you tried splitting the init step? (see the
> >> example in the manual for how to init an array in chunks done by
> >> different
> >> workers). Just guessing though: I'm not sure if/how those will be
> >> serialized if each worker is contending for the whole array.
> >> 
> >> On Fri, Dec 5, 2014 at 4:23 PM, benFranklin  wrote:
> >>> Hi all, I'm trying to figure out how to best initialize a SharedArray,
> >>> using a C function to fill it up that computes a huge matrix in parts,
> >>> and
> >>> all comments are appreciated. To summarise: Is A, making an empty shared
> >>> array, computing the matrix in parallel using pmap and then filling it
> >>> up
> >>> serially, better than using B, computing in parallel and storing in one
> >>> step by using an init function in the SharedArray declaration?
> >>> 
> >>> 
> >>> The difference tends to be that B uses a lot more memory, each process
> >>> using the exact same amount of memory. However it is much faster than A,
> >>> as
> >>> the copy step takes longer than the computation, but in A most of the
> >>> memory usage is in one process, using less memory overall.
> >>> 
> >>> Any tips on how to do this better? Also, this pmap is how I'm handling
> >>> more complex paralellizations in Julia. Any comments on that approach?
> >>> 
> >>> Thanks a lot!
> >>> 
> >>> Best,
> >>> Ben
> >>> 
> >>> 
> >>> CODE A:
> >>> 
> >>> Is this, making an empty shared array, computing the matrix in parallel
> >>> and then filling it up serially:
> >>> 
> >>> function findZeroDividends(model::ModelPrivate)
> >>> 
> >>> nW = length(model.vW)
> >>> nZ = length(model.vZ)
> >>> nK = length(model.vK)
> >>> nQ = length(model.vQ)
> >>> 
> >>>  zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers())
> >>> 
> >>> input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in
> >>> 1:nK];
> >>> results = pmap(findZeroInC,input);
> >>> 
> >>> for w in 1:nW
> >>> for z in 1:nZ
> >>> for k in 1:nK
> >>> 
> >>> zeroMatrix[w,z,k,:,:,:] = results[w + nW*((z-1) + nZ*(k-1))]
> >>> 
> >>>  end
> >>> 
> >>> end
> >>> end
> >>> 
> >>> return zeroMatrix
> >>> end
> >>> 
> >>> ___
> >>> 
> >>> CODE B:
> >>> 
> >>> Better than these two:
> >>> 
> >>> function
> >>> start(x::SharedArray,nW::Int64,nZ::Int64,nK::Int64,model::ModelPrivate)
> >>> 
> >>> for j in myid()-1:nworkers():(nW*nZ*nK)
> >>> inds = ind2sub((nW,nZ,nK),j)
> >>> x[inds[1],inds[2],inds[3],:,:,:]
> >>> =findZeroInC(stateFindZeroK(inds[1],inds[2],inds[3],model))
> >>> end
> >>> 
> >>> x
> >>> 
> >>> end
> >>> 
> >>> function findZeroDividendsSmart(model::ModelPrivate)
> >>> 
> >>> nW = length(model.vW)
> >>> nZ = length(model.vZ)
> >>> nK = length(model.vK)
> >>> nQ = length(model.vQ)
> >>> 
> >>> #input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in
> >>> 1:nK];
> >>> #results = pmap(f

Re: [julia-users] Initializing a SharedArray Memory Error

2014-12-09 Thread elextr


On Wednesday, December 10, 2014 12:28:29 PM UTC+10, benFranklin wrote:
>
> I've made a small example of the memory problems I've been running into. I 
> can't find a way to deallocate a SharedArray,
>

Someone more expert might find it, but I can't see anywhere that the 
mmapped memory is unmapped.

 

> if the code below runs once, it means the computer has enough memory to 
> run this. If I can properly deallocate the memory I should be able to do it 
> again, however, I run out of memory. Am I misunderstanding something about 
> garbage collection in Julia?
>
> Thanks for your attention
>
> Code: 
>
> @everywhere nQ = 60
>
> @everywhere function inF(x::SharedArray,nQ::Int64)
>
> number = myid()-1;
> targetLength = nQ*nQ*3
>
> startN = floor((number-1)*targetLength/nworkers()) + 1
> endN = floor(number*targetLength/nworkers())
>
> myIndexes = int64(startN:endN)
> for j in myIndexes
> inds = ind2sub((nQ,nQ,nQ),j)
> x[inds[1],inds[2],inds[3],:,:,:] = rand(nQ,nQ,nQ)
> end
>
>
> end
>
> while true
> zeroMatrix = SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), init = 
> x->inF(x,nQ))
> println("ran!")
> @everywhere zeroMatrix = 1
> @everywhere gc()
> end
>
> On Monday, 8 December 2014 23:43:03 UTC-5, Isaiah wrote:
>>
>> Hopefully you will get an answer on pmap from someone more familiar with 
>> the parallel stuff, but: have you tried splitting the init step? (see the 
>> example in the manual for how to init an array in chunks done by different 
>> workers). Just guessing though: I'm not sure if/how those will be 
>> serialized if each worker is contending for the whole array.
>>
>> On Fri, Dec 5, 2014 at 4:23 PM, benFranklin  wrote:
>>
>>> Hi all, I'm trying to figure out how to best initialize a SharedArray, 
>>> using a C function to fill it up that computes a huge matrix in parts, and 
>>> all comments are appreciated. To summarise: Is A, making an empty shared 
>>> array, computing the matrix in parallel using pmap and then filling it up 
>>> serially, better than using B, computing in parallel and storing in one 
>>> step by using an init function in the SharedArray declaration?
>>>
>>>
>>> The difference tends to be that B uses a lot more memory, each process 
>>> using the exact same amount of memory. However it is much faster than A, as 
>>> the copy step takes longer than the computation, but in A most of the 
>>> memory usage is in one process, using less memory overall.
>>>
>>> Any tips on how to do this better? Also, this pmap is how I'm handling 
>>> more complex paralellizations in Julia. Any comments on that approach?
>>>
>>> Thanks a lot!
>>>
>>> Best,
>>> Ben
>>>
>>>
>>> CODE A:
>>>
>>> Is this, making an empty shared array, computing the matrix in parallel 
>>> and then filling it up serially:
>>>
>>> function findZeroDividends(model::ModelPrivate)
>>>
>>> nW = length(model.vW)
>>> nZ = length(model.vZ)
>>> nK = length(model.vK)
>>> nQ = length(model.vQ)
>>>  zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers())
>>>
>>> input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in 
>>> 1:nK];
>>> results = pmap(findZeroInC,input);
>>>
>>> for w in 1:nW
>>> for z in 1:nZ
>>> for k in 1:nK
>>>
>>> zeroMatrix[w,z,k,:,:,:] = results[w + nW*((z-1) + nZ*(k-1))]
>>>  end
>>> end
>>> end
>>>
>>> return zeroMatrix
>>> end
>>>
>>> ___
>>>
>>> CODE B:
>>>
>>> Better than these two:
>>>
>>> function 
>>> start(x::SharedArray,nW::Int64,nZ::Int64,nK::Int64,model::ModelPrivate)
>>>
>>> for j in myid()-1:nworkers():(nW*nZ*nK)
>>> inds = ind2sub((nW,nZ,nK),j)
>>> x[inds[1],inds[2],inds[3],:,:,:] 
>>> =findZeroInC(stateFindZeroK(inds[1],inds[2],inds[3],model))
>>> end
>>>
>>> x
>>>
>>> end
>>>
>>> function findZeroDividendsSmart(model::ModelPrivate)
>>>
>>> nW = length(model.vW)
>>> nZ = length(model.vZ)
>>> nK = length(model.vK)
>>> nQ = length(model.vQ)
>>>
>>> #input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in 
>>> 1:nK];
>>> #results = pmap(findZeroInC,input);
>>>
>>> zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers(), 
>>> init = x->start(x,nW,nZ,nK,model) )
>>>
>>> return zeroMatrix
>>> end
>>>
>>> 
>>>
>>> The C function being called is inside this wrapper and returns the 
>>> pointer to  double *capitalChoices = (double 
>>> *)malloc(sizeof(double)*nQ*nQ*nQ);
>>>
>>> function findZeroInC(state::stateFindZeroK)
>>>
>>> w = state.wealth
>>> z = state.z
>>> k = state.k
>>> model = state.model
>>>
>>> #findZeroInC(double wealth, int z,int k,  double theta, double delta, 
>>>  double* vK,
>>> # int nK, double* vQ, int nQ, double* transition, double betaGov)
>>>
>>> nQ = length(model.vQ)
>>>
>>> t = ccall((:findZeroInC,"findP.so"), 
>>> Ptr{Float64},(Float64,Int64,Int64,Float64,Float64,Ptr{Float64},Int64,Ptr{Float64},Int64,Ptr{Float64},Float64),
>>>
>>> model.vW[w],z-1,k-1,model.theta,model.delta,model.vK,length(model.vK),model.vQ,nQ,model.transition,model.betaGov)
>>> if t == C_NULL
>>> error("N

Re: [julia-users] Initializing a SharedArray Memory Error

2014-12-09 Thread benFranklin
I've made a small example of the memory problems I've been running into. I 
can't find a way to deallocate a SharedArray, if the code below runs once, 
it means the computer has enough memory to run this. If I can properly 
deallocate the memory I should be able to do it again, however, I run out 
of memory. Am I misunderstanding something about garbage collection in 
Julia?

Thanks for your attention

Code: 

@everywhere nQ = 60

@everywhere function inF(x::SharedArray,nQ::Int64)

number = myid()-1;
targetLength = nQ*nQ*3

startN = floor((number-1)*targetLength/nworkers()) + 1
endN = floor(number*targetLength/nworkers())

myIndexes = int64(startN:endN)
for j in myIndexes
inds = ind2sub((nQ,nQ,nQ),j)
x[inds[1],inds[2],inds[3],:,:,:] = rand(nQ,nQ,nQ)
end


end

while true
zeroMatrix = SharedArray(Float64,(nQ,nQ,3,nQ,nQ,nQ),pids=workers(), init = 
x->inF(x,nQ))
println("ran!")
@everywhere zeroMatrix = 1
@everywhere gc()
end

On Monday, 8 December 2014 23:43:03 UTC-5, Isaiah wrote:
>
> Hopefully you will get an answer on pmap from someone more familiar with 
> the parallel stuff, but: have you tried splitting the init step? (see the 
> example in the manual for how to init an array in chunks done by different 
> workers). Just guessing though: I'm not sure if/how those will be 
> serialized if each worker is contending for the whole array.
>
> On Fri, Dec 5, 2014 at 4:23 PM, benFranklin  > wrote:
>
>> Hi all, I'm trying to figure out how to best initialize a SharedArray, 
>> using a C function to fill it up that computes a huge matrix in parts, and 
>> all comments are appreciated. To summarise: Is A, making an empty shared 
>> array, computing the matrix in parallel using pmap and then filling it up 
>> serially, better than using B, computing in parallel and storing in one 
>> step by using an init function in the SharedArray declaration?
>>
>>
>> The difference tends to be that B uses a lot more memory, each process 
>> using the exact same amount of memory. However it is much faster than A, as 
>> the copy step takes longer than the computation, but in A most of the 
>> memory usage is in one process, using less memory overall.
>>
>> Any tips on how to do this better? Also, this pmap is how I'm handling 
>> more complex paralellizations in Julia. Any comments on that approach?
>>
>> Thanks a lot!
>>
>> Best,
>> Ben
>>
>>
>> CODE A:
>>
>> Is this, making an empty shared array, computing the matrix in parallel 
>> and then filling it up serially:
>>
>> function findZeroDividends(model::ModelPrivate)
>>
>> nW = length(model.vW)
>> nZ = length(model.vZ)
>> nK = length(model.vK)
>> nQ = length(model.vQ)
>>  zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers())
>>
>> input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in 
>> 1:nK];
>> results = pmap(findZeroInC,input);
>>
>> for w in 1:nW
>> for z in 1:nZ
>> for k in 1:nK
>>
>> zeroMatrix[w,z,k,:,:,:] = results[w + nW*((z-1) + nZ*(k-1))]
>>  end
>> end
>> end
>>
>> return zeroMatrix
>> end
>>
>> ___
>>
>> CODE B:
>>
>> Better than these two:
>>
>> function 
>> start(x::SharedArray,nW::Int64,nZ::Int64,nK::Int64,model::ModelPrivate)
>>
>> for j in myid()-1:nworkers():(nW*nZ*nK)
>> inds = ind2sub((nW,nZ,nK),j)
>> x[inds[1],inds[2],inds[3],:,:,:] 
>> =findZeroInC(stateFindZeroK(inds[1],inds[2],inds[3],model))
>> end
>>
>> x
>>
>> end
>>
>> function findZeroDividendsSmart(model::ModelPrivate)
>>
>> nW = length(model.vW)
>> nZ = length(model.vZ)
>> nK = length(model.vK)
>> nQ = length(model.vQ)
>>
>> #input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in 
>> 1:nK];
>> #results = pmap(findZeroInC,input);
>>
>> zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers(), init 
>> = x->start(x,nW,nZ,nK,model) )
>>
>> return zeroMatrix
>> end
>>
>> 
>>
>> The C function being called is inside this wrapper and returns the 
>> pointer to  double *capitalChoices = (double 
>> *)malloc(sizeof(double)*nQ*nQ*nQ);
>>
>> function findZeroInC(state::stateFindZeroK)
>>
>> w = state.wealth
>> z = state.z
>> k = state.k
>> model = state.model
>>
>> #findZeroInC(double wealth, int z,int k,  double theta, double delta, 
>>  double* vK,
>> # int nK, double* vQ, int nQ, double* transition, double betaGov)
>>
>> nQ = length(model.vQ)
>>
>> t = ccall((:findZeroInC,"findP.so"), 
>> Ptr{Float64},(Float64,Int64,Int64,Float64,Float64,Ptr{Float64},Int64,Ptr{Float64},Int64,Ptr{Float64},Float64),
>>
>> model.vW[w],z-1,k-1,model.theta,model.delta,model.vK,length(model.vK),model.vQ,nQ,model.transition,model.betaGov)
>> if t == C_NULL
>> error("NULL")
>> end
>>
>> return pointer_to_array(t,(nQ,nQ,nQ),true)
>>
>> end
>>
>>
>> 
>>
>>
>>
>

Re: [julia-users] Initializing a SharedArray Memory Error

2014-12-08 Thread Isaiah Norton
Hopefully you will get an answer on pmap from someone more familiar with
the parallel stuff, but: have you tried splitting the init step? (see the
example in the manual for how to init an array in chunks done by different
workers). Just guessing though: I'm not sure if/how those will be
serialized if each worker is contending for the whole array.

On Fri, Dec 5, 2014 at 4:23 PM, benFranklin  wrote:

> Hi all, I'm trying to figure out how to best initialize a SharedArray,
> using a C function to fill it up that computes a huge matrix in parts, and
> all comments are appreciated. To summarise: Is A, making an empty shared
> array, computing the matrix in parallel using pmap and then filling it up
> serially, better than using B, computing in parallel and storing in one
> step by using an init function in the SharedArray declaration?
>
>
> The difference tends to be that B uses a lot more memory, each process
> using the exact same amount of memory. However it is much faster than A, as
> the copy step takes longer than the computation, but in A most of the
> memory usage is in one process, using less memory overall.
>
> Any tips on how to do this better? Also, this pmap is how I'm handling
> more complex paralellizations in Julia. Any comments on that approach?
>
> Thanks a lot!
>
> Best,
> Ben
>
>
> CODE A:
>
> Is this, making an empty shared array, computing the matrix in parallel
> and then filling it up serially:
>
> function findZeroDividends(model::ModelPrivate)
>
> nW = length(model.vW)
> nZ = length(model.vZ)
> nK = length(model.vK)
> nQ = length(model.vQ)
>  zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers())
>
> input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in 1:nK];
> results = pmap(findZeroInC,input);
>
> for w in 1:nW
> for z in 1:nZ
> for k in 1:nK
>
> zeroMatrix[w,z,k,:,:,:] = results[w + nW*((z-1) + nZ*(k-1))]
>  end
> end
> end
>
> return zeroMatrix
> end
>
> ___
>
> CODE B:
>
> Better than these two:
>
> function
> start(x::SharedArray,nW::Int64,nZ::Int64,nK::Int64,model::ModelPrivate)
>
> for j in myid()-1:nworkers():(nW*nZ*nK)
> inds = ind2sub((nW,nZ,nK),j)
> x[inds[1],inds[2],inds[3],:,:,:]
> =findZeroInC(stateFindZeroK(inds[1],inds[2],inds[3],model))
> end
>
> x
>
> end
>
> function findZeroDividendsSmart(model::ModelPrivate)
>
> nW = length(model.vW)
> nZ = length(model.vZ)
> nK = length(model.vK)
> nQ = length(model.vQ)
>
> #input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in
> 1:nK];
> #results = pmap(findZeroInC,input);
>
> zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers(), init
> = x->start(x,nW,nZ,nK,model) )
>
> return zeroMatrix
> end
>
> 
>
> The C function being called is inside this wrapper and returns the pointer
> to  double *capitalChoices = (double *)malloc(sizeof(double)*nQ*nQ*nQ);
>
> function findZeroInC(state::stateFindZeroK)
>
> w = state.wealth
> z = state.z
> k = state.k
> model = state.model
>
> #findZeroInC(double wealth, int z,int k,  double theta, double delta,
>  double* vK,
> # int nK, double* vQ, int nQ, double* transition, double betaGov)
>
> nQ = length(model.vQ)
>
> t = ccall((:findZeroInC,"findP.so"),
> Ptr{Float64},(Float64,Int64,Int64,Float64,Float64,Ptr{Float64},Int64,Ptr{Float64},Int64,Ptr{Float64},Float64),
>
> model.vW[w],z-1,k-1,model.theta,model.delta,model.vK,length(model.vK),model.vQ,nQ,model.transition,model.betaGov)
> if t == C_NULL
> error("NULL")
> end
>
> return pointer_to_array(t,(nQ,nQ,nQ),true)
>
> end
>
>
> 
>
>
>


[julia-users] Initializing a SharedArray Memory Error

2014-12-05 Thread benFranklin
Hi all, I'm trying to figure out how to best initialize a SharedArray, 
using a C function to fill it up that computes a huge matrix in parts, and 
all comments are appreciated. To summarise: Is A, making an empty shared 
array, computing the matrix in parallel using pmap and then filling it up 
serially, better than using B, computing in parallel and storing in one 
step by using an init function in the SharedArray declaration?


The difference tends to be that B uses a lot more memory, each process 
using the exact same amount of memory. However it is much faster than A, as 
the copy step takes longer than the computation, but in A most of the 
memory usage is in one process, using less memory overall.

Any tips on how to do this better? Also, this pmap is how I'm handling more 
complex paralellizations in Julia. Any comments on that approach?

Thanks a lot!

Best,
Ben


CODE A:

Is this, making an empty shared array, computing the matrix in parallel and 
then filling it up serially:

function findZeroDividends(model::ModelPrivate)

nW = length(model.vW)
nZ = length(model.vZ)
nK = length(model.vK)
nQ = length(model.vQ)
 zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers())

input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in 1:nK];
results = pmap(findZeroInC,input);

for w in 1:nW
for z in 1:nZ
for k in 1:nK

zeroMatrix[w,z,k,:,:,:] = results[w + nW*((z-1) + nZ*(k-1))]
 end
end
end

return zeroMatrix
end

___

CODE B:

Better than these two:

function 
start(x::SharedArray,nW::Int64,nZ::Int64,nK::Int64,model::ModelPrivate)

for j in myid()-1:nworkers():(nW*nZ*nK)
inds = ind2sub((nW,nZ,nK),j)
x[inds[1],inds[2],inds[3],:,:,:] 
=findZeroInC(stateFindZeroK(inds[1],inds[2],inds[3],model))
end

x

end

function findZeroDividendsSmart(model::ModelPrivate)

nW = length(model.vW)
nZ = length(model.vZ)
nK = length(model.vK)
nQ = length(model.vQ)

#input = [stateFindZeroK(w,z,k,model) for w in 1:nW, z in 1:nZ,  k in 1:nK];
#results = pmap(findZeroInC,input);

zeroMatrix = SharedArray(Float64,(nW,nZ,nK,nQ,nQ,nQ),pids=workers(), init = 
x->start(x,nW,nZ,nK,model) )

return zeroMatrix
end



The C function being called is inside this wrapper and returns the pointer 
to  double *capitalChoices = (double *)malloc(sizeof(double)*nQ*nQ*nQ);

function findZeroInC(state::stateFindZeroK)

w = state.wealth
z = state.z
k = state.k
model = state.model

#findZeroInC(double wealth, int z,int k,  double theta, double delta, 
 double* vK,
# int nK, double* vQ, int nQ, double* transition, double betaGov)

nQ = length(model.vQ)

t = ccall((:findZeroInC,"findP.so"), 
Ptr{Float64},(Float64,Int64,Int64,Float64,Float64,Ptr{Float64},Int64,Ptr{Float64},Int64,Ptr{Float64},Float64),
model.vW[w],z-1,k-1,model.theta,model.delta,model.vK,length(model.vK),model.vQ,nQ,model.transition,model.betaGov)
if t == C_NULL
error("NULL")
end

return pointer_to_array(t,(nQ,nQ,nQ),true)

end