Re: [julia-users] Re: eigs fails silently

2016-08-06 Thread Madeleine Udell
Oh gosh! It's terrible that we're trying to factor the second matrix. It's
quite common that the second matrix in a generalized eigenvalue problem is
not positive definite.

If we can't get this fixed soon, I'm going to have to resort to Matlab for
my current project...!

On Sat, Aug 6, 2016 at 1:53 PM, Ralph Smith <smith.ralp...@gmail.com> wrote:

> The 0.5 behavior is the same as Fortran, and is technically correct.
> ARPACK cannot create more (orthogonal) Krylov vectors than the rank of the
> matrix A,
> so your example needs the additional keyword ncv=2. This doesn't seem to
> be adequately documented in arpack-ng either. The 0.4 behavior looks
> unreliable.
>
> On Saturday, August 6, 2016 at 2:03:21 AM UTC-4, Madeleine Udell wrote:
>>
>> Actually: I'm not sure we should chalk the error up to ARPACK. Julia 0.4
>> is able to produce a (correct, I think) answer to
>>
>> eigs(A, C)
>>
>> but 0.5 gives an ARPACK error. I don't suppose ARPACK changed between
>> Julia versions...!?
>>
>> On Sat, Aug 6, 2016 at 1:54 AM, Madeleine Udell <madelei...@gmail.com>
>> wrote:
>>
>>> Andreas, thanks for the investigation. I'll use 0.5 for now, and hope
>>> the real problems I encounter are within the capabilities of ARPACK.
>>>
>>> It's embarrassing to be bested by Matlab...
>>>
>>> On Fri, Aug 5, 2016 at 9:23 PM, Andreas Noack <andreasno...@gmail.com>
>>> wrote:
>>>
>>>> I've looked a bit into this. I believe there is a bug in the Julia
>>>> wrappers on 0.4. The good news is that the bug appears to be fixed on 0.5.
>>>> The bad news is the ARPACK seems to have a hard time with the problem. I 
>>>> get
>>>>
>>>> julia> eigs(A,C,nev = 1, which = :LR)[1]
>>>> ERROR: Base.LinAlg.ARPACKException("unspecified ARPACK error: -")
>>>>  in aupd_wrapper(::Type{T}, ::Base.LinAlg.#matvecA!#67{Array{Float64,2}},
>>>> ::Base.LinAlg.##64#71{Array
>>>> {Float64,2}}, ::Base.LinAlg.##65#72, ::Int64, ::Bool, ::Bool, ::String,
>>>> ::Int64, ::Int64, ::String, :
>>>> :Float64, ::Int64, ::Int64, ::Array{Float64,1}) at ./linalg/arpack.jl:53
>>>>  in #_eigs#60(::Int64, ::Int64, ::Symbol, ::Float64, ::Int64, ::Void,
>>>> ::Array{Float64,1}, ::Bool, ::B
>>>> ase.LinAlg.#_eigs, ::Array{Float64,2}, ::Array{Float64,2}) at
>>>> ./linalg/arnoldi.jl:271
>>>>  in (::Base.LinAlg.#kw##_eigs)(::Array{Any,1}, ::Base.LinAlg.#_eigs,
>>>> ::Array{Float64,2}, ::Array{Floa
>>>> t64,2}) at ./:0
>>>>  in #eigs#54(::Array{Any,1}, ::Function, ::Array{Float64,2},
>>>> ::Array{Float64,2}) at ./linalg/arnoldi.
>>>> jl:80
>>>>  in (::Base.LinAlg.#kw##eigs)(::Array{Any,1}, ::Base.LinAlg.#eigs,
>>>> ::Array{Float64,2}, ::Array{Float6
>>>> 4,2}) at ./:0
>>>>
>>>> and since SciPy ends up with the same conclusion I conclude that the
>>>> issue is ARPACK. Matlab is doing something else because they are able to
>>>> handle this problem.
>>>>
>>>> Given that 0.5 is almost release, I'll not spend more time on the issue
>>>> on 0.4. Thought, if anybody is able to figure out what is going on, please
>>>> let us know.
>>>>
>>>> On Friday, August 5, 2016 at 8:47:26 AM UTC-4, Madeleine Udell wrote:
>>>>>
>>>>> Setting `which=:LR, nev=1` does not return the generalized eigenvalue
>>>>> with the largest real parts, and does not give a warning or error:
>>>>>
>>>>> n = 10
>>>>> C = eye(n)
>>>>> A = zeros(n,n)
>>>>> A[1] = 100
>>>>> A[end] = -100
>>>>> @assert eigs(A, C, nev=1, which=:LR)[1][1] == maximum(eigs(A, C)[1])
>>>>>
>>>>> Am I expected to set nev greater than the number of eigenvalues I
>>>>> truly desire, based on my intuition as a numerical analyst? Or has eigs
>>>>> broken its implicit guarantee?
>>>>>
>>>>>
>>>
>>>
>>> --
>>> Madeleine Udell
>>> Assistant Professor, Operations Research and Information Engineering
>>> Cornell University
>>> https://people.orie.cornell.edu/mru8/
>>> (415) 729-4115
>>>
>>
>>
>>
>> --
>> Madeleine Udell
>> Assistant Professor, Operations Research and Information Engineering
>> Cornell University
>> https://people.orie.cornell.edu/mru8/
>> (415) 729-4115
>>
>


-- 
Madeleine Udell
Assistant Professor, Operations Research and Information Engineering
Cornell University
https://people.orie.cornell.edu/mru8/
(415) 729-4115


Re: [julia-users] Re: eigs fails silently

2016-08-06 Thread Madeleine Udell
Actually: I'm not sure we should chalk the error up to ARPACK. Julia 0.4 is
able to produce a (correct, I think) answer to

eigs(A, C)

but 0.5 gives an ARPACK error. I don't suppose ARPACK changed between Julia
versions...!?

On Sat, Aug 6, 2016 at 1:54 AM, Madeleine Udell <madeleine.ud...@gmail.com>
wrote:

> Andreas, thanks for the investigation. I'll use 0.5 for now, and hope the
> real problems I encounter are within the capabilities of ARPACK.
>
> It's embarrassing to be bested by Matlab...
>
> On Fri, Aug 5, 2016 at 9:23 PM, Andreas Noack <
> andreasnoackjen...@gmail.com> wrote:
>
>> I've looked a bit into this. I believe there is a bug in the Julia
>> wrappers on 0.4. The good news is that the bug appears to be fixed on 0.5.
>> The bad news is the ARPACK seems to have a hard time with the problem. I get
>>
>> julia> eigs(A,C,nev = 1, which = :LR)[1]
>> ERROR: Base.LinAlg.ARPACKException("unspecified ARPACK error: -")
>>  in aupd_wrapper(::Type{T}, ::Base.LinAlg.#matvecA!#67{Array{Float64,2}},
>> ::Base.LinAlg.##64#71{Array
>> {Float64,2}}, ::Base.LinAlg.##65#72, ::Int64, ::Bool, ::Bool, ::String,
>> ::Int64, ::Int64, ::String, :
>> :Float64, ::Int64, ::Int64, ::Array{Float64,1}) at ./linalg/arpack.jl:53
>>  in #_eigs#60(::Int64, ::Int64, ::Symbol, ::Float64, ::Int64, ::Void,
>> ::Array{Float64,1}, ::Bool, ::B
>> ase.LinAlg.#_eigs, ::Array{Float64,2}, ::Array{Float64,2}) at
>> ./linalg/arnoldi.jl:271
>>  in (::Base.LinAlg.#kw##_eigs)(::Array{Any,1}, ::Base.LinAlg.#_eigs,
>> ::Array{Float64,2}, ::Array{Floa
>> t64,2}) at ./:0
>>  in #eigs#54(::Array{Any,1}, ::Function, ::Array{Float64,2},
>> ::Array{Float64,2}) at ./linalg/arnoldi.
>> jl:80
>>  in (::Base.LinAlg.#kw##eigs)(::Array{Any,1}, ::Base.LinAlg.#eigs,
>> ::Array{Float64,2}, ::Array{Float6
>> 4,2}) at ./:0
>>
>> and since SciPy ends up with the same conclusion I conclude that the
>> issue is ARPACK. Matlab is doing something else because they are able to
>> handle this problem.
>>
>> Given that 0.5 is almost release, I'll not spend more time on the issue
>> on 0.4. Thought, if anybody is able to figure out what is going on, please
>> let us know.
>>
>> On Friday, August 5, 2016 at 8:47:26 AM UTC-4, Madeleine Udell wrote:
>>>
>>> Setting `which=:LR, nev=1` does not return the generalized eigenvalue
>>> with the largest real parts, and does not give a warning or error:
>>>
>>> n = 10
>>> C = eye(n)
>>> A = zeros(n,n)
>>> A[1] = 100
>>> A[end] = -100
>>> @assert eigs(A, C, nev=1, which=:LR)[1][1] == maximum(eigs(A, C)[1])
>>>
>>> Am I expected to set nev greater than the number of eigenvalues I truly
>>> desire, based on my intuition as a numerical analyst? Or has eigs broken
>>> its implicit guarantee?
>>>
>>>
>
>
> --
> Madeleine Udell
> Assistant Professor, Operations Research and Information Engineering
> Cornell University
> https://people.orie.cornell.edu/mru8/
> (415) 729-4115
>



-- 
Madeleine Udell
Assistant Professor, Operations Research and Information Engineering
Cornell University
https://people.orie.cornell.edu/mru8/
(415) 729-4115


Re: [julia-users] Re: eigs fails silently

2016-08-05 Thread Madeleine Udell
Andreas, thanks for the investigation. I'll use 0.5 for now, and hope the
real problems I encounter are within the capabilities of ARPACK.

It's embarrassing to be bested by Matlab...

On Fri, Aug 5, 2016 at 9:23 PM, Andreas Noack <andreasnoackjen...@gmail.com>
wrote:

> I've looked a bit into this. I believe there is a bug in the Julia
> wrappers on 0.4. The good news is that the bug appears to be fixed on 0.5.
> The bad news is the ARPACK seems to have a hard time with the problem. I get
>
> julia> eigs(A,C,nev = 1, which = :LR)[1]
> ERROR: Base.LinAlg.ARPACKException("unspecified ARPACK error: -")
>  in aupd_wrapper(::Type{T}, ::Base.LinAlg.#matvecA!#67{Array{Float64,2}},
> ::Base.LinAlg.##64#71{Array
> {Float64,2}}, ::Base.LinAlg.##65#72, ::Int64, ::Bool, ::Bool, ::String,
> ::Int64, ::Int64, ::String, :
> :Float64, ::Int64, ::Int64, ::Array{Float64,1}) at ./linalg/arpack.jl:53
>  in #_eigs#60(::Int64, ::Int64, ::Symbol, ::Float64, ::Int64, ::Void,
> ::Array{Float64,1}, ::Bool, ::B
> ase.LinAlg.#_eigs, ::Array{Float64,2}, ::Array{Float64,2}) at
> ./linalg/arnoldi.jl:271
>  in (::Base.LinAlg.#kw##_eigs)(::Array{Any,1}, ::Base.LinAlg.#_eigs,
> ::Array{Float64,2}, ::Array{Floa
> t64,2}) at ./:0
>  in #eigs#54(::Array{Any,1}, ::Function, ::Array{Float64,2},
> ::Array{Float64,2}) at ./linalg/arnoldi.
> jl:80
>  in (::Base.LinAlg.#kw##eigs)(::Array{Any,1}, ::Base.LinAlg.#eigs,
> ::Array{Float64,2}, ::Array{Float6
> 4,2}) at ./:0
>
> and since SciPy ends up with the same conclusion I conclude that the issue
> is ARPACK. Matlab is doing something else because they are able to handle
> this problem.
>
> Given that 0.5 is almost release, I'll not spend more time on the issue on
> 0.4. Thought, if anybody is able to figure out what is going on, please let
> us know.
>
> On Friday, August 5, 2016 at 8:47:26 AM UTC-4, Madeleine Udell wrote:
>>
>> Setting `which=:LR, nev=1` does not return the generalized eigenvalue
>> with the largest real parts, and does not give a warning or error:
>>
>> n = 10
>> C = eye(n)
>> A = zeros(n,n)
>> A[1] = 100
>> A[end] = -100
>> @assert eigs(A, C, nev=1, which=:LR)[1][1] == maximum(eigs(A, C)[1])
>>
>> Am I expected to set nev greater than the number of eigenvalues I truly
>> desire, based on my intuition as a numerical analyst? Or has eigs broken
>> its implicit guarantee?
>>
>>


-- 
Madeleine Udell
Assistant Professor, Operations Research and Information Engineering
Cornell University
https://people.orie.cornell.edu/mru8/
(415) 729-4115


[julia-users] eigs fails silently

2016-08-05 Thread Madeleine Udell
Setting `which=:LR, nev=1` does not return the generalized eigenvalue with 
the largest real parts, and does not give a warning or error:

n = 10
C = eye(n)
A = zeros(n,n)
A[1] = 100
A[end] = -100
@assert eigs(A, C, nev=1, which=:LR)[1][1] == maximum(eigs(A, C)[1])

Am I expected to set nev greater than the number of eigenvalues I truly 
desire, based on my intuition as a numerical analyst? Or has eigs broken 
its implicit guarantee?



[julia-users] Inherit from AbstractMatrix

2016-07-14 Thread Madeleine Udell
Convex.jl defines an abstract type AbstractExpr, from which all important 
concrete types in the package are descended. We've defined a bunch of 
functions that allow users to treat AbstractExprs as though they were 
matrices: indexing, hcat, vcat, multiplication, addition, subtraction, etc. 
Can we use these to automatically get more advanced functions, like (for 
example) kron?

The code for kron 
 
uses only indexing and multiplication operators. But I'm not sure how to 
make AbstractExpr (and its subtypes) subtype AbstractMatrix{Float64} or 
subtype AbstractArray{Float64,2} to make this transition seamless.

Simply defining

abstract AbstractExpr <: AbstractArray{Float64,2} 

results in a method error: (Variable is defined as a subtype of 
AbstractExpr)

julia> kron(A,B)
ERROR: MethodError: `kron` has no method matching kron(::Array{Float64,2}, 
::Convex.Variable)
Closest candidates are:
  kron(::Any, ::Any, ::Any, ::Any...)
  kron{T,S}(::Array{T,2}, ::Array{S,2})
  kron(::Union{Array{T,1},Array{T,2}}, ::Number)

1) Is it possible to have Convex's AbstractExpr inherit from AbstractArray? 
What other methods would we need to implement to make this work?
2) Is it a terrible idea, for reasons I haven't thought of?

Thanks!
Madeleine


[julia-users] Re: a'*b and svds for custom operators

2016-04-22 Thread Madeleine Udell
Thanks Greg! These recommendations worked great.

Dominique, thanks for pointing out your operator infrastructure! Your lazy 
evaluation protocol is exactly what I need...

Madeleine

On Thursday, April 21, 2016 at 6:36:57 AM UTC-7, Dominique Orban wrote:
>
> I have a whole operator infrastructure here: 
> https://github.com/JuliaOptimizers/LinearOperators.jl (develop branch) 
> but I've never tried SVD. It should work with normest & co. as all that's 
> required is matvecs.
>
> On Wednesday, April 20, 2016 at 9:17:32 PM UTC-4, Madeleine Udell wrote:
>>
>> Hi, 
>>
>> I'm trying to define my own custom operator type that will allow me to 
>> implement my own * and '* operations for use inside eigenvalue or singular 
>> value routines like eigs and svds. But I'm having trouble making this work.
>>
>> Here's a simple example reimplementing matrix multiplication, with 
>> questions sprinkled as comments in the code.
>>
>> ---
>>
>> import Base: *, Ac_mul_b, size
>> # Ac_mul_b doesn't seem to live in base; where does it live?
>>
>> type MyOperator
>> A::Array{Float64}
>> end
>>
>> o = MyOperator(randn(5,4))
>>
>> *(o::MyOperator, v::AbstractVector) = (println("using custom * method"); 
>> o.A*v)
>> o*rand(4) # this works
>>
>> Ac_mul_b(o::MyOperator, v::AbstractVector) = (println("using custom '* 
>> method"); o.A'*b)
>> o'*rand(5) # this doesn't work; instead, it calls ctranspose(o)*v, which 
>> is not what I want
>>
>> size(o::MyOperator) = size(o.A)
>> size(o::MyOperator, i::Int) = size(o)[i]
>>
>> svds(o, nsv=1)
>> # this doesn't work; svds requires a `zero` method for the parametrized 
>> type of my operator. If I know the type will always be Float64, what's the 
>> easiest way to tell this to svds?
>>
>> ---
>>
>> Questions, collected:
>> 1. Ac_mul_b doesn't seem to live in base; where does it live? Or should I 
>> be extending a different function? This causes o'*v to call 
>> ctranspose(o)*v, which is not what I want. (The default ctranspose for new 
>> types seems to be the identity.)
>> 2. svds requires a `zero` method for the parametrized type of my 
>> operator. If I know the type will always be Float64, what's the easiest way 
>> to tell this to svds? Here's the eigs/svds 
>> <https://github.com/JuliaLang/julia/blob/master/base/linalg/arnoldi.jl> 
>> code, but I haven't been able to track down enough information on 
>> AbstractMatrices and parametrized types to make this work.
>> 3. Any other methods I should implement for my operator?
>>
>> Thanks!
>> Madeleine
>>
>

[julia-users] a'*b and svds for custom operators

2016-04-20 Thread Madeleine Udell
Hi, 

I'm trying to define my own custom operator type that will allow me to 
implement my own * and '* operations for use inside eigenvalue or singular 
value routines like eigs and svds. But I'm having trouble making this work.

Here's a simple example reimplementing matrix multiplication, with 
questions sprinkled as comments in the code.

---

import Base: *, Ac_mul_b, size
# Ac_mul_b doesn't seem to live in base; where does it live?

type MyOperator
A::Array{Float64}
end

o = MyOperator(randn(5,4))

*(o::MyOperator, v::AbstractVector) = (println("using custom * method"); 
o.A*v)
o*rand(4) # this works

Ac_mul_b(o::MyOperator, v::AbstractVector) = (println("using custom '* 
method"); o.A'*b)
o'*rand(5) # this doesn't work; instead, it calls ctranspose(o)*v, which is 
not what I want

size(o::MyOperator) = size(o.A)
size(o::MyOperator, i::Int) = size(o)[i]

svds(o, nsv=1)
# this doesn't work; svds requires a `zero` method for the parametrized 
type of my operator. If I know the type will always be Float64, what's the 
easiest way to tell this to svds?

---

Questions, collected:
1. Ac_mul_b doesn't seem to live in base; where does it live? Or should I 
be extending a different function? This causes o'*v to call 
ctranspose(o)*v, which is not what I want. (The default ctranspose for new 
types seems to be the identity.)
2. svds requires a `zero` method for the parametrized type of my operator. 
If I know the type will always be Float64, what's the easiest way to tell 
this to svds? Here's the eigs/svds 
 
code, but I haven't been able to track down enough information on 
AbstractMatrices and parametrized types to make this work.
3. Any other methods I should implement for my operator?

Thanks!
Madeleine


Re: [julia-users] Memory management in Julia

2016-02-01 Thread Madeleine Udell
This is on a mac; we've got a variety of function calls giving errors, some
with probability ~.5 and some every time we've run them.

On Mon, Feb 1, 2016 at 10:51 AM, Yichao Yu <yyc1...@gmail.com> wrote:

> On Mon, Feb 1, 2016 at 1:39 PM, Madeleine Udell
> <madeleine.ud...@gmail.com> wrote:
> > Hi all,
> >
> > I'm running into some memory management issues: in particular, a malloc
> > error that claims I am modifying an object after freeing it: see this
> > question. The error is,
> >
> > julia(9849,0x7fff705d0300) malloc: *** error for object 0x7f96a332f408:
> > incorrect checksum for freed object - object was probably modified after
> > being freed. *** set a breakpoint in malloc_error_break to debug
> >
> > I'm not sure how to debug it: what's the best way to search for code that
> > might be modifying an object after freeing it in Julia? (For example, I
> > don't know what or where malloc_error_break is.)
>
> Which platform is this and how repeatable is it?
>
> >
> > Thanks!
> > Madeleine
>



-- 
Madeleine Udell
Postdoctoral Fellow at the Center for the Mathematics of Information
California Institute of Technology
*https://courses2.cit.cornell.edu/mru8
<https://courses2.cit.cornell.edu/mru8>*
(415) 729-4115


Re: [julia-users] Memory management in Julia

2016-02-01 Thread Madeleine Udell
Fantastic! That helped me track down the problem and it's working now.
On Feb 1, 2016 11:10 AM, "Matt Bauman" <mbau...@gmail.com> wrote:

> Can you reproduce it if you run julia with the --check-bounds=yes command
> line argument?
>
> On Monday, February 1, 2016 at 1:55:25 PM UTC-5, Madeleine Udell wrote:
>>
>> This is on a mac; we've got a variety of function calls giving errors,
>> some with probability ~.5 and some every time we've run them.
>>
>> On Mon, Feb 1, 2016 at 10:51 AM, Yichao Yu <yyc...@gmail.com> wrote:
>>
>>> On Mon, Feb 1, 2016 at 1:39 PM, Madeleine Udell
>>> <madelei...@gmail.com> wrote:
>>> > Hi all,
>>> >
>>> > I'm running into some memory management issues: in particular, a malloc
>>> > error that claims I am modifying an object after freeing it: see this
>>> > question. The error is,
>>> >
>>> > julia(9849,0x7fff705d0300) malloc: *** error for object 0x7f96a332f408:
>>> > incorrect checksum for freed object - object was probably modified
>>> after
>>> > being freed. *** set a breakpoint in malloc_error_break to debug
>>> >
>>> > I'm not sure how to debug it: what's the best way to search for code
>>> that
>>> > might be modifying an object after freeing it in Julia? (For example, I
>>> > don't know what or where malloc_error_break is.)
>>>
>>> Which platform is this and how repeatable is it?
>>>
>>> >
>>> > Thanks!
>>> > Madeleine
>>>
>>
>>
>>
>> --
>> Madeleine Udell
>> Postdoctoral Fellow at the Center for the Mathematics of Information
>> California Institute of Technology
>> *https://courses2.cit.cornell.edu/mru8
>> <https://courses2.cit.cornell.edu/mru8>*
>> (415) 729-4115
>>
>


[julia-users] Re: malloc error?

2016-01-29 Thread Madeleine Udell
A bit more information: the error seems to be occurring in the function
`fit!` in algorithms/proxgrad.jl
<https://github.com/madeleineudell/LowRankModels.jl/blob/dataframe-ux/src/algorithms/proxgrad.jl>.
There's almost no calling of C functions, so very little chance for memory
management problems: the only possible problems I can imagine are in

1) gemm! (eg line 59
<https://github.com/madeleineudell/LowRankModels.jl/blob/dataframe-ux/src/algorithms/proxgrad.jl#L59>
)
2) accessing out-of-bounds entries via an ArrayView (eg line 84
<https://github.com/madeleineudell/LowRankModels.jl/blob/dataframe-ux/src/algorithms/proxgrad.jl#L84>
)
3) accessing a garbage-collected variable (maybe accessing an array entry
via an array view?)

I've checked sizes of matrices via a bunch of @assert statements, and the
error occurs nondeterministically, which leads me to think that something
like 3) is happening.

Any ideas how we've managed to modify an object after freeing it?

On Fri, Jan 29, 2016 at 10:14 AM, Nandana Sengupta <nandana@gmail.com>
wrote:

> I'm using the Julia package LowRankModels (
> https://github.com/madeleineudell/LowRankModels.jl/tree/dataframe-ux) and
> coming across a malloc error when making a single innocuous change to my
> code.
>
>
> For instance in the following sample code, changing a single parameter
> (the rank k) from 10 to 40 makes the model go from running smoothly to
> producing a malloc error.
>
>
> Would really appreciate any pointers towards what might be going on /any
> tips to debug this error. Details and code below. Thanks!
>
>
> #
>
>
> Wording of error: "julia(9849,0x7fff705d0300) malloc: *** error for object
> 0x7f96a332f408: incorrect checksum for freed object - object was probably
> modified after being freed.
>
> *** set a breakpoint in malloc_error_break to debug"
>
>
> Julia Version: 0.4.1
>
>
> Branch of LowRankModels: dataframe-ux
>
>
> Link to Data:
> https://dl.dropboxusercontent.com/u/24399038/GSS2014cleanestCV10.csv
>
>
> Julia code that reproduces the error:
>
>
>
> ##
>
>
> using DataFrames
>
> # branch of LowRankModels found at
> https://github.com/NandanaSengupta/LowRankModels.jl/tree/dataframe-ux
>
> using LowRankModels
>
>
> ### loading data table
>
> df = readtable("GSS2014cleanestCV10.csv");
>
> # eliminate first (id) column
>
> df1 = df[:, 2:size(df)[2] ];
>
>
>
> # vector of datatypes -- 3 types of columns: real, categorical and ordinal
>
> datatypes = Array(Symbol, size(df1)[2])
>
> datatypes[1:23] = :real
>
> datatypes[24:54] = :ord
>
> datatypes[55:size(df1)[2]] = :cat
>
>
>
>
> ## run GLRM AND cross_validate with rank k = 10
> ##
>
>  Runs without any error
>
>
> glrm_10 = GLRM(df1, 10, datatypes)
>
>
> srand(10)
>
> t1, t2, t3, t4 = cross_validate(glrm_10, nfolds = 5, params = Params(),
> init = init_svd!);
>
>
>
> ###### run GLRM AND cross_validate with rank k = 40
> ##
>
>  malloc error on cross_validate
>
>
> glrm_40 = GLRM(df1, 40, datatypes) #, prob_scale = false)
>
>
> srand(40)
>
> t1, t2, t3, t4 = cross_validate(glrm_40, nfolds = 5, params = Params(),
> init = init_svd!);
>



-- 
Madeleine Udell
Postdoctoral Fellow at the Center for the Mathematics of Information
California Institute of Technology
*https://courses2.cit.cornell.edu/mru8
<https://courses2.cit.cornell.edu/mru8>*
(415) 729-4115


Re: [julia-users] Defining a function in different modules

2015-08-06 Thread Madeleine Udell
Thanks, David! Requiring StatsBase and importing StatsBase.fit! fixes the
problem.

Of course that doesn't prevent other authors from writing other packages I
don't know about with overlapping names. But I take it from previous posts
on this thread that on 0.4 `using` multiple modules that export the same
function results in an error rather than a silent overwrite; which is at
least easier to debug.

On Thu, Aug 6, 2015 at 6:28 PM, David Gold david.gol...@gmail.com wrote:

 One thing I believe you could do is actually require StatsBase for
 LowRankModels and extend StatsBase.fit! in your LowRankModels module. As
 long as there are no method signature conflicts, I think this should work
 fine (although you might end up with some new ambiguity warnings to deal
 with).

 Ideally such a generic verb as `fit!` would be defined in a package even
 more generic than StatsBase, like ModelBase or something, and both
 StatsBase and LowRankModels (as well as any other package that uses the
 verb in a similar spirit) would require ModelBase (or whatever) and extend
 `ModelBase.fit!`.


 On Thursday, August 6, 2015 at 6:23:14 PM UTC-7, David Gold wrote:

 Madeleine,

 You are experiencing these problems in 0.3, correct?

 On Thursday, August 6, 2015 at 3:46:49 PM UTC-7, Madeleine Udell wrote:

 I've been running into major user problems because of the silent
 overwriting of functions from one module by functions from another. My
 LowRankModels module interacts closely with the DataFrames module, which
 imports a bunch of stuff from StatsBase. Internally, I'm careful to use
 `import` rather than `using`; but users often want to do `using
 LowRankModels, DataFrames`. Depending on the order of the using statements,
 either the LowRankModels.fit!(g::GLRM) function is overwritten, or the
 StatsBase.fit!(obj::StatisticalModel) function is. This is *extremely*
 confusing for users...

 On the other hand, I *like* the fact that I'm reusing the name `fit!`,
 since the (semantic) sense in which LowRankModels fits a GLRM is similar to
 the sense in which StatsBase fits statistical models, and so I *don't* want
 to change the name...

 What's a good solution?

 On Thursday, April 30, 2015 at 2:26:19 PM UTC-7, Scott Jones wrote:

 Maybe because it seems that a lot of the major packages have been put
 into Base, so it isn't a problem, as MA Laforge pointed out, leading to
 Base being incredibly large,
 with stuff that means Julia's MIT license doesn't mean all that much,
 because it includes GPL code by default...

 Scott

 On Thursday, April 30, 2015 at 5:03:52 PM UTC-4, Stefan Karpinski wrote:

 On Wed, Apr 29, 2015 at 9:08 PM, Scott Jones scott.pa...@gmail.com
 wrote:

 Your restrictions are making it very hard to develop easy to use APIs
 that make sense for the people using them…

 That’s why so many people have been bringing this issue up…


 Not a single person who maintains a major Julia package has complained
 about this. Which doesn't mean that there can't possibly be an issue here,
 but it seems to strongly suggest that this is one of those concerns that
 initially appears dire, when coming from a particular programming
 background, but which dissipates once one acclimatizes to the multiple
 dispatch mindset – in particular the idea that one generic function =
 one verb concept.

 ...

 ...




-- 
Madeleine Udell
Postdoctoral Fellow at the Center for the Mathematics of Information
California Institute of Technology
www.stanford.edu/~udell
(415) 729-4115


Re: [julia-users] Defining a function in different modules

2015-08-06 Thread Madeleine Udell
I've been running into major user problems because of the silent 
overwriting of functions from one module by functions from another. My 
LowRankModels module interacts closely with the DataFrames module, which 
imports a bunch of stuff from StatsBase. Internally, I'm careful to use 
`import` rather than `using`; but users often want to do `using 
LowRankModels, DataFrames`. Depending on the order of the using statements, 
either the LowRankModels.fit!(g::GLRM) function is overwritten, or the 
StatsBase.fit!(obj::StatisticalModel) function is. This is *extremely* 
confusing for users... 

On the other hand, I *like* the fact that I'm reusing the name `fit!`, 
since the (semantic) sense in which LowRankModels fits a GLRM is similar to 
the sense in which StatsBase fits statistical models, and so I *don't* want 
to change the name...

What's a good solution?

On Thursday, April 30, 2015 at 2:26:19 PM UTC-7, Scott Jones wrote:

 Maybe because it seems that a lot of the major packages have been put into 
 Base, so it isn't a problem, as MA Laforge pointed out, leading to Base 
 being incredibly large,
 with stuff that means Julia's MIT license doesn't mean all that much, 
 because it includes GPL code by default...

 Scott

 On Thursday, April 30, 2015 at 5:03:52 PM UTC-4, Stefan Karpinski wrote:

 On Wed, Apr 29, 2015 at 9:08 PM, Scott Jones scott.pa...@gmail.com 
 wrote:

 Your restrictions are making it very hard to develop easy to use APIs that 
 make sense for the people using them…

 That’s why so many people have been bringing this issue up…


 Not a single person who maintains a major Julia package has complained 
 about this. Which doesn't mean that there can't possibly be an issue here, 
 but it seems to strongly suggest that this is one of those concerns that 
 initially appears dire, when coming from a particular programming 
 background, but which dissipates once one acclimatizes to the multiple 
 dispatch mindset – in particular the idea that one generic function = 
 one verb concept.

 ...



[julia-users] Announcing Convex.jl - Convex Optimization in Julia

2015-02-04 Thread Madeleine Udell


Convex.jl https://github.com/JuliaOpt/Convex.jl is a Julia library for 
mathematical programming that makes it easy to formulate and fast to solve 
nonlinear convex optimization problems. Convex.jl 
https://github.com/JuliaOpt/Convex.jl is a member of the JuliaOpt 
https://github.com/JuliaOpt umbrella group and can use (nearly) any 
solver that complies with the MathProgBase interface, including Mosek 
https://github.com/JuliaOpt/Mosek.jl, Gurobi 
https://github.com/JuliaOpt/gurobi.jl, ECOS 
https://github.com/JuliaOpt/ECOS.jl, SCS 
https://github.com/JuliaOpt/SCS.jl, and GLPK 
https://github.com/JuliaOpt/GLPK.jl.

Here's a quick example of code that solves a non-negative least-squares 
problem.

using Convex

# Generate random problem data
m = 4;  n = 5
A = randn(m, n); b = randn(m, 1)

# Create a (column vector) variable of size n x 1.
x = Variable(n)

# The problem is to minimize ||Ax - b||^2 subject to x = 0
problem = minimize(sum_squares(A * x + b), [x = 0])

solve!(problem)

We could instead solve a robust approximation problem by replacing 
sum_squares(A 
* x + b) by sum(norm(A * x + b, 1)) or sum(huber(A * x + b)); it's that 
easy.

Convex.jl https://github.com/JuliaOpt/Convex.jl is different from JuMP 
https://github.com/JuliaOpt/JuMP.jl in that it allows (and prioritizes) 
linear algebraic and functional constructions in objectives and constraints 
(like max(x,y)  A*z). Under the hood, it converts problems to a standard 
conic form, which requires (and certifies) that the problem is convex, and 
guarantees global optimality of the resulting solution.


Re: [julia-users] Broadcasting variables

2014-12-11 Thread Madeleine Udell
Amit and Blake, thanks for all your advice. I've managed to cobble together
a shared memory version of LowRankModels.jl, using the workarounds we
devised above. In case you're interested, it's at

https://github.com/madeleineudell/LowRankModels.jl/blob/master/src/shareglrm.jl

and you can run it using eg

julia -p 3 LowRankModels/examples/sharetest.jl

There's a significant overhead, but it's faster than the serial version for
large problem sizes. Any advice for reducing the overhead would be much
appreciated.

However, in that package I'm seeing some unexpected behavior: occasionally
it seems that some of the processes have not finished their jobs at the end
of an @everywhere block, although looking at the code for @everywhere I see
it's wrapped in a @sync already. Is there something else I can use to
synchronize (ie wait for completion of all) the processes?

On Tue, Dec 2, 2014 at 12:21 AM, Amit Murthy amit.mur...@gmail.com wrote:

 Issue - https://github.com/JuliaLang/julia/issues/9219

 On Tue, Dec 2, 2014 at 10:04 AM, Amit Murthy amit.mur...@gmail.com
 wrote:

 From the documentation - Modules in Julia are separate global variable
 workspaces.

 So what is happening is that the anonymous function in remotecall(i,
 x-(global const X=x; nothing), localX) creates X as module global.

 The following works:

 module ParallelStuff
 export doparallelstuff

 function doparallelstuff()(m = 10, n = 20)
 # initialize variables
 localX = Base.shmem_rand(m; pids=procs())
 localY = Base.shmem_rand(n; pids=procs())
 localf = [x-i+sum(x) for i=1:m]
 localg = [x-i+sum(x) for i=1:n]

 # broadcast variables to all worker processes (thanks to Amit Murthy
 for suggesting this syntax)
 @sync begin
 for i in procs(localX)
 remotecall(i, x-(global X=x; nothing), localX)
 remotecall(i, x-(global Y=x; nothing), localY)
 remotecall(i, x-(global f=x; nothing), localf)
 remotecall(i, x-(global g=x; nothing), localg)
 end
 end

 # compute
 for iteration=1:1
 @everywhere begin
 X=ParallelStuff.X
 Y=ParallelStuff.Y
 f=ParallelStuff.f
 g=ParallelStuff.g
 for i=localindexes(X)
 X[i] = f[i](Y)
 end
 for j=localindexes(Y)
 Y[j] = g[j](X)
 end
 end
 end
 end

 end #module


 While remotecall, @everywhere, etc run under Main, the fact that the
 closure variables refers to Module ParallelStuff is pretty confusing.
 I think we need a better way to handle this.


 On Tue, Dec 2, 2014 at 4:58 AM, Madeleine Udell 
 madeleine.ud...@gmail.com wrote:

 Thanks to Blake and Amit for some excellent suggestions! Both strategies
 work fine when embedded in functions, but not when those functions are
 embedded in modules. For example, the following throws an error:

 @everywhere include(ParallelStuff.jl)
 @everywhere using ParallelStuff
 doparallelstuff()

 when ParallelStuff.jl contains the following code:

 module ParallelStuff
 export doparallelstuff

 function doparallelstuff(m = 10, n = 20)
 # initialize variables
 localX = Base.shmem_rand(m; pids=procs())
 localY = Base.shmem_rand(n; pids=procs())
 localf = [x-i+sum(x) for i=1:m]
 localg = [x-i+sum(x) for i=1:n]

 # broadcast variables to all worker processes (thanks to Amit Murthy
 for suggesting this syntax)
 @sync begin
 for i in procs(localX)
 remotecall(i, x-(global const X=x; nothing), localX)
 remotecall(i, x-(global const Y=x; nothing), localY)
 remotecall(i, x-(global const f=x; nothing), localf)
 remotecall(i, x-(global const g=x; nothing), localg)
 end
 end

 # compute
 for iteration=1:1
 @everywhere for i=localindexes(X)
 X[i] = f[i](Y)
 end
 @everywhere for j=localindexes(Y)
 Y[j] = g[j](X)
 end
 end
 end

 end #module

 On 3 processes (julia -p 3), the error is as follows:

 exception on 1: exception on 2: exception on 3: ERROR: X not defined
  in anonymous at no file
  in eval at
 /Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7
  in anonymous at multi.jl:1310
  in run_work_thunk at multi.jl:621
  in run_work_thunk at multi.jl:630
  in anonymous at task.jl:6
 ERROR: X not defined
  in anonymous at no file
  in eval at
 /Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7
  in anonymous at multi.jl:1310
  in anonymous at multi.jl:848
  in run_work_thunk at multi.jl:621
  in run_work_thunk at multi.jl:630
  in anonymous at task.jl:6
 ERROR: X not defined
  in anonymous at no file
  in eval at
 /Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7
  in anonymous at multi.jl:1310
  in anonymous at multi.jl:848
  in run_work_thunk at multi.jl:621
  in run_work_thunk at multi.jl:630
  in anonymous at task.jl

Re: [julia-users] Broadcasting variables

2014-12-01 Thread Madeleine Udell
Thanks to Blake and Amit for some excellent suggestions! Both strategies
work fine when embedded in functions, but not when those functions are
embedded in modules. For example, the following throws an error:

@everywhere include(ParallelStuff.jl)
@everywhere using ParallelStuff
doparallelstuff()

when ParallelStuff.jl contains the following code:

module ParallelStuff
export doparallelstuff

function doparallelstuff(m = 10, n = 20)
# initialize variables
localX = Base.shmem_rand(m; pids=procs())
localY = Base.shmem_rand(n; pids=procs())
localf = [x-i+sum(x) for i=1:m]
localg = [x-i+sum(x) for i=1:n]

# broadcast variables to all worker processes (thanks to Amit Murthy
for suggesting this syntax)
@sync begin
for i in procs(localX)
remotecall(i, x-(global const X=x; nothing), localX)
remotecall(i, x-(global const Y=x; nothing), localY)
remotecall(i, x-(global const f=x; nothing), localf)
remotecall(i, x-(global const g=x; nothing), localg)
end
end

# compute
for iteration=1:1
@everywhere for i=localindexes(X)
X[i] = f[i](Y)
end
@everywhere for j=localindexes(Y)
Y[j] = g[j](X)
end
end
end

end #module

On 3 processes (julia -p 3), the error is as follows:

exception on 1: exception on 2: exception on 3: ERROR: X not defined
 in anonymous at no file
 in eval at
/Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7
 in anonymous at multi.jl:1310
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6
ERROR: X not defined
 in anonymous at no file
 in eval at
/Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7
 in anonymous at multi.jl:1310
 in anonymous at multi.jl:848
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6
ERROR: X not defined
 in anonymous at no file
 in eval at
/Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7
 in anonymous at multi.jl:1310
 in anonymous at multi.jl:848
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6
exception on exception on 2: 1: ERROR: Y not defined
 in anonymous at no file
 in eval at
/Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7
 in anonymous at multi.jl:1310
 in anonymous at multi.jl:848
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6
ERROR: Y not defined
 in anonymous at no file
 in eval at
/Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7
 in anonymous at multi.jl:1310
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6
exception on 3: ERROR: Y not defined
 in anonymous at no file
 in eval at
/Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7
 in anonymous at multi.jl:1310
 in anonymous at multi.jl:848
 in run_work_thunk at multi.jl:621
 in run_work_thunk at multi.jl:630
 in anonymous at task.jl:6

For comparison, the non-modularized version works:

function doparallelstuff(m = 10, n = 20)
# initialize variables
localX = Base.shmem_rand(m; pids=procs())
localY = Base.shmem_rand(n; pids=procs())
localf = [x-i+sum(x) for i=1:m]
localg = [x-i+sum(x) for i=1:n]

# broadcast variables to all worker processes (thanks to Amit Murthy
for suggesting this syntax)
@sync begin
for i in procs(localX)
remotecall(i, x-(global const X=x; nothing), localX)
remotecall(i, x-(global const Y=x; nothing), localY)
remotecall(i, x-(global const f=x; nothing), localf)
remotecall(i, x-(global const g=x; nothing), localg)
end
end

# compute
for iteration=1:1
@everywhere for i=localindexes(X)
X[i] = f[i](Y)
end
@everywhere for j=localindexes(Y)
Y[j] = g[j](X)
end
end
end

doparallelstuff()

On Mon, Nov 24, 2014 at 11:24 AM, Blake Johnson blakejohnso...@gmail.com
wrote:

 I use this macro to send variables to remote processes:

 macro sendvar(proc, x)
 quote
 rr = RemoteRef()
 put!(rr, $x)
 remotecall($proc, (rr)-begin
 global $(esc(x))
 $(esc(x)) = fetch(rr)
 end, rr)
 end
 end

 Though the solution above looks a little simpler.

 --Blake

 On Sunday, November 23, 2014 1:30:49 AM UTC-5, Amit Murthy wrote:

 From the description of Base.localize_vars - 'wrap an expression in let
 a=a,b=b,... for each var it references'

 Though that does not seem to the only(?) issue here

 On Sun, Nov 23, 2014 at 11:52 AM, Madeleine Udell madelei...@gmail.com
 wrote:

 Thanks! This is extremely helpful.

 Can you tell me more about what localize_vars does?

 On Sat, Nov 22, 2014 at 9:11 PM, Amit Murthy amit@gmail.com wrote:

 This works:

 function

Re: [julia-users] Broadcasting variables

2014-11-22 Thread Madeleine Udell
The code block I posted before works, but throws an error when embedded in
a function: ERROR: X not defined (in first line of @parallel). Why am I
getting this error when I'm *assigning to* X?

function doparallelstuff(m = 10, n = 20)
# initialize variables
localX = Base.shmem_rand(m)
localY = Base.shmem_rand(n)
localf = [x-i+sum(x) for i=1:m]
localg = [x-i+sum(x) for i=1:n]

# broadcast variables to all worker processes
@parallel for i=workers()
global X = localX
global Y = localY
global f = localf
global g = localg
end
# give variables same name on master
X,Y,f,g = localX,localY,localf,localg

# compute
for iteration=1:1
@everywhere for i=localindexes(X)
X[i] = f[i](Y)
end
@everywhere for j=localindexes(Y)
Y[j] = g[j](X)
end
end
end

doparallelstuff()

On Fri, Nov 21, 2014 at 5:13 PM, Madeleine Udell madeleine.ud...@gmail.com
wrote:

 My experiments with parallelism also occur in focused blocks; I think
 that's a sign that it's not yet as user friendly as it could be.

 Here's a solution to the problem I posed that's simple to use: @parallel +
 global can be used to broadcast a variable, while @everywhere can be used
 to do a computation on local data (ie, without resending the data). I'm not
 sure how to do the variable renaming programmatically, though.

 # initialize variables
 m,n = 10,20
 localX = Base.shmem_rand(m)
 localY = Base.shmem_rand(n)
 localf = [x-i+sum(x) for i=1:m]
 localg = [x-i+sum(x) for i=1:n]

 # broadcast variables to all worker processes
 @parallel for i=workers()
 global X = localX
 global Y = localY
 global f = localf
 global g = localg
 end
 # give variables same name on master
 X,Y,f,g = localX,localY,localf,localg

 # compute
 for iteration=1:10
 @everywhere for i=localindexes(X)
 X[i] = f[i](Y)
 end
 @everywhere for j=localindexes(Y)
 Y[j] = g[j](X)
 end
 end

 On Fri, Nov 21, 2014 at 11:14 AM, Tim Holy tim.h...@gmail.com wrote:

 My experiments with parallelism tend to occur in focused blocks, and I
 haven't
 done it in quite a while. So I doubt I can help much. But in general I
 suspect
 you're encountering these problems because much of the IPC goes through
 thunks, and so a lot of stuff gets reclaimed when execution is done.

 If I were experimenting, I'd start by trying to create RemoteRef()s and
 put!
 ()ing my variables into them. Then perhaps you might be able to fetch them
 from other processes. Not sure that will work, but it seems to be worth a
 try.

 HTH,
 --Tim

 On Thursday, November 20, 2014 08:20:19 PM Madeleine Udell wrote:
  I'm trying to use parallelism in julia for a task with a structure that
 I
  think is quite pervasive. It looks like this:
 
  # broadcast lists of functions f and g to all processes so they're
  available everywhere
  # create shared arrays X,Y on all processes so they're available
 everywhere
  for iteration=1:1000
  @parallel for i=1:size(X)
  X[i] = f[i](Y)
  end
  @parallel for j=1:size(Y)
  Y[j] = g[j](X)
  end
  end
 
  I'm having trouble making this work, and I'm not sure where to dig
 around
  to find a solution. Here are the difficulties I've encountered:
 
  * @parallel doesn't allow me to create persistent variables on each
  process; ie, the following results in an error.
 
  s = Base.shmem_rand(12,3)
  @parallel for i=1:nprocs() m,n = size(s) end
  @parallel for i=1:nprocs() println(m) end
 
  * @everywhere does allow me to create persistent variables on each
 process,
  but doesn't send any data at all, including the variables I need in
 order
  to define new variables. Eg the following is an error: s is a shared
 array,
  but the variable (ie pointer to) s is apparently not shared.
  s = Base.shmem_rand(12,3)
  @everywhere m,n = size(s)
 
  Here are the kinds of questions I'd like to see protocode for:
  * How can I broadcast a variable so that it is available and persistent
 on
  every process?
  * How can I create a reference to the same shared array s that is
  accessible from every process?
  * How can I send a command to be performed in parallel, specifying which
  variables should be sent to the relevant processes and which should be
  looked up in the local namespace?
 
  Note that everything I ask above is not specific to shared arrays; the
 same
  constructs would also be extremely useful in the distributed case.
 
  --
 
  An interesting partial solution is the following:
  funcs! = Function[x-x[:] = x+k for k=1:3]
  d = drand(3,12)
  let funcs! = funcs!
@sync @parallel for k in 1:3
  funcs![myid()-1](localpart(d))
end
  end
 
  Here, I'm not sure why the let statement is necessary to send funcs!,
 since
  d is sent automatically.
 
  -
 
  Thanks!
  Madeleine




 --
 Madeleine Udell
 PhD Candidate in Computational and Mathematical Engineering
 Stanford University

Re: [julia-users] Broadcasting variables

2014-11-22 Thread Madeleine Udell
Thanks! This is extremely helpful.

Can you tell me more about what localize_vars does?

On Sat, Nov 22, 2014 at 9:11 PM, Amit Murthy amit.mur...@gmail.com wrote:

 This works:

 function doparallelstuff(m = 10, n = 20)
 # initialize variables
 localX = Base.shmem_rand(m; pids=procs())
 localY = Base.shmem_rand(n; pids=procs())
 localf = [x-i+sum(x) for i=1:m]
 localg = [x-i+sum(x) for i=1:n]

 # broadcast variables to all worker processes
 @sync begin
 for i in procs(localX)
 remotecall(i, x-(global X; X=x; nothing), localX)
 remotecall(i, x-(global Y; Y=x; nothing), localY)
 remotecall(i, x-(global f; f=x; nothing), localf)
 remotecall(i, x-(global g; g=x; nothing), localg)
 end
 end

 # compute
 for iteration=1:1
 @everywhere for i=localindexes(X)
 X[i] = f[i](Y)
 end
 @everywhere for j=localindexes(Y)
 Y[j] = g[j](X)
 end
 end
 end

 doparallelstuff()

 Though I would have expected broadcast of variables to be possible with
 just
 @everywhere X=localX
 and so on 


 Looks like @everywhere does not call localize_vars.  I don't know if this
 is by design or just an oversight. I would have expected it to do so. Will
 file an issue on github.



 On Sun, Nov 23, 2014 at 8:24 AM, Madeleine Udell 
 madeleine.ud...@gmail.com wrote:

 The code block I posted before works, but throws an error when embedded
 in a function: ERROR: X not defined (in first line of @parallel). Why am
 I getting this error when I'm *assigning to* X?

 function doparallelstuff(m = 10, n = 20)
 # initialize variables
 localX = Base.shmem_rand(m)
 localY = Base.shmem_rand(n)
 localf = [x-i+sum(x) for i=1:m]
 localg = [x-i+sum(x) for i=1:n]

 # broadcast variables to all worker processes
 @parallel for i=workers()
 global X = localX
 global Y = localY
 global f = localf
 global g = localg
 end
 # give variables same name on master
 X,Y,f,g = localX,localY,localf,localg

 # compute
 for iteration=1:1
 @everywhere for i=localindexes(X)
 X[i] = f[i](Y)
 end
 @everywhere for j=localindexes(Y)
 Y[j] = g[j](X)
 end
 end
 end

 doparallelstuff()

 On Fri, Nov 21, 2014 at 5:13 PM, Madeleine Udell 
 madeleine.ud...@gmail.com wrote:

 My experiments with parallelism also occur in focused blocks; I think
 that's a sign that it's not yet as user friendly as it could be.

 Here's a solution to the problem I posed that's simple to use: @parallel
 + global can be used to broadcast a variable, while @everywhere can be used
 to do a computation on local data (ie, without resending the data). I'm not
 sure how to do the variable renaming programmatically, though.

 # initialize variables
 m,n = 10,20
 localX = Base.shmem_rand(m)
 localY = Base.shmem_rand(n)
 localf = [x-i+sum(x) for i=1:m]
 localg = [x-i+sum(x) for i=1:n]

 # broadcast variables to all worker processes
 @parallel for i=workers()
 global X = localX
 global Y = localY
 global f = localf
 global g = localg
 end
 # give variables same name on master
 X,Y,f,g = localX,localY,localf,localg

 # compute
 for iteration=1:10
 @everywhere for i=localindexes(X)
 X[i] = f[i](Y)
 end
 @everywhere for j=localindexes(Y)
 Y[j] = g[j](X)
 end
 end

 On Fri, Nov 21, 2014 at 11:14 AM, Tim Holy tim.h...@gmail.com wrote:

 My experiments with parallelism tend to occur in focused blocks, and I
 haven't
 done it in quite a while. So I doubt I can help much. But in general I
 suspect
 you're encountering these problems because much of the IPC goes through
 thunks, and so a lot of stuff gets reclaimed when execution is done.

 If I were experimenting, I'd start by trying to create RemoteRef()s and
 put!
 ()ing my variables into them. Then perhaps you might be able to fetch
 them
 from other processes. Not sure that will work, but it seems to be worth
 a try.

 HTH,
 --Tim

 On Thursday, November 20, 2014 08:20:19 PM Madeleine Udell wrote:
  I'm trying to use parallelism in julia for a task with a structure
 that I
  think is quite pervasive. It looks like this:
 
  # broadcast lists of functions f and g to all processes so they're
  available everywhere
  # create shared arrays X,Y on all processes so they're available
 everywhere
  for iteration=1:1000
  @parallel for i=1:size(X)
  X[i] = f[i](Y)
  end
  @parallel for j=1:size(Y)
  Y[j] = g[j](X)
  end
  end
 
  I'm having trouble making this work, and I'm not sure where to dig
 around
  to find a solution. Here are the difficulties I've encountered:
 
  * @parallel doesn't allow me to create persistent variables on each
  process; ie, the following results in an error.
 
  s = Base.shmem_rand(12,3)
  @parallel for i=1:nprocs() m,n = size(s) end
  @parallel for i=1:nprocs() println(m) end

Re: [julia-users] Broadcasting variables

2014-11-22 Thread Madeleine Udell
Yes, I read the code, but I'm not sure I understand what the let statement
is doing. It's trying to redefine the scope of the variable, or create a
new variable with the same value but over a different scope? How does the
let statement interact with the namespaces of the various processes?

On Sat, Nov 22, 2014 at 10:30 PM, Amit Murthy amit.mur...@gmail.com wrote:

 From the description of Base.localize_vars - 'wrap an expression in let
 a=a,b=b,... for each var it references'

 Though that does not seem to the only(?) issue here

 On Sun, Nov 23, 2014 at 11:52 AM, Madeleine Udell 
 madeleine.ud...@gmail.com wrote:

 Thanks! This is extremely helpful.

 Can you tell me more about what localize_vars does?

 On Sat, Nov 22, 2014 at 9:11 PM, Amit Murthy amit.mur...@gmail.com
 wrote:

 This works:

 function doparallelstuff(m = 10, n = 20)
 # initialize variables
 localX = Base.shmem_rand(m; pids=procs())
 localY = Base.shmem_rand(n; pids=procs())
 localf = [x-i+sum(x) for i=1:m]
 localg = [x-i+sum(x) for i=1:n]

 # broadcast variables to all worker processes
 @sync begin
 for i in procs(localX)
 remotecall(i, x-(global X; X=x; nothing), localX)
 remotecall(i, x-(global Y; Y=x; nothing), localY)
 remotecall(i, x-(global f; f=x; nothing), localf)
 remotecall(i, x-(global g; g=x; nothing), localg)
 end
 end

 # compute
 for iteration=1:1
 @everywhere for i=localindexes(X)
 X[i] = f[i](Y)
 end
 @everywhere for j=localindexes(Y)
 Y[j] = g[j](X)
 end
 end
 end

 doparallelstuff()

 Though I would have expected broadcast of variables to be possible with
 just
 @everywhere X=localX
 and so on 


 Looks like @everywhere does not call localize_vars.  I don't know if
 this is by design or just an oversight. I would have expected it to do so.
 Will file an issue on github.



 On Sun, Nov 23, 2014 at 8:24 AM, Madeleine Udell 
 madeleine.ud...@gmail.com wrote:

 The code block I posted before works, but throws an error when embedded
 in a function: ERROR: X not defined (in first line of @parallel). Why am
 I getting this error when I'm *assigning to* X?

 function doparallelstuff(m = 10, n = 20)
 # initialize variables
 localX = Base.shmem_rand(m)
 localY = Base.shmem_rand(n)
 localf = [x-i+sum(x) for i=1:m]
 localg = [x-i+sum(x) for i=1:n]

 # broadcast variables to all worker processes
 @parallel for i=workers()
 global X = localX
 global Y = localY
 global f = localf
 global g = localg
 end
 # give variables same name on master
 X,Y,f,g = localX,localY,localf,localg

 # compute
 for iteration=1:1
 @everywhere for i=localindexes(X)
 X[i] = f[i](Y)
 end
 @everywhere for j=localindexes(Y)
 Y[j] = g[j](X)
 end
 end
 end

 doparallelstuff()

 On Fri, Nov 21, 2014 at 5:13 PM, Madeleine Udell 
 madeleine.ud...@gmail.com wrote:

 My experiments with parallelism also occur in focused blocks; I think
 that's a sign that it's not yet as user friendly as it could be.

 Here's a solution to the problem I posed that's simple to use:
 @parallel + global can be used to broadcast a variable, while @everywhere
 can be used to do a computation on local data (ie, without resending the
 data). I'm not sure how to do the variable renaming programmatically,
 though.

 # initialize variables
 m,n = 10,20
 localX = Base.shmem_rand(m)
 localY = Base.shmem_rand(n)
 localf = [x-i+sum(x) for i=1:m]
 localg = [x-i+sum(x) for i=1:n]

 # broadcast variables to all worker processes
 @parallel for i=workers()
 global X = localX
 global Y = localY
 global f = localf
 global g = localg
 end
 # give variables same name on master
 X,Y,f,g = localX,localY,localf,localg

 # compute
 for iteration=1:10
 @everywhere for i=localindexes(X)
 X[i] = f[i](Y)
 end
 @everywhere for j=localindexes(Y)
 Y[j] = g[j](X)
 end
 end

 On Fri, Nov 21, 2014 at 11:14 AM, Tim Holy tim.h...@gmail.com wrote:

 My experiments with parallelism tend to occur in focused blocks, and
 I haven't
 done it in quite a while. So I doubt I can help much. But in general
 I suspect
 you're encountering these problems because much of the IPC goes
 through
 thunks, and so a lot of stuff gets reclaimed when execution is done.

 If I were experimenting, I'd start by trying to create RemoteRef()s
 and put!
 ()ing my variables into them. Then perhaps you might be able to fetch
 them
 from other processes. Not sure that will work, but it seems to be
 worth a try.

 HTH,
 --Tim

 On Thursday, November 20, 2014 08:20:19 PM Madeleine Udell wrote:
  I'm trying to use parallelism in julia for a task with a structure
 that I
  think is quite pervasive. It looks like this:
 
  # broadcast lists of functions f and g to all processes so they're
  available

Re: [julia-users] Broadcasting variables

2014-11-21 Thread Madeleine Udell
My experiments with parallelism also occur in focused blocks; I think
that's a sign that it's not yet as user friendly as it could be.

Here's a solution to the problem I posed that's simple to use: @parallel +
global can be used to broadcast a variable, while @everywhere can be used
to do a computation on local data (ie, without resending the data). I'm not
sure how to do the variable renaming programmatically, though.

# initialize variables
m,n = 10,20
localX = Base.shmem_rand(m)
localY = Base.shmem_rand(n)
localf = [x-i+sum(x) for i=1:m]
localg = [x-i+sum(x) for i=1:n]

# broadcast variables to all worker processes
@parallel for i=workers()
global X = localX
global Y = localY
global f = localf
global g = localg
end
# give variables same name on master
X,Y,f,g = localX,localY,localf,localg

# compute
for iteration=1:10
@everywhere for i=localindexes(X)
X[i] = f[i](Y)
end
@everywhere for j=localindexes(Y)
Y[j] = g[j](X)
end
end

On Fri, Nov 21, 2014 at 11:14 AM, Tim Holy tim.h...@gmail.com wrote:

 My experiments with parallelism tend to occur in focused blocks, and I
 haven't
 done it in quite a while. So I doubt I can help much. But in general I
 suspect
 you're encountering these problems because much of the IPC goes through
 thunks, and so a lot of stuff gets reclaimed when execution is done.

 If I were experimenting, I'd start by trying to create RemoteRef()s and
 put!
 ()ing my variables into them. Then perhaps you might be able to fetch them
 from other processes. Not sure that will work, but it seems to be worth a
 try.

 HTH,
 --Tim

 On Thursday, November 20, 2014 08:20:19 PM Madeleine Udell wrote:
  I'm trying to use parallelism in julia for a task with a structure that I
  think is quite pervasive. It looks like this:
 
  # broadcast lists of functions f and g to all processes so they're
  available everywhere
  # create shared arrays X,Y on all processes so they're available
 everywhere
  for iteration=1:1000
  @parallel for i=1:size(X)
  X[i] = f[i](Y)
  end
  @parallel for j=1:size(Y)
  Y[j] = g[j](X)
  end
  end
 
  I'm having trouble making this work, and I'm not sure where to dig around
  to find a solution. Here are the difficulties I've encountered:
 
  * @parallel doesn't allow me to create persistent variables on each
  process; ie, the following results in an error.
 
  s = Base.shmem_rand(12,3)
  @parallel for i=1:nprocs() m,n = size(s) end
  @parallel for i=1:nprocs() println(m) end
 
  * @everywhere does allow me to create persistent variables on each
 process,
  but doesn't send any data at all, including the variables I need in order
  to define new variables. Eg the following is an error: s is a shared
 array,
  but the variable (ie pointer to) s is apparently not shared.
  s = Base.shmem_rand(12,3)
  @everywhere m,n = size(s)
 
  Here are the kinds of questions I'd like to see protocode for:
  * How can I broadcast a variable so that it is available and persistent
 on
  every process?
  * How can I create a reference to the same shared array s that is
  accessible from every process?
  * How can I send a command to be performed in parallel, specifying which
  variables should be sent to the relevant processes and which should be
  looked up in the local namespace?
 
  Note that everything I ask above is not specific to shared arrays; the
 same
  constructs would also be extremely useful in the distributed case.
 
  --
 
  An interesting partial solution is the following:
  funcs! = Function[x-x[:] = x+k for k=1:3]
  d = drand(3,12)
  let funcs! = funcs!
@sync @parallel for k in 1:3
  funcs![myid()-1](localpart(d))
end
  end
 
  Here, I'm not sure why the let statement is necessary to send funcs!,
 since
  d is sent automatically.
 
  -
 
  Thanks!
  Madeleine




-- 
Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell


[julia-users] Broadcasting variables

2014-11-20 Thread Madeleine Udell
I'm trying to use parallelism in julia for a task with a structure that I 
think is quite pervasive. It looks like this:

# broadcast lists of functions f and g to all processes so they're 
available everywhere
# create shared arrays X,Y on all processes so they're available everywhere
for iteration=1:1000
@parallel for i=1:size(X)
X[i] = f[i](Y)
end
@parallel for j=1:size(Y)
Y[j] = g[j](X)
end
end

I'm having trouble making this work, and I'm not sure where to dig around 
to find a solution. Here are the difficulties I've encountered:

* @parallel doesn't allow me to create persistent variables on each 
process; ie, the following results in an error.

s = Base.shmem_rand(12,3)
@parallel for i=1:nprocs() m,n = size(s) end
@parallel for i=1:nprocs() println(m) end

* @everywhere does allow me to create persistent variables on each process, 
but doesn't send any data at all, including the variables I need in order 
to define new variables. Eg the following is an error: s is a shared array, 
but the variable (ie pointer to) s is apparently not shared.
s = Base.shmem_rand(12,3)
@everywhere m,n = size(s)

Here are the kinds of questions I'd like to see protocode for:
* How can I broadcast a variable so that it is available and persistent on 
every process?
* How can I create a reference to the same shared array s that is 
accessible from every process? 
* How can I send a command to be performed in parallel, specifying which 
variables should be sent to the relevant processes and which should be 
looked up in the local namespace?

Note that everything I ask above is not specific to shared arrays; the same 
constructs would also be extremely useful in the distributed case.

--

An interesting partial solution is the following:
funcs! = Function[x-x[:] = x+k for k=1:3]
d = drand(3,12)
let funcs! = funcs!
  @sync @parallel for k in 1:3
funcs![myid()-1](localpart(d))
  end
end

Here, I'm not sure why the let statement is necessary to send funcs!, since 
d is sent automatically.

-

Thanks!
Madeleine


[julia-users] Option types in julia?

2014-03-08 Thread Madeleine Udell
Hi,

I'd like to allow a function to take arguments that consist of arrays of a 
type I've defined, or empty arrays. In many languages, one can do this 
using option types; is there an analogue in julia?

Concretely, I'd like to define T so that the last typeassert here will be 
true:

julia type MyType
   value
   end

julia T=Union(MyType,None)
MyType (constructor with 1 method)

julia [MyType(4)]::Array{T}
1-element Array{MyType,1}:
 MyType(4)

julia []::Array{T}
ERROR: type: typeassert: expected Array{MyType,N}, got Array{None,1}

Thanks!
Madeleine


Re: [julia-users] SharedArray arithmetic

2014-02-20 Thread Madeleine Udell
Thanks for the reminder! I'm still making the transition to devectorized 
thinking...


Re: [julia-users] SharedArray arithmetic

2014-02-19 Thread Madeleine Udell
Tim, thanks for the explanation! It looks like

b = Base.shmem_randn(10)
b[:] *= 2

should work without allocating new memory for normal and shared
arrays, and is slightly more concise.

On Wed, Feb 19, 2014 at 12:11 PM, Tim Holy tim.h...@gmail.com wrote:
 Unfortunately, I'm pretty sure you can't do what you're hoping for. Consider:

 julia A = rand(2,2)
 2x2 Array{Float64,2}:
  0.63259   0.109017
  0.667425  0.112597

 julia pointer(A)
 Ptr{Float64} @0x0446f6a0

 julia A *= 2
 2x2 Array{Float64,2}:
  1.26518  0.218033
  1.33485  0.225195

 julia pointer(A)
 Ptr{Float64} @0x05696650

 I think A *= 2 gets translated by the parser into A = A*2. There are some old
 posts that can do a better job than I explaining why this is important.

 So I think your only bet is to do something like this:

 for i = 1:length(A)
 A[i] *= 2
 end

 which does work in-place.

 --Tim

 On Wednesday, February 19, 2014 11:26:48 AM Madeleine Udell wrote:
 Currently in-place arithmetic operations turn shared arrays into normal
 ones: eg

 b = Base.shmem_randn(10)
 b *= 2

 results in b becoming a (not-shared) array. I'd like to overload these
 in-place arithmetic operations so that instead of writing

 typeof(b)==SharedArray? b.s *= 2 : b *= 2

 I can just write

 b *= 2

 and know that the shared status of my array will be preserved. How might
 one overload these operations?

 (Note: I don't want c = b + b to automatically create a shared array c,
 since that allocates new shared memory.)



-- 
Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell


Re: [julia-users] Re: Parallel sparse matrix vector multiplication

2014-02-13 Thread Madeleine Udell
Thanks for all the advice, everyone. I've just finished a parallel sparse
matrix vector multiplication library written in straight julia for shared
memory machines, using Amit Murthy's SharedArrays. You can find it at
https://github.com/madeleineudell/ParallelSparseMatMul.jl

For a matrix with 1E9 nonzeros, its performance on 6 processors is the same
as MKL, and it retains linear parallel scaling up to the highest I've
tested (40 threads).

(serial julia)
julia @time S'*x;
elapsed time: 17.63098703 seconds (8410364 bytes allocated)

(mkl, 6 threads)
julia @time
Base.LinAlg.SparseBLAS.cscmv!(transa,alpha,matdescra,A,x,beta,y);
elapsed time: 4.334168257 seconds (48 bytes allocated)

(ParallelSparseMatMul, 6 threads)
julia @time y = A'*x;
elapsed time: 4.18557078 seconds (1858156 bytes allocated)





On Tue, Feb 11, 2014 at 1:44 PM, Madeleine Udell
madeleine.ud...@gmail.comwrote:

 Thanks, Jake. It seems like I'm still not able to go above 6 threads, but
 that may be deep in my MKL install.

 I've begun implementing a shared sparse matrix class in native Julia that
 seems to scale better. The code is attached to this email.  How can I
 overload A'*b for my SharedBilinearOperator class, so that I can call
 At_mul_B on a stored copy of A', rather than computing A' each time I want
 to multiply by it? ie can I write something like

 ('*)(L::SharedBilinearOperator,x) = At_mul_B(L.A,x)

 ?


 On Tue, Feb 11, 2014 at 8:40 AM, Jake Bolewski jakebolew...@gmail.comwrote:

 Hey Madeleine,

 First I would check that your global environment 
 variableshttp://software.intel.com/sites/products/documentation/hpc/mkl/mkl_userguide_lnx/GUID-0DE6A77B-00E0-4ED6-9CAE-52FCF49E5623.htmare
  set up correctly.  If you want to set up the number of threads
 programmatically:

_   _ _(_)_ |  A fresh approach to technical computing
   (_) | (_) (_)|  Documentation: http://docs.julialang.org
_ _   _| |_  __ _   |  Type help() to list help topics
   | | | | | | |/ _` |  |
   | | |_| | | | (_| |  |  Version 0.3.0-prerelease+1470 (2014-02-08 16:23
 UTC)
  _/ |\__'_|_|_|\__'_|  |  Commit 596d5c4* (3 days old master)
 |__/   |  x86_64-unknown-linux-gnu

 julia Base.blas_vendor()
 :mkl

 julia Base.blas_set_num_threads(12)


 You can see the relevant code here:
 https://github.com/JuliaLang/julia/blob/8ac5a7f7fff1c54a768c7bc9ae85cf053d310f42/base/util.jl#L293
 .
 It's always worth a quick search through the code base to figure out what
 is going on behind the scenes.

 Hope that helps!
 Jake

 On Tuesday, February 11, 2014 12:34:56 AM UTC-5, Madeleine Udell wrote:

 Jake, thanks for the 
 referencehttp://software.intel.com/en-us/forums/topic/294954;
 I have 32 hyperthreaded cores, so there's definitely something else going
 on to limit me to 6 in addition to not exploiting hyperthreading.



 Perhaps I need to call something like omp_set_num_threads()? But there
 doesn't seem to be a function by this name in the libmkl_rt library.

 julia ccall((:omp_set_num_threads,Base.libblas_name),Ptr{Void},(
 Uint8,),32)
 ERROR: ccall: could not find function omp_set_num_threads in library
 libmkl_rt
  in anonymous at no file


 On Mon, Feb 10, 2014 at 4:05 PM, Jake Bolewski jakebo...@gmail.comwrote:

 Are those hyper-threaded cores?  If so, check Intel MKL's
 documentation on hyper-threading.

 -Best
 Jake

 On Monday, February 10, 2014 6:38:50 PM UTC-5, Madeleine Udell wrote:

 It looks like only 6 threads are being used when I use mkl from julia.
 If I do blas_set_num_threads(n), then using top, I see julia is
 running at min(n,6)*100% cpu. Any idea why this would be, or how to
 get mkl to use more threads? I'm not sure if this is an issue in julia
 or with my installation of mkl.

 On Mon, Feb 10, 2014 at 2:09 PM, Andreas Noack Jensen
 andreasno...@gmail.com wrote:
  You are welcome. Good to hear that it worked.
 
 
  2014-02-10 22:35 GMT+01:00 Madeleine Udell madelei...@gmail.com:
 
  fantastic, thank you. I now see Base.libblas_name=libmkl_rt, and
 am able
  to run the following test successfully:
 
  transa = 'N'::Base.LinAlg.BlasChar # multiply by A, not A'
  matdescra = GXXF # G = general, X = ignored, F = one-based
 indexing
  m,n = 50,100
  A = sprand(m,n,.01)
  y = zeros(m)
  x = rand(n)
  alpha = 1.0
  beta = 1.0
 
  Base.LinAlg.SparseBLAS.cscmv!(transa,alpha,matdescra,A,x,beta,y)
  y_builtin = A*x
 
  julia y==y_builtin
  true
 
 
  On Mon, Feb 10, 2014 at 12:08 PM, Andreas Noack Jensen
  andreasno...@gmail.com wrote:
 
  Hi Madeleine
 
  When compiling julia with MKL you'll have to do make cleanall as
 well as
  rebuild ARPACK and Suitesparse, i.e. make -C distclean-arpack and
 make -C
  distclean-suitesparse. It is also easier to create a Make.user
 there you set
  USE_MKL=1 and MKLROOT to the location of your MKL library files,
 e.g.
  /opt/intel/mkl. The arguments are explained here
 
 
  http://software.intel.com/sites/products/documentation/hpc/
 mkl/mklman/GUID-C2EE93F0-B573-4538-A994

Re: [julia-users] Re: Parallel sparse matrix vector multiplication

2014-02-13 Thread Madeleine Udell
Thanks, Jiahao! It looks like you've already made a great dent in the
iterative solvers wishlist. I'm planning on using the
ParallelSparseMatMul library along with some iterative solvers
(possibly just LSQR) to implement iterative solvers for nonnegative
least squares, lasso, elastic net, etc using ADMM. It would be nice to
ensure that eg shared arrays stay shared in IterativeSolvers to ensure
it works well with parallel matrix multiplication.

On Thu, Feb 13, 2014 at 5:05 PM, Jiahao Chen jia...@mit.edu wrote:
 Fantastic work!

 I've been meaning to get back to work on IterativeSolvers...

 Thanks,

 Jiahao Chen, PhD
 Staff Research Scientist
 MIT Computer Science and Artificial Intelligence Laboratory



-- 
Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell


Re: [julia-users] Re: Parallel sparse matrix vector multiplication

2014-02-11 Thread Madeleine Udell
Thanks, Jake. It seems like I'm still not able to go above 6 threads, but
that may be deep in my MKL install.

I've begun implementing a shared sparse matrix class in native Julia that
seems to scale better. The code is attached to this email.  How can I
overload A'*b for my SharedBilinearOperator class, so that I can call
At_mul_B on a stored copy of A', rather than computing A' each time I want
to multiply by it? ie can I write something like

('*)(L::SharedBilinearOperator,x) = At_mul_B(L.A,x)

?


On Tue, Feb 11, 2014 at 8:40 AM, Jake Bolewski jakebolew...@gmail.comwrote:

 Hey Madeleine,

 First I would check that your global environment 
 variableshttp://software.intel.com/sites/products/documentation/hpc/mkl/mkl_userguide_lnx/GUID-0DE6A77B-00E0-4ED6-9CAE-52FCF49E5623.htmare
  set up correctly.  If you want to set up the number of threads
 programmatically:

_   _ _(_)_ |  A fresh approach to technical computing
   (_) | (_) (_)|  Documentation: http://docs.julialang.org
_ _   _| |_  __ _   |  Type help() to list help topics
   | | | | | | |/ _` |  |
   | | |_| | | | (_| |  |  Version 0.3.0-prerelease+1470 (2014-02-08 16:23
 UTC)
  _/ |\__'_|_|_|\__'_|  |  Commit 596d5c4* (3 days old master)
 |__/   |  x86_64-unknown-linux-gnu

 julia Base.blas_vendor()
 :mkl

 julia Base.blas_set_num_threads(12)


 You can see the relevant code here:
 https://github.com/JuliaLang/julia/blob/8ac5a7f7fff1c54a768c7bc9ae85cf053d310f42/base/util.jl#L293
 .
 It's always worth a quick search through the code base to figure out what
 is going on behind the scenes.

 Hope that helps!
 Jake

 On Tuesday, February 11, 2014 12:34:56 AM UTC-5, Madeleine Udell wrote:

 Jake, thanks for the 
 referencehttp://software.intel.com/en-us/forums/topic/294954;
 I have 32 hyperthreaded cores, so there's definitely something else going
 on to limit me to 6 in addition to not exploiting hyperthreading.



 Perhaps I need to call something like omp_set_num_threads()? But there
 doesn't seem to be a function by this name in the libmkl_rt library.

 julia ccall((:omp_set_num_threads,Base.libblas_name),Ptr{Void},(
 Uint8,),32)
 ERROR: ccall: could not find function omp_set_num_threads in library
 libmkl_rt
  in anonymous at no file


 On Mon, Feb 10, 2014 at 4:05 PM, Jake Bolewski jakebo...@gmail.comwrote:

 Are those hyper-threaded cores?  If so, check Intel MKL's
 documentation on hyper-threading.

 -Best
 Jake

 On Monday, February 10, 2014 6:38:50 PM UTC-5, Madeleine Udell wrote:

 It looks like only 6 threads are being used when I use mkl from julia.
 If I do blas_set_num_threads(n), then using top, I see julia is
 running at min(n,6)*100% cpu. Any idea why this would be, or how to
 get mkl to use more threads? I'm not sure if this is an issue in julia
 or with my installation of mkl.

 On Mon, Feb 10, 2014 at 2:09 PM, Andreas Noack Jensen
 andreasno...@gmail.com wrote:
  You are welcome. Good to hear that it worked.
 
 
  2014-02-10 22:35 GMT+01:00 Madeleine Udell madelei...@gmail.com:
 
  fantastic, thank you. I now see Base.libblas_name=libmkl_rt, and
 am able
  to run the following test successfully:
 
  transa = 'N'::Base.LinAlg.BlasChar # multiply by A, not A'
  matdescra = GXXF # G = general, X = ignored, F = one-based
 indexing
  m,n = 50,100
  A = sprand(m,n,.01)
  y = zeros(m)
  x = rand(n)
  alpha = 1.0
  beta = 1.0
 
  Base.LinAlg.SparseBLAS.cscmv!(transa,alpha,matdescra,A,x,beta,y)
  y_builtin = A*x
 
  julia y==y_builtin
  true
 
 
  On Mon, Feb 10, 2014 at 12:08 PM, Andreas Noack Jensen
  andreasno...@gmail.com wrote:
 
  Hi Madeleine
 
  When compiling julia with MKL you'll have to do make cleanall as
 well as
  rebuild ARPACK and Suitesparse, i.e. make -C distclean-arpack and
 make -C
  distclean-suitesparse. It is also easier to create a Make.user
 there you set
  USE_MKL=1 and MKLROOT to the location of your MKL library files,
 e.g.
  /opt/intel/mkl. The arguments are explained here
 
 
  http://software.intel.com/sites/products/documentation/hpc/
 mkl/mklman/GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2.htm#
 GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2
 
  but transa determines if the operation is Ax or A'x and matdescra
 has
  information about the structure of the matrix, e.g. if it is
 triangular.
  When you have succeeded in compiling julia with MKL, the libblas
 variable
  should just be Base.libblas_name.
 
 
  2014-02-10 20:37 GMT+01:00 Madeleine Udell madelei...@gmail.com:
 
  I'm having trouble using MKL in julia. I changed Make.inc so that
  USE_MKL = 1,
  but when I make and run julia, I find that Base.libblas_name =
  libopenblas. Is this expected? I would have thought it would be
 eg
  libmkl_core.
 
  Andreas, I found your wrappers for MKL in this PR. I've not used
 MKL
  before, so I don't understand the call signature of those
 functions in order
  to call MKL directly. Any chance you could explain what are
 transa::BlasChar
  and matdescra::ASCIIString in the following

Re: [julia-users] Re: Parallel sparse matrix vector multiplication

2014-02-10 Thread Madeleine Udell
I'm having trouble using MKL in julia. I changed Make.inc so that USE_MKL =
1,
but when I make and run julia, I find that Base.libblas_name =
libopenblas. Is this expected? I would have thought it would be eg
libmkl_core.

Andreas, I found your wrappers for MKL in this
PRhttps://github.com/JuliaLang/julia/pull/4525/files.
I've not used MKL before, so I don't understand the call signature of those
functions in order to call MKL directly. Any chance you could explain what
are transa::BlasChar and matdescra::ASCIIString in the following function,
and which mkl library is expected in the libblas variable? I see many .so
files in the lib/intel64 directory of my mkl installation, and I'm not sure
which one to point julia to.

function cscmv!(transa::BlasChar, α::$T, matdescra::ASCIIString,
A::SparseMatrixCSC{$T, BlasInt}, x::StridedVector{$T}, β::$T,
y::StridedVector{$T})
length(x) == A.n || throw(DimensionMismatch(Matrix with $(A.n) columns
multiplied with vector of length $(length(x
length(y) == A.m || throw(DimensionMismatch(Vector of length $(A.m) added
to vector of length $(length(y #
ccall(($(string(mv)), libblas), Void,
(Ptr{Uint8}, Ptr{BlasInt}, Ptr{BlasInt}, Ptr{$T},
 Ptr{Uint8}, Ptr{$T}, Ptr{BlasInt}, Ptr{BlasInt},
 Ptr{BlasInt}, Ptr{$T}, Ptr{$T}, Ptr{$T}),
transa, A.m, A.n, α,
matdescra, A.nzval, A.rowval, pointer(A.colptr, 1),
pointer(A.colptr, 2), x, β, y)
return y
end

Thanks for your help!
Madeleine


On Wed, Feb 5, 2014 at 1:49 PM, Andreas Noack Jensen 
andreasnoackjen...@gmail.com wrote:

 A*b will not call MKL when A is sparse. There has been some writing about
 making a MKL package that overwrites A_mul_B(Matrix,Vector) with the MKL
 versions and I actually wrote wrappers for the sparse MKL subroutines in
 the fall for the same reason.


 2014-02-05 Madeleine Udell madeleine.ud...@gmail.com:

 Miles, you're right that writing sparse matrix vector products in native
 Julia probably won't be the best idea given Julia's model of parallelism.
 That's why I'm interested in calling an outside library like PETSc.

 I see it's possible to link Julia with MKL. I haven't tried this yet, but
 if I do, will A*b (where A is sparse) call MKL to perform the matrix vector
 product?


 On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin miles.lu...@gmail.comwrote:

 Memory access is typically a significant bottleneck in sparse mat-vec,
 so unfortunately I'm skeptical that one could achieve good performance
 using Julia's current distributed memory approach on a multicore machine.
 This really calls for something like OpenMP.


 On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine Udell wrote:

 I'm developing an iterative optimization algorithm in Julia along the
 lines of other contributions to the Iterative Solvers 
 projecthttps://github.com/JuliaLang/IterativeSolvers.jlor Krylov
 Subspace
 https://github.com/JuliaLang/IterativeSolvers.jl/blob/master/src/krylov.jlmodule
  whose
 only computationally intensive step is computing A*b or A'*b. I would like
 to parallelize the method by using a parallel sparse matrix vector
 multiply. Is there a standard backend matrix-vector multiply that's
 recommended in Julia if I'm targeting a shared memory computer with a large
 number of processors? Similarly, is there a recommended backend for
 targeting a cluster? My matrices can easily reach 10 million rows by 1
 million columns, with sparsity anywhere from .01% to problems that are
 nearly diagonal.

 I've seen many posts https://github.com/JuliaLang/julia/issues/2645 
 talking
 about integrating PETSc as a backend for this purpose, but it looks like
 the 
 projecthttps://github.com/petsc/petsc/blob/master/bin/julia/PETSc.jlhas 
 stalled - the last commits I see are a year ago. I'm also interested in
 other backends, eg Spark http://spark.incubator.apache.org/, 
 SciDBhttp://scidb.org/,
 etc.

 I'm more interested in solving sparse problems, but as a side note, the
 built-in BLAS acceleration by changing the number of threads 
 `blas_set_num_threads`
 works ok for dense problems using a moderate number of processors. I wonder
 why the number of threads isn't set higher than one by default, for
 example, using as many as nprocs() cores?




 --
 Madeleine Udell
 PhD Candidate in Computational and Mathematical Engineering
 Stanford University
 www.stanford.edu/~udell




 --
 Med venlig hilsen

 Andreas Noack Jensen




-- 
Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell


Re: [julia-users] Re: Parallel sparse matrix vector multiplication

2014-02-10 Thread Madeleine Udell
Thanks, Andreas. I've tried to follow your directions, but am still getting
Base.libblas_name = libopenblas. Details follow:

I've created a file Make.mkl and replaced Make.inc with Make.mkl in the
Makefile. In Make.inc, I've set

USE_MKL=1

and I've set the MKLROOT for my install location,
export MKLROOT=/opt/intel/mkl

I run

make cleanall

with no problems, but when I try

make -C distclean-arpack
make -C distclean-suitesparse

where I get the error `make: *** distclean-arpack: No such file or
directory.  Stop.` Instead, I've run

make -C deps distclean-arpack
make -C deps distclean-suitesparse

(I'm not sure if that's doing what was intended). Now I recompile julia
(very quickly!)

make -j 60

but still see

julia Base.libblas_name
libopenblas

Any ideas?

Thanks!
Madeleine


On Mon, Feb 10, 2014 at 12:08 PM, Andreas Noack Jensen 
andreasnoackjen...@gmail.com wrote:

 Hi Madeleine

 When compiling julia with MKL you'll have to do make cleanall as well as
 rebuild ARPACK and Suitesparse, i.e. make -C distclean-arpack and make -C
 distclean-suitesparse. It is also easier to create a Make.user there you
 set USE_MKL=1 and MKLROOT to the location of your MKL library files, e.g.
 /opt/intel/mkl. The arguments are explained here


 http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2.htm#GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2

 but transa determines if the operation is Ax or A'x and matdescra has
 information about the structure of the matrix, e.g. if it is triangular.
 When you have succeeded in compiling julia with MKL, the libblas variable
 should just be Base.libblas_name.


 2014-02-10 20:37 GMT+01:00 Madeleine Udell madeleine.ud...@gmail.com:

 I'm having trouble using MKL in julia. I changed Make.inc so that USE_MKL
 = 1,
 but when I make and run julia, I find that Base.libblas_name =
 libopenblas. Is this expected? I would have thought it would be eg
 libmkl_core.

 Andreas, I found your wrappers for MKL in this 
 PRhttps://github.com/JuliaLang/julia/pull/4525/files.
 I've not used MKL before, so I don't understand the call signature of those
 functions in order to call MKL directly. Any chance you could explain what
 are transa::BlasChar and matdescra::ASCIIString in the following function,
 and which mkl library is expected in the libblas variable? I see many .so
 files in the lib/intel64 directory of my mkl installation, and I'm not sure
 which one to point julia to.

 function cscmv!(transa::BlasChar, α::$T, matdescra::ASCIIString,
 A::SparseMatrixCSC{$T, BlasInt}, x::StridedVector{$T}, β::$T,
 y::StridedVector{$T})
 length(x) == A.n || throw(DimensionMismatch(Matrix with $(A.n) columns
 multiplied with vector of length $(length(x
  length(y) == A.m || throw(DimensionMismatch(Vector of length $(A.m)
 added to vector of length $(length(y #
 ccall(($(string(mv)), libblas), Void,
  (Ptr{Uint8}, Ptr{BlasInt}, Ptr{BlasInt}, Ptr{$T},
  Ptr{Uint8}, Ptr{$T}, Ptr{BlasInt}, Ptr{BlasInt},
  Ptr{BlasInt}, Ptr{$T}, Ptr{$T}, Ptr{$T}),
 transa, A.m, A.n, α,
 matdescra, A.nzval, A.rowval, pointer(A.colptr, 1),
  pointer(A.colptr, 2), x, β, y)
 return y
 end

 Thanks for your help!
 Madeleine


 On Wed, Feb 5, 2014 at 1:49 PM, Andreas Noack Jensen 
 andreasnoackjen...@gmail.com wrote:

 A*b will not call MKL when A is sparse. There has been some writing
 about making a MKL package that overwrites A_mul_B(Matrix,Vector) with the
 MKL versions and I actually wrote wrappers for the sparse MKL subroutines
 in the fall for the same reason.


 2014-02-05 Madeleine Udell madeleine.ud...@gmail.com:

 Miles, you're right that writing sparse matrix vector products in native
 Julia probably won't be the best idea given Julia's model of parallelism.
 That's why I'm interested in calling an outside library like PETSc.

 I see it's possible to link Julia with MKL. I haven't tried this yet,
 but if I do, will A*b (where A is sparse) call MKL to perform the matrix
 vector product?


 On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin miles.lu...@gmail.comwrote:

 Memory access is typically a significant bottleneck in sparse mat-vec,
 so unfortunately I'm skeptical that one could achieve good performance
 using Julia's current distributed memory approach on a multicore machine.
 This really calls for something like OpenMP.


 On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine Udell
 wrote:

 I'm developing an iterative optimization algorithm in Julia along the
 lines of other contributions to the Iterative Solvers 
 projecthttps://github.com/JuliaLang/IterativeSolvers.jlor Krylov
 Subspace
 https://github.com/JuliaLang/IterativeSolvers.jl/blob/master/src/krylov.jlmodule
  whose
 only computationally intensive step is computing A*b or A'*b. I would 
 like
 to parallelize the method by using a parallel sparse matrix vector
 multiply. Is there a standard backend matrix-vector multiply that's
 recommended in Julia if I'm targeting a shared memory computer

Re: [julia-users] Re: Parallel sparse matrix vector multiplication

2014-02-10 Thread Madeleine Udell
fantastic, thank you. I now see Base.libblas_name=libmkl_rt, and am able
to run the following test successfully:

transa = 'N'::Base.LinAlg.BlasChar # multiply by A, not A'
matdescra = GXXF # G = general, X = ignored, F = one-based indexing
m,n = 50,100
A = sprand(m,n,.01)
y = zeros(m)
x = rand(n)
alpha = 1.0
beta = 1.0

Base.LinAlg.SparseBLAS.cscmv!(transa,alpha,matdescra,A,x,beta,y)
y_builtin = A*x

julia y==y_builtin
true


On Mon, Feb 10, 2014 at 12:08 PM, Andreas Noack Jensen 
andreasnoackjen...@gmail.com wrote:

 Hi Madeleine

 When compiling julia with MKL you'll have to do make cleanall as well as
 rebuild ARPACK and Suitesparse, i.e. make -C distclean-arpack and make -C
 distclean-suitesparse. It is also easier to create a Make.user there you
 set USE_MKL=1 and MKLROOT to the location of your MKL library files, e.g.
 /opt/intel/mkl. The arguments are explained here


 http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2.htm#GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2

 but transa determines if the operation is Ax or A'x and matdescra has
 information about the structure of the matrix, e.g. if it is triangular.
 When you have succeeded in compiling julia with MKL, the libblas variable
 should just be Base.libblas_name.


 2014-02-10 20:37 GMT+01:00 Madeleine Udell madeleine.ud...@gmail.com:

 I'm having trouble using MKL in julia. I changed Make.inc so that USE_MKL
 = 1,
 but when I make and run julia, I find that Base.libblas_name =
 libopenblas. Is this expected? I would have thought it would be eg
 libmkl_core.

 Andreas, I found your wrappers for MKL in this 
 PRhttps://github.com/JuliaLang/julia/pull/4525/files.
 I've not used MKL before, so I don't understand the call signature of those
 functions in order to call MKL directly. Any chance you could explain what
 are transa::BlasChar and matdescra::ASCIIString in the following function,
 and which mkl library is expected in the libblas variable? I see many .so
 files in the lib/intel64 directory of my mkl installation, and I'm not sure
 which one to point julia to.

 function cscmv!(transa::BlasChar, α::$T, matdescra::ASCIIString,
 A::SparseMatrixCSC{$T, BlasInt}, x::StridedVector{$T}, β::$T,
 y::StridedVector{$T})
 length(x) == A.n || throw(DimensionMismatch(Matrix with $(A.n) columns
 multiplied with vector of length $(length(x
  length(y) == A.m || throw(DimensionMismatch(Vector of length $(A.m)
 added to vector of length $(length(y #
 ccall(($(string(mv)), libblas), Void,
  (Ptr{Uint8}, Ptr{BlasInt}, Ptr{BlasInt}, Ptr{$T},
  Ptr{Uint8}, Ptr{$T}, Ptr{BlasInt}, Ptr{BlasInt},
  Ptr{BlasInt}, Ptr{$T}, Ptr{$T}, Ptr{$T}),
 transa, A.m, A.n, α,
 matdescra, A.nzval, A.rowval, pointer(A.colptr, 1),
  pointer(A.colptr, 2), x, β, y)
 return y
 end

 Thanks for your help!
 Madeleine


 On Wed, Feb 5, 2014 at 1:49 PM, Andreas Noack Jensen 
 andreasnoackjen...@gmail.com wrote:

 A*b will not call MKL when A is sparse. There has been some writing
 about making a MKL package that overwrites A_mul_B(Matrix,Vector) with the
 MKL versions and I actually wrote wrappers for the sparse MKL subroutines
 in the fall for the same reason.


 2014-02-05 Madeleine Udell madeleine.ud...@gmail.com:

 Miles, you're right that writing sparse matrix vector products in native
 Julia probably won't be the best idea given Julia's model of parallelism.
 That's why I'm interested in calling an outside library like PETSc.

 I see it's possible to link Julia with MKL. I haven't tried this yet,
 but if I do, will A*b (where A is sparse) call MKL to perform the matrix
 vector product?


 On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin miles.lu...@gmail.comwrote:

 Memory access is typically a significant bottleneck in sparse mat-vec,
 so unfortunately I'm skeptical that one could achieve good performance
 using Julia's current distributed memory approach on a multicore machine.
 This really calls for something like OpenMP.


 On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine Udell
 wrote:

 I'm developing an iterative optimization algorithm in Julia along the
 lines of other contributions to the Iterative Solvers 
 projecthttps://github.com/JuliaLang/IterativeSolvers.jlor Krylov
 Subspace
 https://github.com/JuliaLang/IterativeSolvers.jl/blob/master/src/krylov.jlmodule
  whose
 only computationally intensive step is computing A*b or A'*b. I would 
 like
 to parallelize the method by using a parallel sparse matrix vector
 multiply. Is there a standard backend matrix-vector multiply that's
 recommended in Julia if I'm targeting a shared memory computer with a 
 large
 number of processors? Similarly, is there a recommended backend for
 targeting a cluster? My matrices can easily reach 10 million rows by 1
 million columns, with sparsity anywhere from .01% to problems that are
 nearly diagonal.

 I've seen many posts https://github.com/JuliaLang/julia/issues/2645 
 talking
 about integrating PETSc as a backend

Re: [julia-users] Re: Parallel sparse matrix vector multiplication

2014-02-10 Thread Madeleine Udell
It looks like only 6 threads are being used when I use mkl from julia.
If I do blas_set_num_threads(n), then using top, I see julia is
running at min(n,6)*100% cpu. Any idea why this would be, or how to
get mkl to use more threads? I'm not sure if this is an issue in julia
or with my installation of mkl.

On Mon, Feb 10, 2014 at 2:09 PM, Andreas Noack Jensen
andreasnoackjen...@gmail.com wrote:
 You are welcome. Good to hear that it worked.


 2014-02-10 22:35 GMT+01:00 Madeleine Udell madeleine.ud...@gmail.com:

 fantastic, thank you. I now see Base.libblas_name=libmkl_rt, and am able
 to run the following test successfully:

 transa = 'N'::Base.LinAlg.BlasChar # multiply by A, not A'
 matdescra = GXXF # G = general, X = ignored, F = one-based indexing
 m,n = 50,100
 A = sprand(m,n,.01)
 y = zeros(m)
 x = rand(n)
 alpha = 1.0
 beta = 1.0

 Base.LinAlg.SparseBLAS.cscmv!(transa,alpha,matdescra,A,x,beta,y)
 y_builtin = A*x

 julia y==y_builtin
 true


 On Mon, Feb 10, 2014 at 12:08 PM, Andreas Noack Jensen
 andreasnoackjen...@gmail.com wrote:

 Hi Madeleine

 When compiling julia with MKL you'll have to do make cleanall as well as
 rebuild ARPACK and Suitesparse, i.e. make -C distclean-arpack and make -C
 distclean-suitesparse. It is also easier to create a Make.user there you set
 USE_MKL=1 and MKLROOT to the location of your MKL library files, e.g.
 /opt/intel/mkl. The arguments are explained here


 http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2.htm#GUID-C2EE93F0-B573-4538-A994-202CB3ADFFC2

 but transa determines if the operation is Ax or A'x and matdescra has
 information about the structure of the matrix, e.g. if it is triangular.
 When you have succeeded in compiling julia with MKL, the libblas variable
 should just be Base.libblas_name.


 2014-02-10 20:37 GMT+01:00 Madeleine Udell madeleine.ud...@gmail.com:

 I'm having trouble using MKL in julia. I changed Make.inc so that
 USE_MKL = 1,
 but when I make and run julia, I find that Base.libblas_name =
 libopenblas. Is this expected? I would have thought it would be eg
 libmkl_core.

 Andreas, I found your wrappers for MKL in this PR. I've not used MKL
 before, so I don't understand the call signature of those functions in 
 order
 to call MKL directly. Any chance you could explain what are 
 transa::BlasChar
 and matdescra::ASCIIString in the following function, and which mkl library
 is expected in the libblas variable? I see many .so files in the 
 lib/intel64
 directory of my mkl installation, and I'm not sure which one to point julia
 to.

 function cscmv!(transa::BlasChar, α::$T, matdescra::ASCIIString,
 A::SparseMatrixCSC{$T, BlasInt}, x::StridedVector{$T}, β::$T,
 y::StridedVector{$T})
 length(x) == A.n || throw(DimensionMismatch(Matrix with $(A.n) columns
 multiplied with vector of length $(length(x
 length(y) == A.m || throw(DimensionMismatch(Vector of length $(A.m)
 added to vector of length $(length(y #
 ccall(($(string(mv)), libblas), Void,
 (Ptr{Uint8}, Ptr{BlasInt}, Ptr{BlasInt}, Ptr{$T},
 Ptr{Uint8}, Ptr{$T}, Ptr{BlasInt}, Ptr{BlasInt},
 Ptr{BlasInt}, Ptr{$T}, Ptr{$T}, Ptr{$T}),
 transa, A.m, A.n, α,
 matdescra, A.nzval, A.rowval, pointer(A.colptr, 1),
 pointer(A.colptr, 2), x, β, y)
 return y
 end

 Thanks for your help!
 Madeleine


 On Wed, Feb 5, 2014 at 1:49 PM, Andreas Noack Jensen
 andreasnoackjen...@gmail.com wrote:

 A*b will not call MKL when A is sparse. There has been some writing
 about making a MKL package that overwrites A_mul_B(Matrix,Vector) with the
 MKL versions and I actually wrote wrappers for the sparse MKL subroutines 
 in
 the fall for the same reason.


 2014-02-05 Madeleine Udell madeleine.ud...@gmail.com:

 Miles, you're right that writing sparse matrix vector products in
 native Julia probably won't be the best idea given Julia's model of
 parallelism. That's why I'm interested in calling an outside library like
 PETSc.

 I see it's possible to link Julia with MKL. I haven't tried this yet,
 but if I do, will A*b (where A is sparse) call MKL to perform the matrix
 vector product?


 On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin miles.lu...@gmail.com
 wrote:

 Memory access is typically a significant bottleneck in sparse
 mat-vec, so unfortunately I'm skeptical that one could achieve good
 performance using Julia's current distributed memory approach on a 
 multicore
 machine. This really calls for something like OpenMP.


 On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine Udell
 wrote:

 I'm developing an iterative optimization algorithm in Julia along
 the lines of other contributions to the Iterative Solvers project or 
 Krylov
 Subspace module whose only computationally intensive step is computing 
 A*b
 or A'*b. I would like to parallelize the method by using a parallel 
 sparse
 matrix vector multiply. Is there a standard backend matrix-vector 
 multiply
 that's recommended in Julia if I'm targeting a shared memory

[julia-users] Parallel sparse matrix vector multiplication

2014-02-05 Thread Madeleine Udell
I'm developing an iterative optimization algorithm in Julia along the lines 
of other contributions to the Iterative Solvers 
projecthttps://github.com/JuliaLang/IterativeSolvers.jlor Krylov 
Subspace 
https://github.com/JuliaLang/IterativeSolvers.jl/blob/master/src/krylov.jlmodule
 whose 
only computationally intensive step is computing A*b or A'*b. I would like 
to parallelize the method by using a parallel sparse matrix vector 
multiply. Is there a standard backend matrix-vector multiply that's 
recommended in Julia if I'm targeting a shared memory computer with a large 
number of processors? Similarly, is there a recommended backend for 
targeting a cluster? My matrices can easily reach 10 million rows by 1 
million columns, with sparsity anywhere from .01% to problems that are 
nearly diagonal.

I've seen many posts https://github.com/JuliaLang/julia/issues/2645 talking 
about integrating PETSc as a backend for this purpose, but it looks like 
the project https://github.com/petsc/petsc/blob/master/bin/julia/PETSc.jlhas 
stalled - the last commits I see are a year ago. I'm also interested in 
other backends, eg Spark http://spark.incubator.apache.org/, 
SciDBhttp://scidb.org/, 
etc. 

I'm more interested in solving sparse problems, but as a side note, the 
built-in BLAS acceleration by changing the number of threads 
`blas_set_num_threads` 
works ok for dense problems using a moderate number of processors. I wonder 
why the number of threads isn't set higher than one by default, for 
example, using as many as nprocs() cores?


Re: [julia-users] Re: Parallel sparse matrix vector multiplication

2014-02-05 Thread Madeleine Udell
Miles, you're right that writing sparse matrix vector products in native
Julia probably won't be the best idea given Julia's model of parallelism.
That's why I'm interested in calling an outside library like PETSc.

I see it's possible to link Julia with MKL. I haven't tried this yet, but
if I do, will A*b (where A is sparse) call MKL to perform the matrix vector
product?


On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin miles.lu...@gmail.com wrote:

 Memory access is typically a significant bottleneck in sparse mat-vec, so
 unfortunately I'm skeptical that one could achieve good performance using
 Julia's current distributed memory approach on a multicore machine. This
 really calls for something like OpenMP.


 On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine Udell wrote:

 I'm developing an iterative optimization algorithm in Julia along the
 lines of other contributions to the Iterative Solvers 
 projecthttps://github.com/JuliaLang/IterativeSolvers.jlor Krylov
 Subspace
 https://github.com/JuliaLang/IterativeSolvers.jl/blob/master/src/krylov.jlmodule
  whose
 only computationally intensive step is computing A*b or A'*b. I would like
 to parallelize the method by using a parallel sparse matrix vector
 multiply. Is there a standard backend matrix-vector multiply that's
 recommended in Julia if I'm targeting a shared memory computer with a large
 number of processors? Similarly, is there a recommended backend for
 targeting a cluster? My matrices can easily reach 10 million rows by 1
 million columns, with sparsity anywhere from .01% to problems that are
 nearly diagonal.

 I've seen many posts https://github.com/JuliaLang/julia/issues/2645 talking
 about integrating PETSc as a backend for this purpose, but it looks like
 the 
 projecthttps://github.com/petsc/petsc/blob/master/bin/julia/PETSc.jlhas 
 stalled - the last commits I see are a year ago. I'm also interested in
 other backends, eg Spark http://spark.incubator.apache.org/, 
 SciDBhttp://scidb.org/,
 etc.

 I'm more interested in solving sparse problems, but as a side note, the
 built-in BLAS acceleration by changing the number of threads 
 `blas_set_num_threads`
 works ok for dense problems using a moderate number of processors. I wonder
 why the number of threads isn't set higher than one by default, for
 example, using as many as nprocs() cores?




-- 
Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell


Re: [julia-users] Re: Parallel sparse matrix vector multiplication

2014-02-05 Thread Madeleine Udell
But speaking of writing parallel matrix vector products in native
Julia, this might be a great use case for shared arrays (although
right now I think only dense shared arrays exist). Amit, can you
comment on this?

On Wed, Feb 5, 2014 at 1:41 PM, Madeleine Udell
madeleine.ud...@gmail.com wrote:
 Miles, you're right that writing sparse matrix vector products in native
 Julia probably won't be the best idea given Julia's model of parallelism.
 That's why I'm interested in calling an outside library like PETSc.

 I see it's possible to link Julia with MKL. I haven't tried this yet, but if
 I do, will A*b (where A is sparse) call MKL to perform the matrix vector
 product?


 On Wed, Feb 5, 2014 at 11:43 AM, Miles Lubin miles.lu...@gmail.com wrote:

 Memory access is typically a significant bottleneck in sparse mat-vec, so
 unfortunately I'm skeptical that one could achieve good performance using
 Julia's current distributed memory approach on a multicore machine. This
 really calls for something like OpenMP.


 On Wednesday, February 5, 2014 11:42:00 AM UTC-5, Madeleine Udell wrote:

 I'm developing an iterative optimization algorithm in Julia along the
 lines of other contributions to the Iterative Solvers project or Krylov
 Subspace module whose only computationally intensive step is computing A*b
 or A'*b. I would like to parallelize the method by using a parallel sparse
 matrix vector multiply. Is there a standard backend matrix-vector multiply
 that's recommended in Julia if I'm targeting a shared memory computer with a
 large number of processors? Similarly, is there a recommended backend for
 targeting a cluster? My matrices can easily reach 10 million rows by 1
 million columns, with sparsity anywhere from .01% to problems that are
 nearly diagonal.

 I've seen many posts talking about integrating PETSc as a backend for
 this purpose, but it looks like the project has stalled - the last commits I
 see are a year ago. I'm also interested in other backends, eg Spark, SciDB,
 etc.

 I'm more interested in solving sparse problems, but as a side note, the
 built-in BLAS acceleration by changing the number of threads
 `blas_set_num_threads` works ok for dense problems using a moderate number
 of processors. I wonder why the number of threads isn't set higher than one
 by default, for example, using as many as nprocs() cores?




 --
 Madeleine Udell
 PhD Candidate in Computational and Mathematical Engineering
 Stanford University
 www.stanford.edu/~udell



-- 
Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell


Re: [julia-users] Using SharedArray with pmap

2014-01-27 Thread Madeleine Udell
Thanks, that makes sense. After decoupling the parameters in my code, I see 
that the increase in memory is proportional to the number of tasks, not the 
size of the shared memory block.

On Sunday, January 26, 2014 1:28:18 AM UTC-8, Amit Murthy wrote:

 @parallel is efficient at executing a large number of small computations, 
 while pmap is better for a small number of complex computations.

 What is happening in the mypmap case is that remotecall_fetch is being 
 called 1 times, i.e., 1 roundtrips to other processors. Not very 
 efficient.

 I would also point out that in your particular example of @parallel (+), 
 you will find that sum(a::AbstractArray) is much faster than @parallel.

 However, say you wanted to initialize the shared array in parallel, then 
 you would find that

 s=SharedArray(Int, 10^8)
 @parallel for i in 1:10^8
s[i] = rand(1:10)
 end

 quite efficiently use all workers in initializing the shared array.






 On Sun, Jan 26, 2014 at 12:55 PM, Madeleine Udell 
 madelei...@gmail.comjavascript:
  wrote:

 When using SharedArrays with pmap, I'm getting an increase in memory 
 usage and time proportional to the number of tasks. This doesn't happen 
 when using @parallel. What's the right way to pass shared arrays to workers 
 using functional syntax?

 (code for file q3.jl pasted below and also attached; the first timing 
 result refers to a @parallel implementation, the second to a pmap-style 
 implementation)

 ᐅ julia -p 10 q3.jl 100
 elapsed time: 1.14932906 seconds (12402424 bytes allocated)
 elapsed time: 0.097900614 seconds (2716048 bytes allocated)
 ᐅ julia -p 10 q3.jl 1000
 elapsed time: 1.140016584 seconds (12390724 bytes allocated)
 elapsed time: 0.302179888 seconds (21641260 bytes allocated)
 ᐅ julia -p 10 q3.jl 1
 elapsed time: 1.173121314 seconds (12402424 bytes allocated)
 elapsed time: 2.429918636 seconds (197840960 bytes allocated)

 n = int(ARGS[1])
 arr = randn(n)
 function make_shared(a::AbstractArray,pids=workers())
 sh = SharedArray(typeof(a[1]),size(a),pids=pids)
 sh[:] = a[:]
 return sh
 end
 arr = make_shared(arr)
 tasks = 1:n

 @time begin
 @parallel (+) for i in tasks
  arr[i]
 end
 end

 @everywhere function f(task,arr)
 arr[task]
 end
 function mypmap(f::Function, tasks, arr)
 # if this resends the shared data every time, it shouldn't)
 np = nprocs()  # determine the number of processes available
 n = length(tasks)
 results = 0
 i = 1
 # function to produce the next work item from the queue.
 # in this case it's just an index.
 nextidx() = (idx=i; i+=1; idx)
 @sync begin
 for p=1:np
 if p != myid() || np == 1
 @async begin
 while true
 idx = nextidx()
 if idx  n
 break
 end
 task = tasks[idx]
 results += remotecall_fetch(p, f, task, arr)
 end
 end
 end
 end
 end
 results
 end

 @time mypmap(f,tasks,arr)




Re: [julia-users] ijulia parallel

2014-01-25 Thread Madeleine Udell
I tried running Pkg.update() and restarting IJulia, with the same error. 
I'm using the Julia installed from github two days ago. Any other ideas 
what might be going on?

On Saturday, January 25, 2014 1:59:55 PM UTC-8, Jiahao Chen wrote:

  ERROR: StateError(Resource temporarily unavailable) 

 This error was recently fixed. Try running Pkg.update() and restarting 
 IJulia. 

  Additionally, I'd like to be able to use the ijulia analog of 
  
  julia --machinefile fn 
  
  
  Is there something I need to add to my ipython julia profile to make 
 this 
  possible? Or a command line argument I can pass? Or can I set up an 
 ipython 
  cluster for use with ijulia? 

 You can do this programmatically using a ClusterManager. 


 http://docs.julialang.org/en/latest/manual/parallel-computing/#clustermanagers
  

 Not sure about the IPython cluster option. 

 Thanks, 

 Jiahao Chen, PhD 
 Staff Research Scientist 
 MIT Computer Science and Artificial Intelligence Laboratory 



[julia-users] Using SharedArray with pmap

2014-01-25 Thread Madeleine Udell
When using SharedArrays with pmap, I'm getting an increase in memory usage 
and time proportional to the number of tasks. This doesn't happen when 
using @parallel. What's the right way to pass shared arrays to workers 
using functional syntax?

(code for file q3.jl pasted below and also attached; the first timing 
result refers to a @parallel implementation, the second to a pmap-style 
implementation)

ᐅ julia -p 10 q3.jl 100
elapsed time: 1.14932906 seconds (12402424 bytes allocated)
elapsed time: 0.097900614 seconds (2716048 bytes allocated)
ᐅ julia -p 10 q3.jl 1000
elapsed time: 1.140016584 seconds (12390724 bytes allocated)
elapsed time: 0.302179888 seconds (21641260 bytes allocated)
ᐅ julia -p 10 q3.jl 1
elapsed time: 1.173121314 seconds (12402424 bytes allocated)
elapsed time: 2.429918636 seconds (197840960 bytes allocated)

n = int(ARGS[1])
arr = randn(n)
function make_shared(a::AbstractArray,pids=workers())
sh = SharedArray(typeof(a[1]),size(a),pids=pids)
sh[:] = a[:]
return sh
end
arr = make_shared(arr)
tasks = 1:n

@time begin
@parallel (+) for i in tasks
arr[i]
end
end

@everywhere function f(task,arr)
arr[task]
end
function mypmap(f::Function, tasks, arr)
# if this resends the shared data every time, it shouldn't)
np = nprocs()  # determine the number of processes available
n = length(tasks)
results = 0
i = 1
# function to produce the next work item from the queue.
# in this case it's just an index.
nextidx() = (idx=i; i+=1; idx)
@sync begin
for p=1:np
if p != myid() || np == 1
@async begin
while true
idx = nextidx()
if idx  n
break
end
task = tasks[idx]
results += remotecall_fetch(p, f, task, arr)
end
end
end
end
end
results
end

@time mypmap(f,tasks,arr)


q3.jl
Description: Binary data


Re: [julia-users] [Parallel] Using shared memory + parallel maps elegantly

2014-01-25 Thread Madeleine Udell
That works great. Thanks!


On Thu, Jan 23, 2014 at 8:39 PM, Amit Murthy amit.mur...@gmail.com wrote:

 The SharedArray object ha a field loc_shmarr which represents the backing
 array. So S.loc_shmarr should work everywhere. But you are right, we need
 to ensure that the SharedArray can be used just as a regular array.


 On Fri, Jan 24, 2014 at 9:00 AM, Madeleine Udell 
 madeleine.ud...@gmail.com wrote:

 even more problematic: I can't multiply by my SharedArray:

 no method *(SharedArray{Float64,2}, Array{Float64,2})


 On Thursday, January 23, 2014 7:22:59 PM UTC-8, Madeleine Udell wrote:

 Thanks! I'm trying out a SharedArray solution now, but wondered if you
 can tell me if there's an easy way to reimplement many of the convenience
 wrappers on arrays for shared arrays. Eg I get the following errors:

  shared_array[1,:]
 no method getindex(SharedArray{Float64,2}, Float64, Range1{Int64})

  repmat(shared_array,2,1)
 no method similar(SharedArray{Float64,2}, Type{Float64}, (Int64,Int64))
  in repmat at abstractarray.jl:1043

 I'm surprised these aren't inherited properties from AbstractArray!

 On Wednesday, January 22, 2014 8:05:45 PM UTC-8, Amit Murthy wrote:

 1. The SharedArray object can be sent to any of the processes that
 mapped the shared memory segment during construction. The backing array is
 not copied.
 2. User defined composite types are fine as long as isbits(T) is true.



 On Thu, Jan 23, 2014 at 1:01 AM, Madeleine Udell 
 madelei...@gmail.comwrote:

 That's not a problem for me; all of my data is numeric. To summarize a
 long post, I'm interested in understanding

 1) good programming paradigms for using shared memory together with
 parallel maps. In particular, can a shared array and other nonshared data
 structure be combined into a single data structure and passed in a 
 remote
 call without unnecessarily copying the shared array? and
 2) possibilities for extending shared memory in julia to other data
 types, and even to user defined types.


 On Tuesday, January 21, 2014 11:17:10 PM UTC-8, Amit Murthy wrote:

 I have not gone through your post in detail, but would like to point
 out that SharedArray can only be used for bitstypes.


 On Wed, Jan 22, 2014 at 12:23 PM, Madeleine Udell 
 madelei...@gmail.com wrote:

 # Say I have a list of tasks, eg tasks i=1:n
 # For each task I want to call a function foo
 # that depends on that task and some fixed data
 # I have many types of fixed data: eg, arrays, dictionaries,
 integers, etc

 # Imagine the data comes from eg loading a file based on user input,
 # so we can't hard code the data into the function foo
 # although it's constant during program execution

 # If I were doing this in serial, I'd do the following

 type MyData
 myint
 mydict
 myarray
 end

 function foo(task,data::MyData)
 data.myint + data.myarray[data.mydict[task]]
 end

 n = 10
 const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n))

 results = zeros(n)
 for i = 1:n
 results[i] = foo(i,data)
 end

 # What's the right way to do this in parallel? Here are a number of
 ideas
 # To use @parallel or pmap, we have to first copy all the code and
 data everywhere
 # I'd like to avoid that, since the data is huge (10 - 100 GB)

 @everywhere begin
 type MyData
  myint
 mydict
 myarray
 end

 function foo(task,data::MyData)
 data.myint + data.myarray[data.mydict[task]]
 end

 n = 10
 const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n))
 end

  ## @parallel
 results = zeros(n)
 @parallel for i = 1:n
 results[i] = foo(i,data)
 end

 ## pmap
 @everywhere foo(task) = foo(task,data)
 results = pmap(foo,1:n)

 # To avoid copying data, I can make myarray a shared array
 # In that case, I don't want to use @everywhere to put data on each
 processor
 # since that would reinstantiate the shared array.
 # My current solution is to rewrite my data structure to *not*
 include myarray,
 # and pass the array to the function foo separately.
 # But the code gets much less pretty as I tear apart my data
 structure,
 # especially if I have a large number of shared arrays.
 # Is there a way for me to avoid this while using shared memory?
 # really, I'd like to be able to define my own shared memory data
 types...

 @everywhere begin
 type MySmallerData
 myint
 mydict
  end

 function foo(task,data::MySmallerData,myarray::SharedArray)
 data.myint + myarray[data.mydict[task]]
  end

 n = 10
 const data = MySmallerData(rand(),Dict(1:n,randperm(n)))
 end

 myarray = SharedArray(randperm(n))

 ## @parallel
 results = zeros(n)
 @parallel for i = 1:n
 results[i] = foo(i,data,myarray)
 end

 ## pmap
 @everywhere foo(task) = foo(task,data,myarray)
 results = pmap(foo,1:n)

 # Finally, what can I do to avoid copying mydict to each processor?
 # Is there a way to use shared memory for it?
 # Once again, I'd really like to be able to define my own shared
 memory data types...







-- 
Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University

Re: [julia-users] ijulia parallel

2014-01-25 Thread Madeleine Udell
great, thanks! that works.

On Saturday, January 25, 2014 9:25:28 PM UTC-8, Amit Murthy wrote:

 This was the fix in Julia (2 days ago) that corrected it - 
 https://github.com/JuliaLang/julia/commit/390f466d23798b47ba6be40ee5777e21642569bb
  

 Yo may want to pull the latest Julia and try again.


 On Sun, Jan 26, 2014 at 9:36 AM, Madeleine Udell 
 madelei...@gmail.comjavascript:
  wrote:

 I tried running Pkg.update() and restarting IJulia, with the same error. 
 I'm using the Julia installed from github two days ago. Any other ideas 
 what might be going on?


 On Saturday, January 25, 2014 1:59:55 PM UTC-8, Jiahao Chen wrote:

  ERROR: StateError(Resource temporarily unavailable) 

 This error was recently fixed. Try running Pkg.update() and restarting 
 IJulia. 

  Additionally, I'd like to be able to use the ijulia analog of 
  
  julia --machinefile fn 
  
  
  Is there something I need to add to my ipython julia profile to make 
 this 
  possible? Or a command line argument I can pass? Or can I set up an 
 ipython 
  cluster for use with ijulia? 

 You can do this programmatically using a ClusterManager. 

 http://docs.julialang.org/en/latest/manual/parallel-
 computing/#clustermanagers 

 Not sure about the IPython cluster option. 

 Thanks, 

 Jiahao Chen, PhD 
 Staff Research Scientist 
 MIT Computer Science and Artificial Intelligence Laboratory 




Re: [julia-users] [Parallel] Using shared memory + parallel maps elegantly

2014-01-23 Thread Madeleine Udell
Thanks! I'm trying out a SharedArray solution now, but wondered if you can 
tell me if there's an easy way to reimplement many of the convenience 
wrappers on arrays for shared arrays. Eg I get the following errors:

 shared_array[1,:]
no method getindex(SharedArray{Float64,2}, Float64, Range1{Int64})

 repmat(shared_array,2,1)
no method similar(SharedArray{Float64,2}, Type{Float64}, (Int64,Int64))
 in repmat at abstractarray.jl:1043

I'm surprised these aren't inherited properties from AbstractArray!

On Wednesday, January 22, 2014 8:05:45 PM UTC-8, Amit Murthy wrote:

 1. The SharedArray object can be sent to any of the processes that mapped 
 the shared memory segment during construction. The backing array is not 
 copied.
 2. User defined composite types are fine as long as isbits(T) is true.



 On Thu, Jan 23, 2014 at 1:01 AM, Madeleine Udell 
 madelei...@gmail.comjavascript:
  wrote:

 That's not a problem for me; all of my data is numeric. To summarize a 
 long post, I'm interested in understanding 

 1) good programming paradigms for using shared memory together with 
 parallel maps. In particular, can a shared array and other nonshared data 
 structure be combined into a single data structure and passed in a remote 
 call without unnecessarily copying the shared array? and 
 2) possibilities for extending shared memory in julia to other data 
 types, and even to user defined types.


 On Tuesday, January 21, 2014 11:17:10 PM UTC-8, Amit Murthy wrote:

 I have not gone through your post in detail, but would like to point out 
 that SharedArray can only be used for bitstypes.


 On Wed, Jan 22, 2014 at 12:23 PM, Madeleine Udell 
 madelei...@gmail.comwrote:

 # Say I have a list of tasks, eg tasks i=1:n
 # For each task I want to call a function foo
 # that depends on that task and some fixed data
 # I have many types of fixed data: eg, arrays, dictionaries, integers, 
 etc

 # Imagine the data comes from eg loading a file based on user input,
 # so we can't hard code the data into the function foo 
 # although it's constant during program execution

 # If I were doing this in serial, I'd do the following

 type MyData
 myint
 mydict
 myarray
 end

 function foo(task,data::MyData)
 data.myint + data.myarray[data.mydict[task]]
 end

 n = 10
 const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n))

 results = zeros(n)
 for i = 1:n
 results[i] = foo(i,data)
 end

 # What's the right way to do this in parallel? Here are a number of 
 ideas
 # To use @parallel or pmap, we have to first copy all the code and data 
 everywhere
 # I'd like to avoid that, since the data is huge (10 - 100 GB)

 @everywhere begin
 type MyData
  myint
 mydict
 myarray
 end

 function foo(task,data::MyData)
 data.myint + data.myarray[data.mydict[task]]
 end

 n = 10
 const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n))
 end

  ## @parallel
 results = zeros(n)
 @parallel for i = 1:n
 results[i] = foo(i,data)
 end

 ## pmap
 @everywhere foo(task) = foo(task,data)
 results = pmap(foo,1:n)

 # To avoid copying data, I can make myarray a shared array
 # In that case, I don't want to use @everywhere to put data on each 
 processor
 # since that would reinstantiate the shared array.
 # My current solution is to rewrite my data structure to *not* include 
 myarray,
 # and pass the array to the function foo separately.
 # But the code gets much less pretty as I tear apart my data structure,
 # especially if I have a large number of shared arrays. 
 # Is there a way for me to avoid this while using shared memory?
 # really, I'd like to be able to define my own shared memory data 
 types...

 @everywhere begin
 type MySmallerData
 myint
 mydict
  end

 function foo(task,data::MySmallerData,myarray::SharedArray)
 data.myint + myarray[data.mydict[task]]
  end

 n = 10
 const data = MySmallerData(rand(),Dict(1:n,randperm(n)))
 end

 myarray = SharedArray(randperm(n))

 ## @parallel
 results = zeros(n)
 @parallel for i = 1:n
 results[i] = foo(i,data,myarray)
 end

 ## pmap
 @everywhere foo(task) = foo(task,data,myarray)
 results = pmap(foo,1:n)

 # Finally, what can I do to avoid copying mydict to each processor?
 # Is there a way to use shared memory for it?
 # Once again, I'd really like to be able to define my own shared memory 
 data types...





[julia-users] [Parallel] Chaining dependencies on modules

2014-01-21 Thread Madeleine Udell
I'm trying to understand the most Julian way to perform a particular 
parallel programming task. Suppose I need function foo from module.jl to be 
available everywhere. Let's call the following code map_foo.jl:

@everywhere include(module.jl)
@everywhere using MyModule
pmap(foo,1:100)

That works fine, except when module.jl itself has other dependencies on 
other modules:

module MyModule

using DataStructures
export foo

function foo(i)
return Queue(i)
end

end # module

In this case, it works to call 

julia map_foo.jl

but when I call

julia -p 2 map_foo.jl

I get the following error

Warning: requiring DataStructures did not define a corresponding module.
Warning: requiring DataStructures did not define a corresponding module.
exception on exception on 2: 3: ERROR: ERROR: Queue not definedQueue not 
defined
 in
 in foo at /Users/madeleineudell/Dropbox/pestilli_icme_life 
(1)/src/julia/questions/module.jl:7
 in anonymous at multi.jl:834
 in run_work_thunk at multi.jl:575
 in anonymous at task.jl:834
foo at /Users/madeleineudell/Dropbox/pestilli_icme_life 
(1)/src/julia/questions/module.jl:7
 in anonymous at multi.jl:834
 in run_work_thunk at multi.jl:575
 in anonymous at task.jl:834

Does anyone know how I can successfully chain dependencies like this when 
using parallelism? Calling @everywhere on the import call in module.jl also 
doesn't fix the problem, strangely enough.

Of course, if I could put all my code into shared memory, I'd be much 
happier. I just saw an update https://github.com/JuliaLang/julia/pull/4939 
adding 
shared memory arrays, but I don't know if there's a way to get shared 
memory code!



[julia-users] [Parallel] Using shared memory + parallel maps elegantly

2014-01-21 Thread Madeleine Udell
# Say I have a list of tasks, eg tasks i=1:n
# For each task I want to call a function foo
# that depends on that task and some fixed data
# I have many types of fixed data: eg, arrays, dictionaries, integers, etc

# Imagine the data comes from eg loading a file based on user input,
# so we can't hard code the data into the function foo 
# although it's constant during program execution

# If I were doing this in serial, I'd do the following

type MyData
myint
mydict
myarray
end

function foo(task,data::MyData)
data.myint + data.myarray[data.mydict[task]]
end

n = 10
const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n))

results = zeros(n)
for i = 1:n
results[i] = foo(i,data)
end

# What's the right way to do this in parallel? Here are a number of ideas
# To use @parallel or pmap, we have to first copy all the code and data 
everywhere
# I'd like to avoid that, since the data is huge (10 - 100 GB)

@everywhere begin
type MyData
myint
mydict
myarray
end

function foo(task,data::MyData)
data.myint + data.myarray[data.mydict[task]]
end

n = 10
const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n))
end

## @parallel
results = zeros(n)
@parallel for i = 1:n
results[i] = foo(i,data)
end

## pmap
@everywhere foo(task) = foo(task,data)
results = pmap(foo,1:n)

# To avoid copying data, I can make myarray a shared array
# In that case, I don't want to use @everywhere to put data on each 
processor
# since that would reinstantiate the shared array.
# My current solution is to rewrite my data structure to *not* include 
myarray,
# and pass the array to the function foo separately.
# But the code gets much less pretty as I tear apart my data structure,
# especially if I have a large number of shared arrays. 
# Is there a way for me to avoid this while using shared memory?
# really, I'd like to be able to define my own shared memory data types...

@everywhere begin
type MySmallerData
myint
mydict
end

function foo(task,data::MySmallerData,myarray::SharedArray)
data.myint + myarray[data.mydict[task]]
end

n = 10
const data = MySmallerData(rand(),Dict(1:n,randperm(n)))
end

myarray = SharedArray(randperm(n))

## @parallel
results = zeros(n)
@parallel for i = 1:n
results[i] = foo(i,data,myarray)
end

## pmap
@everywhere foo(task) = foo(task,data,myarray)
results = pmap(foo,1:n)

# Finally, what can I do to avoid copying mydict to each processor?
# Is there a way to use shared memory for it?
# Once again, I'd really like to be able to define my own shared memory 
data types...