I moved this discussion to https://discourse.julialang.org where it is 
easier to format the code chunks.

On Monday, October 31, 2016 at 2:04:02 PM UTC-5, Douglas Bates wrote:
>
> I am encountering an unexpected amount of storage allocation in the 
> cfactor!{A::HBlkDiag) method in the MixedModels package.  See
> https://github.com/dmbates/MixedModels.jl/blob/master/src/cfactor.jl  for 
> the code.
>
> An HBlkDiag matrix is a homogeneous block diagonal matrix where 
> "homogeneous" refers to the fact that all the diagonal blocks are square 
> and of the same size.  Because of this homogeneity the blocks can be stored 
> in an r by r by k array where r is the size of each of the square block and 
> k is the number of such blocks.
>
> On entry to this method the blocks are symmetric, positive semi-definite. 
>  I want to overwrite the upper triangle of each of these blocks with its 
> Cholesky factor.  I call LAPACK.potrf! directly because I don't want 
> cholfact! to throw a non-positive-definite error.  The strange thing to me 
> is that when I monitor the storage allocation, I get a huge amount of 
> storage being allocated in the line with that call.  This may be because 
> LAPACK.potrf! returns a tuple of the original matrix and an Int (the info 
> code) but I'm not sure.
>
> To see an example of this unusual amount of allocation try the following 
> code with julia --track-allocation=user
>
> using Feather, MixedModels
> cd(Pkg.dir("MixedModels", "test", "data"))
> sleepstudy = Feather.read("sleepstudy.feather", nullable=false)
> fm1 = fit!(lmm(Reaction ~ 1 + Days + (1 + Days | Subject), sleepstudy))
> Profile.clear_malloc_data()
> devs, vars, betas, thetas = bootstrap(10_000, fm1)
>
> I get
>
>         - function cfactor!{T}(A::HBlkDiag{T})
>         -     Aa = A.arr
>         0     r, s, t = size(Aa)
>         0     if r ≠ s
>         0         throw(ArgumentError("HBlkDiag matrix A must be square"))
>         -     end
>  94428000     scm = Array(T, (r, r))
>         0     for k in 1 : t  # FIXME: Lots of allocations in this loop
>         0         for j in 1 : r, i in 1 : j
>         0             scm[i, j] = Aa[i, j, k]
>         -         end
> 566568000         LAPACK.potrf!('U', scm)
>         0         for j in 1 : r, i in 1 : j
>         0             Aa[i, j, k] = scm[i, j]
>         -         end
>         -     end
>  10492000     UpperTriangular(A)
>         - end
>
> In this case the HBlkDiag matrix being decomposed has is 36 by 36 
> consisting of 18 2 by 2 diagonal blocks.  scm is a scratch 2 by 2 
> matrixthat is overwritten in sequence by the upper triangle of each of the 
> original 2 by 2 blocks and passed to LAPACK.potrf!
>

Reply via email to