> > Perhaps I will just call GEMV directly and pass it pointers to the correct > locations.
Yes that would solve the allocation problem. Some of the BLAS wrappers support pointers as input, but I think that is only the BLAS1 routines. Hence, you'll have to make your own ccall to BLAS. Maybe we should consider to support pointer version of the BLAS2 and 3 wrappers. 2014-12-07 13:05 GMT-05:00 <adok...@ucdavis.edu>: > Thank you! I had suspected it may have been related to sub. Perhaps I will > just call GEMV directly and pass it pointers to the correct locations. > > On Sunday, December 7, 2014 9:59:47 AM UTC-8, Andreas Noack wrote: >> >> The allocation is from gemv!. It is likely to be sub that allocates the >> memory. See >> >> >> julia> let >> A = randn(5,5); >> x = randn(5); >> b = Array(Float64, 5) >> @time for i = 1:10;BLAS.gemv!('N',1.0,A,x,0.0,b);end >> end >> elapsed time: 8.0933e-5 seconds (0 bytes allocated) >> >> >> versus >> >> julia> let >> x = randn(5); >> @time for i = 1:10;sub(x, 1:5);end >> end >> elapsed time: 1.258e-6 seconds (1440 bytes allocated) >> >> >> I've been told that it is because of the garbage collector and >> unfortunately I'm not sure if it is possible to change. It can in some >> cases be extensive and make it difficult to compete with Fortran >> implementations of linear algebra algorithms. >> >> 2014-12-07 12:41 GMT-05:00 <ado...@ucdavis.edu>: >> >>> I am writing a Julia function which implements some of the functionality >>> of the LAPACK function TPQRT, but my version stops before computing the >>> block reflectors. In my function, I make several calls to gemv! and ger! >>> from Base.LinAlg.BLAS. Each of these calls seems to generate extra overhead >>> of 440 bytes according to the output of --track-allocation, and since each >>> of those functions gets called several times within a function that also >>> gets called many times, the extra allocations blow up and I end up spending >>> a large chunk of time in garbage collection. Is it normal for each function >>> call to dynamically allocate so much space? Here is the function in >>> question: >>> >>> function tpqrf!( A::StridedMatrix{Float64}, B::StridedMatrix{Float64}, >>> tau::StridedVector{Float64}, work::Vector{Float64} ) >>> m,n = size(B) >>> for j = 1:n >>> # Construct Householder reflector >>> A[j,j], tau[j] = larfg!(A[j,j], sub(B, :, j), tau[j]) >>> # Apply reflector to remainder >>> for i = j+1:n >>> work[i] = A[j,i] >>> end >>> Base.LinAlg.BLAS.gemv!('T', 1.0, sub(B,:,j+1:n), sub(B,:,j), 1.0, >>> sub(work,j+1:n)) >>> for k = j+1:n >>> A[j,k] = A[j,k] - tau[j]*work[k] >>> end >>> Base.LinAlg.BLAS.ger!(-tau[j], sub(B,:,j), sub(work,j+1:n), >>> sub(B,:,j+1:n)) >>> end >>> end >>> >>> Thanks, >>> Aaron >>> >>> >>> >>> >>