Go here: https://github.com/JuliaLang/julia/issues/new; describe the issue (much as you did here) and submit. Thank you!
On Fri, Jul 22, 2016 at 8:58 PM, Michael Prange <pra...@alum.mit.edu> wrote: > I'm new to Julia and do not know how to file a performance issue, but I am > happy to do it. Can you point me to the right place? > > Sent from my phone > > On Jul 22, 2016, at 18:09, Stefan Karpinski <ste...@karpinski.org> wrote: > > Can you file a performance issue? The built-in circshift should not have > these performance issues. > > On Fri, Jul 22, 2016 at 4:18 PM, Michael Prange <pra...@alum.mit.edu> > wrote: > >> I just discovered that Julia already has a function for circularly >> shifting the data in an array: circshift(A, shifts). However, its >> performance is worst of all. Using this new method, >> >> function fill_W4!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, ishift:: >> Int) >> @assert(size(W,1) == length(w), "Dimension mismatch between W and w") >> W[:,icol] = circshift(w,[ishift,]) >> return >> end >> >> >> the resulting timings are given by (with new random numbers) >> >> fill_W!: 0.002918 seconds (4 allocations: 160 bytes) >> fill_W1!: 0.006440 seconds (10 allocations: 7.630 MB) >> fill_W2!: 0.009244 seconds (8 allocations: 7.630 MB, 21.61% gc time) >> fill_W3!: 0.002014 seconds (8 allocations: 352 bytes) >> fill_W4!: 0.049601 seconds (19 allocations: 30.518 MB, 3.63% gc time) >> >> >> I would have expected the built-in method circshift to achieve the best >> results, but it is worst in all categories: time, allocations and memory. >> >> Michael >> >> On Friday, July 22, 2016 at 2:23:16 PM UTC-4, Michael Prange wrote: >>> >>> Gunnar, >>> >>> Thank you for your explanation of the extra allocations and the tip >>> about sub. I implemented a version with sub as fill_W3!: >>> >>> function fill_W3!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, >>> ishift::Int) >>> @assert(size(W,1) == length(w), "Dimension mismatch between W and w" >>> ) >>> W[(ishift+1):end,icol] = sub(w, 1:(length(w)-ishift)) >>> W[1:ishift,icol] = sub(w, (length(w)-ishift+1):length(w)) >>> return >>> end >>> >>> Is this what you had in mind? I reran the tests above (with new random >>> numbers) and had the following results: >>> >>> fill_W!: 0.003234 seconds (4 allocations: 160 bytes) >>> fill_W1!: 0.005898 seconds (9 allocations: 7.630 MB) >>> fill_W2!: 0.005904 seconds (7 allocations: 7.630 MB) >>> fill_W3!: 0.002347 seconds (8 allocations: 352 bytes) >>> >>> Using sub consistently achieves better times that fill_W!, even through it >>> uses twice the number of allocations than fill_W!. This seems to be the way >>> to go. >>> >>> >>> Michael >>> >>> >>> On Thursday, July 21, 2016 at 5:35:47 PM UTC-4, Gunnar Farnebäck wrote: >>>> >>>> fill_W1! allocates memory because it makes copies when constructing the >>>> right hand sides. fill_W2 allocates memory in order to construct the >>>> comprehensions (that you then discard). In both cases memory allocation >>>> could plausibly be avoided by a sufficiently smart compiler, but until >>>> Julia becomes that smart, have a look at the sub function to provide views >>>> instead of copies for the right hand sides of fill_W1!. >>>> >>>> On Thursday, July 21, 2016 at 5:07:34 PM UTC+2, Michael Prange wrote: >>>>> >>>>> I'm a new user, so have mercy in your responses. >>>>> >>>>> I've written a method that takes a matrix and vector as input and then >>>>> fills in column icol of that matrix with the vector of given values that >>>>> have been shifted upward by ishift indices with periodic boundary >>>>> conditions. To make this clear, given the matrix >>>>> >>>>> W = [1 2 >>>>> 3 4 >>>>> 5 6] >>>>> >>>>> the vector w = [7 8 9], icol = 2 and ishift = 1, the new value of W >>>>> is given by >>>>> >>>>> W = [1 8 >>>>> 3 9 >>>>> 5 7] >>>>> >>>>> I need a fast way of doing this for large matrices. I wrote three >>>>> methods that should (In my naive mind) give the same performance results, >>>>> but @time reports otherwise. The method definitions and the performance >>>>> results are given below. Can someone teach me why the results are so >>>>> different? The method fill_W! is too wordy for my tastes, but the more >>>>> compact notation in fill_W1! and fill_W2! achieve poorer results. Any why >>>>> do these latter two methods allocate so much memory when the whole point >>>>> of >>>>> these methods is to use already-allocated memory. >>>>> >>>>> Michael >>>>> >>>>> ### Definitions >>>>> >>>>> >>>>> function fill_W1!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, >>>>> ishift::Int) >>>>> @assert(size(W,1) == length(w), "Dimension mismatch between W and >>>>> w") >>>>> W[1:(end-ishift),icol] = w[(ishift+1):end] >>>>> W[(end-(ishift-1)):end,icol] = w[1:ishift] >>>>> return >>>>> end >>>>> >>>>> >>>>> function fill_W2!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, >>>>> ishift::Int) >>>>> @assert(size(W,1) == length(w), "Dimension mismatch between W and >>>>> w") >>>>> [W[i,icol] = w[i+ishift] for i in 1:(length(w)-ishift)] >>>>> [W[end-ishift+i,icol] = w[i] for i in 1:ishift] >>>>> return >>>>> end >>>>> >>>>> >>>>> function fill_W!{TF}(W::Matrix{TF}, icol::Int, w::Vector{TF}, >>>>> ishift::Int) >>>>> @assert(size(W,1) == length(w), "Dimension mismatch between W and >>>>> w") >>>>> n = length(w) >>>>> for j in 1:(n-ishift) >>>>> W[j,icol] = w[j+ishift] >>>>> end >>>>> for j in (n-(ishift-1)):n >>>>> W[j,icol] = w[j-(n-ishift)] >>>>> end >>>>> end >>>>> >>>>> >>>>> # Performance Results >>>>> julia> >>>>> W = rand(1000000,2) >>>>> w = rand(1000000) >>>>> println("fill_W!:") >>>>> println(@time fill_W!(W, 2, w, 2)) >>>>> println("fill_W1!:") >>>>> println(@time fill_W1!(W, 2, w, 2)) >>>>> println("fill_W2!:") >>>>> println(@time fill_W2!(W, 2, w, 2)) >>>>> >>>>> >>>>> Out> >>>>> fill_W!: >>>>> 0.002801 seconds (4 allocations: 160 bytes) >>>>> nothing >>>>> fill_W1!: >>>>> 0.007427 seconds (9 allocations: 7.630 MB) >>>>> [0.152463397611579,0.6314166578356002] >>>>> fill_W2!: >>>>> 0.005587 seconds (7 allocations: 7.630 MB) >>>>> [0.152463397611579,0.6314166578356002] >>>>> >>>>> >>>>> >