Great article! I used some of your tips on a complicated inner loop just now and immediately got a 3.6x speedup.
On Monday, September 15, 2014 7:57:17 PM UTC-5, Arch Robison wrote: > > Thanks to all for taking time to report my errors. I missed @code_llvm's > introduction into Julia. It let me shorten that section slightly. Jacob > is right about the row/column index. I tend to think of the leftmost > subscript as the "runs down the column" index. When I'm working with > Fortran/Julia, I'm constantly thinking "runs down the column; runs down the > column" to override my C++ habit. *B*ut I mistakenly phrased it as > "column index". I changed the text to say "leftmost subscript". I think > that I've fixed all the bugs reported up to the time this note is posted. > > On Mon, Sep 15, 2014 at 5:55 PM, Jacob Quinn <quinn....@gmail.com > <javascript:>> wrote: > >> - Under the first bullet point of "What is Vectorization?", "Writing you >> code" should be "Writing your code" >> >> - In the "Speedup Surprise" section, the last sentence says "The compiler >> than...", should be "The compiler then...." >> >> - In the section "Inspecting Whether Code Vectorizes", you can actually >> use the pattern >> >> >> @code_llvm axpy(1.414f0, x, y) >> >> >> (using the original example/function call). This gives the LLVM output of >> the outermost function call and is quite handy vs. dealing with the tuple >> of argument types (as is noted in your article). >> >> >> - In the section "Trip Count must be Obvious", down to the bullet points, >> I think your second bullet point should be >> >> >> >> - *first*(*r*) returns the first index value. >> >> - In the section "The Loop Body Should be Straight Line Code", there is >> an extra period (.) after *type-stable code* >> >> - In the section "Subscripts should be Unit Stride", it says >> >> When working with nested loops on two-dimensional arrays, use *@simd* on >> the inner loop and make that loop index the column index of the arrays. >> >> But then the example has `i` as the inner loop variable and the matrices >> are indexed like A[i, j], which seems like `j` is column-index. Am I >> misinterpreting this? >> >> >> Loved the article! For those of us unable to make it to JuliaCon, it's >> been awesome to be able to watch the videos and read up on articles like >> this. Thanks for all the work! >> >> -Jacob >> >> On Mon, Sep 15, 2014 at 6:26 PM, Patrick O'Leary <patrick...@gmail.com >> <javascript:>> wrote: >> >>> Under "Inspecting Whether Code Vectorizes" >>> >>> code_llvm(axpy,(T1,T2,T2}) >>> >>> >>> The next-to-last character should be a paren. >>> >>> This is a very informative article; thanks for putting it and the >>> feature together! >>> >> >> >