Great article!  I used some of your tips on a complicated inner loop just 
now and immediately got a 3.6x speedup.  

On Monday, September 15, 2014 7:57:17 PM UTC-5, Arch Robison wrote:
>
> Thanks to all for taking time to report my errors.  I missed @code_llvm's 
> introduction into Julia.  It let me shorten that section slightly.  Jacob 
> is right about the row/column index. I tend to think of the leftmost 
> subscript as the "runs down the column" index.  When I'm working with 
> Fortran/Julia, I'm constantly thinking "runs down the column; runs down the 
> column" to override my C++ habit.  *B*ut I mistakenly phrased it as 
> "column index".  I changed the text to say "leftmost subscript".  I think 
> that I've fixed all the bugs reported up to the time this note is posted.
>
> On Mon, Sep 15, 2014 at 5:55 PM, Jacob Quinn <quinn....@gmail.com 
> <javascript:>> wrote:
>
>> - Under the first bullet point of "What is Vectorization?", "Writing you 
>> code" should be "Writing your code"
>>
>> - In the "Speedup Surprise" section, the last sentence says "The compiler 
>> than...", should be "The compiler then...."
>>
>> - In the section "Inspecting Whether Code Vectorizes", you can actually 
>> use the pattern
>>  
>>
>> @code_llvm axpy(1.414f0, x, y)
>>
>>
>> (using the original example/function call). This gives the LLVM output of 
>> the outermost function call and is quite handy vs. dealing with the tuple 
>> of argument types (as is noted in your article).
>>
>>
>> - In the section "Trip Count must be Obvious", down to the bullet points, 
>> I think your second bullet point should be 
>>
>>
>>
>>    - *first*(*r*) returns the first index value.
>>
>>  - In the section "The Loop Body Should be Straight Line Code", there is 
>> an extra period (.) after *type-stable code*
>>
>> - In the section "Subscripts should be Unit Stride", it says
>>
>> When working with nested loops on two-dimensional arrays, use *@simd* on 
>> the inner loop and make that loop index the column index of the arrays.
>>
>> But then the example has `i` as the inner loop variable and the matrices 
>> are indexed like A[i, j], which seems like `j` is column-index. Am I 
>> misinterpreting this?
>>
>>
>> Loved the article! For those of us unable to make it to JuliaCon, it's 
>> been awesome to be able to watch the videos and read up on articles like 
>> this. Thanks for all the work!
>>
>> -Jacob
>>
>> On Mon, Sep 15, 2014 at 6:26 PM, Patrick O'Leary <patrick...@gmail.com 
>> <javascript:>> wrote:
>>
>>> Under "Inspecting Whether Code Vectorizes"
>>>
>>> code_llvm(axpy,(T1,T2,T2})
>>>
>>>
>>> The next-to-last character should be a paren. 
>>>
>>> This is a very informative article; thanks for putting it and the 
>>> feature together!
>>>
>>
>>
>

Reply via email to