npreg spends almost all its time in multithreaded blas doing QR 
factorizations so there isn't much "tweaking" you can do do improve 
performance.

On Friday, December 18, 2015 at 9:11:07 AM UTC+1, michae...@gmail.com wrote:
>
> I did profiling, and ProfileView.jl worked fine, and is definitely pretty 
> slick. There were no surprises, though, the sections with the @time-ings 
> are the ones that are costly.
>
> On Thursday, December 17, 2015 at 6:27:11 PM UTC+1, michae...@gmail.com 
> wrote:
>>
>> I tried using the profiler with another problem a few months ago, and 
>> ProfileView was not working for me then. I will give it another try. 
>> However, the parts of the code that impact the timing are pretty narrowly 
>> identified already. I have read the performance guide pretty carefully, and 
>> I don't see how to improve the current code with its suggestions. I suspect 
>> that trying to avoid using large arrays, and doing more with loops, might 
>> help. That would be a change of strategy, though, rather than an 
>> optimization of the current approach.
>>
>> On Thursday, December 17, 2015 at 3:54:07 PM UTC+1, Kristoffer Carlsson 
>> wrote:
>>>
>>> Why haven't you tried to profile it? That's is the first thing that 
>>> anyone that would try to help you would do. Use 
>>> https://github.com/timholy/ProfileView.jl see what is slow and see if 
>>> it is explained in the performance guide.
>>>
>>> Then you can ask a much better question, like "why is this statement" 
>>> slow instead of posting a whole function and ask someone to optimize the 
>>> whole thing.
>>>
>>

Reply via email to