npreg spends almost all its time in multithreaded blas doing QR factorizations so there isn't much "tweaking" you can do do improve performance.
On Friday, December 18, 2015 at 9:11:07 AM UTC+1, michae...@gmail.com wrote: > > I did profiling, and ProfileView.jl worked fine, and is definitely pretty > slick. There were no surprises, though, the sections with the @time-ings > are the ones that are costly. > > On Thursday, December 17, 2015 at 6:27:11 PM UTC+1, michae...@gmail.com > wrote: >> >> I tried using the profiler with another problem a few months ago, and >> ProfileView was not working for me then. I will give it another try. >> However, the parts of the code that impact the timing are pretty narrowly >> identified already. I have read the performance guide pretty carefully, and >> I don't see how to improve the current code with its suggestions. I suspect >> that trying to avoid using large arrays, and doing more with loops, might >> help. That would be a change of strategy, though, rather than an >> optimization of the current approach. >> >> On Thursday, December 17, 2015 at 3:54:07 PM UTC+1, Kristoffer Carlsson >> wrote: >>> >>> Why haven't you tried to profile it? That's is the first thing that >>> anyone that would try to help you would do. Use >>> https://github.com/timholy/ProfileView.jl see what is slow and see if >>> it is explained in the performance guide. >>> >>> Then you can ask a much better question, like "why is this statement" >>> slow instead of posting a whole function and ask someone to optimize the >>> whole thing. >>> >>