Could you please explain why the iterator version is so much faster? Is it
simply from avoiding temporary array allocation?
Thanks,
--Peter
On Friday, August 22, 2014 7:53:59 AM UTC-7, Rafael Fourquet wrote:
>
> We'd like to eventually be able to do stream fusion to make the vectorized
>> version as efficient as the manually fused version, but for now there's a
>> performance gap.
>>
>
> It is also not too difficult to implement a fused version via iterators,
> eg:
>
> immutable iabs{X}
> x::X
> end
>
> Base.start(i::iabs) = start(i.x)
> Base.next(i::iabs, s) = ((v, s) = next(i.x, s); (abs(v), s))
> Base.done(i::iabs, s) = done(i.x, s)
>
> Then sum(iabs(A)) is ways faster than sum(abs(A)) (but still slightly
> slower than sumabs(A)).
>
>