Perhaps a simpler solution would be to just assert that `evaluate` always 
returns a Float64?  You may even be able to remove the isa branches in that 
case, but I'm not sure how it'll compare.

On Saturday, April 2, 2016 at 4:34:09 PM UTC-4, Cedric St-Jean wrote:
>
> Thank you for the detailed explanation. I tried it out:
>
> function pretty_fast(features::Vector{Feature})
>     retval = 0.0
>     for i in 1 : length(features)
>         if isa(features[i], A)
>             x = evaluate(features[i]::A)
>         elseif isa(features[i], B)
>             x = evaluate(features[i]::B)
>         else
>             x = evaluate(features[i])
>         end
>         retval += x
>     end
>     retval
> end
>
> On my laptop, fast runs in 10 microseconds, pretty_fast in 30, and slow in 
> 210.
>
> On Saturday, April 2, 2016 at 12:24:18 PM UTC-4, Yichao Yu wrote:
>>
>> On Sat, Apr 2, 2016 at 12:16 PM, Tim Wheeler <timwheel...@gmail.com> 
>> wrote: 
>> > Thank you for the comments. In my original code it means the difference 
>> > between a 30 min execution with memory allocation in the Gigabytes and 
>> a few 
>> > seconds of execution with only 800 bytes using the second version. 
>> > I thought under-the-hood Julia basically runs those if statements 
>> anyway for 
>> > its dispatch, and don't know why it needs to allocate any memory. 
>> > Having the if-statement workaround will be fine though. 
>>
>> Well, if you have a lot of these cheap functions being dynamically 
>> dispatched I think it is not a good way to use the type. Depending on 
>> your problem, you may be better off using a enum/flags/dict to 
>> represent the type/get the values. 
>>
>> The reason for the allocation is that the return type is unknown. It 
>> should be obvious to see if you check your code with code_warntype. 
>>
>> > 
>> > On Saturday, April 2, 2016 at 7:26:11 AM UTC-7, Cedric St-Jean wrote: 
>> >> 
>> >> 
>> >>> Therefore there's no way the compiler can rewrite the slow version to 
>> the 
>> >>> fast version. 
>> >> 
>> >> 
>> >> It knows that the element type is a Feature, so it could produce: 
>> >> 
>> >> if isa(features[i], A) 
>> >>     retval += evaluate(features[i]::A) 
>> >> elseif isa(features[i], B) 
>> >>     retval += evaluate(features[i]::B) 
>> >> else 
>> >>     retval += evaluate(features[i]) 
>> >> end 
>> >> 
>> >> and it would make sense for abstract types that have few subtypes. I 
>> >> didn't realize that dispatch was an order of magnitude slower than 
>> type 
>> >> checking. It's easy enough to write a macro generating this expansion, 
>> too. 
>> >> 
>> >> On Saturday, April 2, 2016 at 2:05:20 AM UTC-4, Yichao Yu wrote: 
>> >>> 
>> >>> On Fri, Apr 1, 2016 at 9:56 PM, Tim Wheeler <timwheel...@gmail.com> 
>> >>> wrote: 
>> >>> > Hello Julia Users. 
>> >>> > 
>> >>> > I ran into a weird slowdown issue and reproduced a minimal working 
>> >>> > example. 
>> >>> > Maybe someone can help shed some light. 
>> >>> > 
>> >>> > abstract Feature 
>> >>> > 
>> >>> > type A <: Feature end 
>> >>> > evaluate(f::A) = 1.0 
>> >>> > 
>> >>> > type B <: Feature end 
>> >>> > evaluate(f::B) = 0.0 
>> >>> > 
>> >>> > function slow(features::Vector{Feature}) 
>> >>> >     retval = 0.0 
>> >>> >     for i in 1 : length(features) 
>> >>> >         retval += evaluate(features[i]) 
>> >>> >     end 
>> >>> >     retval 
>> >>> > end 
>> >>> > 
>> >>> > function fast(features::Vector{Feature}) 
>> >>> >     retval = 0.0 
>> >>> >     for i in 1 : length(features) 
>> >>> >         if isa(features[i], A) 
>> >>> >             retval += evaluate(features[i]::A) 
>> >>> >         else 
>> >>> >             retval += evaluate(features[i]::B) 
>> >>> >         end 
>> >>> >     end 
>> >>> >     retval 
>> >>> > end 
>> >>> > 
>> >>> > using ProfileView 
>> >>> > 
>> >>> > features = Feature[] 
>> >>> > for i in 1 : 10000 
>> >>> >     push!(features, A()) 
>> >>> > end 
>> >>> > 
>> >>> > slow(features) 
>> >>> > @time slow(features) 
>> >>> > fast(features) 
>> >>> > @time fast(features) 
>> >>> > 
>> >>> > The output is: 
>> >>> > 
>> >>> > 0.000136 seconds (10.15 k allocations: 166.417 KB) 
>> >>> > 0.000012 seconds (5 allocations: 176 bytes) 
>> >>> > 
>> >>> > 
>> >>> > This is a HUGE difference! Am I missing something big? Is there a 
>> good 
>> >>> > way 
>> >>> > to inspect code to figure out where I am going wrong? 
>> >>> 
>> >>> This is because of type instability as you will find in the 
>> performance 
>> >>> tips. 
>> >>> Note that slow and fast are not equivalent since the fast version 
>> only 
>> >>> accept `A` or `B` but the slow version accepts any subtype of feature 
>> >>> that you may ever define. Therefore there's no way the compiler can 
>> >>> rewrite the slow version to the fast version. 
>> >>> There are optimizations that can be applied to bring down the gap but 
>> >>> there'll always be a large difference between the two. 
>> >>> 
>> >>> > 
>> >>> > 
>> >>> > Thank you in advance for any guidance. 
>> >>> > 
>> >>> > 
>> >>> > -Tim 
>> >>> > 
>> >>> > 
>> >>> > 
>> >>> > 
>> >>> > 
>>
>

Reply via email to