Re: [julia-users] needless loss of performance: immutable conflates two concepts

Stefan Karpinski Tue, 05 Aug 2014 15:53:33 -0700

It may not be immediately clear why the immutable/mutable distinction is
better for writing generic code than the more traditional value/reference
distinction. This is because mutable and immutable types share the same
semantics while reference types and value types have very different
semantics. You cannot pass a reference type to code that expects a value
type and have it work – or vice versa. It is perfectly fine, on the other
hand, to pass a mutable type to code that expects an immutable one, and as
long as it doesn't actually do any mutation, it's fine to pass an immutable
type to code that was written with a mutable type in mind. Thus, you can
easily write generic code in Julia that is applicable to both mutable and
immutable types. In languages that have reference and value types, you
cannot easily write generic code that is applicable to both – usually the
compiler won't even let you try.


On Tuesday, August 5, 2014, Jameson Nash <vtjn...@gmail.com> wrote:

> > This conflation causes a performance hit.
>
> You claim this, but it is not supported by your data. In fact, we saw the
> opposite:
>
> > testimmut.type_upd_fast()
> 0.618769232 seconds
>
> > testimmut.immut_upd_slow()
> 0.351634871
>
> Why? Perhaps related is that neither 1 nor 3 in your list of assumptions
> is necessarily true.
>
> 1. While semantically this statement is correct, LLVM is extremely good at
> this particular optimization. In fact, while C/C++ does not impose this
> restriction, llvm does! So while the C language spec does not require this
> restriction, it ends up that llvm (via the clang compiler) can do better at
> optimizing C code after adding this restriction.
>
> 3. While ref-counting is a form of garbage collection, it is not the form
> used by Julia. Julia uses a simple mark-and-sweep GC, currently (there is a
> pull request targeted for early 0.4 that will change this to an incremental
> mark-and-sweep GC). This is generally unrelated, as it is possible to have
> a boxed immutable type handled by the GC, whereas the reverse has not yet
> been implemented.
>
> So, while your suggestion that `type` and `immutable` are not the only way
> categories for data access, they were chosen for Julia as being simpler
> mental models than the traditional C distinction of structs and pointers
> (as well as being more amenable to garbage collection and generic
> programming).
>
>
> On Tue, Aug 5, 2014 at 5:38 PM, <vava...@uwaterloo.ca
> <javascript:_e(%7B%7D,'cvml','vava...@uwaterloo.ca');>> wrote:
>
>> Dear Julia users,
>>
>> It seems to me that Julia's distinction between a 'type' and an
>> 'immutable' conflates two independent properties; the consequence of this
>> conflation is a needless loss of performance.  In more detail, the
>> differences between a 'type' struct and 'immutable' struct in Julia are:
>>
>> 1. Assignment of 'type' struct copies a pointer; assignment of an
>> 'immutable' struct copies the data.
>>
>> 2. An array of type structs is an array of pointers, while an array of
>> immutables is an array of data.
>>
>> 3. Type structs are refcounted, whereas immutables are not.  (This is not
>> documented; it is my conjecture.)
>>
>> 4. Fields in type structs can be modified, but fields in immutables
>> cannot.
>>
>> Clearly #1-#3 are related concepts.  As far as I can see, #4 is
>> completely independent from #1-#3, and there is no obvious reason why it is
>> forbidden to modify fields in immutables.  There is no analogous
>> restriction in C/C++.
>>
>> This conflation causes a performance hit.  Consider:
>>
>> type floatbool
>>   a::Float64
>>   b:Bool
>> end
>>
>> If t is of type Array{floatbool,1} and I want to update the flag b in
>> t[10] to 'true', I say 't[10].b=true' (call this 'fast'update).  But if
>> instead of 'type floatbool' I had said 'immutable floatbool', then to set
>> flag b in t[10] I need the more complex code t[10] =
>> floatbool(t[10].a,true) (call this 'slow' update).
>>
>> To document the performance hit, I wrote five functions below. The first
>> three use 'type' and either no update, fast update, or slow update; the
>> last two use 'immutable' and either no update or slow update.   You can see
>> a HUGE hit on performance between slow and fast update for `type'; for
>> immutable there would presumably also be a difference, although apparently
>> smaller. (Obviously, I can't test fast update for immutable; this is the
>> point of my message!)
>>
>> So why does Julia impose this apparently needless restriction on
>> immutable?
>>
>> -- Steve Vavasis
>>
>>
>> julia> @time testimmut.type_upd_none()
>> @time testimmut.type_upd_none()
>> elapsed time: 0.141462422 seconds (48445152 bytes allocated)
>>
>> julia> @time testimmut.type_upd_fast()
>> @time testimmut.type_upd_fast()
>> elapsed time: 0.618769232 seconds (48247072 bytes allocated)
>>
>> julia> @time testimmut.type_upd_slow()
>> @time testimmut.type_upd_slow()
>> elapsed time: 4.511306586 seconds (4048268640 bytes allocated)
>>
>> julia> @time testimmut.immut_upd_none()
>> @time testimmut.immut_upd_none()
>> elapsed time: 0.04480173 seconds (32229468 bytes allocated)
>>
>> julia> @time testimmut.immut_upd_slow()
>> @time testimmut.immut_upd_slow()
>> elapsed time: 0.351634871 seconds (32000096 bytes allocated)
>>
>> module testimmut
>>
>> type xytype
>>     x::Int
>>     y::Float64
>>     z::Float64
>>     summed::Bool
>> end
>>
>> immutable xyimmut
>>     x::Int
>>     y::Float64
>>     z::Float64
>>     summed::Bool
>> end
>>
>> myfun(x) = x * (x + 1) * (x + 2)
>>
>> function type_upd_none()
>>     n = 1000000
>>     a = Array(xytype, n)
>>     for i = 1 : n
>>         a[i] = xytype(div(i,2), 0.0, 0.0, false)
>>     end
>>     numtri = 100
>>     for tri = 1 : numtri
>>         sum = 0
>>         for i = 1 : n
>>             @inbounds x = a[i].x
>>             sum += myfun(x)
>>         end
>>     end
>> end
>>
>>
>> function type_upd_fast()
>>     n = 1000000
>>     a = Array(xytype, n)
>>     for i = 1 : n
>>         a[i] = xytype(div(i,2),  0.0, 0.0, false)
>>     end
>>     numtri = 100
>>     for tri = 1 : numtri
>>         sum = 0
>>         for i = 1 : n
>>             @inbounds x = a[i].x
>>             sum += myfun(x)
>>             @inbounds a[i].summed = true
>>         end
>>     end
>> end
>>
>> function type_upd_slow()
>>     n = 1000000
>>     a = Array(xytype, n)
>>     for i = 1 : n
>>         a[i] = xytype(div(i,2),  0.0, 0.0, false)
>>     end
>>     numtri = 100
>>     for tri = 1 : numtri
>>         sum = 0
>>         for i = 1 : n
>>             @inbounds x = a[i].x
>>             sum += myfun(x)
>>             @inbounds a[i] = xytype(a[i].x, a[i].y, a[i].z, true)
>>         end
>>     end
>> end
>>
>>
>> function immut_upd_none()
>>     n = 1000000
>>     a = Array(xyimmut, n)
>>     for i = 1 : n
>>         a[i] = xyimmut(div(i,2),  0.0, 0.0, false)
>>     end
>>     numtri = 100
>>     for tri = 1 : numtri
>>         sum = 0
>>         for i = 1 : n
>>             @inbounds x = a[i].x
>>             sum += myfun(x)
>>         end
>>     end
>> end
>>
>> function immut_upd_slow()
>>     n = 1000000
>>     a = Array(xyimmut, n)
>>     for i = 1 : n
>>         a[i] = xyimmut(div(i,2),  0.0, 0.0, false)
>>     end
>>     numtri = 100
>>     for tri = 1 : numtri
>>         sum = 0
>>         for i = 1 : n
>>             @inbounds x = a[i].x
>>             sum += myfun(x)
>>             @inbounds a[i] = xyimmut(a[i].x, a[i].y, a[i].z, true)
>>         end
>>     end
>> end
>>
>> end
>>
>>
>>
>>
>
>

Re: [julia-users] needless loss of performance: immutable conflates two concepts

Reply via email to