Re: [julia-users] Incrementing integer slow?
Holy smoky, this loop flies now! Thanks so much to everyone who contributed to help, awesome feedback. Following are a few comments from my further tests with enclosed my current fastest version: - In my script on the global scope, I can define a type, but I cannot defined it const. So: - GlobalIdType = Int works but - const GlobalIdType = Int errors out. - I can define a type const within a function definition, but it does not seem to have any performance benefit. - As Stefan explained, the performance hit with my initial integer increment i += 1 was not because it was not typed but because it was not considered a constant type by the compiler. I decoupled my array index types with their column id types, declaring that array index type inside my function definition, and those increments pretty much disappeared from the profile report. I did not have to declare explicitly those index types, the default Int would have worked just fine, I was just experimenting to make sure I did not misunderstand anything. - Then, again using Stefan's hint, the next bottleneck was the array accesses. I added the types of interest as parameters of my function (i.e., template function in the C++ world), annotated the array types accordingly, and here we go, another order of magnitude of performance gain! Using @code_typed shows all variables are typed, no Any shows up. - Now, the slowest part is no longer the loop being iterated over about 100,000 times in my test, but the initial array allocation of the results being returned by the function -- quite amazing the efficiency of the core iteration! This was a very educational exercise for me, this will help me write code appropriately typed and fast from the beginning. I have a *remaining question* on how to best pass those types in a function signature where my main data container is a DataFrame. Essentially, I functionally have: function foo(dt::DataFrame) a = dt[:col].data # An array as column of my DataFrame, data with no NA. const MyDataType = eltype(a) ... end As specified above, access is slow because the array a if of type Array{Any} at compile time, so not optimal by an order of magnitude. What I essentially did to type and hence speed up the array access was passing the array data type parameter as follows: function foo(dt::DataFrame, dummyParameter::MyDataType) a = dt[:col].data::Array{MyDataType} # An array as column of my DataFrame, data with no NA. ... end But it is not clean, the data type MyDataType is redundant as embedded in the data frame parameter dt. Is there a way to annotate directly the data frame parameter dt with its column types so I don't have to create a dummy function parameter for this purpose? If so, how? If my data frame has N columns with my data type of interest in its first column, I am looking for something conceptually like below (syntax below does not work): function foo(dt::*DataFrame{**MyDataType, ...}*) a = dt[:col].data::Array{MyDataType} # An array as column of my DataFrame, data with no NA. ... end Note that the following pattern does not solve the array type inference problem (i.e., it is slow), see Generic performance problems, #523 by John Myles White https://github.com/JuliaStats/DataFrames.jl/issues/523 (where I initially learned to go access the data array directly for optimal performance): function foo(dt::DataFrame) da = dt[:col] const MyDataType = eltype(a) a = dt[:col].data::Array{MyDataType} # An array as column of my DataFrame, data with no NA. ... end (Perhaps I should ask this question in a different post as it is beyond the scope of my initial increment performance question) Thanks much. On Thursday, September 18, 2014 4:25:25 PM UTC-4, Stefan Karpinski wrote: On Thu, Sep 18, 2014 at 2:33 PM, Ivar Nesje iva...@gmail.com javascript: wrote: Seems like Julia is not smart enough to guess that i::IdType will always ensure that i is a Int64 when IdType might change. Since IdType is not constant, it can change at any time – generating code on the premise that it cannot change would be incorrect. Instead, the code generated for this function needs to handle the possibility that IdType can change at any point, which basically makes all optimizations impossible. It's a bit low level (we should really expose this better), but you can use @code_typed to see what the inferred types of all the local variables are: julia (@code_typed manual_iter1(1,2))[1].args[2][2] 38-element Array{Any,1}: {:dt1,Int64,0} {:dt2,Int64,0} {:zeroIdType,Any,18} {:oneIdType,Any,18} {:zeroDType,Any,18} {symbol(#s5149),Any,18} {symbol(#s5150),Any,18} {:dt1id1,Any,18} {:dt1D,Any,18} {symbol(#s5151),Any,18} {symbol(#s5152),Any,18} {symbol(#s5153),Any,18} {:dt2id2,Any,18} {:dt2D1,Any,18} {:dt2D2,Any,18} {:MAX_INT,Any,18}
Re: [julia-users] Incrementing integer slow?
1 has type Int. If you add it to something with a different type, you might be causing type instability. What happens if you replace the literal 1 with one(T) for the type you're working with? -- John On Sep 18, 2014, at 9:56 AM, G. Patrick Mauroy gpmau...@gmail.com wrote: Profiling shows incrementing integers by 1 (i += 1) being the bottleneck. Within the same loop are other statements that do take much less time. In my performance optimizing zeal, I over typed the hell out of everything to attempt squeezing performance to the last once. Some of this zeal did help in other parts of the code, but now struggling making sense at spending most of the time incrementing by 1. I suspect the problem is over typing zeal because I seem to recall having a version not so strongly typed that ran consistently 2-3 times faster for default Int (but not for other Int types). It was late at night so I don't recall the details! I am pretty confident the increment variables are typed so there should not be any undue cast. Any idea? Here is how my code conceptually looks like: # Global static type declaration ahead seems to have helped (as opposed to deriving from eltype of underlying array at the beginning of function being profiled). IdType = Int # Int64 DType = Int function my_fct(dt1, dt2) # Convert is for sure unnecessary for default Int types but more rigorous and necessary in some parts of code when experimenting with other IdType DType types. const oneIdType = convert(IdType, 1) # Used to make sure I increment with a value of the proper type, again useless with IdType = Int. const zeroIdType = convert(IdType, 0) i::IdType = zeroIdType; i2Match::IdType = zeroIdType; i2Lower::IdType = zeroIdType; i2Upper::IdType = oneIdType; ... # Critical loop. i2Match = i2Lower while i2Match i2Upper @inbounds i2MatchD2 = dt2D2[i2Match] if i1D = i2MatchD2 i += oneIdType # SLOW! @inbounds i2MatchD1 = dt2D1[i2Match] @inbounds resid1[i] = i1id1 ... end i2Match += oneIdType # SLOW! end ... end The undeclared types are 1-dim arrays of the appropriate type -- basically all Int in this configuration. Enclosed is the full stand-alone code if anyone cares to try. On my machines, one function call is in the range of 0.05 to 0.1 sec, highly depending upon garbage collection, so profiling with 100 runs is done in about 10 sec. Thanks. Patrick crossJoinFilter.jl
Re: [julia-users] Incrementing integer slow?
No change. I over typed everything to avoid such type mismatches, particularly when experimenting with other integer types. So unless I missed something somewhere, it should not be the case. I suspect something like the compiler does not recognize the incrementing variables should be registries. Unless it is the inherent speed of incrementing, but I doubt it, I had some faster runs at some points... On Thursday, September 18, 2014 12:58:12 PM UTC-4, John Myles White wrote: 1 has type Int. If you add it to something with a different type, you might be causing type instability. What happens if you replace the literal 1 with one(T) for the type you're working with? -- John On Sep 18, 2014, at 9:56 AM, G. Patrick Mauroy gpma...@gmail.com javascript: wrote: Profiling shows incrementing integers by 1 (i += 1) being the bottleneck. Within the same loop are other statements that do take much less time. In my performance optimizing zeal, I over typed the hell out of everything to attempt squeezing performance to the last once. Some of this zeal did help in other parts of the code, but now struggling making sense at spending most of the time incrementing by 1. I suspect the problem is over typing zeal because I seem to recall having a version not so strongly typed that ran consistently 2-3 times faster for default Int (but not for other Int types). It was late at night so I don't recall the details! I am pretty confident the increment variables are typed so there should not be any undue cast. Any idea? Here is how my code conceptually looks like: # Global static type declaration ahead seems to have helped (as opposed to deriving from eltype of underlying array at the beginning of function being profiled). IdType = Int # Int64 DType = Int function my_fct(dt1, dt2) # Convert is for sure unnecessary for default Int types but more rigorous and necessary in some parts of code when experimenting with other IdType DType types. const oneIdType = convert(IdType, 1) # Used to make sure I increment with a value of the proper type, again useless with IdType = Int. const zeroIdType = convert(IdType, 0) i::IdType = zeroIdType; i2Match::IdType = zeroIdType; i2Lower::IdType = zeroIdType; i2Upper::IdType = oneIdType; ... # Critical loop. i2Match = i2Lower while i2Match i2Upper @inbounds i2MatchD2 = dt2D2[i2Match] if i1D = i2MatchD2 i += oneIdType # SLOW! @inbounds i2MatchD1 = dt2D1[i2Match] @inbounds resid1[i] = i1id1 ... end i2Match += oneIdType # SLOW! end ... end The undeclared types are 1-dim arrays of the appropriate type -- basically all Int in this configuration. Enclosed is the full stand-alone code if anyone cares to try. On my machines, one function call is in the range of 0.05 to 0.1 sec, highly depending upon garbage collection, so profiling with 100 runs is done in about 10 sec. Thanks. Patrick crossJoinFilter.jl
Re: [julia-users] Incrementing integer slow?
Try running with --track-allocation=user and see if it's allocating memory on that line. If so, you have a type problem. http://docs.julialang.org/en/latest/manual/performance-tips/ (2nd and 3rd sections) --Tim On Thursday, September 18, 2014 10:44:58 AM G. Patrick Mauroy wrote: No change. I over typed everything to avoid such type mismatches, particularly when experimenting with other integer types. So unless I missed something somewhere, it should not be the case. I suspect something like the compiler does not recognize the incrementing variables should be registries. Unless it is the inherent speed of incrementing, but I doubt it, I had some faster runs at some points... On Thursday, September 18, 2014 12:58:12 PM UTC-4, John Myles White wrote: 1 has type Int. If you add it to something with a different type, you might be causing type instability. What happens if you replace the literal 1 with one(T) for the type you're working with? -- John On Sep 18, 2014, at 9:56 AM, G. Patrick Mauroy gpma...@gmail.com javascript: wrote: Profiling shows incrementing integers by 1 (i += 1) being the bottleneck. Within the same loop are other statements that do take much less time. In my performance optimizing zeal, I over typed the hell out of everything to attempt squeezing performance to the last once. Some of this zeal did help in other parts of the code, but now struggling making sense at spending most of the time incrementing by 1. I suspect the problem is over typing zeal because I seem to recall having a version not so strongly typed that ran consistently 2-3 times faster for default Int (but not for other Int types). It was late at night so I don't recall the details! I am pretty confident the increment variables are typed so there should not be any undue cast. Any idea? Here is how my code conceptually looks like: # Global static type declaration ahead seems to have helped (as opposed to deriving from eltype of underlying array at the beginning of function being profiled). IdType = Int # Int64 DType = Int function my_fct(dt1, dt2) # Convert is for sure unnecessary for default Int types but more rigorous and necessary in some parts of code when experimenting with other IdType DType types. const oneIdType = convert(IdType, 1) # Used to make sure I increment with a value of the proper type, again useless with IdType = Int. const zeroIdType = convert(IdType, 0) i::IdType = zeroIdType; i2Match::IdType = zeroIdType; i2Lower::IdType = zeroIdType; i2Upper::IdType = oneIdType; ... # Critical loop. i2Match = i2Lower while i2Match i2Upper @inbounds i2MatchD2 = dt2D2[i2Match] if i1D = i2MatchD2 i += oneIdType # SLOW! @inbounds i2MatchD1 = dt2D1[i2Match] @inbounds resid1[i] = i1id1 ... end i2Match += oneIdType # SLOW! end ... end The undeclared types are 1-dim arrays of the appropriate type -- basically all Int in this configuration. Enclosed is the full stand-alone code if anyone cares to try. On my machines, one function call is in the range of 0.05 to 0.1 sec, highly depending upon garbage collection, so profiling with 100 runs is done in about 10 sec. Thanks. Patrick crossJoinFilter.jl
Re: [julia-users] Incrementing integer slow?
Ah ah, this is it, I found the culprit! Pre-declaring the type of my increment variables slowed down by a factor of at least 2 if not 3 -- I will profile tonight when I can be on Linux, I cannot now from Windows. i::IdType = zeroIdType # Slow increment in the loop i = zeroIdType # Much faster increment in the loop even through zeroIdType is of type IdType = Int = Int64. It smells more like explicitly typing the variables prevents the compiler from using it as a registry or something like that. I will let the experts explain what really happens. By the way, in my endeavors, I got the impression the loop construct for i in 1:n was slower than while i =n But I need to run further tests to determine whether it is indeed so or not. On Thursday, September 18, 2014 1:44:58 PM UTC-4, G. Patrick Mauroy wrote: No change. I over typed everything to avoid such type mismatches, particularly when experimenting with other integer types. So unless I missed something somewhere, it should not be the case. I suspect something like the compiler does not recognize the incrementing variables should be registries. Unless it is the inherent speed of incrementing, but I doubt it, I had some faster runs at some points...
Re: [julia-users] Incrementing integer slow?
Type instabilities is often solved with the `const` keyword on some global variable. I would try const IdType = Int Seems like Julia is not smart enough to guess that i::IdType will always ensure that i is a Int64 when IdType might change. Regards Ivar kl. 19:58:02 UTC+2 torsdag 18. september 2014 skrev G. Patrick Mauroy følgende: Ah ah, this is it, I found the culprit! Pre-declaring the type of my increment variables slowed down by a factor of at least 2 if not 3 -- I will profile tonight when I can be on Linux, I cannot now from Windows. i::IdType = zeroIdType # Slow increment in the loop i = zeroIdType # Much faster increment in the loop even through zeroIdType is of type IdType = Int = Int64. It smells more like explicitly typing the variables prevents the compiler from using it as a registry or something like that. I will let the experts explain what really happens. By the way, in my endeavors, I got the impression the loop construct for i in 1:n was slower than while i =n But I need to run further tests to determine whether it is indeed so or not. On Thursday, September 18, 2014 1:44:58 PM UTC-4, G. Patrick Mauroy wrote: No change. I over typed everything to avoid such type mismatches, particularly when experimenting with other integer types. So unless I missed something somewhere, it should not be the case. I suspect something like the compiler does not recognize the incrementing variables should be registries. Unless it is the inherent speed of incrementing, but I doubt it, I had some faster runs at some points...
Re: [julia-users] Incrementing integer slow?
On Thu, Sep 18, 2014 at 2:33 PM, Ivar Nesje iva...@gmail.com wrote: Seems like Julia is not smart enough to guess that i::IdType will always ensure that i is a Int64 when IdType might change. Since IdType is not constant, it can change at any time – generating code on the premise that it cannot change would be incorrect. Instead, the code generated for this function needs to handle the possibility that IdType can change at any point, which basically makes all optimizations impossible. It's a bit low level (we should really expose this better), but you can use @code_typed to see what the inferred types of all the local variables are: julia (@code_typed manual_iter1(1,2))[1].args[2][2] 38-element Array{Any,1}: {:dt1,Int64,0} {:dt2,Int64,0} {:zeroIdType,Any,18} {:oneIdType,Any,18} {:zeroDType,Any,18} {symbol(#s5149),Any,18} {symbol(#s5150),Any,18} {:dt1id1,Any,18} {:dt1D,Any,18} {symbol(#s5151),Any,18} {symbol(#s5152),Any,18} {symbol(#s5153),Any,18} {:dt2id2,Any,18} {:dt2D1,Any,18} {:dt2D2,Any,18} {:MAX_INT,Any,18} {:MAX_D,Any,18} {:nrow1,Any,18} {:nrow2,Any,18} {:i1,Any,2} {:i1id1,Any,2} {:i1D,Any,2} {:i2Lower,Any,2} {:i2LowerD2,Any,2} {:i2Upper,Any,2} {:i2UpperD1,Any,2} {:i2UpperD2,Any,2} {:i2Match,Any,2} {:i,Any,2} {:nrowMax,Any,18} {:resid1,Any,18} {:resid2,Any,18} {:resD1,Any,18} {:resD,Any,18} {:resD2,Any,18} {:i2MatchD2,Any,18} {:i2MatchD1,Any,18} {symbol(#s5154),Any,18} As you can see, it's not pretty: given Int arguments, literally only the types of the arguments themselves are more specific than Any. If you declare IdType and DType to be const, it gets better, but still not ideal: julia (@code_typed manual_iter1(1,2))[1].args[2][2] 38-element Array{Any,1}: {:dt1,Int64,0} {:dt2,Int64,0} {:zeroIdType,Int64,18} {:oneIdType,Int64,18} {:zeroDType,Int64,18} {symbol(#s122),Any,18} {symbol(#s121),Any,18} {:dt1id1,Any,18} {:dt1D,Any,18} {symbol(#s120),Any,18} {symbol(#s119),Any,18} {symbol(#s118),Any,18} {:dt2id2,Any,18} {:dt2D1,Any,18} {:dt2D2,Any,18} {:MAX_INT,Int64,18} {:MAX_D,Int64,18} {:nrow1,Int64,18} {:nrow2,Int64,18} {:i1,Int64,2} {:i1id1,Int64,2} {:i1D,Int64,2} {:i2Lower,Int64,2} {:i2LowerD2,Int64,2} {:i2Upper,Int64,2} {:i2UpperD1,Int64,2} {:i2UpperD2,Int64,2} {:i2Match,Int64,2} {:i,Int64,2} {:nrowMax,Int64,18} {:resid1,Any,18} {:resid2,Any,18} {:resD1,Any,18} {:resD,Any,18} {:resD2,Any,18} {:i2MatchD2,Any,18} {:i2MatchD1,Any,18} {symbol(#s117),Any,18} When you pull something out of an untyped structure like dt1id1, dt1D, etc. it is a good idea to provide some type annotations if you can. Most of the other type annotations inside the function body are unnecessary, however.