Yeah, that's true. I didn't read the IR carefully enough. Laszlo, are you on the latest Julia? I worry that it's hard to make comparisons if you're running an older version of Julia.
-- John On Mar 28, 2014, at 8:18 AM, Stefan Karpinski <ste...@karpinski.org> wrote: > Perhaps I should have said "isomorphic" – the only differences there are > names. It's more obvious that the native code is the same – only the source > line annotations are different at all. > > > On Fri, Mar 28, 2014 at 11:16 AM, John Myles White <johnmyleswh...@gmail.com> > wrote: > On my system, the two functions produce different LLVM IR: > > julia> code_llvm(f1, ()) > > define i64 @julia_f115727() { > top: > %0 = call i64 @julia_power_by_squaring1373(i64 10, i64 8), !dbg !726 > %1 = icmp slt i64 %0, 1, !dbg !726 > br i1 %1, label %L2, label %if, !dbg !726 > > if: ; preds = %top, %if > %j.04 = phi i64 [ %3, %if ], [ 1, %top ] > %k.03 = phi i64 [ %4, %if ], [ 1, %top ] > %2 = and i64 %k.03, 1, !dbg !727 > %3 = add i64 %j.04, %2, !dbg !727 > %4 = add i64 %k.03, 1, !dbg !728 > %5 = call i64 @julia_power_by_squaring1373(i64 10, i64 8), !dbg !726 > %6 = icmp sgt i64 %4, %5, !dbg !726 > br i1 %6, label %L2, label %if, !dbg !726 > > L2: ; preds = %if, %top > %j.0.lcssa = phi i64 [ 1, %top ], [ %3, %if ] > ret i64 %j.0.lcssa, !dbg !729 > } > > julia> code_llvm(f2, ()) > > define i64 @julia_f215728() { > top: > %0 = call i64 @julia_power_by_squaring1373(i64 10, i64 8), !dbg !729 > %1 = icmp slt i64 %0, 1, !dbg !729 > br i1 %1, label %L6, label %L3, !dbg !729 > > L3: ; preds = %top, %L3 > %j.08 = phi i64 [ %3, %L3 ], [ 1, %top ] > %k.07 = phi i64 [ %4, %L3 ], [ 1, %top ] > %2 = and i64 %k.07, 1, !dbg !730 > %3 = add i64 %j.08, %2, !dbg !730 > %4 = add i64 %k.07, 1, !dbg !731 > %5 = call i64 @julia_power_by_squaring1373(i64 10, i64 8), !dbg !729 > %6 = icmp slt i64 %5, %4, !dbg !729 > br i1 %6, label %L6, label %L3, !dbg !729 > > L6: ; preds = %L3, %top > %j.0.lcssa = phi i64 [ 1, %top ], [ %3, %L3 ] > ret i64 %j.0.lcssa, !dbg !732 > } > > But the performance is identical or slightly in favor of f1. > > -- John > > On Mar 28, 2014, at 8:02 AM, Stefan Karpinski <ste...@karpinski.org> wrote: > > > Both way of writing a while loop should be the same. If you're seeing a > > difference, something else is going on. I'm not able to reproduce this: > > > > function f1() > > j = k = 1 > > while k <= 10^8 > > j += k & 1 > > k += 1 > > end > > return j > > end > > > > function f2() > > j = k = 1 > > while true > > k <= 10^8 || break > > j += k & 1 > > k += 1 > > end > > return j > > end > > > > function f3() > > j = k = 1 > > while true > > k > 10^8 && break > > j += k & 1 > > k += 1 > > end > > return j > > end > > > > julia> @time f1() > > elapsed time: 0.644661304 seconds (64 bytes allocated) > > 50000001 > > > > julia> @time f2() > > elapsed time: 0.640951585 seconds (64 bytes allocated) > > 50000001 > > > > julia> @time f3() > > elapsed time: 0.639177183 seconds (64 bytes allocated) > > 50000001 > > > > All three functions produce identical native code. Can you send exactly > > what your function definitions are, how you're timing them and perhaps the > > output of code_native(f1,())? > > > > > > On Fri, Mar 28, 2014 at 10:48 AM, Laszlo Hars <laszloh...@gmail.com> wrote: > > Thanks, John, for your replies. In my system your code gives reliable > > results, too, if we increase the loop limits to 10^9: > > > > julia> mean(t1s ./ t2s) > > 11.924373323658703 > > > > This 12% makes a significant difference in my function of nested loops > > (could add up to a factor of 2 slow down). So, the question remains: > > > > - what is the fastest coding of a while loop? > > > >