Perhaps I should have said "isomorphic" – the only differences there are
names. It's more obvious that the native code is the same – only the source
line annotations are different at all.


On Fri, Mar 28, 2014 at 11:16 AM, John Myles White <johnmyleswh...@gmail.com
> wrote:

> On my system, the two functions produce different LLVM IR:
>
> julia> code_llvm(f1, ())
>
> define i64 @julia_f115727() {
> top:
>   %0 = call i64 @julia_power_by_squaring1373(i64 10, i64 8), !dbg !726
>   %1 = icmp slt i64 %0, 1, !dbg !726
>   br i1 %1, label %L2, label %if, !dbg !726
>
> if:                                               ; preds = %top, %if
>   %j.04 = phi i64 [ %3, %if ], [ 1, %top ]
>   %k.03 = phi i64 [ %4, %if ], [ 1, %top ]
>   %2 = and i64 %k.03, 1, !dbg !727
>   %3 = add i64 %j.04, %2, !dbg !727
>   %4 = add i64 %k.03, 1, !dbg !728
>   %5 = call i64 @julia_power_by_squaring1373(i64 10, i64 8), !dbg !726
>   %6 = icmp sgt i64 %4, %5, !dbg !726
>   br i1 %6, label %L2, label %if, !dbg !726
>
> L2:                                               ; preds = %if, %top
>   %j.0.lcssa = phi i64 [ 1, %top ], [ %3, %if ]
>   ret i64 %j.0.lcssa, !dbg !729
> }
>
> julia> code_llvm(f2, ())
>
> define i64 @julia_f215728() {
> top:
>   %0 = call i64 @julia_power_by_squaring1373(i64 10, i64 8), !dbg !729
>   %1 = icmp slt i64 %0, 1, !dbg !729
>   br i1 %1, label %L6, label %L3, !dbg !729
>
> L3:                                               ; preds = %top, %L3
>   %j.08 = phi i64 [ %3, %L3 ], [ 1, %top ]
>   %k.07 = phi i64 [ %4, %L3 ], [ 1, %top ]
>   %2 = and i64 %k.07, 1, !dbg !730
>   %3 = add i64 %j.08, %2, !dbg !730
>   %4 = add i64 %k.07, 1, !dbg !731
>   %5 = call i64 @julia_power_by_squaring1373(i64 10, i64 8), !dbg !729
>   %6 = icmp slt i64 %5, %4, !dbg !729
>   br i1 %6, label %L6, label %L3, !dbg !729
>
> L6:                                               ; preds = %L3, %top
>   %j.0.lcssa = phi i64 [ 1, %top ], [ %3, %L3 ]
>   ret i64 %j.0.lcssa, !dbg !732
> }
>
> But the performance is identical or slightly in favor of f1.
>
>  -- John
>
> On Mar 28, 2014, at 8:02 AM, Stefan Karpinski <ste...@karpinski.org>
> wrote:
>
> > Both way of writing a while loop should be the same. If you're seeing a
> difference, something else is going on. I'm not able to reproduce this:
> >
> > function f1()
> >   j = k = 1
> >   while k <= 10^8
> >     j += k & 1
> >     k += 1
> >   end
> >   return j
> > end
> >
> > function f2()
> >   j = k = 1
> >   while true
> >     k <= 10^8 || break
> >     j += k & 1
> >     k += 1
> >   end
> >   return j
> > end
> >
> > function f3()
> >   j = k = 1
> >   while true
> >     k > 10^8 && break
> >     j += k & 1
> >     k += 1
> >   end
> >   return j
> > end
> >
> > julia> @time f1()
> > elapsed time: 0.644661304 seconds (64 bytes allocated)
> > 50000001
> >
> > julia> @time f2()
> > elapsed time: 0.640951585 seconds (64 bytes allocated)
> > 50000001
> >
> > julia> @time f3()
> > elapsed time: 0.639177183 seconds (64 bytes allocated)
> > 50000001
> >
> > All three functions produce identical native code. Can you send exactly
> what your function definitions are, how you're timing them and perhaps the
> output of code_native(f1,())?
> >
> >
> > On Fri, Mar 28, 2014 at 10:48 AM, Laszlo Hars <laszloh...@gmail.com>
> wrote:
> > Thanks, John, for your replies. In my system your code gives reliable
> results, too, if we increase the loop limits to 10^9:
> >
> > julia> mean(t1s ./ t2s)
> > 11.924373323658703
> >
> > This 12% makes a significant difference in my function of nested loops
> (could add up to a factor of 2 slow down). So, the question remains:
> >
> > - what is the fastest coding of a while loop?
> >
>
>

Reply via email to