On Sunday, 13 August 2017 at 09:08:14 UTC, Petar Kirov
[ZombineDev] wrote:
There's one especially interesting result:
This instantiation:
sum_subranges(std.range.iota!(int, int).iota(int, int).Result,
uint)
of the following function:
auto sum_subranges(T)(T input, uint range)
{
import std.range : chunks, ElementType, array;
import std.algorithm : map;
return input.chunks(range).map!(sum);
}
gets optimized with LDC to:
push rax
test edi, edi
je .LBB2_2
mov edx, edi
mov rax, rsi
pop rcx
ret
.LBB2_2:
lea rsi, [rip + .L.str.3]
lea rcx, [rip + .L.str]
mov edi, 45
mov edx, 89
mov r8d, 6779
call _d_assert_msg@PLT
I.e. the compiler turned a O(n) algorithm to O(1), which is
quite neat. It is also quite surprising to me that it looks
like even dmd managed to do a similar optimization:
sum_subranges(std.range.iota!(int, int).iota(int, int).Result,
uint):
push rbp
mov rbp,rsp
sub rsp,0x30
mov DWORD PTR [rbp-0x8],edi
mov r9d,DWORD PTR [rbp-0x8]
test r9,r9
jne 41
mov r8d,0x1b67
mov ecx,0x0
mov eax,0x61
mov rdx,rax
mov QWORD PTR [rbp-0x28],rdx
mov edx,0x0
mov edi,0x2d
mov rsi,rdx
mov rdx,QWORD PTR [rbp-0x28]
call 41
41: mov QWORD PTR [rbp-0x20],rsi
mov QWORD PTR [rbp-0x18],r9
mov rdx,QWORD PTR [rbp-0x18]
mov rax,QWORD PTR [rbp-0x20]
mov rsp,rbp algorithms a
pop rbp
ret
Moral of the story: templates + ranges are an awesome
combination.
Change the parameter for this array size to be taken from stdin
and I assume that these optimizations will go away.