On Dec 5, 2010, at 9:49 AM, Chris Lattner wrote: > > On Dec 5, 2010, at 3:19 AM, Richard Guenther wrote: > >>> $ clang t.cc -S -o - -O3 -mkernel -fomit-frame-pointer -mllvm >>> -show-mc-encoding >>> .section __TEXT,__text,regular,pure_instructions >>> .globl __Z4testl >>> .align 4, 0x90 >>> __Z4testl: ## @_Z4testl >>> ## BB#0: ## %entry >>> movl $4, %ecx ## encoding: >>> [0xb9,0x04,0x00,0x00,0x00] >>> movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] >>> mulq %rcx ## encoding: [0x48,0xf7,0xe1] >>> movq $-1, %rdi ## encoding: >>> [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] >>> cmovnoq %rax, %rdi ## encoding: [0x48,0x0f,0x41,0xf8] >>> jmp __Znam ## TAILCALL >>> ## encoding: [0xeb,A] >>> ## fixup A - offset: 1, value: >>> __Znam-1, kind: FK_PCRel_1 >>> .subsections_via_symbols >>> >>> This could be further improved by inverting the cmov condition to avoid the >>> first movq, which we'll tackle as a general regalloc improvement. >> >> I'm curious as on how you represent the overflow checking in your highlevel >> IL. > > The (optimized) generated IR is: > > $ clang t.cc -emit-llvm -S -o - -O3 > ... > define noalias i8* @_Z4testl(i64 %count) ssp { > entry: > %0 = tail call %0 @llvm.umul.with.overflow.i64(i64 %count, i64 4) > %1 = extractvalue %0 %0, 1 > %2 = extractvalue %0 %0, 0 > %3 = select i1 %1, i64 -1, i64 %2 > %call = tail call noalias i8* @_Znam(i64 %3) > ret i8* %call > }
Sorry, it's a little easier to read with expanded names and types: define noalias i8* @_Z4testl(i64 %count) ssp { entry: %A = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %count, i64 4) %B = extractvalue { i64, i1 } %A, 1 %C = extractvalue { i64, i1 } %A, 0 %D = select i1 %B, i64 -1, i64 %C %call = tail call noalias i8* @_Znam(i64 %D) ret i8* %call } -Chris