This is the LL code generated using the -output-ll switch of ldc2 (it's a kind of nearly universal bytecode for llvm):

@_D5test51ayG16i = global [16 x i32] [i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16] @_D5test51byG16i = global [16 x i32] [i32 16, i32 15, i32 14, i32 13, i32 12, i32 11, i32 10, i32 9, i32 8, i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1]
@_D5test51cG16i = global [16 x i32] zeroinitializer


If I add a "align(16)" annotation to a, b, c it adds the align 16 annotation in the LL code too:

@_D5test51ayG16i = global [16 x i32] [i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16], align 16 @_D5test51byG16i = global [16 x i32] [i32 16, i32 15, i32 14, i32 13, i32 12, i32 11, i32 10, i32 9, i32 8, i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1], align 16
@_D5test51cG16i = global [16 x i32] zeroinitializer, align 16

I didn't think to also take a look at the asm produced. Even if I don't add align(16) in the code, and even if the LLVM IR doesn't have an alignment annotation, the data in the asm has such alignment:


    .data
    .globl  __D4test1ayG16i
    .align  16
__D4test1ayG16i:
    .long   1
    .long   2
    .long   3
    .long   4
    .long   5
    .long   6
    .long   7
    .long   8
    .long   9
    .long   10
    .long   11
    .long   12
    .long   13
    .long   14
    .long   15
    .long   16

    .globl  __D4test1byG16i
    .align  16
__D4test1byG16i:
    .long   16
    .long   15
    .long   14
    .long   13
    .long   12
    .long   11
    .long   10
    .long   9
    .long   8
    .long   7
    .long   6
    .long   5
    .long   4
    .long   3
    .long   2
    .long   1


Regarding adding the alignment to the D type system, to assure sliced data is aligned, a possible first step is to support:

void foo(align(16) double[] a1,
         align(16) double[] a2) {}


That annotation is dynamically tested at entry point. So it's similar to this:

void foo(double[] a1, double[] a2)
in {
    assert(isAligned(a1, 16) && isAligned(a2, 16));
} body {}


But inside the body of foo() the compiler is allowed to use aligned SIMD instructions (in theory this should be true for the version with asserts too).

A more elegant (and maybe better) solution is to verify the alignment statically. This means foo() accepts only 16-aligned arrays. Array cast(), new array, dup, idup, and little else is statically known to produce/return aligned arrays. And for the other slices you need something like makeAligned!N() that returns a statically known alignment.

Bye,
bearophile

Reply via email to