I am working on some ppc architecture, and I realize that I have a (very) big slowdown due to bad alignment of data by ocamlopt. I need to have my data aligned in memory depending of the size of the data : floats are to be aligned on 8 bytes, int on 4 bytes, etc....
First, make sure that misalignment is really the source of your slowdown. The PowerPC processors I'm familiar with can access 4-aligned 8-byte floats with minimal overhead, while the penalty is much bigger for other misalignments. Indeed, the PowerPC calling conventions mandate that some 8-byte float arguments are passed on the stack at 4-aligned addresses, so that's strong incentive for the hardware people to implement those accesses efficiently.
BUT, after verification, I remark that ocamlopt doesn't align as I need. I tried to use ARCH_ALIGN_DOUBLE, but it doesn't seem to be what I thought, and doesn't change anything for my needs. Is there ANY way to obtain what I need easily or at least quickly ?
Data allocated in the Caml heap is word-aligned, where a word is 4 bytes on a 32-bit platform and 8 bytes on a 64-bit platform. This is deeply ingrained in the Caml GC and allocator, so don't expect to change this easily. What you can do, however: 1- Use the 64-bit PowerPC port. Everything will be 8-aligned then. 2- Use a bigarray instead of a float array. Bigarray data is allocated outside the heap, at naturally-aligned addresses. - Xavier Leroy _______________________________________________ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs