On Tuesday 26 February 2008 17:27, Luiz Americo Pereira Camara wrote: > Yury Sidorov wrote: > > The patch removes packed record for some platforms. > > IMO packed can be removed for all platforms. It will gain some > > speed. > > I'd like to understand more this issue. > Why are non packed records faster? > The difference occurs at memory allocation or at memory access?
At memory access. On x86 processors it's usually only a speed penalty (or has anyone ever seen the AC flag turned on?), on other processors you may even have to workaround exceptions (i.e. bus errors), because the processor simply refuses to read or write unaligned data. And then the only way to circumvent the processor's refusal is to read/write the data byte by byte or mask it out, which is slower than just reading or writing it. Consider writing a 16-bit value spanning across 32-bit-values where the processor can only access a single 32 bits value at an aligned address: *_ _ _ _*_ _ _ _ |0|1|2|3|4|5|6|7| |_______| Now the data you need is spanning across bytes [2:5], but the processor can only read full 32 bits either at position 0 (reading bytes [0:3]), or position 4 (reading byte [4:7]). You'd need to read both processor words, mask the data in the lower and upper half of each and write back both words with the new data patched "inbetween" them. So by now, no matter if the processor handles it for you or if the compiler would insert the necessary code to do it, even a simple increment is insanely expensive in terms of processor cycles. Vinzent. _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel