Markus, > Request 2 - add COPY16; *NOT* acked by me > > [PATCH 2/8] lib/lzo: clean-up by introducing COPY16 > > is still not correct because of possible overlapping copies. I'll > address this on the weekend.
Can you give a syndrome as to why { COPY8(op, ip); COPY8(op+8,ip+8); ip+=16; op+=16; } or { COPY8(op, ip); ip+=8; op+=8; COPY8(op, ip); ip+=8; op+=8; } vs. #define COPY16(dst,src) COPY8(dst,src); COPY8(dst+8,src+8) { COPY16(op, ip); ip+=16; op+=16; } .. causes "overlapping copies"? COPY8 was only ever used in pairs as above and the second method broke compiler optimizers since it adds an artificial barrier between the two groups. The only difference was that decompress and compress had the pointer increments spread out. If we need to fix that then that's a good reason, but your reasoning continues to elude me. I can refactor the patch to align the second method with the first and make compress and decompress get the same codegen, which is functionally identical to the COPY16 patch, but that would seem to in your opinion be the whole problem.. I'll see what you've got after the weekend ;D Ta Matt Sealey