* Michel Quercia: > like this ? > > ---------------------------------------------------------------------------------- > # corps de boucle à dérouler. taille du code = 24 octets > # entrer avec eax = edx = 1er chiffre de a, CF = 0 > #undef BODY > #define BODY(x,y,z) \ > adcl x(%ebx,%ecx,4), %eax; \ > /* movl y(%esi,%ecx,4), %edx */ .byte 0x8B, 0x54, 0x8E, y; \ > movl %eax, x(%edi,%ecx,4); \ > /* adcl y(%ebx,%ecx,4), %edx */ .byte 0x13, 0x54, 0x8B, y; \ > movl z(%esi,%ecx,4), %eax; \ > movl %edx, y(%edi,%ecx,4) > > # boucle d addition déroulée pour 16 chiffres > ALIGN(4) > L(begin): > BODY(-4,0,4); BODY(4,8,12); BODY(12,16,20); BODY(20,24,28) > BODY(28,32,36); BODY(36,40,44); BODY(44,48,52); BODY(52,56,60) > ----------------------------------------------------------------------------------
Exactly. > Seen your patch. There remains other unrolled loops in the > multiplication subroutines, affected by the same bug and that will > explain additionnal segfaults. To discover which files are affected do a > "grep -l BODY kernel/n/x86/*.S" Yes, I know. I discovered them after I decided that my approach is the wrong one. > Now there are two possibilities : > > 1. if you guys at Debian need an urgent fix then there is Numerix-0.21a > available (with nops instead of hand-coded zero displacements). I can > deliver a 0.21b with zero-displacements if necessary. > > 2. if you don't care about Numerix being buggy then let's live with 0.21 > until 0.22 is ready for releasing (probably next Spring). Note that the > bug only affects x86-cpu without sse2 capability, this kind of > processors belongs to the past now. It's a release-critical bug which needs to be fixed. I would like to use the patch you described above, applied to 0.21. Do you think this is feasible?