Re: fat_init violates host ABI on Win64
Ciao, Il Mer, 26 Aprile 2017 12:25 pm, Nicolas Hake ha scritto: > Torbjörn Granlund wrote: >> Where do we (unconditionally) rely on variable-length arrays? > > mpn/generic/sqrlo_basecase.c line 153, where SQRLO_BASECASE_ALLOC is a > non-constant expression by ultimately referencing CPUVEC_THRESHOLD if > fat.h is included. You are right, we should define a costant SQRLO_DC_THRESHOLD_LIMIT not only for the TUNE builds, but for FAT too... For sqrlo_basecase a patch like the following should work. diff -r 021277dcb21f gmp-impl.h --- a/gmp-impl.hTue Apr 18 23:47:55 2017 +0200 +++ b/gmp-impl.hWed Apr 26 18:27:38 2017 +0200 @@ -5018,7 +5018,6 @@ #undef MUL_TOOM33_THRESHOLD_LIMIT #undef MULLO_BASECASE_THRESHOLD_LIMIT #undef SQRLO_BASECASE_THRESHOLD_LIMIT -#undef SQRLO_DC_THRESHOLD_LIMIT #undef SQR_TOOM3_THRESHOLD_LIMIT #define SQR_TOOM2_MAX_GENERIC 200 #define MUL_TOOM22_THRESHOLD_LIMIT 700 @@ -5032,12 +5031,16 @@ #define SQR_TOOM8_THRESHOLD_LIMIT 1200 #define MULLO_BASECASE_THRESHOLD_LIMIT 200 #define SQRLO_BASECASE_THRESHOLD_LIMIT 200 -#define SQRLO_DC_THRESHOLD_LIMIT400 #define GET_STR_THRESHOLD_LIMIT 150 #define FAC_DSC_THRESHOLD_LIMIT2048 #endif /* TUNE_PROGRAM_BUILD */ +#if TUNE_PROGRAM_BUILD || WANT_FAT_BINARY +#undef SQRLO_DC_THRESHOLD_LIMIT +#define SQRLO_DC_THRESHOLD_LIMIT400 +#endif + #if defined (__cplusplus) } #endif Regards, m -- http://bodrato.it/papers/ ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs
Re: fat_init violates host ABI on Win64
Torbjörn Granlund wrote: Where do we (unconditionally) rely on variable-length arrays? mpn/generic/sqrlo_basecase.c line 153, where SQRLO_BASECASE_ALLOC is a non-constant expression by ultimately referencing CPUVEC_THRESHOLD if fat.h is included. mpn/generic/sqr_basecase.c line 170 has the same issue via SQR_TOOM2_THRESHOLD. Both macros are constant if fat.h is not included. Nicolas ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs
Re: fat_init violates host ABI on Win64
I wonder if one could use some m4 macrology to reduce clutter. Maybe one macro for the pattern IFSTD(` sub $8, %rsp') IFDOS(` sub $40, %rsp ') ASSERT(nz, `test $15, %rsp') CALL( mpn_add_n) IFSTD(` add $8, %rsp') IFDOS(` add $40, %rsp ') used when it's known that rsp = 8 (mod 16), and another one when it's known that rsp = 0 (mod 16). There seems to be a few uses of CALL that want to do this differently, though, e.g., gcd_1.asm, maybe it would also help to move (and rename) its conditional definition of STACK_ALLOC. /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance. ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs
Re: fat_init violates host ABI on Win64
Nicolas Hakewrites: the Windows x64 ABI requires callers to allocate a 32 byte "parameter area" before calling into a function, which the callee is allowed to use as it pleases[1]. fat_init does not do this before calling __gmpn_cpuvec_init, thus violating the ABI. Right; apparently we're aware of this as such allocations are performed in most cases in GMP. There seem to be 5 forgotten places, the one you found and 4 more. I'm not sure whether MSVC is a supported compiler (I assume it isn't, because --enable-fat requires support for variable-length arrays, which MSVC will not compile in C mode), but it's possible that other compilers may also use the parameter area as a scratch space, or even that gcc/clang start using it in future releases. We should follow the ABI, of course. Where do we (unconditionally) rely on variable-length arrays? While such arrays are in the C standard since 18 years, it is missing from C++. Attached patch solves this problem by just allocating 32 bytes of stack space before calling __gmpn_cpuvec_init, and discarding them afterwards. Thanks. Here is a slightly different patch: diff -Nrc2 gmp-main.253deadf9fc8/mpn/x86_64/divrem_2.asm gmp-main/mpn/x86_64/divrem_2.asm *** gmp-main.253deadf9fc8/mpn/x86_64/divrem_2.asm Tue Apr 25 20:52:18 2017 --- gmp-main/mpn/x86_64/divrem_2.asmTue Apr 25 20:52:18 2017 *** *** 101,106 --- 101,108 IFSTD(` mov %r11, %rdi ') IFDOS(` mov %r11, %rcx ') + IFDOS(` sub $32, %rsp ') ASSERT(nz, `test $15, %rsp') CALL( mpn_invert_limb) + IFDOS(` add $32, %rsp ') pop %r11 pop %r10 diff -Nrc2 gmp-main.253deadf9fc8/mpn/x86_64/fat/fat_entry.asm gmp-main/mpn/x86_64/fat/fat_entry.asm *** gmp-main.253deadf9fc8/mpn/x86_64/fat/fat_entry.asm Tue Apr 25 20:52:18 2017 --- gmp-main/mpn/x86_64/fat/fat_entry.asm Tue Apr 25 20:52:18 2017 *** *** 168,172 --- 168,174 push%r9 push%rax + IFDOS(` sub $32, %rsp ') CALL( __gmpn_cpuvec_init) + IFDOS(` add $32, %rsp ') pop %rax pop %r9 diff -Nrc2 gmp-main.253deadf9fc8/mpn/x86_64/mod_1_1.asm gmp-main/mpn/x86_64/mod_1_1.asm *** gmp-main.253deadf9fc8/mpn/x86_64/mod_1_1.asmTue Apr 25 20:52:18 2017 --- gmp-main/mpn/x86_64/mod_1_1.asm Tue Apr 25 20:52:18 2017 *** *** 199,204 --- 199,206 IFSTD(` mov %r12, %rdi ') C pass parameter IFDOS(` mov %r12, %rcx ') C pass parameter + IFDOS(` sub $32, %rsp ') ASSERT(nz, `test $15, %rsp') CALL( mpn_invert_limb) + IFDOS(` add $32, %rsp ') neg %r12 mov %r12, %r8 diff -Nrc2 gmp-main.253deadf9fc8/mpn/x86_64/mod_1_2.asm gmp-main/mpn/x86_64/mod_1_2.asm *** gmp-main.253deadf9fc8/mpn/x86_64/mod_1_2.asmTue Apr 25 20:52:18 2017 --- gmp-main/mpn/x86_64/mod_1_2.asm Tue Apr 25 20:52:18 2017 *** *** 184,189 --- 184,191 IFSTD(` mov %r12, %rdi ') C pass parameter IFDOS(` mov %r12, %rcx ') C pass parameter + IFDOS(` sub $32, %rsp ') ASSERT(nz, `test $15, %rsp') CALL( mpn_invert_limb) + IFDOS(` add $32, %rsp ') mov %r12, %r8 mov %rax, %r11 diff -Nrc2 gmp-main.253deadf9fc8/mpn/x86_64/mod_1_4.asm gmp-main/mpn/x86_64/mod_1_4.asm *** gmp-main.253deadf9fc8/mpn/x86_64/mod_1_4.asmTue Apr 25 20:52:18 2017 --- gmp-main/mpn/x86_64/mod_1_4.asm Tue Apr 25 20:52:18 2017 *** *** 191,196 --- 191,198 IFSTD(` mov %r12, %rdi ') C pass parameter IFDOS(` mov %r12, %rcx ') C pass parameter + IFDOS(` sub $32, %rsp ') ASSERT(nz, `test $15, %rsp') CALL( mpn_invert_limb) + IFDOS(` add $32, %rsp ') mov %r12, %r8 mov %rax, %r11 -- Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs
fat_init violates host ABI on Win64
Dear GMP maintainers, the Windows x64 ABI requires callers to allocate a 32 byte "parameter area" before calling into a function, which the callee is allowed to use as it pleases[1]. fat_init does not do this before calling __gmpn_cpuvec_init, thus violating the ABI. This is not usually an issue, since neither gcc nor clang seem to use the space. MSVC, however, does, which results in rax/rcx/r8/r9 incorrectly being restored to clobbered values, and, in turn, random crashes as fat_init jumps into the real routines with incorrect arguments. I'm not sure whether MSVC is a supported compiler (I assume it isn't, because --enable-fat requires support for variable-length arrays, which MSVC will not compile in C mode), but it's possible that other compilers may also use the parameter area as a scratch space, or even that gcc/clang start using it in future releases. Attached patch solves this problem by just allocating 32 bytes of stack space before calling __gmpn_cpuvec_init, and discarding them afterwards. Thanks, Nicolas [1] "Even if the called function has fewer than 4 parameters, these 4 stack locations ... may be used by the called function for other purposes ..." https://msdn.microsoft.com/en-us/library/ew5tede7.aspx --- a/mpn/x86_64/fat/fat_entry.asm +++ b/mpn/x86_64/fat/fat_entry.asm @@ -167,7 +167,13 @@ push%r8 push%r9 push%rax +ifdef(`HOST_DOS64',`dnl + subq$32, %rsp +') CALL( __gmpn_cpuvec_init) +ifdef(`HOST_DOS64',`dnl + addq$32, %rsp +') pop %rax pop %r9 pop %r8 ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs