Torbjorn Granlund <t...@gmplib.org> writes: > I am not too enthusiastic about struct return types for critical > functions. I expect this to be returned via a stack slot everywhere o > almost everywhere.
As far as I understand, the most common ABIs for x86_64 and ARM (which is pretty close to "almost everywhere"...) both return structs of this form in two registers: %rax/%rdx, and %r0/%r1. Consider the test compilation unit typedef struct { unsigned long q; unsigned long r; } qr_t; qr_t divrem (unsigned long u, unsigned long d) { qr_t res; res.q = u/d; res.r = u - res.q*d; return res; } On x86_64 (and gnu/linux), gcc -c -O compiles this to 0: 48 89 f8 mov %rdi,%rax 3: 31 d2 xor %edx,%edx 5: 48 f7 f6 div %rsi 8: 48 89 fa mov %rdi,%rdx b: 48 0f af f0 imul %rax,%rsi f: 48 29 f2 sub %rsi,%rdx 12: c3 retq Both inputs and outputs are passed in registers. The return value is the only thing stored on the stack. > I recall to have seen some code for that. How fast does it run > currently on the various CPUs? Don't know yet. > Code comment: > > I think we cannot afford to do a separate lshift of the dividend operand > when the divisor is just a few limbs. We need to to shifting on-the- > fly, however irksome that might be. AN mpn_div_qr_1u_pi1 is called-for. I think we'll definitely want mpn_div_qr_1u_pi1 for the most common platforms. I was thinking, that maybe we could let it be an optional function, with no C implementation, and resort to a separate mpn_lshift if the function is missing. But if needed, it's no big deal to extract a C mpn_div_qr_1u_pi1 from divrem_1.c, with on-the-fly shifting. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel