> > I don't understand why you want to pass __m256 and 256-bit vector values > to anonymous arguments in registers. The only thing the vararg functions > would do with it would be save it somewhere on the stack. > Given the x86_64 ABI, you can't expect calling an implicitly > prototyped or non-vararg prototyped function which is actually > defined as vararg function (as %rax wouldn't be properly initialized),
Unprototyped functions calls all get rax set. If calle is variadic, things still work. Sure, for __m256 we can also declare prototypes for variadic functions mandatory and simply pass things on stack. Honza > which means you need a prototype for all vararg functions and > at that point the caller can just do the job for the callee and push stuff > on the stack. Then vararg prologue doesn't need to save %ymm* registers > at all and va_arg will handle __m256 just fine. > > Jakub