Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-18 Thread H.J. Lu
Hi, Here is the AVX patch for x86-64 psABI proposed at gcc submmit 2008. H.J. --- On Sun, Jun 15, 2008 at 6:49 PM, Jan Hubicka [EMAIL PROTECTED] wrote: On Wed, Jun 11, 2008 at 07:49:12AM -0700, H.J. Lu wrote: I guess we all agree on passing variadic arguments on stack (that is only those

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-15 Thread Jan Hubicka
On Wed, Jun 11, 2008 at 07:49:12AM -0700, H.J. Lu wrote: I guess we all agree on passing variadic arguments on stack (that is only those belonging on ...) and rest in registers. It seems easiest in regard to future register set extensions too. Only negative thing is that calls to

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-11 Thread H.J. Lu
On Tue, Jun 10, 2008 at 05:48:57PM +0200, Jan Hubicka wrote: On Tue, Jun 10, 2008 at 8:11 AM, Jakub Jelinek [EMAIL PROTECTED] wrote: On Tue, Jun 10, 2008 at 04:50:14PM +0200, Jan Hubicka wrote: 1) make __m256 passed on stack on variadic functions and in registers otherwse. Then we

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread Jakub Jelinek
On Mon, Jun 09, 2008 at 04:40:54PM +0200, Jan Hubicka wrote: Still it seems to me that we can use extend current eax convention. Currently the value must be in range 0...8 as it specify number of SSE registers. We can pack both numbers into it. This way we get unforutnately wild jump on case

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread Jan Hubicka
I don't understand why you want to pass __m256 and 256-bit vector values to anonymous arguments in registers. The only thing the vararg functions would do with it would be save it somewhere on the stack. Given the x86_64 ABI, you can't expect calling an implicitly prototyped or non-vararg

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread H.J. Lu
On Tue, Jun 10, 2008 at 4:32 AM, Jan Hubicka [EMAIL PROTECTED] wrote: I don't understand why you want to pass __m256 and 256-bit vector values to anonymous arguments in registers. The only thing the vararg functions would do with it would be save it somewhere on the stack. Given the x86_64

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread Jakub Jelinek
On Tue, Jun 10, 2008 at 04:50:14PM +0200, Jan Hubicka wrote: 1) make __m256 passed on stack on variadic functions and in registers otherwse. Then we don't need to worry about varargs changes at all. This will break unprototyped calls. 2) extend rax to pass info about if __m256 registers

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread H.J. Lu
On Tue, Jun 10, 2008 at 8:11 AM, Jakub Jelinek [EMAIL PROTECTED] wrote: On Tue, Jun 10, 2008 at 04:50:14PM +0200, Jan Hubicka wrote: 1) make __m256 passed on stack on variadic functions and in registers otherwse. Then we don't need to worry about varargs changes at all. This will break

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread Jan Hubicka
On Tue, Jun 10, 2008 at 8:11 AM, Jakub Jelinek [EMAIL PROTECTED] wrote: On Tue, Jun 10, 2008 at 04:50:14PM +0200, Jan Hubicka wrote: 1) make __m256 passed on stack on variadic functions and in registers otherwse. Then we don't need to worry about varargs changes at all. This will break

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread H.J. Lu
On Tue, Jun 10, 2008 at 8:48 AM, Jan Hubicka [EMAIL PROTECTED] wrote: On Tue, Jun 10, 2008 at 8:11 AM, Jakub Jelinek [EMAIL PROTECTED] wrote: On Tue, Jun 10, 2008 at 04:50:14PM +0200, Jan Hubicka wrote: 1) make __m256 passed on stack on variadic functions and in registers otherwse. Then

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-09 Thread Jan Hubicka
On Fri, Jun 06, 2008 at 06:50:26AM -0700, H.J. Lu wrote: On Fri, Jun 06, 2008 at 10:28:34AM +0200, Jan Hubicka wrote: ymm0 and xmm0 are the same register. xmm0 is the lower 128bit of xmm0. I am not sure if we need separate XMM registers from YMM registers. Yes, I

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-06 Thread Jan Hubicka
ymm0 and xmm0 are the same register. xmm0 is the lower 128bit of xmm0. I am not sure if we need separate XMM registers from YMM registers. Yes, I know that xmm0 is lower part of ymm0. I still think we ought to be able to support varargs that do save ymm0 registers only when ymm values are

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-06 Thread H.J. Lu
On Fri, Jun 06, 2008 at 10:28:34AM +0200, Jan Hubicka wrote: ymm0 and xmm0 are the same register. xmm0 is the lower 128bit of xmm0. I am not sure if we need separate XMM registers from YMM registers. Yes, I know that xmm0 is lower part of ymm0. I still think we ought to be able to

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-06 Thread Richard Guenther
On Fri, Jun 6, 2008 at 4:28 PM, H.J. Lu [EMAIL PROTECTED] wrote: On Fri, Jun 06, 2008 at 06:50:26AM -0700, H.J. Lu wrote: On Fri, Jun 06, 2008 at 10:28:34AM +0200, Jan Hubicka wrote: ymm0 and xmm0 are the same register. xmm0 is the lower 128bit of xmm0. I am not sure if we need separate

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-06 Thread H.J. Lu
On Fri, Jun 6, 2008 at 7:31 AM, Richard Guenther [EMAIL PROTECTED] wrote: On Fri, Jun 6, 2008 at 4:28 PM, H.J. Lu [EMAIL PROTECTED] wrote: On Fri, Jun 06, 2008 at 06:50:26AM -0700, H.J. Lu wrote: On Fri, Jun 06, 2008 at 10:28:34AM +0200, Jan Hubicka wrote: ymm0 and xmm0 are the same

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-06 Thread Jakub Jelinek
On Thu, Jun 05, 2008 at 07:31:12AM -0700, H.J. Lu wrote: 1. Extend the register save area to put upper 128bit at the end. Pros: Aligned access. Save stack space if 256bit registers are used. Cons Split access. Require more split access beyond 256bit. 2. Extend the register

RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-05 Thread H.J. Lu
Hi, x86-64 psABI defines typedef struct { unsigned int gp_offset; unsigned int fp_offset; void *overflow_arg_area; void *reg_save_area; } va_list[1]; for variable argument list. va_list is used to access variable argument list: void bar (const char *format, va_list ap) { if (va_arg

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-05 Thread Richard Guenther
On Thu, Jun 5, 2008 at 4:31 PM, H.J. Lu [EMAIL PROTECTED] wrote: Hi, x86-64 psABI defines typedef struct { unsigned int gp_offset; unsigned int fp_offset; void *overflow_arg_area; void *reg_save_area; } va_list[1]; for variable argument list. va_list is used to access variable

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-05 Thread Jan Hubicka
1. Extend the register save area to put upper 128bit at the end. Pros: Aligned access. Save stack space if 256bit registers are used. Cons Split access. Require more split access beyond 256bit. 2. Extend the register save area to put full 265bit YMMs at the end. The

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-05 Thread H.J. Lu
On Thu, Jun 5, 2008 at 7:49 AM, Richard Guenther [EMAIL PROTECTED] wrote: On Thu, Jun 5, 2008 at 4:31 PM, H.J. Lu [EMAIL PROTECTED] wrote: Hi, x86-64 psABI defines typedef struct { unsigned int gp_offset; unsigned int fp_offset; void *overflow_arg_area; void *reg_save_area; }

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-05 Thread H.J. Lu
On Thu, Jun 5, 2008 at 8:15 AM, Jan Hubicka [EMAIL PROTECTED] wrote: 1. Extend the register save area to put upper 128bit at the end. Pros: Aligned access. Save stack space if 256bit registers are used. Cons Split access. Require more split access beyond 256bit. 2. Extend