Hi,
We like to update x86-64 psABI to pass aggregates of 32 bytes with
single __m256 field
in AVX registers, instead of memory. However, finding the proper
wording seems tricky.
Here is what I got. Any comments?
Thanks.
--
H.J.
Index: low-level-sys-info.tex
===================================================================
--- low-level-sys-info.tex (revision 5099)
+++ low-level-sys-info.tex (working copy)
@@ -343,10 +343,12 @@ classes are corresponding to \xARCH regi
\begin{description}
\item[INTEGER] This class consists of integral types that fit into one of
the general purpose registers.
-\item[SSE] The class consists of types that fit into a SSE register.
-\item[SSEUP] The class consists of types that fit into a SSE register
+\item[SSE] The class consists of types that fit into an SSE register.
+\item[SSEUP] The class consists of types that fit into an SSE register
+ and can be passed and returned in the most significant half of it.
+\item[AVX] The class consists of types that fit into an AVX register.
+\item[AVXUP] The class consists of types that fit into an AVX register
and can be passed and returned in the most significant half of it.
-\item[AVX] The class consists of types that fit into a AVX register.
\item[X87, X87UP] These classes consists of types that will be returned via
the x87 FPU.
\item[COMPLEX\_X87] This class consists of types that will be returned
@@ -372,7 +374,9 @@ The basic types are assigned their natur
\item Arguments of types \code{__float128}, \code{_Decimal128}
and \code{__m128} are split into two halves. The least significant
ones belong to class SSE, the most significant one to class SSEUP.
-\item Arguments of type \code{__m256} are in class AVX.
+\item Arguments of type \code{__m256} are split into into two halves.
+ The least significant ones belong to class AVX, the most significant
+ one to class AVXUP.
\item The 64-bit mantissa of arguments of type \code{long double}
belongs to class X87, the 16-bit exponent plus 6 bytes of padding
belongs to class X87UP.
@@ -407,11 +411,10 @@ The classification of aggregate (structu
types works as follows:
\begin{enumerate}
-\item If the size of an object is larger than two \eightbytes, or
- it contains unaligned fields, it has class MEMORY.
+\item If it contains unaligned fields, it has class MEMORY.
\item If a C++ object has either a non-trivial copy constructor
- or a non-trivial destructor
+ or a non-trivial destructor,
\footnote{A de/constructor is trivial if it is an implicitly-declared
default de/constructor and if:
\begin{itemize}
@@ -433,6 +436,15 @@ types works as follows:
because such objects must have well defined addresses. Similar
issues apply when returning an object from a function.}
+\item If the size of the aggregate is four \eightbytes, two
+ consecutive \eightbytes are classified as an aggregate of two
+ \eightbytes. If the first of two \eightbytes aggregates has the
+ AVX class, it is broken into the SSE and SSEUP classes for
+ class merge purpose.
+
+\item If the size of an object is larger than two \eightbytes,
+ it has class MEMORY.
+
\item If the size of the aggregate exceeds a single \eightbyte, each is
classified separately. Each \eightbyte gets initialized to class NO_CLASS.
@@ -453,6 +465,8 @@ types works as follows:
\begin{enumerate}
\item If one of the classes is MEMORY, the whole argument is passed in
memory.
\item If SSEUP is not preceeded by SSE, it is converted to SSE.
+ \item If AVXUP is preceeded by SSE, the SSE class is converted to AVX.
+ \item If AVXUP is not preceeded by AVX, it is converted to AVX.
\end{enumerate}
\end{enumerate}