http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56504



             Bug #: 56504

           Summary: -mveclibabi=... Support AMD's LibM 3.0 (sucessor of

                    ACML)

    Classification: Unclassified

           Product: gcc

           Version: 4.8.0

            Status: UNCONFIRMED

          Severity: enhancement

          Priority: P3

         Component: middle-end

        AssignedTo: unassig...@gcc.gnu.org

        ReportedBy: bur...@gcc.gnu.org





GCC currently supports:



       -mveclibabi=type

           Specifies the ABI type to use for vectorizing intrinsics

           [...] and acml for the AMD math core library. [...]



           [...]    and "__vrd2_sin",

           "__vrd2_cos", "__vrd2_exp", "__vrd2_log", "__vrd2_log2",

           "__vrd2_log10", "__vrs4_sinf", "__vrs4_cosf", "__vrs4_expf",

           "__vrs4_logf", "__vrs4_log2f", "__vrs4_log10f" and

           "__vrs4_powf" for the corresponding function type when

           -mveclibabi=acml is used.



The current AMD LibM version, however, supports much more:

http://developer.amd.com/tools/cpu-development/libm/





>From the release notes:



Vector Functions 

----------------

         Exponential

         -----------

            * vrs4_expf, vrs4_exp2f, vrs4_exp10f, vrs4_expm1f

            * vrsa_expf, vrsa_exp2f, vrsa_exp10f, vrsa_expm1f

            * vrd2_exp, vrd2_exp2, vrd2_exp10, vrd2_expm1

            * vrda_exp, vrda_exp2, vrda_exp10, vrda_expm1



         Logarithmic

         -----------

            * vrs4_logf, vrs4_log2f, vrs4_log10f, vrs4_log1pf

            * vrsa_logf, vrsa_log2f, vrsa_log10f, vrsa_log1pf

            * vrd2_log, vrd2_log2, vrd2_log10, vrd2_log1p

            * vrda_log, vrda_log2, vrda_log10, vrda_log1p



         Trigonometric

         -------------

            * vrs4_cosf, vrs4_sinf

            * vrsa_cosf, vrsa_sinf

            * vrd2_cos, vrd2_sin

            * vrda_cos, vrda_sin

            * vrd2_sincos,vrda_sincos

            * vrs4_sincosf,vrsa_sincosf 

            * vrd2_tan, vrs4_tanf

            * vrd2_cosh





         Power

         -----

            * vrs4_cbrtf, vrd2_cbrt, vrs4_powf, vrs4_powxf

            * vrsa_cbrtf, vrda_cbrt, vrsa_powf, vrsa_powxf

            * vrd2_pow





The vector functions are the known (cf. include/amdlibm.h):

    __m128d amd_vrd2_exp    (__m128d x);

    __m128  amd_vrs4_expf   (__m128  x);

    etc.



While the array version use:

    void amd_vrsa_expf      (int len, float  *src, float  *dst);

    void amd_vrda_exp2      (int len, double *src, double *dst);



    void amd_vrda_exp       (int len, double *src, double *dst);

    void amd_vrsa_expf      (int len, float  *src, float  *dst);



Unfortunately, no further documentation is available, telling whether, e.g.,

src and dst may be the same or not.







Note that AMD LibM now uses "amd_" as prefix to the vector functions. It

contains the old version as weak symbols but only those:



0000000000000340 W __vrd2_cos

00000000000000e0 W __vrd2_exp

00000000000001a0 W __vrd2_log

00000000000001c0 W __vrd2_log10

00000000000001b0 W __vrd2_log2

0000000000000330 W __vrd2_sin

0000000000000390 W __vrs4_cosf

00000000000000a0 W __vrs4_expf

0000000000000200 W __vrs4_log10f

00000000000001f0 W __vrs4_log2f

00000000000001e0 W __vrs4_logf

00000000000003a0 W __vrs4_sinf

Reply via email to