Re: [PATCH] Add infrastructure to merge standard builtin enums with backend builtins

Michael Meissner Fri, 26 Aug 2011 07:20:14 -0700

On Fri, Aug 26, 2011 at 10:19:24AM +0200, Richard Guenther wrote:
> On Thu, Aug 25, 2011 at 10:35 PM, Michael Meissner
> <meiss...@linux.vnet.ibm.com> wrote:
> > On Wed, Aug 24, 2011 at 11:06:55AM +0200, Richard Guenther wrote:
> >> This basically would make DECL_BUILT_IN_CLASS no longer necessary
> >> if all targets where converted, right?  (We don't currently have any
> >> BUILT_IN_FRONTEND builtins).  That would sound appealing if this
> >> patch weren't a partial transition ;)
> >
> > Or we could reduce it to 1 bit if we aren't going to change all of the
> > backends.
> >
> >> Now for the possible downsides.  How can we reliably distinguish
> >> middle-end from target builtins for purpose of lazy initialization?
> >> Doesn't this complicate the idea of "pluggable" targets, thus
> >> something like a hybrid ppc / spu compiler?  In this light merging
> >> middle-end and target builtin enums and arrays sounds like a step
> >> backward.
> >
> > If we are willing to pay the storage costs, we could have 1 or 2 bytes for
> > builtin owner, and 2 bytes for builtin index, and then reserve 0 for 
> > standard
> > builtins and 1 for machine dependent builtins.  However, then you still have
> > the potential problem that sooner or later somebody else will omit the 
> > checks.
> 
> I don't think that the issue you only can index BUILT_IN_NORMAL builtins
> in built_in_decls is an issue and worth thinking about at all.  It's simply
> bugs.


I've probably spent about 2-3 weeks total tracking down those bugs in the past,
because they are hard to pin down, but if we don't want to merge the two
numbers it isn't a deal breaker to me.  It was more while I'm playing in the
builtin space, fix the problem.

> > We could reserve a fixed range for plugin builtins if you think that is
> > desirable.
> 
> Oh, plugin builtins - I didn't even think about the possibility of having
> those ;)
> 
> In the end I think we should stick with BUILT_IN_CLASS and maybe
> add BUILT_IN_PLUGIN then ;)

I think if we do this, we should re-use the front end builtin class, and add
methods that front ends can add their builtins to the main list.  Otherwise we
need to grow the class by 1 bit.

> >> What I _do_ like is having common machinery for defining builtins.
> >> Though instead of continuing the .def file way with all the current
> >> warts of ways of adding attributes, etc. to builtins I would have
> >> prefered a genbuiltins.c program that can parse standard C
> >> declarations and generate whatever is necessary to setup the
> >> builtin decls.  Thus, instead of
> >>
> >> DEF_GCC_BUILTIN        (BUILT_IN_CLZ, "clz", BT_FN_INT_UINT,
> >> ATTR_CONST_NOTHROW_LEAF_LIST)
> >>
> >> have simply
> >>
> >> int __builtin_clz (unsigned int) __attribute__((const,nothrow,leaf));
> >>
> >> in a header file which genbuiltins.c would parse.  My first idea
> >> when discussing this was a -fgenbuiltins flag to the C frontend
> >> (because that already can do all the parsing ...), but Micha suggested
> >> a parser that can deal with the above is easy enough to "re-implement".
> >
> > Yes, that is certainly do-able.  My main intention is to see what kind of
> > infrastructure people wanted before changing all of the ppc builtins.
> 
> Sure.  I agree that all the duplicated code we have in backends for a
> way to create target builtins, defining enums (or not) for them and
> having a way to reference them for targetm.builtin_decl (or not) is bad.
> But unifying those, or providing common infrastructure for them should
> be orthogonal to the issue whether we want to "merge" the builtin
> classes or their storage in some way (I think we don't).  It would of
> course be nice if the infrastructure to create taget builtins were
> generic enough to eventually handle builtin creation in the middle-end
> (and the frontends) as well.
> 
> >> Hm, I guess this pushes back a bit on your patch.  Sorry for that.
> >> If you're not excited to try the above idea, can you split out the
> >> pieces that do the .def file thing for rs6000, keeping the separation
> >> of md and middle-end builtin arrays and enums?
> >
> > I have several goals for the 4.7 time frame:
> >
> >  1) Make target attribute and pragma enable appropriate machine dependent
> >     builtins;
> 
> That's now something completely new ;)  Why do we need builtins for this?

I ran out of time when I added target pragma support in 4.6 to enable the
builtins for target functions.  We don't need new builtins, but the ppc backend
needs to enable the builtins that exist when the target is selected, which the
x86 already does.  In the end, I want to be able to do:

        void v4sf_add (float *, float *, float *, size_t)
                __attribute__ ((__ifunc__ ("resolve_v4sf_add")));

        static void v4sf_power7_add (float *, float *, float *, size_t)
                __attribute__ ((__target__ ("cpu=power7")));

        static void v4sf_altivec_add (float *, float *, float *, size_t)
                __attribute__ ((__target__ ("altivec")));

        static void v4sf_generic_add (float *, float *, float *, size_t);

        static void *resolve_v4sf_add (void);

        static void
        v4sf_power7_add (float *a, float *b, float *c, size_t n)
        {
          size_t n2 = n / 4;
          size_t n3 = n % 4;
          while (n2-- > 0)
            {
              vector float * v_a = (vector float *)a;
              vector float * v_b = (vector float *)b;
              vector float * v_c = (vector float *)c;

              *v_a = __builtin_vsx_xvaddsp (*v_b, *v_c);
              a += 4;
              b += 4;
              c += 4;
            }
          while (n3-- > 0)
            *a++ = *b++ + *c++;
        }

        static void
        v4sf_altivec_add (float *a, float *b, float *c, size_t n)
        {
          if (((((size_t)a) & 0xf) == 0)
              && ((((size_t)b) & 0xf) == 0)
              && ((((size_t)c) & 0xf) == 0))
            {
              size_t n2 = n / 4;
              n %= 4;
              while (n2-- > 0)
                {
                  vector float * v_a = (vector float *)a;
                  vector float * v_b = (vector float *)b;
                  vector float * v_c = (vector float *)c;

                  *v_a = __builtin_altivec_vaddfp (*v_b, *v_c);
                  a += 4;
                  b += 4;
                  c += 4;
                }
            }
        }

        static void
        v4sf_generic_add (float *a, float *b, float *c, size_t n)
        {
          while (n-- > 0)
            *a++ = *b++ + *c++;
        }

        static void *
        resolve_v4sf_add (void)
        {
          if (running_on_power7 ())
            return (void *)v4sf_power7_add;
          else if (running_on_altivec ())
            return (void *)v4sf_altivec_add;
          else
            return (void *)v4sf_generic_add;
        }

Now, in this particular case, the compiler's auto vectorization might be used,
but it is meant to be an example.

> >  2) Make it less likely we will again be bitten by code that blindly
> >     references built_in_decl without checking if it is MD or standard;
> 
> I don't think this is important at all.  Proposed solution: transition
> builtin decl access to a functional interface:
> 
> tree built_in_decl (enum built_in_code)
> 
> which when building with C++ will get you warnings if indexed with
> a bougs enum type or an integer type.

Yep.

> >  3) Make at least the MD builtins created on demand.  It would be nice to do
> >     the standard builtins as well, but that may somewhat more problematical.
> >     I do think all references to built_in_decl and implicit_built_in_decl
> >     should be moved to a macro wrapper.
> 
> To a (inline) function wrapper with the same name, indeed.
> 
> > If we restrict the types and attributes for a C like header file, it 
> > shouldn't
> > be that hard (famous last words).  I would think adding #ifdef also, so:
> >
> >        #ifdef __ALTIVEC__
> >        extern vector float __builtin_altivec_vaddfp (vector float, vector
> >                                float) __attribute__ ((...));
> >        #endif
> >
> > The backend would need to specify a list of valid #ifdef's and the mapping 
> > to
> > TARGET_<xxx>, and valid extra types with a mapping to the internal type 
> > node.
> 
> Yes.  For the middle-end/frontend stuff we also need a way to specify
> the difference between C89 and C99 builtins and GCC internal builtins.
> 
> Not sure if I'd use #ifdef like above or simply stick to .def file like
> attributions in comments like
> 
> /* TARGET(ALTIVEC) */
> extern vector float __builtin_altivec_vaddfp (vector float, vector
> float) __attribute__ ((...));
> 
> /* LANG(C99) */
> double cbrt (double, double) __attribute__ ((...));

This also works.

> we also have fancy things like conditionally adding attributes, for example
> the ATTR_MATHFN_FPROUNDING stuff.  Using #ifdefs for this would
> work, I think there is a special macro defined for -frounding-math.
> 
> The idea certainly needs some thoughts, and the target builtins probably
> have less feature requirements than the middle-end builtins.def stuff.
> 
> The .h file could also serve as a container for builtin documentation.
> 
> > The alternative is something like what Kenney and Mike are doing in their
> > private port, where they have new syntax in the MD file for builtins.
> 
> But are those user-exposed builtins?  Certainly interesting to combine
> builtin definition and the instruction it expands to.

Yes, these are user exposed builtins.  Massive amounts of user exposed builtins
(Mike said he needs 13 bits for the builtin index).  I think it would be better
if Mike comments on this.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com     fax +1 (978) 399-6899

Re: [PATCH] Add infrastructure to merge standard builtin enums with backend builtins

Reply via email to