On 2015-07-19 16:34:52 -0700, Peter Geoghegan wrote:
> +# PGAC_C_BUILTIN_PREFETCH
> +# -------------------------
> +# Check if the C compiler understands __builtin_prefetch(),
> +# and define HAVE__BUILTIN_PREFETCH if so.
> +AC_DEFUN([PGAC_C_BUILTIN_PREFETCH],
> +[AC_CACHE_CHECK(for __builtin_prefetch, pgac_cv__builtin_prefetch,
> +[AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
> +[int i = 0;__builtin_prefetch(&i, 0, 3);])],
> +[pgac_cv__builtin_prefetch=yes],
> +[pgac_cv__builtin_prefetch=no])])
> +if test x"$pgac_cv__builtin_prefetch" = xyes ; then
> +AC_DEFINE(HAVE__BUILTIN_PREFETCH, 1,
> +          [Define to 1 if your compiler understands __builtin_prefetch.])
> +fi])# PGAC_C_BUILTIN_PREFETCH

Hm. Is a compiler test actually test anything reliably here? Won't this
just throw a warning during compile time about an unknown function?

> +/*
> + * Prefetch support -- Support memory prefetching hints on some platforms.
> + *
> + * pg_rfetch() is specialized for the case where an array is accessed
> + * sequentially, and we can prefetch a pointer within the next element (or an
> + * even later element) in order to hide memory latency.  This case involves
> + * prefetching addresses with low temporal locality.  Note that it's rather
> + * difficult to get any kind of speedup using pg_rfetch();  any use of the
> + * intrinsic should be carefully tested.  Also note that it's okay to pass it
> + * an invalid or NULL address, although it's best avoided.
> + */
> +#if defined(USE_MEM_PREFETCH)
> +#define pg_rfetch(addr)              __builtin_prefetch((addr), 0, 0)
> +#endif

What is that name actually supposed to mean? 'rfetch' doesn't seem to be
particularly clear - or is it a meant to be a wordplay combined with the
p?

I don't think you should specify the read/write and locality parameters
if we don't hand-pick them - right now you're saying the memory will
only be read and that it has no temporal locality.

I think there should be a empty fallback definition even if the current
only caller ends up not needing it - not all callers will require it.

> +                                     /*
> +                                      * Perform memory prefetch of tuple 
> that's three places
> +                                      * ahead of current (which is returned 
> to caller).
> +                                      * Testing shows that this 
> significantly boosts the
> +                                      * performance for TSS_INMEM "forward" 
> callers by
> +                                      * hiding memory latency behind their 
> processing of
> +                                      * returned tuples.
> +                                      */
> +#ifdef USE_MEM_PREFETCH
> +                                     if (readptr->current + 3 < 
> state->memtupcount)
> +                                             
> pg_rfetch(state->memtuples[readptr->current + 3]);
> +#endif

Hm. Why do we need a branch here? The target of prefetches don't need to
be valid addresses and adding a bunch of arithmetic and a branch for the
corner case doesn't sound worthwhile to me.


What worries me about adding explicit prefetching is that it's *highly*
platform and even micro-architecture dependent. Why is looking three
elements ahead the right value?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to