On 2015-09-02 12:24:33 -0700, Peter Geoghegan wrote: > "Read fetch". One argument past to the intrinsic here specifies that > the variable will be read only. I did things this way because I > imagined that there would be very limited uses for the macro only. I > probably cover almost all interesting cases for explicit memory > prefetching already.
I'd be less brief in this case then, no need to be super short here. > > I don't think you should specify the read/write and locality parameters > > if we don't hand-pick them - right now you're saying the memory will > > only be read and that it has no temporal locality. > > > > I think there should be a empty fallback definition even if the current > > only caller ends up not needing it - not all callers will require it. > > It started out that way, but Tom felt that it was better to have a > USE_MEM_PREFETCH because of the branch below... That doesn't mean we shouldn't still provide an empty definition. > > Hm. Why do we need a branch here? The target of prefetches don't need to > > be valid addresses and adding a bunch of arithmetic and a branch for the > > corner case doesn't sound worthwhile to me. > > ...which is needed, because I'm still required to not dereference wild > pointers. > In other words, although pg_rfetch()/__builtin_prefetch() > does not require that a valid pointer be passed, it is not okay to > read past an array's bounds to read that pointer. The GCC docs are > clear on this -- "Data prefetch does not generate faults if addr is > invalid, but the address expression itself must be valid". That's just a question of how to formulate this though? pg_rfetch(((char *) state->memtuples ) + 3 * sizeof(SortTuple) + offsetof(SortTuple, tuple))? For something heavily platform dependent like this that seems ok. > Because that was the fastest value following testing on my laptop. You > are absolutely right to point out that this isn't a good reason to go > with the patch -- I share your concern. All I can say in defense of > that is that other major system software does the same, without any > regard for the underlying microarchitecture AFAICT. I know linux stripped out most prefetches at some point, even from x86 specific code, because it showed that they aged very badly. I.e. they removed a bunch of them and stuff got faster, whereas they were beneficial on earlier architectures. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers