Thanks for the replies and thoughts.

On 08/19/14 18:27, Heikki Linnakangas wrote:
On 08/20/2014 12:17 AM, John Lumby wrote:
I am attaching a new version of the patch for consideration in the current commit fest.

Thanks for working on this!

Relative to the one I submitted on 25 June in bay175-w412ff89303686022a9f16aa3...@phx.gbl the method for handling aio completion using sigevent has been re-written to use
signals exclusively rather than a composite of signals and LWlocks,
and this has fixed the problem I mentioned before with the LWlock method.

ISTM the patch is still allocating stuff in shared memory that really doesn't belong there. Namely, the BufferAiocb structs. Or at least parts of it; there's also a waiter queue there which probably needs to live in shared memory, but the rest of it does not.

Actually the reason the BufferAiocb ( the postgresql block corresponding to the aio's aiocb) must be located in shared memory is that, as you said, it acts as anchor for the waiter list.
See further comment below.


At least BufAWaitAioCompletion is still calling aio_error() on an AIO request that might've been initiated by another backend. That's not OK.

Yes,   you are right,  and I agree with this one  -
I will add a aio_error_return_code field in the BufferAiocb
and only the originator will set this from the real aiocb.


Please write the patch without atomic CAS operation. Just use a spinlock.

Umm, this is a new criticism I think. I use CAS for things other than locking, such as add/remove from shared queue. I suppose maybe a spinlock on the entire queue can be used equivalently, but with more code (extra confusion) and worse performance (coarser serialization). What is your objection to using gcc's atomic ops? Portability?


There's a patch in the commitfest to add support for that,

sorry,  support for what? There are already spinlocks in postgresql,
you mean some new kind? please point me at it with hacker msgid or something.

but it's not committed yet, and all those USE_AIO_ATOMIC_BUILTIN_COMP_SWAP ifdefs make the patch more difficult to read. Same with all the other #ifdefs; please just pick a method that works.

Ok,  yes,   the ifdefs are unpleasant.    I will do something about that.
Ideally they would be entirely confined to header files and only macro functions in C files - maybe I can do that. And eventually when the dust has settled
eliminate obsolete ifdef blocks altogether.

Also, please split prefetching of regular index scans into a separate patch. It's orthogonal to doing async I/O;

actually not completely orthogonal, see next

we could prefetch regular index scans with posix_fadvise too, and AIO should be useful for prefetching other stuff.

On 08/19/14 19:10, Claudio Freire wrote:
On Tue, Aug 19, 2014 at 7:27 PM, Heikki Linnakangas
<hlinnakan...@vmware.com> wrote:
Also, please split prefetching of regular index scans into a separate patch. ...
That patch already happened on the list, and it wasn't a win in many
cases. I'm not sure it should be proposed independently of this one.
Maybe a separate patch, but it should be considered dependent on this.

I don't have an archive link at hand atm, but I could produce one later.
Several people have asked to split this patch into several smaller ones and I
have thought about it.     It would introduce some awkward dependencies.
E.g. splitting the scanner code (index, relation-heap) into separate patch
from aio code would mean that the scanner patch becomes dependent
on the aio patch.     They are not quite orthogonal.

The reason is that the scanners call a new function, DiscardBuffer(blockid)
when aio is in use. We can get around it by providing a stub for that function
in the scanner patch,   but then there is some risk of someone getting the
wrong version of that function in their build. It just adds yet more complexity
and breakage opportunities.


- Heikki

One further comment concerning these BufferAiocb and aiocb control blocks
being in shared memory :

I explained above that the BufferAiocb must be in shared memory for wait/post.
The aiocb does not necessarily have to be in shared memory,
but since there is a one-to-one relationship between BufferAiocb and aiocb,
it makes the code much simpler ,  and the operation much more efficient,
if the aiocb is embedded in the BufferAiocb as I have it now.
E.g. if the aiocb is in private-process memory, then an additional allocation
scheme is needed (fixed number?   palloc()in'g extra ones as needed?  ...)
which adds complexity, and probably some inefficiency since a shared pool is usually more efficient (allows higher maximum per process etc), and there would have to be some pointer de-referencing from BufferAiocb to aiocb adding some (small) overhead.

I understood your objection to use of shared memory as being that you don't want a non-originator to access the originator's aiocb using aio_xxx calls, and, with the one change I've said I will do above (put the aio_error retcode in the BufferAiocb) I will have achieved that requirement. I am hoping this answers your objections
concerning shared memory.





Reply via email to