Tom Lane wrote:
The pending-fsync stuff in md.c is also expecting to be able to add
entries during a scan.

No, mdsync starts the scan from scratch after calling AbsorbFsyncRequests.

I don't think we can go in the direction of forbidding insertions during
a scan --- as the case at hand shows, it's just not always obvious that
that could happen, and finding/fixing such a problem is nigh impossible.
(We were darn fortunate to be able to reproduce this one.)  Plus we have
a couple of places where it's really necessary to be able to do it,
anyway.

The only answer I can see that seems reasonably robust is to change
dynahash.c so that it tracks whether any seq_search scans are open on a
hashtable, and doesn't carry out any splits while one is.  This wouldn't
cost anything noticeable in performance, assuming that not very many
splits are postponed.  The PITA aspect of it is that we'd need to add
bookkeeping mechanisms to ensure that the count of active scans gets
cleaned up on error exit.  It's not like we've not got lots of those,
though.

We could have two kinds of seq scans, with and without support for concurrent inserts. If you open a scan without that support, it acts just like today, and no extra bookkeeping or clean up by the caller is required. If you need concurrent inserts, we inhibit bucket splits, but it's up to the caller to explicitly close the scan, possibly with PG_TRY/CATCH. I'm not sure if that's simpler in the end, but we could get away without adding generic bookkeeping mechanism.

Possibly we could simplify matters a bit by not worrying about cleaning
up leaked counts at subtransaction abort, ie, the list of open scans
would only get forced to empty at top transaction end.  This carries a
slightly higher risk of meaningful performance degradation, but in
practice I doubt it's a big problem.  If we agreed that then we'd not
need ResourceOwner support --- it could be handled like LWLock counts.

Hmm. Unlike lwlocks, hash tables can live in different memory contexts, so we can't just have list of open scans similar to held_lwlocks array.

Do we need to support multiple simultaneous seq scans of a hash table? I suppose we do..

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to