Heikki Linnakangas wrote:
Tom Lane wrote:
Also, we have a generic issue that making fresh entries in a hashtable
might result in a concurrent hash_seq_search scan visiting existing
entries more than once; that's definitely not something any of the
existing callers are thinking about.

Ouch. Note that we can also miss some entries altogether, which is probably even worse.

In case someone is wondering how that can happen, here's an example. We're scanning a bucket that contains four entries, and we split it after returning 1:

1 -> 2* -> 3 -> 4

* denotes the next entry the seq scan has stored.

If this is split goes example like this:

1 -> 3
2* -> 4

The seq scan will continue scanning from 2, then 4, and miss 3 altogether.

I briefly went through all callers of hash_seq_init. The only place where we explicitly rely on being able to add entries to a hash table while scanning it is in tbm_lossify. There's more complex loops in portalmem.c and relcache.c, which I think are safe, but would need to look closer. There's also the pg_prepared_statement set-returning-function that keeps a scan open across calls, which seems error-prone.

Should we document the fact that it's not safe to insert new entries to a hash table while scanning it, and fix the few call sites that do that, or does anyone see a better solution? One alternative would be to inhibit bucket splits while a scan is in progress, but then we'd need to take care to clean up after each scan.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Reply via email to