Jeff Davis wrote:
On Wed, 2007-05-02 at 23:59 +0100, Heikki Linnakangas wrote:
Jeff Davis wrote:
On Wed, 2007-05-02 at 20:58 +0100, Heikki Linnakangas wrote:
Jeff Davis wrote:
What should be the maximum size of this hash table?
Good question. And also, how do you remove entries from it?
I guess the size should somehow be related to number of backends. Each
backend will realistically be doing just 1 or max 2 seq scan at a time.
It also depends on the number of large tables in the databases, but we
don't have that information easily available. How about using just
NBackends? That should be plenty, but wasting a few hundred bytes of
memory won't hurt anyone.
One entry per relation, not per backend, is my current design.
Umm, you naturally have just entry per relation, but we were talking
about how many entries the table needs to hold.. You're patch had a
hard-coded value of 1000 which is quite arbitrary.
I think I'm missing something from your statement "I guess the size
should somehow be related to number of backends." If there is only one
entry per relation, why do more backends require a larger hash table?
The hash table keeps track of ongoing seq scans. That's presumably
related to number of backends; I can't imagine a plan where a backend is
executing more than one seq scan at a time, so that gives us an upper
bound of NBackends simultaneous seq scans in the system.
Sure, if you only have say 5 tables that are large enough that we try to
keep track of them, there can't be more than 5 entries in the table. But
we don't know how many such tables there is at startup time.
A hard coded constant in the range of 10 - 100 might be just fine in
practice.
If I just step back for a second to consider a simpler design:
We will most likely have no more than a few relations larger than the
minimum threshold for Sync Scanning in any database being scanned
concurrently.
Note that the list would be shared with all databases in the cluster.
Your point is still valid, though.
What if I just make an LRU that occupies a fixed size? Any reads or
updates can start at the MRU item and work backward, and any evictions
can happen at the LRU item.
Does a hash table really save us cycles if we keep the list small, say,
100 items? Looking at all 100 items would only be required perhaps on a
scan startup. I don't have a good sense of scale here, is following 50
pointers while holding a lock a significant source of contention?
Hmm. If you only need to scan through it when you start the scan, I
suppose it would be ok.
100 is of course arbitrary, and that could grow or shrink automatically
at runtime.
Except that it can't shrink, because we don't have any convenient moment
to remove an entry from the list, other than dropping the least recently
used entry out of the list when inserting a new one. And since it's in
shared mem, the allocated size is fixed at boot up, so we can't grow it
either.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match