Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Andres Freund
On 2013-11-15 14:22:30 +1300, David Rowley wrote:
 On Fri, Nov 15, 2013 at 3:12 AM, Andres Freund and...@2ndquadrant.comwrote:
 
  Hi,
 
  On 2013-11-13 22:55:43 +1300, David Rowley wrote:
   Here 
   http://www.postgresql.org/message-id/24278.1352922...@sss.pgh.pa.usthere
   was some talk about init_sequence being a bottleneck when many sequences
   are used in a single backend.
  
   The attached I think implements what was talked about in the above link
   which for me seems to double the speed of a currval() loop over 3
   sequences. It goes from about 7 seconds to 3.5 on my laptop.
 
  I think it'd be a better idea to integrate the sequence caching logic
  into the relcache. There's a comment about it:
   * (We can't
   * rely on the relcache, since it's only, well, a cache, and may decide to
   * discard entries.)
  but that's not really accurate anymore. We have the infrastructure for
  keeping values across resets and we don't discard entries.
 
 
 I just want to check this idea against an existing todo item to move
 sequences into a single table, as I think by the sounds of it this binds
 sequences being relations even closer together.

 This had been on the back of my mind while implementing the hash table
 stuff for init_sequence and again when doing my benchmarks where I created
 3 sequences and went through the pain of having a path on my file
 system with 3 8k files.

Well. But in which real world usecases is that actually the bottleneck?

 1. The search_path stuff makes this a bit more complex. It sounds like this
 would require some duplication of the search_path logic.

I'd assumed that if we were to do this, the sequences themselves would
still continue to live in pg_class. Just instead of a relfilenode
containing their state it would be stored in an extra table.

 2. There is also the problem with tracking object dependency.
 
 Currently:
 create sequence t_a_seq;
 create table t (a int not null default nextval('t_a_seq'));
 alter sequence t_a_seq owned by t.a;
 drop table t;
 drop sequence t_a_seq; -- already deleted by drop table t
 ERROR:  sequence t_a_seq does not exist
 
 Moving sequences to a single table sounds like a special case for this
 logic.

There should already be code such dependencies.

4) Scalability problems: The one block sequences use already can be a
major contention issue when you have paralell inserts to the same
table. A workload which I, unlike a couple thousand unrelated sequences,
actually think is more realistic. So we'd need to force 1 sequence tuple
per block, which we currently cannot do.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Andres Freund
On 2013-11-15 19:12:15 +1300, David Rowley wrote:
 On Fri, Nov 15, 2013 at 3:23 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  Andres Freund and...@2ndquadrant.com writes:
   I think it'd be a better idea to integrate the sequence caching logic
   into the relcache. There's a comment about it:
* (We can't
* rely on the relcache, since it's only, well, a cache, and may decide
  to
* discard entries.)
   but that's not really accurate anymore. We have the infrastructure for
   keeping values across resets and we don't discard entries.
 
  We most certainly *do* discard entries, if they're not open when a cache
  flush event comes along.
 
  I suppose it'd be possible to mark a relcache entry for a sequence
  as locked-in-core, but that doesn't attract me at all.  A relcache
  entry is a whole lot larger than the amount of state we really need
  to keep for a sequence.
 
  One idea is to have a hashtable for the sequence-specific data,
  but to add a link field to the relcache entry that points to the
  non-flushable sequence hashtable entry.  That would save the second
  hashtable lookup as long as the relcache entry hadn't been flushed
  since last use, while not requiring any violence to the lifespan
  semantics of relcache entries.  (Actually, if we did that, it might
  not even be worth converting the list to a hashtable?  Searches would
  become a lot less frequent.)
 
 
 Unless I've misunderstood something it looks like this would mean giving
 heamam.c and relcache.c knowledge of sequences.
 Currently relation_open is called from open_share_lock in sequence.c. The
 only way I can see to do this would be to add something like
 relation_open_sequence() in heapam.c which means we'd need to invent
 RelationIdGetSequenceRelation() and use that instead
 of RelationIdGetRelation() and somewhere along the line have it pass back
 the SeqTableData struct which would be tagged onto RelIdCacheEnt.

Look like indexes already go through that path, and store data in the
relcache, without requiring heapam.c to know all that much.

 I think it can be done but I don't think it will look pretty.
 Perhaps if there was a more generic way... Would tagging some void
 *rd_extra only the RelationData be a better way? And just have sequence.c
 make use of that for storing the SeqTableData.

I'd rather have it properly typed in -rd_sequence, maybe in a union
against the index members.

 Also I'm wondering what we'd do with all these pointers when someone does
 DISCARD SEQUENCES; would we have to invalidate the relcache or would it
 just be matter of looping over it and freeing of the sequence data setting
 the pointers to NULL?

Good question. Have a 'discard_count' member and global variable? That
starts at zero and gets incremented everytime somebody discards?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Heikki Linnakangas

On 15.11.2013 07:47, David Rowley wrote:

On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas hlinnakan...@vmware.com

wrote

I think that means that we should just completely replace the list with
the hash table. The difference with a small N is lost in noise, so there's
no point in keeping the list as a fast path for small N. That'll make the
patch somewhat simpler.


Attached is a much more simple patch which gets rid of the initial linked
list.


Thanks, committed with minor copy-editing. I dialed down the initial 
size of the hash table from 1000 to 16, that ought to be enough.


I ran a quick performance test of my own, based on the script you sent. 
I modified it a bit to eliminate the PL/pgSQL overhead, making it more 
heavily bottlenecked by the nextval/currval overhead. Results:


nextval, 1 seqs 36772   2426
currval, 1 seq  11761069
currval, 2 seqs 865 857
currval, 4 seqs 742 759
currval, 5 seqs 718 711
currval, 10 seqs680 668
currval, 100 seqs   871 656
currval, 1000 seqs  3507700
currval, 1 seqs 34742   1224

The performance when you touch only a few sequences is unchanged. When 
you touch a lot of them, you gain. Just as you would expect.


Attached is the test script I used. After running the test, I realized 
that there's a little flaw in the test methodology, but I doesn't 
invalidates the results. I used the same backend for all the test runs, 
so even when currval() is called repeatedly for a single sequence, the 
linked list (or hash table, with the patch) nevertheless contains 
entries for all 1 sequences. However, the sequences actually used by 
the test are always in the front of the list, because the nextval() 
calls were made in the same order. But with the unpatched version, if 
you called currval() on the lastly initialized sequence repeatedly, 
instead of the firstly initialized one, you would get much would get 
horrible performance, even when you touch only a single sequence.


Regarding the more grandiose ideas of using the relcache or rewriting 
the way sequences are stored altogether: this patch might become 
obsolete if we do any of that stuff, but that's ok. The immediate 
performance problem has been solved now, but those other ideas might be 
worth pursuing for other reasons.


- Heikki
CREATE or replace FUNCTION create_seq(n integer) RETURNS void
LANGUAGE plpgsql
AS $$
BEGIN
drop table if exists testseqs;
create table testseqs(seqoid oid, n int4);
WHILE n  0 LOOP
EXECUTE 'CREATE SEQUENCE testseq' || n;
	insert into testseqs select oid, n from pg_class where relname=('testseq' || n);
  n := n - 1;
END LOOP;
END
$$;

CREATE or replace FUNCTION drop_seq() RETURNS void language plpgsql as
$$
declare
   n int4;
begin
   for n in select testseqs.n from testseqs loop
 execute 'drop sequence testseq' || n;
   end loop;
end;
$$;


CREATE or replace FUNCTION nextval_seq(nseqs integer, niter integer) RETURNS bigint
LANGUAGE sql
AS $$
select count(*) from (
  select nextval(seqoid), n, g from testseqs, generate_series(1, niter) g where n = nseqs
) foo;
$$;


CREATE or replace FUNCTION currval_seq(nseqs integer, niter integer) RETURNS bigint
LANGUAGE sql
AS $$
select count(*) from (
  select currval(seqoid), n, g from testseqs, generate_series(1, niter) g where n = nseqs
) foo;
$$;


\timing on
--select create_seq(1);

\echo warmup
select nextval_seq(1, 100);

\echo calling nextval on 1 distinct sequences, 100 times
select nextval_seq(1, 100);

\echo calling currval on 1 sequence, 100 times
select currval_seq(1, 100);

\echo calling currval on 2 sequences, 50 times
select currval_seq(2, 50);

\echo calling currval on 4 sequences, 25 times
select currval_seq(4, 25);

\echo calling currval on 5 sequences, 20 times
select currval_seq(5, 20);

\echo calling currval on 10 sequences, 10 times
select currval_seq(10, 10);

\echo calling currval on 100 sequences, 1 times
select currval_seq(100, 1);

\echo calling currval on 1000 sequences, 1000 times
select currval_seq(1000, 1000);

\echo calling currval on 1 distinct sequences, 100 times
select currval_seq(1, 100);



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Andres Freund
On 2013-11-15 12:31:54 +0200, Heikki Linnakangas wrote:
 On 15.11.2013 07:47, David Rowley wrote:
 On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote
 
 I think that means that we should just completely replace the list with
 the hash table. The difference with a small N is lost in noise, so there's
 no point in keeping the list as a fast path for small N. That'll make the
 patch somewhat simpler.
 
 Attached is a much more simple patch which gets rid of the initial linked
 list.
 
 Thanks, committed with minor copy-editing. I dialed down the initial size of
 the hash table from 1000 to 16, that ought to be enough.

I am slightly confused by the use of HASH_CONTEXT without setting
ctl-hcxt, that works, but seems more like an accident.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread David Rowley
On Fri, Nov 15, 2013 at 11:31 PM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:

 Thanks, committed with minor copy-editing. I dialed down the initial size
 of the hash table from 1000 to 16, that ought to be enough.


Great. Thanks for commiting.

Regards

David Rowley


 - Heikki



Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Heikki Linnakangas

On 15.11.2013 12:44, Andres Freund wrote:

On 2013-11-15 12:31:54 +0200, Heikki Linnakangas wrote:

On 15.11.2013 07:47, David Rowley wrote:

On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas hlinnakan...@vmware.com

wrote

I think that means that we should just completely replace the list with
the hash table. The difference with a small N is lost in noise, so there's
no point in keeping the list as a fast path for small N. That'll make the
patch somewhat simpler.


Attached is a much more simple patch which gets rid of the initial linked
list.


Thanks, committed with minor copy-editing. I dialed down the initial size of
the hash table from 1000 to 16, that ought to be enough.


I am slightly confused by the use of HASH_CONTEXT without setting
ctl-hcxt, that works, but seems more like an accident.


Ugh. Accident indeed. Thanks, fixed!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread Heikki Linnakangas

On 14.11.2013 14:38, David Rowley wrote:

I've just completed some more benchmarking of this. I didn't try dropping
the threshold down to 2 or 0 but I did tests at the cut over point and
really don't see much difference in performance between the list at 32 and
the hashtable at 33 sequences. The hash table version excels in the 16000
sequence test in comparison to the unpatched version.

Times are in milliseconds of the time it took to call currval() 10
times for 1 sequence.
  Patched Unpatched increased by  1 in cache 1856.452 1844.11 -1%  32 in
cache 1841.84 1802.433 -2%  33 in cache 1861.558  not tested N/A  16000 in
cache 1963.711 10329.22 426%


If I understand those results correctly, the best case scenario with the 
current code takes about 1800 ms. There's practically no difference with 
N = 32, where N is the number of sequences touched. The hash table 
method also takes about 1800 ms when N=33. The performance of the hash 
table is O(1), so presumably we can extrapolate from that that it's the 
same for any N.


I think that means that we should just completely replace the list with 
the hash table. The difference with a small N is lost in noise, so 
there's no point in keeping the list as a fast path for small N. That'll 
make the patch somewhat simpler.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread Andres Freund
Hi,

On 2013-11-13 22:55:43 +1300, David Rowley wrote:
 Here http://www.postgresql.org/message-id/24278.1352922...@sss.pgh.pa.us there
 was some talk about init_sequence being a bottleneck when many sequences
 are used in a single backend.
 
 The attached I think implements what was talked about in the above link
 which for me seems to double the speed of a currval() loop over 3
 sequences. It goes from about 7 seconds to 3.5 on my laptop.

I think it'd be a better idea to integrate the sequence caching logic
into the relcache. There's a comment about it:
 * (We can't
 * rely on the relcache, since it's only, well, a cache, and may decide to
 * discard entries.)
but that's not really accurate anymore. We have the infrastructure for
keeping values across resets and we don't discard entries.

Since we already do a relcache lookup for every sequence manipulation
(c.f. init_sequence()) relying on it won't increase, but rather decrease
the overhead.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I think it'd be a better idea to integrate the sequence caching logic
 into the relcache. There's a comment about it:
  * (We can't
  * rely on the relcache, since it's only, well, a cache, and may decide to
  * discard entries.)
 but that's not really accurate anymore. We have the infrastructure for
 keeping values across resets and we don't discard entries.

We most certainly *do* discard entries, if they're not open when a cache
flush event comes along.

I suppose it'd be possible to mark a relcache entry for a sequence
as locked-in-core, but that doesn't attract me at all.  A relcache
entry is a whole lot larger than the amount of state we really need
to keep for a sequence.

One idea is to have a hashtable for the sequence-specific data,
but to add a link field to the relcache entry that points to the
non-flushable sequence hashtable entry.  That would save the second
hashtable lookup as long as the relcache entry hadn't been flushed
since last use, while not requiring any violence to the lifespan
semantics of relcache entries.  (Actually, if we did that, it might
not even be worth converting the list to a hashtable?  Searches would
become a lot less frequent.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread Andres Freund
On 2013-11-14 09:23:20 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I think it'd be a better idea to integrate the sequence caching logic
  into the relcache. There's a comment about it:
   * (We can't
   * rely on the relcache, since it's only, well, a cache, and may decide to
   * discard entries.)
  but that's not really accurate anymore. We have the infrastructure for
  keeping values across resets and we don't discard entries.
 
 We most certainly *do* discard entries, if they're not open when a cache
 flush event comes along.

What I was aiming at is that we don't discard them because of a limited
cache size. I don't think it means much that we flush the entry when
it's changed but not referenced.

 I suppose it'd be possible to mark a relcache entry for a sequence
 as locked-in-core, but that doesn't attract me at all.  A relcache
 entry is a whole lot larger than the amount of state we really need
 to keep for a sequence.

But effectively that's what already happens unless either somebody else
does an ALTER SEQUENCE or sinval overflow happened, right? So in
production workloads we already will keep both around since hopefully
neither altering a sequence nor sinval overflows should be a frequent
thing.

 One idea is to have a hashtable for the sequence-specific data,
 but to add a link field to the relcache entry that points to the
 non-flushable sequence hashtable entry.  That would save the second
 hashtable lookup as long as the relcache entry hadn't been flushed
 since last use, while not requiring any violence to the lifespan
 semantics of relcache entries.  (Actually, if we did that, it might
 not even be worth converting the list to a hashtable?  Searches would
 become a lot less frequent.)

That's not a bad idea if we decide not to integrate them. And I agree,
there's not much need to have a separate hashtable in that case.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-11-14 09:23:20 -0500, Tom Lane wrote:
 We most certainly *do* discard entries, if they're not open when a cache
 flush event comes along.

 What I was aiming at is that we don't discard them because of a limited
 cache size. I don't think it means much that we flush the entry when
 it's changed but not referenced.

Well, I don't want non-user-significant events (such as an sinval queue
overrun) causing sequence state to get discarded.  We would get bug
reports about lost sequence values.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread Andres Freund
On 2013-11-14 09:47:18 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-11-14 09:23:20 -0500, Tom Lane wrote:
  We most certainly *do* discard entries, if they're not open when a cache
  flush event comes along.
 
  What I was aiming at is that we don't discard them because of a limited
  cache size. I don't think it means much that we flush the entry when
  it's changed but not referenced.
 
 Well, I don't want non-user-significant events (such as an sinval queue
 overrun) causing sequence state to get discarded.  We would get bug
 reports about lost sequence values.

But we can easily do as you suggest and simply retain the entry in that
case. I am just not seeing the memory overhead argument as counting much
since we don't protect against it in normal operation.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread David Rowley
On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 14.11.2013 14:38, David Rowley wrote:

 I've just completed some more benchmarking of this. I didn't try dropping
 the threshold down to 2 or 0 but I did tests at the cut over point and
 really don't see much difference in performance between the list at 32 and
 the hashtable at 33 sequences. The hash table version excels in the 16000
 sequence test in comparison to the unpatched version.

 Times are in milliseconds of the time it took to call currval() 10
 times for 1 sequence.
   Patched Unpatched increased by  1 in cache 1856.452 1844.11 -1%  32
 in
 cache 1841.84 1802.433 -2%  33 in cache 1861.558  not tested N/A  16000 in
 cache 1963.711 10329.22 426%


 If I understand those results correctly, the best case scenario with the
 current code takes about 1800 ms. There's practically no difference with N
 = 32, where N is the number of sequences touched. The hash table method
 also takes about 1800 ms when N=33. The performance of the hash table is
 O(1), so presumably we can extrapolate from that that it's the same for any
 N.

 I think that means that we should just completely replace the list with
 the hash table. The difference with a small N is lost in noise, so there's
 no point in keeping the list as a fast path for small N. That'll make the
 patch somewhat simpler.
 - Heikki


I had thought that maybe the biggest type of workloads might only touch 1
or 2 sequences, though it may be small but I had thought there would be an
overhead in both cycles and memory usage in creating a hash table for these
light usages of sequence backends. It would certainly make the patch more
simple by removing this and it would also mean that I could remove the
sometimes unused -next member from the SeqTableData struct which is just
now set to NULL when in hash table mode. If you think it's the way to go
then I can make the change, though maybe I'll hold off the refactor for now
as it looks like other ideas have come up around rel cache.

Regards

David Rowley


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread David Rowley
On Fri, Nov 15, 2013 at 3:12 AM, Andres Freund and...@2ndquadrant.comwrote:

 Hi,

 On 2013-11-13 22:55:43 +1300, David Rowley wrote:
  Here 
  http://www.postgresql.org/message-id/24278.1352922...@sss.pgh.pa.usthere
  was some talk about init_sequence being a bottleneck when many sequences
  are used in a single backend.
 
  The attached I think implements what was talked about in the above link
  which for me seems to double the speed of a currval() loop over 3
  sequences. It goes from about 7 seconds to 3.5 on my laptop.

 I think it'd be a better idea to integrate the sequence caching logic
 into the relcache. There's a comment about it:
  * (We can't
  * rely on the relcache, since it's only, well, a cache, and may decide to
  * discard entries.)
 but that's not really accurate anymore. We have the infrastructure for
 keeping values across resets and we don't discard entries.


I just want to check this idea against an existing todo item to move
sequences into a single table, as I think by the sounds of it this binds
sequences being relations even closer together.

The todo item reads:

Consider placing all sequences in a single table, or create a system view

This had been on the back of my mind while implementing the hash table
stuff for init_sequence and again when doing my benchmarks where I created
3 sequences and went through the pain of having a path on my file
system with 3 8k files.
It sounds like your idea overlaps with this todo a little, so maybe this is
a good idea to decide which would be best, though the more I think about
it, the more I think that moving sequences into a single table is a no-go

So for implementing moving sequences into a single system table:

1. The search_path stuff makes this a bit more complex. It sounds like this
would require some duplication of the search_path logic.

2. There is also the problem with tracking object dependency.

Currently:
create sequence t_a_seq;
create table t (a int not null default nextval('t_a_seq'));
alter sequence t_a_seq owned by t.a;
drop table t;
drop sequence t_a_seq; -- already deleted by drop table t
ERROR:  sequence t_a_seq does not exist

Moving sequences to a single table sounds like a special case for this
logic.


3. Would moving sequences to a table still have to check that no duplicate
object existed in the pg_class?
Currently you can't have a sequence with the same name as a table

create sequence a;
create table a (a int);
ERROR:  relation a already exists


Its not that I'm trying to shoot holes in moving sequences to a single
table, really I'd like find a way to improve the wastefulness these 1 file
per sequence laying around my file system, but if changing this is a no-go
then it would be better to come off the todo list and then we shouldn't as
many issues pouring more concrete in the sequences being relations mould.

Regards

David Rowley


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread David Rowley
On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote

 I think that means that we should just completely replace the list with
 the hash table. The difference with a small N is lost in noise, so there's
 no point in keeping the list as a fast path for small N. That'll make the
 patch somewhat simpler.
 - Heikki


Attached is a much more simple patch which gets rid of the initial linked
list.


hashtab_seq_v0.2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-14 Thread David Rowley
On Fri, Nov 15, 2013 at 3:23 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Andres Freund and...@2ndquadrant.com writes:
  I think it'd be a better idea to integrate the sequence caching logic
  into the relcache. There's a comment about it:
   * (We can't
   * rely on the relcache, since it's only, well, a cache, and may decide
 to
   * discard entries.)
  but that's not really accurate anymore. We have the infrastructure for
  keeping values across resets and we don't discard entries.

 We most certainly *do* discard entries, if they're not open when a cache
 flush event comes along.

 I suppose it'd be possible to mark a relcache entry for a sequence
 as locked-in-core, but that doesn't attract me at all.  A relcache
 entry is a whole lot larger than the amount of state we really need
 to keep for a sequence.

 One idea is to have a hashtable for the sequence-specific data,
 but to add a link field to the relcache entry that points to the
 non-flushable sequence hashtable entry.  That would save the second
 hashtable lookup as long as the relcache entry hadn't been flushed
 since last use, while not requiring any violence to the lifespan
 semantics of relcache entries.  (Actually, if we did that, it might
 not even be worth converting the list to a hashtable?  Searches would
 become a lot less frequent.)


Unless I've misunderstood something it looks like this would mean giving
heamam.c and relcache.c knowledge of sequences.
Currently relation_open is called from open_share_lock in sequence.c. The
only way I can see to do this would be to add something like
relation_open_sequence() in heapam.c which means we'd need to invent
RelationIdGetSequenceRelation() and use that instead
of RelationIdGetRelation() and somewhere along the line have it pass back
the SeqTableData struct which would be tagged onto RelIdCacheEnt.

I think it can be done but I don't think it will look pretty.
Perhaps if there was a more generic way... Would tagging some void
*rd_extra only the RelationData be a better way? And just have sequence.c
make use of that for storing the SeqTableData.

Also I'm wondering what we'd do with all these pointers when someone does
DISCARD SEQUENCES; would we have to invalidate the relcache or would it
just be matter of looping over it and freeing of the sequence data setting
the pointers to NULL?

Regards

David Rowley


[HACKERS] init_sequence spill to hash table

2013-11-13 Thread David Rowley
Here http://www.postgresql.org/message-id/24278.1352922...@sss.pgh.pa.us there
was some talk about init_sequence being a bottleneck when many sequences
are used in a single backend.

The attached I think implements what was talked about in the above link
which for me seems to double the speed of a currval() loop over 3
sequences. It goes from about 7 seconds to 3.5 on my laptop.

I thought I would post the patch early to see if this is actually wanted
before I do too much more work on it.

My implementation maintains using the linear list for sequences up to a
defined threshold (currently 32) then it moves everything over to a
hashtable and free's off the list.

A more complete solution would contain regression tests to exercise the
hash table code.

I know there is a desire to move sequences over to a single table still,
but I see this as a separate patch and storing current values in a hash
table for each backend should still be a win even if/when the single table
stuff gets implemented.

Regards

David Rowley
CREATE FUNCTION create_seq(n integer) RETURNS integer
LANGUAGE plpgsql
AS $$
BEGIN
WHILE n  0 LOOP
EXECUTE 'CREATE SEQUENCE test' || CAST(n AS TEXT);
  n := n - 1;
END LOOP;
RETURN 0;
END
$$;

CREATE FUNCTION currval_seq(n integer) RETURNS integer
LANGUAGE plpgsql
AS $_$
BEGIN
WHILE n  0 LOOP
EXECUTE $$SELECT currval('test$$ || CAST(n AS TEXT) || $$')$$;
  n := n - 1;
END LOOP;
RETURN 0;
END
$_$;

CREATE FUNCTION drop_seq(n integer) RETURNS integer
LANGUAGE plpgsql
AS $$
BEGIN
WHILE n  0 LOOP
EXECUTE 'DROP SEQUENCE test' || CAST(n AS TEXT);
  n := n - 1;
END LOOP;
RETURN 0;
END
$$;

CREATE OR REPLACE FUNCTION nextval_seq(n integer) RETURNS integer
LANGUAGE plpgsql
AS $_$
BEGIN
WHILE n  0 LOOP
EXECUTE $$SELECT nextval('test$$ || CAST(n AS TEXT) || $$')$$;
n := n - 1;
END LOOP;
RETURN 0;
END
$_$;


SELECT create_seq(1);
SELECT nextval_seq(1);
SELECT currval_seq(1);



hashtab_seq_v0.1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-13 Thread Heikki Linnakangas

On 13.11.2013 11:55, David Rowley wrote:

I thought I would post the patch early to see if this is actually wanted
before I do too much more work on it.


Seems reasonable.


My implementation maintains using the linear list for sequences up to a
defined threshold (currently 32) then it moves everything over to a
hashtable and free's off the list.


Did you check how it would perform if you just always used the hash 
table? Or if you just have a single entry before you move to hash table, 
ie. set the threshold to 2? That would be slightly simpler.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers