Re: [SQL] Foreign Key inter databases

2007-08-03 Thread Dani Castaños

Josh Tolley escribió:

On 8/2/07, Dani Castaños <[EMAIL PROTECTED]> wrote:
  

Hi all!

Is it possible to have a foreign key where referenced table is in
another database?

Thank you in advance

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly




Unless you use things like dblink to make two separate databases think
they're one database, you really can't do anything where one database
depends on another. Your best bet, if you can have both data sets on
the same machine, is to stick them in two separate schemas, and then
you can have foreign keys on a table in one schema referencing a table
in another schema. That's (in part) what schemas are for in the first
place -- to separate logically distinct data sets while allowing
accesses between the two when necessary.

If for whatever reason having all the data on one machine isn't
possible, you can try using dbi-link or dblink to create links between
the two databases and do foreign keys that way, but I've never used
either, so that might not work/make sense/be possible, etc.

-Josh
  


Thanks Josh!
I use dblinks for queries, but I'm pretty sure you can't use it in 
constraints definitions.
My situation is that I have one main database and many customer's 
databases. In main database there is a languages table, and customers 
databases must use the language_id. Then, I don't want to create 
languages table in each customer db, that's the reason why i was trying 
to do a foreign key from customer db to main db.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] Count of rows

2007-08-03 Thread Christian Kindler

do this via execute in a stored procedure - something like this (written on the 
flow - untested!)

returns setof text

declare
 my_record record;
 my counter as bigint;

begin

for my_record in 
   select tablename from pg_tables where
   schemaname = 'public' 
loop
   execute into counter
   'select count(*) from ' || my_record.tablename ;
   return next my_record.tablename || ': ' || counter::text;

end loop;
   return null;

end;


On Fri, August 3, 2007 7:35 am, Paul Lambert wrote:
> What's the best way to count how many rows are in each table via SQL? Or
> is it even possible?
> 
> I'm trying something like:
> 
> select tablename, count(*) from (select tablename from pg_tables where
> schemaname = 'public') as test group by tablename;
> 
> But obviously this just gives a count of 1 for each table in the public
> schema.
> 
> Can it be done or would I have to write a function?
> 
> --
> Paul Lambert
> Database Administrator
> AutoLedgers
> 
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly
> 


-- 
Pt! Schon vom neuen GMX MultiMessenger gehört?
Der kanns mit allen: http://www.gmx.net/de/go/multimessenger

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[SQL] Increment a sequence by more than one

2007-08-03 Thread Steve Midgley

Hi,

I'm writing an import app in a third party language. It's going to use 
"copy to" to move data from STDIN to a postgres (8.2) table. There are 
some complexities though: it's going to copy the records to a 
"versioned" table first, and then at a later time the records will be 
copied by a different procedure to the "live" table. The live table and 
versioned table are identical in terms of their field definitions. But 
there is no sequence associated with the versioned table (whose primary 
key is "id" plus "import_group_id", whereas the live table's pk is just 
"id"). So all versioned table entries must already "know" what their id 
would be in the live table. (This makes sense for other business 
process we have, but it's a bit of a problem in this instance).


My problem: I'd like to be able to grab a block of id's from the live 
table's pk sequence. So let's say my importer has 5,000 new rows to 
import and the current max pk in the live table is 540,203. I'd like to 
be able to increment the primary key sequence in such a way that I get 
a block of ids all to myself and the sequence is reset to 545,203 with 
a guarantee that all the id's between 540203 and 545203 are unused.


I'm guessing this can be done with a stored procedure, but if possible 
I'd like to be able to run this command from my third party app without 
calling a stored procedure (we try not to use stored procedures here 
b/c we code entirely in this third party language - if we had to, it's 
possible we could install a stored procedure though).


But since I've seen so much magic on display from people on this list, 
I'm going to ask if it's possible to do this solely from PG SQL sent 
from a third party language? The main tricky bit seems to be ensuring 
that everything is locked so two such increment calls at the same time 
don't yield overlapping blocks of ids. Is there a way to "lock" the 
sequence generator for the duration of a "nextval" and "setval" call? 
Since pk sequence functions like nextval cannot be rolled back, I'm 
guessing that "begin/end" won't do the job?


I.e:

-- need "magic lock" statement on pk sequence here
nextval
-- returns 540203
setval(545203)
-- now sequence is set to where I want it and I "own" 5000 id's
-- release magic lock here

My fallback is to just have a routine that calls "nextval" a bunch of 
times and stores all the id's it gets - they may or may not be 
sequential but they'll be unique. This is going to be a really slow way 
to get a large number of id's of course and just seems plain 
wrongheaded in many ways.


Any insights? All help is appreciated and input on a better way to 
solve the problem completely is of course welcome as well.


Sincerely,

Steve


Re: [SQL] Increment a sequence by more than one

2007-08-03 Thread Michael Glaesemann


On Aug 3, 2007, at 11:50 , Steve Midgley wrote:

My problem: I'd like to be able to grab a block of id's from the  
live table's pk sequence. So let's say my importer has 5,000 new  
rows to import and the current max pk in the live table is 540,203.  
I'd like to be able to increment the primary key sequence in such a  
way that I get a block of ids all to myself and the sequence is  
reset to 545,203 with a guarantee that all the id's between 540203  
and 545203 are unused.


Setting the next number that will be taken is generated is  
straightforward.

ALTER SEQUENCE foo_seq RESTART WITH 545203;

Perhaps doing something like (untested):

ALTER SEQUENCE foo_seq RESTART WITH (SELECT last_value + CASE WHEN  
is_called THEN 1 ELSE 0 END);


You'll need to manage the skipped values yourself though, of course.  
Perhaps set up a table to hold the current number used in the skipped  
range. Basically this would be the same approach as that used by  
people who need to guarantee gapless sequences: you can check the  
archives for details, but basically you need to make sure the table  
is properly locked when you're planning to use a new number.


But since I've seen so much magic on display from people on this  
list, I'm going to ask if it's possible to do this solely from PG  
SQL sent from a third party language? The main tricky bit seems to  
be ensuring that everything is locked so two such increment calls  
at the same time don't yield overlapping blocks of ids. Is there a  
way to "lock" the sequence generator for the duration of a  
"nextval" and "setval" call? Since pk sequence functions like  
nextval cannot be rolled back, I'm guessing that "begin/end" won't  
do the job?


I'm not sure, but perhaps calling SELECT FOR UPDATE on the sequence  
itself (i.e., not using nextval/setval) would give the appropriate  
lock, though I wouldn't be surprised if it isn't possible. As you've  
noted, sequences are designed for performance to ignore transactions,  
so this may not be possible (though perhaps that's just values  
returned via the nextval function).


Hope this gives you some additional ideas on how to handle this.

Michael Glaesemann
grzm seespotcode net



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [SQL] Increment a sequence by more than one

2007-08-03 Thread Scott Marlowe
On 8/3/07, Steve Midgley <[EMAIL PROTECTED]> wrote:
>
>  Hi,
>
>  I'm writing an import app in a third party language. It's going to use
> "copy to" to move data from STDIN to a postgres (8.2) table. There are some
> complexities though: it's going to copy the records to a "versioned" table
> first, and then at a later time the records will be copied by a different
> procedure to the "live" table. The live table and versioned table are
> identical in terms of their field definitions. But there is no sequence
> associated with the versioned table (whose primary key is "id" plus
> "import_group_id", whereas the live table's pk is just "id"). So all
> versioned table entries must already "know" what their id would be in the
> live table. (This makes sense for other business process we have, but it's a
> bit of a problem in this instance).
>
>  My problem: I'd like to be able to grab a block of id's from the live
> table's pk sequence. So let's say my importer has 5,000 new rows to import
> and the current max pk in the live table is 540,203. I'd like to be able to
> increment the primary key sequence in such a way that I get a block of ids
> all to myself and the sequence is reset to 545,203 with a guarantee that all
> the id's between 540203 and 545203 are unused.

The real danger in doing this is race conditions.  Most anything you
do involves a possible race condition.  As long as the race condition
doesn't result in an id getting used twice, you're safe.

So:

test=# create sequence m;
CREATE SEQUENCE
test=# select nextval('m');
 nextval
-
   1
(1 row)

test=# alter sequence m increment by 5000;
ALTER SEQUENCE
test=# select nextval('m');
 nextval
-
5001
(1 row)

test=# alter sequence m increment by 1;
ALTER SEQUENCE
test=# select nextval('m');
 nextval
-
5002
(1 row)

In this method, the only possible race condition is that someone might
run a nextval('m') between the time you set the increment to 5000 and
1 again.  If that happens, you'd have 5,000 missing ids, but since
sequences are designed to prevent dupes, not holes, that's ok.

>  But since I've seen so much magic on display from people on this list, I'm
> going to ask if it's possible to do this solely from PG SQL sent from a
> third party language? The main tricky bit seems to be ensuring that
> everything is locked so two such increment calls at the same time don't
> yield overlapping blocks of ids. Is there a way to "lock" the sequence
> generator for the duration of a "nextval" and "setval" call?

Avoiding the setval is the real key.  It doesn't scale.  Missing 5,000
ids is no big deal.  repeating them IS a big deal.  Not using setval
is the best answer.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SQL] Increment a sequence by more than one

2007-08-03 Thread Steve Midgley

Hi Scott,

Thanks for this info (and Michael too!).

Let me see if I understand your suggestion. I would run these three 
commands in sequence:


# select nextval('[my_seq_name]');
returns => 52 [I believe that the sequence is at 52]
# alter sequence [my_seq_name] increment by 5000;
# select nextval('[my_seq_name]');
returns => 5052

If the third command doesn't return "5052" - i.e. it returns 5053, then 
I know that somewhere in this sequence another process grabbed an id 
out from under me. It doesn't matter where, but I would know that my 
5000 id's are not unique and should be discarded? If the third command 
DOES return 5052, then I know that those 5000 id's are "locked" for my 
use and no other application could have grabbed one of them?


Can anyone see a flaw in that? It looks right to me..

Scott - it also seems to me that I need not waste all those id's if 
another application does grab one during my statement: If I detect a 
failure, I could just reset the pk sequence back to the max id of the 
underlying table before trying again. I think this code would do it 
(stolen from Ruby's postgres adaptor):


SELECT setval('[seq_name]', (SELECT COALESCE(MAX([pk_of_seq]), (SELECT 
min_value FROM [seq_name])) FROM [table_of_pk]), false)


So for table "property" with pk of "id" and sequence name 
"property_id_seq":


SELECT setval('property_id_seq', (SELECT COALESCE(MAX(id), (SELECT 
min_value FROM property_id_seq)) FROM property), false)


Then I could just retry the first process - though on a table that is 
very busy, I might have retry for a while..


Regarding Michael's suggestion - I tried messing around with LOCK and 
similar commands but they're only allowed to run against TABLES not 
SEQUENCES - too bad - that would have been perfect.


I'm now starting to think that there's no way to solve this problem in 
an "elegant manner" even in a stored procedure? Your method seems to be 
as good as it's going to get? (Not that I'm complaining!)


Thanks again - any thoughts are welcome,

Steve

At 08:01 PM 8/3/2007, Scott Marlowe wrote:

On 8/3/07, Steve Midgley <[EMAIL PROTECTED]> wrote:
>
>  Hi,
>
>  I'm writing an import app in a third party language. It's going to 
use
> "copy to" to move data from STDIN to a postgres (8.2) table. There 
are some
> complexities though: it's going to copy the records to a 
"versioned" table
> first, and then at a later time the records will be copied by a 
different
> procedure to the "live" table. The live table and versioned table 
are
> identical in terms of their field definitions. But there is no 
sequence

> associated with the versioned table (whose primary key is "id" plus
> "import_group_id", whereas the live table's pk is just "id"). So 
all
> versioned table entries must already "know" what their id would be 
in the
> live table. (This makes sense for other business process we have, 
but it's a

> bit of a problem in this instance).
>
>  My problem: I'd like to be able to grab a block of id's from the 
live
> table's pk sequence. So let's say my importer has 5,000 new rows to 
import
> and the current max pk in the live table is 540,203. I'd like to be 
able to
> increment the primary key sequence in such a way that I get a block 
of ids
> all to myself and the sequence is reset to 545,203 with a guarantee 
that all

> the id's between 540203 and 545203 are unused.

The real danger in doing this is race conditions.  Most anything you
do involves a possible race condition.  As long as the race condition
doesn't result in an id getting used twice, you're safe.

So:

test=# create sequence m;
CREATE SEQUENCE
test=# select nextval('m');
 nextval
-
   1
(1 row)

test=# alter sequence m increment by 5000;
ALTER SEQUENCE
test=# select nextval('m');
 nextval
-
5001
(1 row)

test=# alter sequence m increment by 1;
ALTER SEQUENCE
test=# select nextval('m');
 nextval
-
5002
(1 row)

In this method, the only possible race condition is that someone might
run a nextval('m') between the time you set the increment to 5000 and
1 again.  If that happens, you'd have 5,000 missing ids, but since
sequences are designed to prevent dupes, not holes, that's ok.

>  But since I've seen so much magic on display from people on this 
list, I'm
> going to ask if it's possible to do this solely from PG SQL sent 
from a

> third party language? The main tricky bit seems to be ensuring that
> everything is locked so two such increment calls at the same time 
don't
> yield overlapping blocks of ids. Is there a way to "lock" the 
sequence

> generator for the duration of a "nextval" and "setval" call?

Avoiding the setval is the real key.  It doesn't scale.  Missing 5,000
ids is no big deal.  repeating them IS a big deal.  Not using setval
is the best answer.



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [SQL] Increment a sequence by more than one

2007-08-03 Thread Scott Marlowe
On 8/3/07, Steve Midgley <[EMAIL PROTECTED]> wrote:
> Hi Scott,
>
> Thanks for this info (and Michael too!).
>
> Let me see if I understand your suggestion. I would run these three
> commands in sequence:
>
> # select nextval('[my_seq_name]');
> returns => 52 [I believe that the sequence is at 52]
> # alter sequence [my_seq_name] increment by 5000;
> # select nextval('[my_seq_name]');
> returns => 5052
>
> If the third command doesn't return "5052" - i.e. it returns 5053, then
> I know that somewhere in this sequence another process grabbed an id
> out from under me. It doesn't matter where, but I would know that my
> 5000 id's are not unique and should be discarded? If the third command
> DOES return 5052, then I know that those 5000 id's are "locked" for my
> use and no other application could have grabbed one of them?

No, that's not what would happen.  If someone grabbed an id after the
increment value was changed, then you'd get 10052, cause they would
increment the sequence by 5,000.since you're not using setval, and
you're keeping the increment positive, there's no danger of collision,
only of over-incrementing and leaving a giant hole in your sequence.
which is ok.

> Can anyone see a flaw in that? It looks right to me..
>
> Scott - it also seems to me that I need not waste all those id's if
> another application does grab one during my statement: If I detect a
> failure, I could just reset the pk sequence back to the max id of the
> underlying table before trying again. I think this code would do it
> (stolen from Ruby's postgres adaptor):

That is open to a race condition.  The bad kind.

> SELECT setval('[seq_name]', (SELECT COALESCE(MAX([pk_of_seq]), (SELECT
> min_value FROM [seq_name])) FROM [table_of_pk]), false)

As long as you're using setval, you have a race condition.  Please
avoid it.  Unless you can guarantee that no one else is using the
database at the same time (during a maintenance window etc...)

> So for table "property" with pk of "id" and sequence name
> "property_id_seq":
>
> SELECT setval('property_id_seq', (SELECT COALESCE(MAX(id), (SELECT
> min_value FROM property_id_seq)) FROM property), false)

You'd think that the select coalesce and the outer select setval would
not have a race condition, but they still do.  Just a much smaller
one.

> I'm now starting to think that there's no way to solve this problem in
> an "elegant manner" even in a stored procedure? Your method seems to be
> as good as it's going to get? (Not that I'm complaining!)

Yep.  Safe is better than pretty or elegant. :)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Increment a sequence by more than one

2007-08-03 Thread Scott Marlowe
Oh, another point.  You should run the

alter sequence m increment 5000;
select nextval('m');
alter sequence m increment 1;

one right after the other to reduce the number of 5000 wide holes in
your sequence.

Or, given the size of bigint, you could just set the increment to 5000
and leave it there, and then any insert could grab nextval('m') and
insert up to 5000 more ids with monotonically increasing ids safely.

Note you wouldn't use defaults or nextvals for the rest, you'd have to
calculate them in your application.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Foreign Key inter databases

2007-08-03 Thread Erik Jones


On Aug 3, 2007, at 2:24 AM, Dani Castaños wrote:


Josh Tolley escribió:

On 8/2/07, Dani Castaños <[EMAIL PROTECTED]> wrote:


Hi all!

Is it possible to have a foreign key where referenced table is in
another database?

Thank you in advance

---(end of  
broadcast)---

TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so  
that your

   message can get through to the mailing list cleanly




Unless you use things like dblink to make two separate databases  
think

they're one database, you really can't do anything where one database
depends on another. Your best bet, if you can have both data sets on
the same machine, is to stick them in two separate schemas, and then
you can have foreign keys on a table in one schema referencing a  
table

in another schema. That's (in part) what schemas are for in the first
place -- to separate logically distinct data sets while allowing
accesses between the two when necessary.

If for whatever reason having all the data on one machine isn't
possible, you can try using dbi-link or dblink to create links  
between

the two databases and do foreign keys that way, but I've never used
either, so that might not work/make sense/be possible, etc.

-Josh



Thanks Josh!
I use dblinks for queries, but I'm pretty sure you can't use it in  
constraints definitions.
My situation is that I have one main database and many customer's  
databases. In main database there is a languages table, and  
customers databases must use the language_id. Then, I don't want to  
create languages table in each customer db, that's the reason why i  
was trying to do a foreign key from customer db to main db.


Not directly as pg constraints, no.  But, what you can do is create a  
trigger that simulates the same effect.


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Increment a sequence by more than one

2007-08-03 Thread Erik Jones

On Aug 3, 2007, at 11:50 AM, Steve Midgley wrote:


Hi,

I'm writing an import app in a third party language. It's going to  
use "copy to" to move data from STDIN to a postgres (8.2) table.  
There are some complexities though: it's going to copy the records  
to a "versioned" table first, and then at a later time the records  
will be copied by a different procedure to the "live" table. The  
live table and versioned table are identical in terms of their  
field definitions. But there is no sequence associated with the  
versioned table (whose primary key is "id" plus "import_group_id",  
whereas the live table's pk is just "id"). So all versioned table  
entries must already "know" what their id would be in the live  
table. (This makes sense for other business process we have, but  
it's a bit of a problem in this instance).


My problem: I'd like to be able to grab a block of id's from the  
live table's pk sequence. So let's say my importer has 5,000 new  
rows to import and the current max pk in the live table is 540,203.  
I'd like to be able to increment the primary key sequence in such a  
way that I get a block of ids all to myself and the sequence is  
reset to 545,203 with a guarantee that all the id's between 540203  
and 545203 are unused.


I'm guessing this can be done with a stored procedure, but if  
possible I'd like to be able to run this command from my third  
party app without calling a stored procedure (we try not to use  
stored procedures here b/c we code entirely in this third party  
language - if we had to, it's possible we could install a stored  
procedure though).


But since I've seen so much magic on display from people on this  
list, I'm going to ask if it's possible to do this solely from PG  
SQL sent from a third party language? The main tricky bit seems to  
be ensuring that everything is locked so two such increment calls  
at the same time don't yield overlapping blocks of ids. Is there a  
way to "lock" the sequence generator for the duration of a  
"nextval" and "setval" call? Since pk sequence functions like  
nextval cannot be rolled back, I'm guessing that "begin/end" won't  
do the job?


I.e:

-- need "magic lock" statement on pk sequence here
nextval
-- returns 540203
setval(545203)
-- now sequence is set to where I want it and I "own" 5000 id's
-- release magic lock here

My fallback is to just have a routine that calls "nextval" a bunch  
of times and stores all the id's it gets - they may or may not be  
sequential but they'll be unique. This is going to be a really slow  
way to get a large number of id's of course and just seems plain  
wrongheaded in many ways.


Any insights? All help is appreciated and input on a better way to  
solve the problem completely is of course welcome as well.


Is there actually a requirement that the block of 5000 values not  
have gaps?  If not, why not make the versioned table's id column  
default to nextval from the same sequence?  Then when the data is  
copied over to the live table, as long as you supply the the id it  
won't generate a new id and you'll maintain your row-row  
relationships.  If you do require that the block not have gaps, check  
out the article on how to do this here: http://www.varlena.com/ 
varlena/GeneralBits/130.php


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [SQL] Increment a sequence by more than one

2007-08-03 Thread Michael Glaesemann


On Aug 3, 2007, at 14:28 , Steve Midgley wrote:

AIUI, one difference between the solutions Scott and I proposed is  
that while INCREMENT is set at 5000, each time nextval is called the  
sequence is incremented by 5000. For example:


test=# select nextval('foos_foo_id_seq');
nextval
-
   1
(1 row)

test=# select nextval('foos_foo_id_seq');
nextval
-
   2
(1 row)

test=# alter sequence foos_foo_id_seq increment 5000;
ALTER SEQUENCE
test=# select nextval('foos_foo_id_seq');
nextval
-
5002
(1 row)

test=# select nextval('foos_foo_id_seq');
nextval
-
   10002
(1 row)


The only issue with this is that it burns through sequence values  
faster. That may not be a concern, of course.


I wonder if there isn't a way to use ALTER SEQUENCE ... CACHE to  
handle this:


-- Alice's session
test=# select nextval('foos_foo_id_seq');
nextval
-
   15002
(1 row)

-- Bob's session
test=# select nextval('foos_foo_id_seq');
nextval
-
   15003
(1 row)

test=# select nextval('foos_foo_id_seq');
nextval
-
   15004
(1 row)

-- Alice's session
test=# alter sequence foos_foo_id_seq cache 5000;
ALTER SEQUENCE
test=# select nextval('foos_foo_id_seq'); nextval
-
   20003
(1 row)

test=# select nextval('foos_foo_id_seq');
nextval
-
   20004
(1 row)

Now you should be able to safely use the values from nextval to  
nextval + cache without worrying that the values in that range are  
going to be used by another backend. It looks like you could even do:




-- Alice's session
test=# select nextval('foos_foo_id_seq');
nextval
-
   30096
(1 row)

test=# alter sequence foos_foo_id_seq cache 10;  -- set cache to  
preallocate

ALTER SEQUENCE
test=# select nextval('foos_foo_id_seq'); -- get nextval and  
preallocate the next 10

nextval
-
   30097
(1 row)

-- Bob's session
test=# select nextval('foos_foo_id_seq');
nextval
-
   30107
(1 row)

-- Alice's session
test=# alter sequence foos_foo_id_seq cache 1; -- reset cache so  
other backends aren't burning sequence values unnecessarily

ALTER SEQUENCE
test=# select nextval('foos_foo_id_seq'); -- note that the previously  
cached values (for both Alice and Bob's session) are discarded, and  
available for manual entry

nextval
-
   30117
(1 row)


Again, you can set up another table to keep track of the values that  
are going to be used manually, perhaps something like:


CREATE TABLE preallocated_foo_ids
(
current_value INTEGER NOT NULL
, maximum_value INTEGER NOT NULL
, check (current_value <= maximum_value)
);

Then, to use:

test=# ALTER SEQUENCE foos_foo_id_seq CACHE 10;
ALTER SEQUENCE
test=# DELETE FROM preallocated_foo_ids; -- clear old ones
DELETE 1
test=# INSERT INTO preallocated_foo_ids (current_value, maximum_value)
test-# SELECT val, val + 10
test-# FROM (SELECT nextval('foos_foo_id_seq')) AS seq(val);
INSERT 0 1
test=# ALTER SEQUENCE foos_foo_id_seq CACHE 1;
ALTER SEQUENCE
test=# SELECT * FROM preallocated_foo_ids;
current_value | maximum_value
---+---
 30142 | 30152
(1 row)

You've now got a preallocated range in preallocated_ids. Just use  
standard table locking on preallocated_foo_ids: there shouldn't be  
heavy contention on this table during your load, so I don't think  
performance should suffer too badly.


test=# BEGIN;
BEGIN
test=# SELECT current_value
test-# FROM preallocated_foo_ids
test-# FOR UPDATE;
current_value
---
 30142
(1 row)

test=# INSERT INTO foos (foo_id) VALUES (30142);
INSERT 0 1
test=# UPDATE preallocated_foo_ids
test-# SET current_value = current_value + 1;
UPDATE 1
test=# COMMIT;
COMMIT
test=# SELECT * FROM preallocated_foo_ids;
current_value | maximum_value
---+---
 30143 | 30152
(1 row)

When you run into an error because of the CHECK constraint, you know  
you've hit the end of your range (if you haven't been checking  
otherwise).


Regarding Michael's suggestion - I tried messing around with LOCK  
and similar commands but they're only allowed to run against TABLES  
not SEQUENCES - too bad - that would have been perfect.


Yeah, I thought that might be the case.

Michael Glaesemann
grzm seespotcode net



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [SQL] Increment a sequence by more than one

2007-08-03 Thread Michael Glaesemann


On Aug 3, 2007, at 15:27 , Erik Jones wrote:

Is there actually a requirement that the block of 5000 values not  
have gaps?


Good point.

If not, why not make the versioned table's id column default to  
nextval from the same sequence?


Of course, the ids of the two tables could be interleaved in this  
case. This might not be an issue, of course.



Michael Glaesemann
grzm seespotcode net



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[SQL] Race condition in resetting a sequence

2007-08-03 Thread Steve Midgley

Hi Scott,

You've moved into more general territory, so I'm starting a new thread. 
The code I provided to reset a primary key sequence is actually part of 
Ruby on Rails core library - actually they use something very similar 
to what I originally sent:


SELECT setval('#{sequence}', (SELECT COALESCE(MAX(#{pk})+(SELECT 
increment_by FROM #{sequence}), (SELECT min_value FROM #{sequence})) 
FROM #{table}), false)


Where:
#{sequence} = sequence name
#{pk} = primary key of table under sequence
#{table} = table under sequence

Their code is a little different from what I provided before b/c it 
increments by one (times the increment #) above the max(pk). But 
essentially it's the same. (I think their method leaves small gaps in 
the sequence every time it runs). Also I think they're method is likely 
to be a little slower (one extra select statement) and therefore 
(perhaps) more vulnerable to a race?


You mentioned something more general though: "As long as you're using 
setval you have a race condition"? However the postgres manual states:


The sequence functions, listed in 
Table 
9-34, provide simple, multiuser-safe methods for obtaining successive 
sequence values from sequence objects.


(http://www.postgresql.org/docs/8.2/interactive/functions-sequence.html)

Included in Table 9-34 is "setval" - so I'm not clear how it can have a 
race condition all by itself? Or maybe it only has a race condition 
when used in ways similar to how Ruby/Rails is using it? (i.e. with a 
compound select/coalesce statement as one of its parameters?) Would 
this command have a race condition:


select setval('my_pk_seq', 500)

This issue is reasonably important since Ruby on Rails is fairly widely 
used. As you say, the race window would be pretty small on a compound 
select -- and the Ruby function doesn't actually get called very often, 
but if you wouldn't mind explaining how the race condition would 
manifest, I'll post a warning on the RoR bug tracking site so that 
people can at least understand that there's a potential bug here..


Thanks again,

Steve

At 08:42 PM 8/3/2007, Scott Marlowe wrote:

On 8/3/07, Steve Midgley <[EMAIL PROTECTED]> wrote:
> Hi Scott,
>
> Thanks for this info (and Michael too!).
>
> Let me see if I understand your suggestion. I would run these three
> commands in sequence:
>
> # select nextval('[my_seq_name]');
> returns => 52 [I believe that the sequence is at 52]
> # alter sequence [my_seq_name] increment by 5000;
> # select nextval('[my_seq_name]');
> returns => 5052
>
> If the third command doesn't return "5052" - i.e. it returns 5053, 
then
> I know that somewhere in this sequence another process grabbed an 
id
> out from under me. It doesn't matter where, but I would know that 
my
> 5000 id's are not unique and should be discarded? If the third 
command
> DOES return 5052, then I know that those 5000 id's are "locked" for 
my

> use and no other application could have grabbed one of them?

No, that's not what would happen.  If someone grabbed an id after the
increment value was changed, then you'd get 10052, cause they would
increment the sequence by 5,000.since you're not using setval, and
you're keeping the increment positive, there's no danger of collision,
only of over-incrementing and leaving a giant hole in your sequence.
which is ok.

> Can anyone see a flaw in that? It looks right to me..
>
> Scott - it also seems to me that I need not waste all those id's if
> another application does grab one during my statement: If I detect 
a
> failure, I could just reset the pk sequence back to the max id of 
the

> underlying table before trying again. I think this code would do it
> (stolen from Ruby's postgres adaptor):

That is open to a race condition.  The bad kind.

> SELECT setval('[seq_name]', (SELECT COALESCE(MAX([pk_of_seq]), 
(SELECT

> min_value FROM [seq_name])) FROM [table_of_pk]), false)

As long as you're using setval, you have a race condition.  Please
avoid it.  Unless you can guarantee that no one else is using the
database at the same time (during a maintenance window etc...)

> So for table "property" with pk of "id" and sequence name
> "property_id_seq":
>
> SELECT setval('property_id_seq', (SELECT COALESCE(MAX(id), (SELECT
> min_value FROM property_id_seq)) FROM property), false)

You'd think that the select coalesce and the outer select setval would
not have a race condition, but they still do.  Just a much smaller
one.

> I'm now starting to think that there's no way to solve this problem 
in
> an "elegant manner" even in a stored procedure? Your method seems 
to be

> as good as it's going to get? (Not that I'm complaining!)

Yep.  Safe is better than pretty or elegant. :)


Re: [SQL] Race condition in resetting a sequence

2007-08-03 Thread Tom Lane
Steve Midgley <[EMAIL PROTECTED]> writes:
> The code I provided to reset a primary key sequence is actually part of 
> Ruby on Rails core library - actually they use something very similar 
> to what I originally sent:

> SELECT setval('#{sequence}', (SELECT COALESCE(MAX(#{pk})+(SELECT 
> increment_by FROM #{sequence}), (SELECT min_value FROM #{sequence})) 
> FROM #{table}), false)

Ugh.  That's completely unsafe/broken, unless they also use locking that
you didn't show.

> You mentioned something more general though: "As long as you're using 
> setval you have a race condition"? However the postgres manual states:

>> The sequence functions, listed in 
>> Table
>>  
>> 9-34, provide simple, multiuser-safe methods for obtaining successive 
>> sequence values from sequence objects.

> (http://www.postgresql.org/docs/8.2/interactive/functions-sequence.html)

> Included in Table 9-34 is "setval" - so I'm not clear how it can have a 
> race condition all by itself?

It doesn't have a race condition "all by itself": it will do what it's
told.  The problem with commands such as the above is that there's a
time window between calculating the max() and executing the setval(),
and that window is more than large enough to allow someone else to
insert a row that invalidates your max() computation.  (Because of MVCC
snapshotting, the risk window is in fact as long as the entire
calculation of the max --- it's not just a few instructions as some
might naively think.)

Now it is possible to make this brute-force approach safe: you can lock
the table against all other modifications until you've applied your own
changes.  But you pay a high price in loss of concurrency if you do
that.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster