Re: [GENERAL] Help with pre-loaded arbitrary key sequences
On Thu, January 17, 2008 10:15, Scott Marlowe wrote: If race conditions are a possible issue, you use a sequence and increment that until you get a number that isn't used. That way two clients connecting at the same time can get different, available numbers. That is close to the idea that I originally had. I was simply wondering if the built-in sequencer could handle this case or whether I need to roll my own. -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:[EMAIL PROTECTED] Harte Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Help with pre-loaded arbitrary key sequences
On Jan 17, 2008 9:05 AM, James B. Byrne [EMAIL PROTECTED] wrote: If the entries involved numbered in the millions then Scott's approach has considerable merit. In my case, as the rate of additions is very low and the size of the existing blocks is in the hundreds rather than hundreds of thousands then I believe that I will simply write my own iterator and do a repetitive select when on the incrementally proposed values until an opening is found then insert the new entry and update the iterator next value accordingly. If race conditions are a possible issue, you use a sequence and increment that until you get a number that isn't used. That way two clients connecting at the same time can get different, available numbers. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Help with pre-loaded arbitrary key sequences
On Wed, January 16, 2008 18:40, Scott Marlowe wrote: You're essentially wanting to fill in the blanks here. If you need good performance, then what you'll need to do is to preallocate all the numbers that haven't been assigned somewhere. So, we make a table something like: create table locatorcodes (i int, count_id serial); Then we insert an id into that table for everyone that's missing from the main table: insert into locatorcodes (i) select b.i from ( select * from generate_series(1,100)as i ) as b left join main_table a on (b.i=a.i) where a.i is null; Or something like that. Now, we've got a table with all the unused ids, and a serial count assigned to them. Create another sequence: create checkout_sequence; and use that to check out numbers from locatorcodes: select i from locatorcodes where count_id=nextval('checkout_sequence'); And since the sequence will just count up, there's little or no problems with performance. There's lots of ways of handling this. That's just one of the ones that doesn't slow your database down a lot. If you need to, you can shuffle the numbers going into the locatorcodes table with an order by random() when you create it. Martin and Scott, Thank you both for your responses. If the entries involved numbered in the millions then Scott's approach has considerable merit. In my case, as the rate of additions is very low and the size of the existing blocks is in the hundreds rather than hundreds of thousands then I believe that I will simply write my own iterator and do a repetitive select when on the incrementally proposed values until an opening is found then insert the new entry and update the iterator next value accordingly. Regards, -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:[EMAIL PROTECTED] Harte Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Help with pre-loaded arbitrary key sequences
On Jan 17, 2008 9:19 AM, James B. Byrne [EMAIL PROTECTED] wrote: On Thu, January 17, 2008 10:15, Scott Marlowe wrote: If race conditions are a possible issue, you use a sequence and increment that until you get a number that isn't used. That way two clients connecting at the same time can get different, available numbers. That is close to the idea that I originally had. I was simply wondering if the built-in sequencer could handle this case or whether I need to roll my own. Yeah, the built in sequencer just increments by one, nothing else. But it should be pretty easy to write a pl/pgsql function that grabs the next value and loop until it finds one that's available. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Help with pre-loaded arbitrary key sequences
On Jan 17, 2008 9:19 AM, James B. Byrne [EMAIL PROTECTED] wrote: On Thu, January 17, 2008 10:15, Scott Marlowe wrote: If race conditions are a possible issue, you use a sequence and increment that until you get a number that isn't used. That way two clients connecting at the same time can get different, available numbers. That is close to the idea that I originally had. I was simply wondering if the built-in sequencer could handle this case or whether I need to roll my own. Got bored, hacked this aggregious pl/pgsql routine up. It looks horrible, but I wanted it to be able to use indexes. Seems to work. Test has ~750k rows and returns in it and returns a new id in 1ms on my little server. File attached. drop table a; drop sequence aseq; create table a (i integer primary key); create sequence aseq; insert into a(i) select * from generate_series(1,100) where random() 0.25; create or replace function getnext() returns int as $$ DECLARE niq int; tf bool; BEGIN loop select nextval('aseq') into niq; select case when (select true from (select niq as i) as x join a on (a.i=x.i)) then TRUE else FALSE end into tf; exit when not tf ; end loop; return niq; END; $$ language plpgsql; ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Help with pre-loaded arbitrary key sequences
On Thu, January 17, 2008 11:48, Scott Marlowe wrote: Got bored, hacked this aggregious pl/pgsql routine up. It looks horrible, but I wanted it to be able to use indexes. Seems to work. Test has ~750k rows and returns in it and returns a new id in 1ms on my little server. File attached. Many thanks. -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:[EMAIL PROTECTED] Harte Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Help with pre-loaded arbitrary key sequences
I am prototyping a system migration that is to employ Ruby, Rails and PostgreSQL. Rails has the convention that the primary key of a row is an arbitrary integer value assigned by the database manager through a sequence. As it turns out, the legacy application employs essentially the same convention in most instances. My question is this: Can one assign an id number to a sequenced key column on create and override the sequencer? If one does this then can and, if so, how does the sequencer in Postgresql handle the eventuality of running into a block of keys holding previously assigned numbers? For example. The existing client master dataset employs an eight digit account number as primary key. The values in use tend to cluster in groups at each thousand increment, thus 1..375, 1001..1288, 2001..2225, 3001..3312, ..., 2001001..2001476, ..., etc. Assuming that these existing entries were all loaded into the new table with the values given as their primary keys and given that one could not simply start the sequencer at a value above the highest existing value: If one was to add a record and auto-generate a sequence number then can the sequencer handle looping from 1 to 375 and returning 376 and then continue until it reaches 1001, when it needs be loop again until 1288 and then return 1289 and so forth? During the load of the initial table data it would probably be necessary to disable the sequencer for this column. Is this in fact the case? If so, how is this done and how is the sequencer restored after the initial migration of data is complete? I presume that I can write my own sequencer function to accomplish this in any case but I wish to know if the existing method handles this case. Another solution is to simply decouple the existing key value from the new and treat the exiting client number as a piece of data (with or without an index), but that seems redundant since the new arbitrary key value might just as well be the existing arbitrary key value. Dispensing with the existing arbitrary number is another option but, the fact is that present business practice is for employees to refer to their clients and vendors by account number. The existing computer system is 25 years old but employs account numbers that predate automation. The firm is well past the century mark and some of these numbers have been in use with a few clients from the end of 1800's. So, while not strictly a business case, both the firm and some of its clients have a strong, if irrational, attachment to preserving the existing scheme. I regret if these questions appear naive but I am struggling with a lot of new information on a number of fronts and as usual wish to get quick answers to questions that may be far more involved than I realize. Sincerely, -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:[EMAIL PROTECTED] Harte Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Help with pre-loaded arbitrary key sequences
On Fri, Jan 11, 2008 at 11:43:54AM -0500, James B. Byrne wrote: My question is this: Can one assign an id number to a sequenced key column on create and override the sequencer? If one does this then can and, if so, how does the sequencer in Postgresql handle the eventuality of running into a block of keys holding previously assigned numbers? You can set the counter during create, or at any time later. However, the counter is not defined by the column as such and will happily return numbers already in the table if you screw it up. The usual process is to insert normally when loading the data and then do a setval() on the sequence to past the values already stored. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Those who make peaceful revolution impossible will make violent revolution inevitable. -- John F Kennedy signature.asc Description: Digital signature
Re: [GENERAL] Help with pre-loaded arbitrary key sequences
On Jan 11, 2008 10:43 AM, James B. Byrne [EMAIL PROTECTED] wrote: I am prototyping a system migration that is to employ Ruby, Rails and PostgreSQL. Rails has the convention that the primary key of a row is an arbitrary integer value assigned by the database manager through a sequence. As it turns out, the legacy application employs essentially the same convention in most instances. My question is this: Can one assign an id number to a sequenced key column on create and override the sequencer? If one does this then can and, if so, how does the sequencer in Postgresql handle the eventuality of running into a block of keys holding previously assigned numbers? For example. The existing client master dataset employs an eight digit account number as primary key. The values in use tend to cluster in groups at each thousand increment, thus 1..375, 1001..1288, 2001..2225, 3001..3312, ..., 2001001..2001476, ..., etc. Assuming that these existing entries were all loaded into the new table with the values given as their primary keys and given that one could not simply start the sequencer at a value above the highest existing value: If one was to add a record and auto-generate a sequence number then can the sequencer handle looping from 1 to 375 and returning 376 and then continue until it reaches 1001, when it needs be loop again until 1288 and then return 1289 and so forth? You're essentially wanting to fill in the blanks here. If you need good performance, then what you'll need to do is to preallocate all the numbers that haven't been assigned somewhere. So, we make a table something like: create table locatorcodes (i int, count_id serial); Then we insert an id into that table for everyone that's missing from the main table: insert into locatorcodes (i) select b.i from ( select * from generate_series(1,100)as i ) as b left join main_table a on (b.i=a.i) where a.i is null; Or something like that. Now, we've got a table with all the unused ids, and a serial count assigned to them. Create another sequence: create checkout_sequence; and use that to check out numbers from locatorcodes: select i from locatorcodes where count_id=nextval('checkout_sequence'); And since the sequence will just count up, there's little or no problems with performance. There's lots of ways of handling this. That's just one of the ones that doesn't slow your database down a lot. If you need to, you can shuffle the numbers going into the locatorcodes table with an order by random() when you create it. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly