Re: [SQL] changing multiple pk's in one update
On 2009-04-13, Stuart McGraw wrote: > Jasen Betts wrote: >> I see no reason to keep the index (and its associated UNIQUE >> constraint) during the update, AFAICT all it does is slow the process >> down. > > Thanks for the suggestion. > > Unfortunately I am doing this key renumbering in > an interactive gui app and as there are several million > rows involved, rebuilding indexes take too long. the update takes a long time too if it's updating all the rows. and updating the index piecewise at the same time. with the index extant I get from 20 (if the start and end ranges don't overlap) and 28s (with , to 28 seconds (maximum overlap) for a table with 100 (short) rows it takes 18 seconds if I first drop the index, then update, then restore the index. so, about the same amount of time or slightly faster by dropping the index. if wrap them in a transaction it takes 30s each way > I have continued searching for other ways to do this > but most of my google results are floating in a sea > of "update if insert fails, like mysql" results so > I still have a little hope it is possible. > I thought I remember seeing, a year or two ago, an > update statement with an ordered subquery that avoided > duplicate key errors but I am probably misrembering. maybe you can do it using a cursor? I've not looked at them yet. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] ENUM vs DOMAIN vs FKyed loookup table
On 2009-04-12, Dirk Jagdmann wrote: >> When you need to choose between enum types, domain types or lookup tables >> with foreign keys, what do you usualy choose? > > When I have a column with valid values that I know when writing my > tables and that will *never* change I use an enum. For example a human > gender type (and remember that there are 4 values for human sex if you > want to model it completely). "lambda moo" has even more genders than that. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Postgres process resident size does not drop after killing statement
Bryce Nesbitt writes: > Every so often our production Postgres 8.3 system will get statement > that runs for a few hours, or a few days, or more, and needs to be > killed dead. We kill it with pg_cancel_backend(), and cpu usage of the > process immediately drops, and the process starts serving other > statements. But the curious thing is the resident size does not drop. > Is this normal? Resident size as displayed by "top" is pretty misleading on many systems. Typically, it counts whatever private memory the process has plus as many pages of the shared memory area as the process has actually touched in its lifespan. So the latter portion ramps up from nil to approach the size of the shared memory block just as a byproduct of normal operations. You did not say what your shared_buffers setting is, but if it's a GB or two then I don't find anything surprising in the numbers you show. regards, tom lane -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] ENUM vs DOMAIN vs FKyed loookup table
ja...@xnet.co.nz (Jasen Betts) writes: > On 2009-04-12, Dirk Jagdmann wrote: >>> When you need to choose between enum types, domain types or lookup >>> tables with foreign keys, what do you usualy choose? >> >> When I have a column with valid values that I know when writing my >> tables and that will *never* change I use an enum. For example a >> human gender type (and remember that there are 4 values for human >> sex if you want to model it completely). > > "lambda moo" has even more genders than that. I'm not so sure about that... I suspect what you're thinking of are "Spivak pronouns," which aren't indication of gender... Instead, they represent a way of declining to indicate gender. It appears that what LambdaMOO has is not a larger number of genders, but rather a large number of ways to *indicate* gender, some of which (e.g. - Spivak pronouns) actually decline to do so. This is more or less equivalent to setting TZ/PGTZ or LC_LANGUAGE to get the system to output values in some format that you prefer. It may be that LambdaMOO has some ways to indicate sexual preferences, but sexual preferences are not the same thing as gender. -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/linuxxian.html "The only thing better than TV with the sound off is Radio with the sound off." -- Dave Moon -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] How to count from a second table in an aggregate query?
Hi, I'm trying to figure out how to do something which I'd guess is easy for a sql whiz but has me stumped. I would greatly appreciate any help on this - it's a form of SQL query that I've never figured out, but have wanted to use many times over the years.. I want to generate an analysis report that counts the values in two separate tables. I've been able to accomplish what I want with two separate queries that I then merge together in Excel. Essentially what I need is a "horizontal UNION" statement (or something like that). I've included some DDL and sample SQL queries that explain what I want better than I can in English, but the general idea is: get a FK id and count of a certain column in one table, based on some criteria -> for each FK id, get the count of a different column in a different table Display the counts from both queries side-by-side along with the FK id's in a single result set Thanks for any assistance on this! Steve /*SQL STARTS*/ drop table if exists contact_log; drop table if exists contact_property; create table contact_log(id serial NOT null, src_contact_id integer, log_type character varying(63), CONSTRAINT contact_log_pkey PRIMARY KEY (id)); create table contact_property(id serial NOT null, contact_id integer, property_id integer, CONSTRAINT contact_property_pkey PRIMARY KEY (id), CONSTRAINT contact_property_cid_pid UNIQUE (contact_id, property_id)); insert into contact_log (src_contact_id, log_type) values(1, 'web'); insert into contact_log (src_contact_id, log_type) values(1, 'web'); insert into contact_log (src_contact_id, log_type) values(1, 'web'); insert into contact_log (src_contact_id, log_type) values(1, 'detail'); insert into contact_log (src_contact_id, log_type) values(1, 'detail'); insert into contact_log (src_contact_id, log_type) values(2, 'detail'); insert into contact_log (src_contact_id, log_type) values(2, 'detail'); insert into contact_log (src_contact_id, log_type) values(2, 'web'); insert into contact_log (src_contact_id, log_type) values(2, 'foobar'); insert into contact_log (src_contact_id, log_type) values(3, 'foobar'); insert into contact_log (src_contact_id, log_type) values(4, 'web'); insert into contact_property (contact_id, property_id) values(1, 20); insert into contact_property (contact_id, property_id) values(1, 21); insert into contact_property (contact_id, property_id) values(1, 22); insert into contact_property (contact_id, property_id) values(2, 23); insert into contact_property (contact_id, property_id) values(2, 24); insert into contact_property (contact_id, property_id) values(1, 50); insert into contact_property (contact_id, property_id) values(3, 51); insert into contact_property (contact_id, property_id) values(5, 52); -- This gets what I want from contact_log select src_contact_id, count(log_type) from contact_log where contact_log.src_contact_id in (select contact_id from contact_property) and log_type in ('web', 'detail') and src_contact_id in (select contact_id from contact_property) group by src_contact_id order by src_contact_id; -- correct output is : 1|5, 2|3 -- This gets what I want from contact_property select contact_id, count(property_id) from contact_property where contact_id in (select src_contact_id from contact_log where log_type in ('web', 'detail')) group by contact_id order by contact_id; -- correct output is: 1|4, 2|2 -- THIS DOESN'T WORK (of course - but what would?) select src_contact_id, count(log_type), count(property_id) from contact_log join contact_property cp on cp.contact_id = contact_log.src_contact_id where contact_log.src_contact_id in (select contact_id from contact_property) and log_type in ('web', 'detail') group by src_contact_id order by src_contact_id -- correct output *should be* : 1|5|4, 2|3|2 /*SQL ENDS*/ -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] changing multiple pk's in one update
On Wed, Apr 15, 2009 at 8:43 AM, Jasen Betts wrote: > the update takes a long time too if it's updating all the rows. > and updating the index piecewise at the same time. > with the index extant I get from 20 (if the start and end ranges don't > overlap) and 28s (with , > to 28 seconds (maximum overlap) for a table with 100 (short) rows He's using a compound key as his PK, eg. (entry, order), where there are only a few order values for each entry, so I think the real case is only updating order for a specific set of entries. >> I have continued searching for other ways to do this >> but most of my google results are floating in a sea >> of "update if insert fails, like mysql" results so >> I still have a little hope it is possible. >> I thought I remember seeing, a year or two ago, an >> update statement with an ordered subquery that avoided >> duplicate key errors but I am probably misrembering. Bear in mind that the update is failing based on the order the data is in the table, not the PK order. create table test (id integer primary key); insert into test (id) values (2), (1), (3); update test set id=id+1; ERROR: duplicate key value violates unique constraint "test_pkey" update test set id=id-1; ERROR: duplicate key value violates unique constraint "test_pkey" Both fail, because it tries to update 2 first. I suppose in a real pinch, you could renumber in two steps. For example, if you wanted to delete id 6 and move everything else down: insert into test (id) values (2), (7), (3), (1), (4), (5), (6), (8), (9); begin; set transaction isolation level serializable; delete from test where id=6; update test set id = id+100 where id >= 6; update test set id = id-101 where id >= 6; commit; Not very nice, but if "id" is really a sequence number starting at 1 in your case and not an always-increasing generated regular serial (so there's no chance of it actually reaching the arbitrarily large number 100), it should work. (It'd probably be workable for real serials, too, with a much larger offset.) If someone else creates a new sense for that entry after the first update, it'll sit on the order number you were about to use and the operation will fail. Serialize so nobody else will insert until you're done. -- Glenn Maynard -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] How to count from a second table in an aggregate query?
On Apr 15, 2009, at 1:15 PM, Steve Midgley wrote: Hi, I'm trying to figure out how to do something which I'd guess is easy for a sql whiz but has me stumped. I would greatly appreciate any help on this - it's a form of SQL query that I've never figured out, but have wanted to use many times over the years.. I want to generate an analysis report that counts the values in two separate tables. I've been able to accomplish what I want with two separate queries that I then merge together in Excel. Essentially what I need is a "horizontal UNION" statement (or something like that). I've included some DDL and sample SQL queries that explain what I want better than I can in English, but the general idea is: get a FK id and count of a certain column in one table, based on some criteria -> for each FK id, get the count of a different column in a different table Display the counts from both queries side-by-side along with the FK id's in a single result set Thanks for any assistance on this! Steve /*SQL STARTS*/ drop table if exists contact_log; drop table if exists contact_property; create table contact_log(id serial NOT null, src_contact_id integer, log_type character varying(63), CONSTRAINT contact_log_pkey PRIMARY KEY (id)); create table contact_property(id serial NOT null, contact_id integer, property_id integer, CONSTRAINT contact_property_pkey PRIMARY KEY (id), CONSTRAINT contact_property_cid_pid UNIQUE (contact_id, property_id)); insert into contact_log (src_contact_id, log_type) values(1, 'web'); insert into contact_log (src_contact_id, log_type) values(1, 'web'); insert into contact_log (src_contact_id, log_type) values(1, 'web'); insert into contact_log (src_contact_id, log_type) values(1, 'detail'); insert into contact_log (src_contact_id, log_type) values(1, 'detail'); insert into contact_log (src_contact_id, log_type) values(2, 'detail'); insert into contact_log (src_contact_id, log_type) values(2, 'detail'); insert into contact_log (src_contact_id, log_type) values(2, 'web'); insert into contact_log (src_contact_id, log_type) values(2, 'foobar'); insert into contact_log (src_contact_id, log_type) values(3, 'foobar'); insert into contact_log (src_contact_id, log_type) values(4, 'web'); insert into contact_property (contact_id, property_id) values(1, 20); insert into contact_property (contact_id, property_id) values(1, 21); insert into contact_property (contact_id, property_id) values(1, 22); insert into contact_property (contact_id, property_id) values(2, 23); insert into contact_property (contact_id, property_id) values(2, 24); insert into contact_property (contact_id, property_id) values(1, 50); insert into contact_property (contact_id, property_id) values(3, 51); insert into contact_property (contact_id, property_id) values(5, 52); -- This gets what I want from contact_log select src_contact_id, count(log_type) from contact_log where contact_log.src_contact_id in (select contact_id from contact_property) and log_type in ('web', 'detail') and src_contact_id in (select contact_id from contact_property) group by src_contact_id order by src_contact_id; -- correct output is : 1|5, 2|3 -- This gets what I want from contact_property select contact_id, count(property_id) from contact_property where contact_id in (select src_contact_id from contact_log where log_type in ('web', 'detail')) group by contact_id order by contact_id; -- correct output is: 1|4, 2|2 -- THIS DOESN'T WORK (of course - but what would?) select src_contact_id, count(log_type), count(property_id) from contact_log join contact_property cp on cp.contact_id = contact_log.src_contact_id where contact_log.src_contact_id in (select contact_id from contact_property) and log_type in ('web', 'detail') group by src_contact_id order by src_contact_id -- correct output *should be* : 1|5|4, 2|3|2 /*SQL ENDS*/ First, in that last query, working or not, you don't need the "contact_log.src_contact_id in (select contact_id from contact_property)" clause as you've already covered that with the join condtion "cp.contact_id = contact_log.src_contact_id". Anyways, on to your actual question, you can't do that in one level from what I can see as the query first does the join and the executes the aggregates on the results of the join. Let's check out the results of that join without the aggregates (I'm ignoring the id values here since they don't come into play and it will help demonstrate what's happening later): select cl.src_contact_id, cl.log_type, cp.contact_id, cp.property_id from contact_log cl, contact_property cp where cl.src_contact_id = cp.contact_id and cl.log_type in ('web', 'detail'); src_contact_id | log_type | contact_id | property_id +--++- 1 | detail | 1 | 20 1 | detail | 1 | 20 1 | web | 1 | 20
Re: [SQL] How to count from a second table in an aggregate query?
Erik Jones wrote: On Apr 15, 2009, at 1:15 PM, Steve Midgley wrote: I want to generate an analysis report that counts the values in two separate tables. I've been able to accomplish what I want with two separate queries that I then merge together in Excel. Essentially what I need is a "horizontal UNION" statement (or something like that). get a FK id and count of a certain column in one table, based on some criteria -> for each FK id, get the count of a different column in a different table Display the counts from both queries side-by-side along with the FK id's in a single result set Joining against a subquery for the second count does the trick: select src_contact_id, count(log_type), cp.count from contact_log , (select contact_id, count(property_id) from contact_property group by contact_id) as cp where src_contact_id = cp.contact_id and log_type in ('web', 'detail') group by src_contact_id, cp.count order by src_contact_id src_contact_id | count | count +---+--- 1 | 5 | 4 2 | 3 | 2 Thanks Erik! This is perfect. Oliveiros showed me another neat solution a while back that involved a select statement in the from clause, but I kind filed that solution mentally as a cool parlor trick. Now I see that I'm going to have to learn and study this form of SQL more closely, as it's extremely flexible and powerful. Thanks for the very complete and patiently instructive response - it makes perfect sense. I'll work to share this along as I go. Steve -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql