date:20090415

Re: [SQL] changing multiple pk's in one update

2009-04-15 Thread Jasen Betts

On 2009-04-13, Stuart McGraw  wrote:
> Jasen Betts wrote:

>> I see no reason to keep the index (and its associated UNIQUE
>> constraint) during the update, AFAICT all it does is slow the process
>> down.
>
> Thanks for the suggestion.
>
> Unfortunately I am doing this key renumbering in
> an interactive gui app and as there are several million 
> rows involved, rebuilding indexes take too long.

the update takes a long time too if it's updating all the rows.
and updating the index piecewise at the same time.
with the index extant I get from 20 (if the start and end ranges don't overlap) 
and 28s (with ,
to 28 seconds (maximum overlap) for a table with 100 (short) rows

it takes 18 seconds if I first drop the index, then update, then 
restore the index.

so, about the same amount of time or slightly faster by dropping the
index.

if wrap them in a transaction it takes 30s each way

> I have continued searching for other ways to do this
> but most of my google results are floating in a sea
> of "update if insert fails, like mysql" results so 
> I still have a little hope it is possible.
> I thought I remember seeing, a year or two ago, an 
> update statement with an ordered subquery that avoided 
> duplicate key errors but I am probably misrembering.

maybe you can do it using a cursor? I've not looked at them yet.

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] ENUM vs DOMAIN vs FKyed loookup table

2009-04-15 Thread Jasen Betts

On 2009-04-12, Dirk Jagdmann  wrote:
>> When you need to choose between enum types, domain types or lookup tables
>> with foreign keys, what do you usualy choose?
>
> When I have a column with valid values that I know when writing my
> tables and that will *never* change I use an enum. For example a human
> gender type (and remember that there are 4 values for human sex if you
> want to model it completely).

"lambda moo" has even more genders than that.


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] Postgres process resident size does not drop after killing statement

2009-04-15 Thread Tom Lane

Bryce Nesbitt  writes:
> Every so often our production Postgres 8.3 system will get statement
> that runs for a few hours, or a few days, or more, and needs to be
> killed dead.  We kill it with pg_cancel_backend(), and cpu usage of the
> process immediately drops, and the process starts serving other
> statements.  But the curious thing is the resident size does not drop. 
> Is this normal?

Resident size as displayed by "top" is pretty misleading on many
systems.  Typically, it counts whatever private memory the process
has plus as many pages of the shared memory area as the process has
actually touched in its lifespan.  So the latter portion ramps up
from nil to approach the size of the shared memory block just as a
byproduct of normal operations.  You did not say what your
shared_buffers setting is, but if it's a GB or two then I don't
find anything surprising in the numbers you show.

regards, tom lane

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] ENUM vs DOMAIN vs FKyed loookup table

2009-04-15 Thread Chris Browne

ja...@xnet.co.nz (Jasen Betts) writes:
> On 2009-04-12, Dirk Jagdmann  wrote:
>>> When you need to choose between enum types, domain types or lookup
>>> tables with foreign keys, what do you usualy choose?
>>
>> When I have a column with valid values that I know when writing my
>> tables and that will *never* change I use an enum. For example a
>> human gender type (and remember that there are 4 values for human
>> sex if you want to model it completely).
>
> "lambda moo" has even more genders than that.

I'm not so sure about that...

I suspect what you're thinking of are "Spivak pronouns," which aren't
indication of gender...  Instead, they represent a way of declining to
indicate gender.

It appears that what LambdaMOO has is not a larger number of genders,
but rather a large number of ways to *indicate* gender, some of which
(e.g. - Spivak pronouns) actually decline to do so.

This is more or less equivalent to setting TZ/PGTZ or LC_LANGUAGE to
get the system to output values in some format that you prefer.

It may be that LambdaMOO has some ways to indicate sexual preferences,
but sexual preferences are not the same thing as gender.
-- 
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/linuxxian.html
"The only thing  better than TV with the  sound off is  Radio with the
sound off." -- Dave Moon

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

[SQL] How to count from a second table in an aggregate query?

2009-04-15 Thread Steve Midgley


Hi,

I'm trying to figure out how to do something which I'd guess is easy for 
a sql whiz but has me stumped. I would greatly appreciate any help on 
this - it's a form of SQL query that I've never figured out, but have 
wanted to use many times over the years..


I want to generate an analysis report that counts the values in two 
separate tables. I've been able to accomplish what I want with two 
separate queries that I then merge together in Excel. Essentially what I 
need is a "horizontal UNION" statement (or something like that).


I've included some DDL and sample SQL queries that explain what I want 
better than I can in English, but the general idea is:


get a FK id and count of a certain column in one table, based on some 
criteria

-> for each FK id, get the count of a different column in a different table
Display the counts from both queries side-by-side along with the FK id's 
in a single result set


Thanks for any assistance on this!

Steve

/*SQL STARTS*/
drop table if exists contact_log;
drop table if exists contact_property;
create table contact_log(id serial NOT null, src_contact_id integer, 
log_type character varying(63), CONSTRAINT contact_log_pkey PRIMARY KEY 
(id));
create table contact_property(id serial NOT null, contact_id integer, 
property_id integer,
 CONSTRAINT contact_property_pkey PRIMARY KEY (id), CONSTRAINT 
contact_property_cid_pid UNIQUE (contact_id, property_id));

insert into contact_log (src_contact_id, log_type) values(1, 'web');
insert into contact_log (src_contact_id, log_type) values(1, 'web');
insert into contact_log (src_contact_id, log_type) values(1, 'web');
insert into contact_log (src_contact_id, log_type) values(1, 'detail');
insert into contact_log (src_contact_id, log_type) values(1, 'detail');
insert into contact_log (src_contact_id, log_type) values(2, 'detail');
insert into contact_log (src_contact_id, log_type) values(2, 'detail');
insert into contact_log (src_contact_id, log_type) values(2, 'web');
insert into contact_log (src_contact_id, log_type) values(2, 'foobar');
insert into contact_log (src_contact_id, log_type) values(3, 'foobar');
insert into contact_log (src_contact_id, log_type) values(4, 'web');
insert into contact_property (contact_id, property_id) values(1, 20);
insert into contact_property (contact_id, property_id) values(1, 21);
insert into contact_property (contact_id, property_id) values(1, 22);
insert into contact_property (contact_id, property_id) values(2, 23);
insert into contact_property (contact_id, property_id) values(2, 24);
insert into contact_property (contact_id, property_id) values(1, 50);
insert into contact_property (contact_id, property_id) values(3, 51);
insert into contact_property (contact_id, property_id) values(5, 52);


-- This gets what I want from contact_log
select src_contact_id, count(log_type)
from contact_log
where
contact_log.src_contact_id in (select contact_id from contact_property)
and log_type in ('web', 'detail')
and src_contact_id in (select contact_id from contact_property)
group by src_contact_id
order by src_contact_id;
-- correct output is : 1|5, 2|3

-- This gets what I want from contact_property
select contact_id, count(property_id)
from contact_property
where
contact_id in (select src_contact_id from contact_log where log_type in 
('web', 'detail'))

group by contact_id
order by contact_id;
-- correct output is: 1|4, 2|2

-- THIS DOESN'T WORK (of course - but what would?)
select src_contact_id, count(log_type), count(property_id)
from contact_log
join contact_property cp on cp.contact_id = contact_log.src_contact_id
where
contact_log.src_contact_id in (select contact_id from contact_property)
and log_type in ('web', 'detail')
group by src_contact_id
order by src_contact_id
-- correct output *should be* : 1|5|4, 2|3|2
/*SQL ENDS*/

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] changing multiple pk's in one update

2009-04-15 Thread Glenn Maynard

On Wed, Apr 15, 2009 at 8:43 AM, Jasen Betts  wrote:
> the update takes a long time too if it's updating all the rows.
> and updating the index piecewise at the same time.
> with the index extant I get from 20 (if the start and end ranges don't 
> overlap) and 28s (with ,
> to 28 seconds (maximum overlap) for a table with 100 (short) rows

He's using a compound key as his PK, eg. (entry, order), where there
are only a few order values for each entry, so I think the real case
is only updating order for a specific set of entries.

>> I have continued searching for other ways to do this
>> but most of my google results are floating in a sea
>> of "update if insert fails, like mysql" results so
>> I still have a little hope it is possible.
>> I thought I remember seeing, a year or two ago, an
>> update statement with an ordered subquery that avoided
>> duplicate key errors but I am probably misrembering.

Bear in mind that the update is failing based on the order the data is
in the table, not the PK order.

create table test (id integer primary key);
insert into test (id) values (2), (1), (3);
update test set id=id+1;
ERROR:  duplicate key value violates unique constraint "test_pkey"
update test set id=id-1;
ERROR:  duplicate key value violates unique constraint "test_pkey"

Both fail, because it tries to update 2 first.

I suppose in a real pinch, you could renumber in two steps.  For
example, if you wanted to delete id 6 and move everything else down:

insert into test (id) values (2), (7), (3), (1), (4), (5), (6), (8), (9);
begin;
set transaction isolation level serializable;
delete from test where id=6;
update test set id = id+100 where id >= 6;
update test set id = id-101 where id >= 6;
commit;

Not very nice, but if "id" is really a sequence number starting at 1
in your case and not an always-increasing generated regular serial (so
there's no chance of it actually reaching the arbitrarily large number
100), it should work.  (It'd probably be workable for real
serials, too, with a much larger offset.)

If someone else creates a new sense for that entry after the first
update, it'll sit on the order number you were about to use and the
operation will fail.  Serialize so nobody else will insert until
you're done.

-- 
Glenn Maynard

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] How to count from a second table in an aggregate query?

2009-04-15 Thread Erik Jones



On Apr 15, 2009, at 1:15 PM, Steve Midgley wrote:


Hi,

I'm trying to figure out how to do something which I'd guess is easy  
for a sql whiz but has me stumped. I would greatly appreciate any  
help on this - it's a form of SQL query that I've never figured out,  
but have wanted to use many times over the years..


I want to generate an analysis report that counts the values in two  
separate tables. I've been able to accomplish what I want with two  
separate queries that I then merge together in Excel. Essentially  
what I need is a "horizontal UNION" statement (or something like  
that).


I've included some DDL and sample SQL queries that explain what I  
want better than I can in English, but the general idea is:


get a FK id and count of a certain column in one table, based on  
some criteria
-> for each FK id, get the count of a different column in a  
different table
Display the counts from both queries side-by-side along with the FK  
id's in a single result set


Thanks for any assistance on this!

Steve

/*SQL STARTS*/
drop table if exists contact_log;
drop table if exists contact_property;
create table contact_log(id serial NOT null, src_contact_id integer,  
log_type character varying(63), CONSTRAINT contact_log_pkey PRIMARY  
KEY (id));
create table contact_property(id serial NOT null, contact_id  
integer, property_id integer,
CONSTRAINT contact_property_pkey PRIMARY KEY (id), CONSTRAINT  
contact_property_cid_pid UNIQUE (contact_id, property_id));

insert into contact_log (src_contact_id, log_type) values(1, 'web');
insert into contact_log (src_contact_id, log_type) values(1, 'web');
insert into contact_log (src_contact_id, log_type) values(1, 'web');
insert into contact_log (src_contact_id, log_type) values(1,  
'detail');
insert into contact_log (src_contact_id, log_type) values(1,  
'detail');
insert into contact_log (src_contact_id, log_type) values(2,  
'detail');
insert into contact_log (src_contact_id, log_type) values(2,  
'detail');

insert into contact_log (src_contact_id, log_type) values(2, 'web');
insert into contact_log (src_contact_id, log_type) values(2,  
'foobar');
insert into contact_log (src_contact_id, log_type) values(3,  
'foobar');

insert into contact_log (src_contact_id, log_type) values(4, 'web');
insert into contact_property (contact_id, property_id) values(1, 20);
insert into contact_property (contact_id, property_id) values(1, 21);
insert into contact_property (contact_id, property_id) values(1, 22);
insert into contact_property (contact_id, property_id) values(2, 23);
insert into contact_property (contact_id, property_id) values(2, 24);
insert into contact_property (contact_id, property_id) values(1, 50);
insert into contact_property (contact_id, property_id) values(3, 51);
insert into contact_property (contact_id, property_id) values(5, 52);


-- This gets what I want from contact_log
select src_contact_id, count(log_type)
from contact_log
where
contact_log.src_contact_id in (select contact_id from  
contact_property)

and log_type in ('web', 'detail')
and src_contact_id in (select contact_id from contact_property)
group by src_contact_id
order by src_contact_id;
-- correct output is : 1|5, 2|3

-- This gets what I want from contact_property
select contact_id, count(property_id)
from contact_property
where
contact_id in (select src_contact_id from contact_log where log_type  
in ('web', 'detail'))

group by contact_id
order by contact_id;
-- correct output is: 1|4, 2|2

-- THIS DOESN'T WORK (of course - but what would?)
select src_contact_id, count(log_type), count(property_id)
from contact_log
join contact_property cp on cp.contact_id = contact_log.src_contact_id
where
contact_log.src_contact_id in (select contact_id from  
contact_property)

and log_type in ('web', 'detail')
group by src_contact_id
order by src_contact_id
-- correct output *should be* : 1|5|4, 2|3|2
/*SQL ENDS*/


First, in that last query, working or not, you don't need the  
"contact_log.src_contact_id in (select contact_id from  
contact_property)" clause as  you've already covered that with the  
join condtion "cp.contact_id = contact_log.src_contact_id".


Anyways, on to your actual question, you can't do that in one level  
from what I can see as the query first does the join and the executes  
the aggregates on the results of the join.  Let's check out the  
results of that join without the aggregates (I'm ignoring the id  
values here since they don't come into play and it will help  
demonstrate what's happening later):


select cl.src_contact_id, cl.log_type, cp.contact_id, cp.property_id
from contact_log cl, contact_property cp
where cl.src_contact_id = cp.contact_id
and cl.log_type in ('web', 'detail');

src_contact_id | log_type | contact_id | property_id
+--++-
  1 | detail   |  1 |  20
  1 | detail   |  1 |  20
  1 | web  |  1 |  20

Re: [SQL] How to count from a second table in an aggregate query?

2009-04-15 Thread Steve Midgley


Erik Jones wrote:


On Apr 15, 2009, at 1:15 PM, Steve Midgley wrote:

I want to generate an analysis report that counts the values in two 
separate tables. I've been able to accomplish what I want with two 
separate queries that I then merge together in Excel. Essentially 
what I need is a "horizontal UNION" statement (or something like that).


get a FK id and count of a certain column in one table, based on some 
criteria
-> for each FK id, get the count of a different column in a different 
table
Display the counts from both queries side-by-side along with the FK 
id's in a single result set


Joining against a subquery for the second count does the trick:

select src_contact_id, count(log_type), cp.count
from contact_log ,
(select contact_id, count(property_id)
  from contact_property
  group by contact_id) as cp
where src_contact_id = cp.contact_id
and log_type in ('web', 'detail')
group by src_contact_id, cp.count
order by src_contact_id

 src_contact_id | count | count
+---+---
  1 | 5 | 4
  2 | 3 | 2

Thanks Erik! This is perfect. Oliveiros showed me another neat solution 
a while back that involved a select statement in the from clause, but I 
kind filed that solution mentally as a cool parlor trick. Now I see that 
I'm going to have to learn and study this form of SQL more closely, as 
it's extremely flexible and powerful.


Thanks for the very complete and patiently instructive response - it 
makes perfect sense. I'll work to share this along as I go.


Steve


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: [SQL] changing multiple pk's in one update

Re: [SQL] ENUM vs DOMAIN vs FKyed loookup table

Re: [SQL] Postgres process resident size does not drop after killing statement

Re: [SQL] ENUM vs DOMAIN vs FKyed loookup table

[SQL] How to count from a second table in an aggregate query?

Re: [SQL] changing multiple pk's in one update

Re: [SQL] How to count from a second table in an aggregate query?

Re: [SQL] How to count from a second table in an aggregate query?

8 matches

Site Navigation

Mail list logo

Footer information