Re: [GENERAL] Adding ON UPDATE CASCADE to an existing foreign key
Stephan Szabo wrote: On Thu, 4 May 2006, Rich Doughty wrote: I have a foreign key constraint that I'd like to alter. I'd rather not drop and re-create it due to the size of the table involved. All I need to do is add an ON UPDATE CASCADE. Is it ok to set confupdtype to 'c' in pg_constraint (and will this be all that's needed) or is it safer to drop and recreate the constraint? I don't think that's going to work, you'd probably need to change the function associated with the trigger involved too. It's probably safer to do the drop and create. ok, thanks. -- - Rich Doughty ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Adding ON UPDATE CASCADE to an existing foreign key constraint
I have a foreign key constraint that I'd like to alter. I'd rather not drop and re-create it due to the size of the table involved. All I need to do is add an ON UPDATE CASCADE. Is it ok to set confupdtype to 'c' in pg_constraint (and will this be all that's needed) or is it safer to drop and recreate the constraint? PG Version 8.0.3 Thanks a lot - Rich Doughty ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] How do I prevent binding to TCP/IP port outside of
Karl Wright wrote: I have a situation where I need postgres to LISTEN and allow BINDs to its TCP/IP port (5432) only to connections that originate from localhost. I need it to not accept *socket* connections if requests come in from off-box. If I try to set up pg_hba.conf such that it rejects off-box requests, it seems to do this after it permits the socket connection, and that won't do for our security geeks here. try listen_addresses = 'localhost' in your postgresql.conf For example, here's the difference: [EMAIL PROTECTED]:~$ curl http://duck37:5432 curl: (52) Empty reply from server [EMAIL PROTECTED]:~$ curl http://duck37:5433 curl: (7) couldn't connect to host [EMAIL PROTECTED]:~$ Note that the outside world seems to be able to connect to 5432 just fine, although any *database* connections get (properly) rejected. I cannot turn off TCP/IP entirely because I have a Java application that uses JDBC. Can somebody tell me whether this is an innate capability of postgres, or whether I will need to modify the base code (and if so, WHERE I would modify it?) Thanks, Karl Wright ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- - Rich Doughty ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] verifying database integrity - fsck for pg?
We are currently migrating a cluster between hosts. I'd like to verify that the new database has been transferred reliably and that the datafiles are in tact. What's the recommended way to do this? We're using pg_start/stop_backup so an md5 check is out of the question. pg version 8.0 Thanks, - Rich Doughty ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] verifying database integrity - fsck for pg?
Tom Lane wrote: Rich Doughty [EMAIL PROTECTED] writes: We are currently migrating a cluster between hosts. I'd like to verify that the new database has been transferred reliably and that the datafiles are in tact. pg_dump both databases and diff the results, perhaps? i had considered pg_dump. i was hoping there was a utility similar to fsck that could check for corruption. i'd like to verify now that the data is ok, rather than coming across errors in 6 months time. i'm going to go a vacuum full, and a pg_dump. at least that should mean all the data is accessible. cheers -- - Rich Doughty ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] verifying database integrity - fsck for pg?
Guido Neitzer wrote: Perhaps you have to do some tricks to tell the script which tables or columns should be equal and which are allowed to differ, but as far as I can see, it shouldn't be that hard. At all, it took me about a day to verify the db contents. I'm not too fussed about a row-by-row comparison between the source and the copy. It's rather a case of a tool to check the datafiles' integrity (such as fsck, myisamchk, svnadmin verify etc). If the fact that pg_dumpall returned successfully, then i would hope that all the data is present and correct. - Rich Doughty ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] View with an outer join - is there any way to optimise
Tom Lane wrote: Rich Doughty [EMAIL PROTECTED] writes: I have a view vw_tokens defined as ... I cannot however perform a meaningful join against this view. ... PG forms the full output of the view. You seem to be wishing that PG would push the INNER JOIN down inside the nested LEFT JOINs. In general, rearranging inner and outer joins like that can change the results. There are limited cases where it can be done without breaking the query semantics, but the planner doesn't currently have any logic to analyze whether it's safe or not, so it just doesn't try. Improving this situation is (or ought to be) on the TODO list, but I dunno when it will happen. ok, thanks. as i suspected, i don't think i'm going to be able to views for this. when the query is ultimately returning only 100 or so rows, i cannot afford a full 4 million row table scan to form the full view when a nested loop might make more sense (anything makes more sense than the full view :-) i have a workaround (of sorts). instead of WHERE token_id IN (SELECT token_id FROM ta_tokenhist WHERE sarreport_id = 9) if i perform the subquery manually, then create a second query of the form WHERE token_id IN (?,?,?,?,?) i get decent results. it's pretty ugly but it works. i doubt that it will scale up to 500 or more results (if that), but thankfully in general, neither do the query results. cheers anyway - Rich Doughty ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] View with an outer join - is there any way to optimise this?
. Any advice is greatly appreciated. i'm starting to wonder if the using a view in this instance is futile. Many thanks -- - Rich Doughty ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] View with an outer join - is there any way to optimise
John McCawley wrote: First of all, check out this thread: http://archives.postgresql.org/pgsql-general/2005-11/msg00734.php I had a similar performance issue with a view. Look at my solution and it might help you out. i'm not sure that'll help in my case as ta_tokens has a 1-to-many relationship with ta_tokenhist. there are various indexed tokenhist columns that i want to query on (in addition to salesorder_id). none of them will return more than 100 or so rows. it'd probably be easier to abandon the view altogether (which isn't something i'd really like to do) Second, you might want to change your token history status from a string to an integer that references a status table. If your view is causing a sequential scan, you're going to end up will bazillions of string comparisons. I don't know if Postgres has some form of black magic string comparison optimization, but I generally avoid string comparisons when I am dealing with a few known values, as would be the case in a status table. interesting thought. of course, i'd rather postgres didn't do a full sequential scan ;-) Rich Doughty wrote: I have a view vw_tokens defined as CREATE VIEW tokens.vw_tokens AS SELECT -- too many columns to mention FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist i ON t.token_id = i.token_id AND i.status = 'issued' LEFT JOIN tokens.ta_tokenhist s ON t.token_id = s.token_id AND s.status = 'sold' LEFT JOIN tokens.ta_tokenhist r ON t.token_id = r.token_id AND r.status = 'redeemed' ; the ta_tokens table contains approx 4 million records, and ta_tokenhist approx 10 millions. queries against the view itself on the primary key execute with no issues at all. I cannot however perform a meaningful join against this view. when i execute SELECT * FROM tokens.ta_tokenhist h INNER JOIN tokens.vw_tokenst ON h.token_id = t.token_id WHERE h.sarreport_id = 9 ; PG forms the full output of the view. the query plan is Hash Join (cost=1638048.47..3032073.73 rows=1 width=702) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Hash Left Join (cost=1638042.45..3011803.15 rows=4052907 width=470) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Hash Left Join (cost=1114741.93..2011923.86 rows=4052907 width=322) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Hash Left Join (cost=559931.55..1093783.71 rows=4052907 width=174) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Seq Scan on ta_tokens t (cost=0.00..73250.07 rows=4052907 width=26) - Hash (cost=459239.41..459239.41 rows=4114456 width=152) - Seq Scan on ta_tokenhist i (cost=0.00..459239.41 rows=4114456 width=152) Filter: ((status)::text = 'issued'::text) - Hash (cost=459239.41..459239.41 rows=3905186 width=152) - Seq Scan on ta_tokenhist s (cost=0.00..459239.41 rows=3905186 width=152) Filter: ((status)::text = 'sold'::text) - Hash (cost=459239.41..459239.41 rows=2617645 width=152) - Seq Scan on ta_tokenhist r (cost=0.00..459239.41 rows=2617645 width=152) Filter: ((status)::text = 'redeemed'::text) - Hash (cost=6.01..6.01 rows=1 width=236) - Index Scan using fkx_tokenhist__sarreports on ta_tokenhist h (cost=0.00..6.01 rows=1 width=236) Index Cond: ((sarreport_id)::integer = 9) I have also tried explicitly querying token_id in the view, hoping to force a nested loop: EXPLAIN SELECT * FROM tokens.vw_tokens__user WHERE token_id IN (SELECT token_id FROM tokens.ta_tokenhist WHERE sarreport_id = 9); QUERY PLAN -- Hash IN Join (cost=1638048.47..3032073.73 rows=1 width=470) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Hash Left Join (cost=1638042.45..3011803.15 rows=4052907 width=470) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Hash Left Join (cost=1114741.93..2011923.86 rows=4052907 width=322) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Hash Left Join (cost=559931.55..1093783.71 rows=4052907 width=174) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Seq Scan on ta_tokens t (cost=0.00..73250.07 rows=4052907 width=26) - Hash (cost=459239.41
Re: [GENERAL] View with an outer join - is there any way to optimise
John McCawley wrote: You should be able to use my trick...the join that is giving you the problem is: SELECT * FROM tokens.ta_tokenhist h INNER JOIN tokens.vw_tokenst ON h.token_id = t.token_id WHERE h.sarreport_id = 9 ; ta_tokenhist is already part of your view, right? So you should be able to include the sarreport_id as part of your view, and then restructure your query as: SELECT * FROM tokens.ta_tokenhist INNER JOIN tokens.vw_tokens ON tokens.ta_tokenhist.token_id = tokens.vw_tokens.token_id WHERE tokens.vw_tokens.sarreport_id = 9 ; I removed the aliases because they confuse me ;) i don't think i can do that. basically i want to run a variety of queries on the vw_tokens view. for example, joins i hope to do may include: tokens.ta_tokenhist h INNER JOIN tokens.vw_tokens WHERE h.customer_id = ? tokens.ta_tokenhist h INNER JOIN tokens.vw_tokens WHERE h.histdate between 'then' and 'now' tokens.vw_tokens WHERE number = ? i just want vw_tokens to give me a constant resultset. i have a feeling though that views aren't go to be able to give me what i need. i suppose i could go for a set returning function, or just write the queries manually. -- - Rich Doughty ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] massive performance hit when using Limit 1
Rich Doughty wrote: can anyone explain the reason for the difference in the following 2 query plans, or offer any advice? the two queries are identical apart from the limit clause. [snip] fwiw, join order makes no difference here either. i get a slightly different plan, but with LIMIT 1 postgres make a really strange planner choice. As before LIMIT 1 the choice is logical and performance fine. - Rich Doughty ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] massive performance hit when using Limit 1
Richard Huxton wrote: Rich Doughty wrote: This one goes nuts and doesn't return. is there any way i can force a query plan similar to the one above? EXPLAIN SELECT _t.* FROM tokens.ta_tokens _t INNER JOIN tokens.ta_tokens_stock _s ON _t.token_id = _s.token_id WHERE _s.retailer_id = '96599' AND _t.value = '10' ORDER BY _t.number ASC LIMIT '1'; QUERY PLAN --- Limit (cost=0.00..14967.39 rows=1 width=27) - Nested Loop (cost=0.00..22316378.56 rows=1491 width=27) - Index Scan using ta_tokens_number_key on ta_tokens _t (cost=0.00..15519868.33 rows=1488768 width=27) Filter: ((value)::numeric = 10::numeric) - Index Scan using ta_tokens_stock_pkey on ta_tokens_stock _s (cost=0.00..4.55 rows=1 width=4) Index Cond: ((outer.token_id)::integer = (_s.token_id)::integer) Filter: ((retailer_id)::integer = 96599) I *think* what's happening here is that PG thinks it will use the index on _t.number (since you are going to sort by that anyway) and pretty soon find a row that will: 1. have value=10 2. join to a row in _s with the right retailer_id It turns out that isn't the case, and so the query takes forever. Without knowing what value and number mean it's difficult to be sure, but I'd guess it's the token_id join part that's the problem, since at a guess a high-numbered retailer will have tokens with high-numbered retailer_id. If you'd posted EXPLAIN ANALYSE then we'd be able to see what actually did happen. no chance. it takes far too long to return (days...). Try the same query but with a low retailer_id (100 or something) and see if it goes a lot quicker. If that is what the problem is, try changing the ORDER BY to something like _s.retailer_id, _t.value, _t.number and see if that gives the planner a nudge in the right direction. the retailer_id would make no difference as thee are only 4000-ish rows in ta_tokens_stock and they all (for now) have the same retailer_id. Failing that, a change to your indexes will almost certainly help. i'm not sure that's the case. the exact same query, but limited to 2 rows is fine. I found this in the 8.0.4 relnotes. i reckon its a good guess that's what the problem is: * Fix mis-planning of queries with small LIMIT values due to poorly thought out fuzzy cost comparison -- - Rich Doughty ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] massive performance hit when using Limit 1
Rich Doughty wrote: Richard Huxton wrote: Rich Doughty wrote: [snip] Try the same query but with a low retailer_id (100 or something) and see if it goes a lot quicker. If that is what the problem is, try changing the ORDER BY to something like _s.retailer_id, _t.value, _t.number and see if that gives the planner a nudge in the right direction. the retailer_id would make no difference as thee are only 4000-ish rows in ta_tokens_stock and they all (for now) have the same retailer_id. ooops. i (sort of) spoke too soon. i didn't read the second half of the comment properly. changing the ORDER BY clause does force a more sensible query plan. many thanks. so that's one way to give the planner hints... Failing that, a change to your indexes will almost certainly help. i'm not sure that's the case. the exact same query, but limited to 2 rows is fine. I found this in the 8.0.4 relnotes. i reckon its a good guess that's what the problem is: * Fix mis-planning of queries with small LIMIT values due to poorly thought out fuzzy cost comparison -- - Rich Doughty ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] massive performance hit when using Limit 1
can anyone explain the reason for the difference in the following 2 query plans, or offer any advice? the two queries are identical apart from the limit clause. the performance here is fine and is the same for LIMIT = 2 EXPLAIN SELECT _t.* FROM tokens.ta_tokens _t INNER JOIN tokens.ta_tokens_stock _s ON _t.token_id = _s.token_id WHERE _s.retailer_id = '96599' AND _t.value = '10' ORDER BY _t.number ASC LIMIT '2'; QUERY PLAN -- Limit (cost=22757.15..22757.15 rows=2 width=27) - Sort (cost=22757.15..22760.88 rows=1491 width=27) Sort Key: _t.number - Nested Loop (cost=0.00..22678.56 rows=1491 width=27) - Seq Scan on ta_tokens_stock _s (cost=0.00..75.72 rows=4058 width=4) Filter: ((retailer_id)::integer = 96599) - Index Scan using ta_tokens_pkey on ta_tokens _t (cost=0.00..5.56 rows=1 width=27) Index Cond: ((_t.token_id)::integer = (outer.token_id)::integer) Filter: ((value)::numeric = 10::numeric) (9 rows) This one goes nuts and doesn't return. is there any way i can force a query plan similar to the one above? EXPLAIN SELECT _t.* FROM tokens.ta_tokens _t INNER JOIN tokens.ta_tokens_stock _s ON _t.token_id = _s.token_id WHERE _s.retailer_id = '96599' AND _t.value = '10' ORDER BY _t.number ASC LIMIT '1'; QUERY PLAN --- Limit (cost=0.00..14967.39 rows=1 width=27) - Nested Loop (cost=0.00..22316378.56 rows=1491 width=27) - Index Scan using ta_tokens_number_key on ta_tokens _t (cost=0.00..15519868.33 rows=1488768 width=27) Filter: ((value)::numeric = 10::numeric) - Index Scan using ta_tokens_stock_pkey on ta_tokens_stock _s (cost=0.00..4.55 rows=1 width=4) Index Cond: ((outer.token_id)::integer = (_s.token_id)::integer) Filter: ((retailer_id)::integer = 96599) (7 rows) All tables are vacuumed and analysed. the row estimates in the plans are accurate. select version(); version -- PostgreSQL 8.0.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.2 20050821 (prerelease) (Debian 4.0.1-6) Thanks a lot, - Rich Doughty ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Outer join query plans and performance
I'm having some significant performance problems with left join. Can anyone give me any pointers as to why the following 2 query plans are so different? EXPLAIN SELECT * FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist h1 ON t.token_id = h1.token_id LEFT JOIN tokens.ta_tokenhist h2 ON t.token_id = h2.token_id WHERE h1.histdate = 'now'; Nested Loop Left Join (cost=0.00..68778.43 rows=2215 width=1402) - Nested Loop (cost=0.00..55505.62 rows=2215 width=714) - Index Scan using idx_tokenhist__histdate on ta_tokenhist h1 (cost=0.00..22970.70 rows=5752 width=688) Index Cond: (histdate = '2005-10-24 13:28:38.411844'::timestamp without time zone) - Index Scan using ta_tokens_pkey on ta_tokens t (cost=0.00..5.64 rows=1 width=26) Index Cond: ((t.token_id)::integer = (outer.token_id)::integer) - Index Scan using fkx_tokenhist__tokens on ta_tokenhist h2 (cost=0.00..5.98 rows=1 width=688) Index Cond: ((outer.token_id)::integer = (h2.token_id)::integer) Performance is fine for this one and the plan is pretty much as i'd expect. This is where i hit a problem. EXPLAIN SELECT * FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist h1 ON t.token_id = h1.token_id LEFT JOIN tokens.ta_tokenhist h2 ON t.token_id = h2.token_id WHERE h2.histdate = 'now'; Hash Join (cost=1249148.59..9000709.22 rows=2215 width=1402) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Hash Left Join (cost=1225660.51..8181263.40 rows=4045106 width=714) Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer) - Seq Scan on ta_tokens t (cost=0.00..71828.06 rows=4045106 width=26) - Hash (cost=281243.21..281243.21 rows=10504921 width=688) - Seq Scan on ta_tokenhist h1 (cost=0.00..281243.21 rows=10504921 width=688) - Hash (cost=22970.70..22970.70 rows=5752 width=688) - Index Scan using idx_tokenhist__histdate on ta_tokenhist h2 (cost=0.00..22970.70 rows=5752 width=688) Index Cond: (histdate = '2005-10-24 13:34:51.371905'::timestamp without time zone) I would understand if h2 was joined on h1, but it isn't. It only joins on t. can anyone give any tips on improving the performance of the second query (aside from changing the join order manually)? Thanks -- - Rich Doughty ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Outer join query plans and performance
Rich Doughty wrote: I'm having some significant performance problems with left join. Can anyone give me any pointers as to why the following 2 query plans are so different? [snip] knew i'd forgotten something... select version(); version -- PostgreSQL 8.0.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.2 20050821 (prerelease) (Debian 4.0.1-6) -- - Rich Doughty ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] perl and insert
On 17 May 2005, Hrishikesh Deshmukh wrote: Hi All, Anybody knows how to use perl dbi to read a file line by line and insert into db! The books which i have tell you exclusively on running queries. it depends on what you need to achieve, but a good place to start would be something like: while (my $line = FILE) { $dbh-do ('INSERT INTO table (line) VALUES (?)', undef, $line); } Where FILE is your open filehandle, and $dbh is your DBI connection, and you've modified the SQL as necessary. If performance is an issue, you may want to try this (although the performance gains depend on database you're using) my $st = $dbh-prepare ('INSERT INTO table (line) VALUES (?)'); while (my $line = FILE) { $st-execute ($line); } - Rich Doughty ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] Increasing the length of a varchar domain
I need to increase the length of one of my domains from 16 - 20 characters. How wise/unwise would it be for me to issue an UPDATE pg_type SET typtypmod = 'newlength + 4' WHERE typname = 'domain name'; command to achieve this. I'd prefer not to dump/reload the database. I remember seeing a thread on this a few months back, but can't seem to find it anymore. PG version 7.4.2 (I know, needs updating...) Any advice? - Rich Doughty ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org