[GENERAL] oracle listener intercept
Hi,I'm looking at replacing an Oracle database that only has a few tables, but that is accessed by many programs in the company via oracle jdbc drivers, oracle odbc drivers and python database interface. Is there a way to intercept the calls to the oracle listener and redirect to postgres without changing the calling application? Thanks, David
Re: [GENERAL] ident authentication with named localhost
Tom Lane wrote: David Link <[EMAIL PROTECTED]> writes: Does anyone know how I can test ident? I'd try sniffing the IP traffic to and from it with a packet sniffer and/or tracing the daemon's system calls with strace. Manually invoking the daemon isn't going to prove a lot, you want to watch its reaction to Postgres. Thanks for your suggestion. I'm new to the concept of packet sniffing and tracing. Can you suggest where I should go or what I should read to better understand this? I believe some flavors of identd have debug tracing options, too ... check the man page ... Too bad no one else has reported this and already found an answer. Maybe I should move to md5 authentication, however I wanted to avoid having to type passwords. Thanks, ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] ident authentication with named localhost
Hi, I am having trouble with ident authentication. Everything is working fine except when specifying host for connections on the local machine. pg_hba.conf: local all all ident wp hostall all 10.97.8.0/24 ident wp pg_ident.conf: wp dlink dlink wp dlink firstalert wp dlink postgres wp dlink video wp postgrespostgres wp wwwrun firstalert wp wwwrun video If the db is on mach1 and the Unix user is dlink the following works [EMAIL PROTECTED] psql -d mydb -U postgres [EMAIL PROTECTED] psql -d mydb -U postgres -h mach1 # from remote machine While the following does not: (nor with perl DBI) [EMAIL PROTECTED] psql -d mydb -U postgres -h mach1 [EMAIL PROTECTED] psql -d mydb -U postgres -h localhost [EMAIL PROTECTED] psql -d mydb -U postgres -h 10.97.8.244 [EMAIL PROTECTED] psql -d mydb -U postgres -h 127.0.0.1 If I add the following to pg_hba.conf it works of course: hostall all 10.97.8.244/32trust But this does not: hostall all 10.97.8.244/32ident wp. If I try as the postgres Unix user then it works: [EMAIL PROTECTED] psql -d mydb -U postgres -h mach1 We are using: SUSE 9 / Linux 2.6.5-7 Postgresql 8.1 And LDAP. The problem might be due to how identd works on localhost with LDAP. The postgres user is found in /etc/passwd, while the dlink user is not. Incidentally, get this, on a second machine (with same software) what's described here as not working, works intermittently. Now it worked. Now it didn't. For dlink user. Weird. Does anyone know how I can test ident? I can telnet 10.97.8.244 113. The server port I know is 5432, but what's the client port to give? Any and all help greatly appreciated. Thanks. David Link ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] EXPLAIN SELECT .. does not return
Tom Lane wrote: David Link <[EMAIL PROTECTED]> writes: The following large EXPLAIN SELECT Statement fails to return, but continues to take up processing time until it is killed. [ 52-way join... ] Am I right in guessing that all the sales_xxx tables are the same size and have similar statistics? I think the problem is that the planner is faced with a large set of essentially equivalent plans and isn't pruning the list aggressively enough. That's something we fixed in 8.0. Correct. Postgresql 7.4.8 You really oughta try something newer. On my machine, 7.4.12 plans a 52-way join in about a minute, and 8.0 and 8.1 in under a second. We just completed our upgrade to 8.1.3. And we are happy campers! Our Explain plan problem has gone away and everything runs faster. I especially notice improved caching of repeated queries. Hats off to you postgres folks. Thank you very much. Postgres rocks! I wonder also if there's not a better way to design the query... maybe a UNION ALL would work better than nested joins. We need the info in separate columns. I don't think we can do it with UNION. That's why the many joins. I understand though with the new tablespace and inheritence features in 8/8.1 I could put all those sales tables back into one table and keep the data in separate files. regards, tom lane Thanks again for all your help. David Link Nielsen Entertainment, White Plains, NY ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] EXPLAIN SELECT .. does not return
Hi, The following large EXPLAIN SELECT Statement fails to return, but continues to take up processing time until it is killed. I've straightened up the postgresql.conf, as per Tom Lane's suggestions since last time I wrote about this. See: http://archives.postgresql.org/pgsql-general/2005-12/msg00259.php Any and all help, greatly appreciated. Thank you. David Link, White Plains, NY Environment: Linux 2.6.5-7.191 SMP Postgresql 7.4.8 Database size: 110 Gb on disk. vacuum analyze done Processes: postgres 15687 1 0 16:12 pts/100:00:00 /usr/bin/postmaster -D /db/pgsql postgres 15693 15687 0 16:12 pts/100:00:00 postgres: stats buffer process postgres 15695 15693 0 16:12 pts/100:00:00 postgres: stats collector process postgres 17485 15687 99 16:17 pts/100:18:17 postgres: dlink usbkup [local] EXPLAIN Configuration: tcpip_socket = true max_connections = 200 shared_buffers = 8000 sort_mem = 262144 vacuum_mem = 65536 max_fsm_pages = 20 fsync = false wal_sync_method = fsync checkpoint_segments = 30 effective_cache_size = 131072 random_page_cost = 4 geqo = true geqo_threshold = 13 default_statistics_target = 100 from_collapse_limit = 10 join_collapse_limit = 10 log_timestamp = true stats_start_collector = true stats_command_string = true lc_messages = 'en_US.UTF-8' lc_monetary = 'en_US.UTF-8' lc_numeric = 'en_US.UTF-8' lc_time = 'en_US.UTF-8' The SQL: explain select t.tid, t.title, to_char (t.retail_reldate, 'mm-dd-yy') as retail_reldate, coalesce (s200401.units, 0) as s200401u, coalesce (s200402.units, 0) as s200402u, coalesce (s200403.units, 0) as s200403u, coalesce (s200404.units, 0) as s200404u, coalesce (s200405.units, 0) as s200405u, coalesce (s200406.units, 0) as s200406u, coalesce (s200407.units, 0) as s200407u, coalesce (s200408.units, 0) as s200408u, coalesce (s200409.units, 0) as s200409u, coalesce (s200410.units, 0) as s200410u, coalesce (s200411.units, 0) as s200411u, coalesce (s200412.units, 0) as s200412u, coalesce (s200413.units, 0) as s200413u, coalesce (s200414.units, 0) as s200414u, coalesce (s200415.units, 0) as s200415u, coalesce (s200416.units, 0) as s200416u, coalesce (s200417.units, 0) as s200417u, coalesce (s200418.units, 0) as s200418u, coalesce (s200419.units, 0) as s200419u, coalesce (s200420.units, 0) as s200420u, coalesce (s200421.units, 0) as s200421u, coalesce (s200422.units, 0) as s200422u, coalesce (s200423.units, 0) as s200423u, coalesce (s200424.units, 0) as s200424u, coalesce (s200425.units, 0) as s200425u, coalesce (s200426.units, 0) as s200426u, coalesce (s200427.units, 0) as s200427u, coalesce (s200428.units, 0) as s200428u, coalesce (s200429.units, 0) as s200429u, coalesce (s200430.units, 0) as s200430u, coalesce (s200431.units, 0) as s200431u, coalesce (s200432.units, 0) as s200432u, coalesce (s200433.units, 0) as s200433u, coalesce (s200434.units, 0) as s200434u, coalesce (s200435.units, 0) as s200435u, coalesce (s200436.units, 0) as s200436u, coalesce (s200437.units, 0) as s200437u, coalesce (s200438.units, 0) as s200438u, coalesce (s200439.units, 0) as s200439u, coalesce (s200440.units, 0) as s200440u, coalesce (s200441.units, 0) as s200441u, coalesce (s200442.units, 0) as s200442u, coalesce (s200443.units, 0) as s200443u, coalesce (s200444.units, 0) as s200444u, coalesce (s200445.units, 0) as s200445u, coalesce (s200446.units, 0) as s200446u, coalesce (s200447.units, 0) as s200447u, coalesce (s200448.units, 0) as s200448u, coalesce (s200449.units, 0) as s200449u, coalesce (s200450.units, 0) as s200450u, coalesce (s200451.units, 0) as s200451u, coalesce (s200452.units, 0) as s200452u from title t left outer join sale_200401 s200401 on t.tid=s200401.tid and s200401.channel=100 left outer join sale_200402 s200402 on t.tid=s200402.tid and s200402.channel=100 left outer join sale_200403 s200403 on t.tid=s200403.tid and s200403.channel=100 left outer join sale_200404 s200404 on t.tid=s200404.tid and s200404.channel=100 left outer join sale_200405 s200405 on t.tid=s200405.tid and s200405.channel=100 left outer join sale_200406 s200406 on t.tid=s200406.tid and s200406.channel=100 left outer join sale_200407 s200407 on t.tid=s200407.tid and s200407.channel=100 left outer join sale_200408 s200408 on t.tid=s200408.tid and s200408.channel=100 left outer join sale_200409 s200409 on t.tid=s200409.tid and s200409.channel=100 left outer join sale_200410 s200410 on t.tid=s200410.tid and s200410.channel=100 left outer join sale_200411 s200411 on t.tid=s200411.tid and s200411.channel=100 left oute
Re: [GENERAL] EXPLAIN SELECT .. does not return
Tom Lane wrote: David Link <[EMAIL PROTECTED]> writes: Certain SQL Queries, I believe those with many table joins, when run as EXPLAIN plans, never return. I'd guess that one or all of these settings are excessive: geqo_threshold = 14 from_collapse_limit = 13 join_collapse_limit = 13 Keep in mind that the planning cost is exponential in these limits, eg geqo_threshold = 14 probably allows planning times about 14 times greater than geqo_threshold = 13. While I'm looking: shared_buffers = 2000 That seems extremely low for modern machines. sort_mem = 1048576 That, on the other hand, is almost certainly way too high for a system-wide setting. You're promising you have 1Gb available for *each* sort. max_fsm_pages = 10 And this way too low for a 100Gb database, unless most of the tables never see any UPDATEs or DELETEs. wal_buffers = 800 Seems a bit high, especially considering you have fsync disabled and thus there is no benefit whatever to buffering WAL. commit_delay = 100 commit_siblings = 50 Have you measured any benefit to having this turned on? All in all it looks like your configuration settings were chosen by throwing darts :-( regards, tom lane Thanks for your reply, Tom. Different folks have made different suggestions. Can you suggest more reasonable values for these? But more importantly, do you think the problem I am having is due to these configuration short comings? Thanks much. David ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] EXPLAIN SELECT .. does not return
Jaime Casanova wrote: when you have thoses cases, you can take a look in pg_stats_activity to find the offending query... or simply logs all queries Thanks for the advice. I also turned on stat_command_string ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] EXPLAIN SELECT .. does not return
Hi, This has become a major problem for us. Thank you in advance for your help. OS: SUSE Linux 2.6.5-7.191-bigsmp PostgreSQL: 7.4.8 Application: ModPerl Web application using DBI.pm Database size: 100 Gb, 1025 Tables. Problem: EXPLAIN SELECT ... does not return. Description: The Application uses an EXPLAIN cost to determine whether a client's dynamic request for data is too demanding for the server so it can gracefully deny them. (Currently, anything over cost=0.00..50.00). The system gets about 3000 page requests a day. Certain SQL Queries, I believe those with many table joins, when run as EXPLAIN plans, never return. As a result they seem to stay churning in the system. Once that happens other queries build up and the performance of the whole database server grinds to a halt. Postgresql never dies, but eventually, user requests start timing out. This happens on average two or three times a week. I kill an offending process and all's well again. I have not been able to identify with certainty an offending SQL statement. Config params, that have changed from default: tcpip_socket = true max_connections = 200 shared_buffers = 2000 sort_mem = 1048576 vacuum_mem = 65536 max_fsm_pages = 10 max_fsm_relations = 1000 max_files_per_process = 1000 fsync = false wal_sync_method = fsync wal_buffers = 800 checkpoint_segments = 30 commit_delay = 100 commit_siblings = 50 effective_cache_size = 1000 random_page_cost = 4 geqo = true geqo_threshold = 14 default_statistics_target = 100 from_collapse_limit = 13 join_collapse_limit = 13 Note: we load lumps of data ea. week. Then primarily it is a readonly database. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Question about backing up partial Database
Hi, Can pg_dump be used to backup a selected number of tables, (rather than All or One)? We have a case where we need to backup tables of the form sale_2001xx, or sale_2002xx, etc. Our current solution is the following: pg_dump -d dbname -t sale_200101 > sales_2001_dump pg_dump -d dbname -t sale_200102 >> sales_2001_dump pg_dump -d dbname -t sale_200103 >> sales_2001_dump .. pg_dump -d dbname -t sale_200152 >> sales_2001_dump gzip sale_2001_dump However this requires more diskspace and more db connection over head than: pg_dump -d dbname | gzip sales_2001_dump.gz Currently all the tables are in the public namespace. We've had trouble in the past for our perl program to work with multiple schema names. Thank you for your help any suggestions. Sincerely, David Link White Plains, NY ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] diskspace usage recovered on db rebuild
Thank you Scott and Lonni for your replies ... On Fri, 2005-04-01 at 11:21, David Link wrote: I have a question regarding filesystem disk space usage. We have a production database containing 5 years of sales data. Linux 2.6.5; Postgresq 7.4.7. VACUUM ANALZYE the entire database everynight (about 40min). It's size, @SUM(pg_class.relpages) * 8192K, is ... About 66 Gigabytes on disk. When I rebuild the database (using pg_dump and pgsql ), the new resultant database is .. About 48 Gigabytes on disk. A 27% space savings. Can someone tell me why that is? *Scott Marlowe wrote: *There's a certain amount of left over space that's ok. PostgreSQL, under normal operation, with regular, non-full vacuums, grows until it reaches a "steady state" size that has some percentage of each file having freed tuple space that can be reused by later inserts / updates. This is a "good thing" as long as it doesn't go to far. Since inserts and updates can be placed in already allocated space, they should be added faster than if each one had to allocate more space in a file then tack the tuple on the end each time. However, if there's too much free space in the table files, then the database will be less efficient, because each sequential read of the tables has to read a lot of "blank" space. It's all about balance. You might want to look at running the pg_autovacuum daemon and letting it decide when a vacuum is necessary, or scheduling regular vacuums to run more often than every night. You might also want to look at adding vacuum or possible vacuum full when updating large tables to individually clean up afterwards. Next time, try a vacuum full first to see how much space it can reclaim. And lastly, use vacuum verbose to get an idea if your fsm settings are high enough. *Lonni J Friedman wrote: * Are you doing a vacuum full each night? What is the specific command(s) that you are using for vacuum, pg_dump and the import? David Link writes: The answer to Lonni question: $ vacuumdb --analyze $database 2>&1 | tee -a $log $ time (pg_dump $database | gzip > $backup_dir/$dump_file) 2>&1 | tee -a $log $ gunzip -c $dumpfile | psql -h $host -d $database >/dev/null ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] diskspace usage recovered on db rebuild
Greetings worthymen. I have a question regarding filesystem disk space usage. We have a production database containing 5 years of sales data. Linux 2.6.5; Postgresq 7.4.7. VACUUM ANALZYE the entire database everynight (about 40min). It's size, @SUM(pg_class.relpages) * 8192K, is ... About 66 Gigabytes on disk. When I rebuild the database (using pg_dump and pgsql ), the new resultant database is .. About 48 Gigabytes on disk. A 27% space savings. Can someone tell me why that is? Thank you, David Link White Plains, NY Long live Postgres. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] A conditional DROP TABLE function
Hi Tino, --- Tino Wildenhain <[EMAIL PROTECTED]> wrote: > David Link wrote: > > Hi All, > > > > Here's a Conditional drop_table func for those interested. There > was a > > thread on this a long time back. > > > > We do this all the time : > > > > DELETE TABLE sales; > > CREATE TABLE sales (...); > > > Hm. "all the time" enables all the warning lights - > what are you doing to have to delete and create > the tables every time? By 'all the time' i mean once a week. (incidently, it is DROP and not DELETE table, of course). We are working with weekly loads of data. because of the volumns i'm using a separate sales table per week, (ie, sale_200301, sale_200302, etc), becuase when it was in one big happy table (ie, sale) it is slower -- especially recreating indexes. Now you can see, to make the weekly data load process rerunnable (and we do rerun it often) i drop and [re]create this weeks collection of sales related tables. The logs contain statistics, warnings and errors, which we share with the non-geeks who use the data. Also, standard procedure (I believe) for maintaining a data model is creating and using database creation scripts (with DROP and CREATE) - so every time you set up a bran new system, you can get those (i believe) unnecessary messages: ERROR table does not exist. -Thanks. > > > But nobody likes > > > > ERROR: table "sales" does not exist > > > > which we see all the time in the logs. I want to show the logs to > none > > db folk -- so we can't have those error messages in it. > > grep -v "ERROR" should do it. Yes, but then you've got to wrap things in a log filter to generate reports rather than just using the processing log as it comes out. > > > > > (There must be some explaination why postgresql (and Oracle as > well) do > > not have CREATE OR REPLACE TABLE as it does for VIEWs, and > FUNCTIONs. > > Anybody know?) > > Nobody needs this? > There is: > > 1) delete from table; > 2) truncate table; > > to remove all the data > > 3) alter table ... > > to change tables layout. these do not create the table should it not yet exist. and i needed it. -- so i wrote the simple stored procedure to make meself happy. i noticed others have asked for a solution to the problem as well. (are you being a stickler?) also in the name of consistency, CREATE OR REPLACE exist for stored procedures (and views?). PS: I LOVE Postegres. It has made my life (as an Oracle DBA) charming rather than hell! Thanks. > > HTH > Tino Wildenhain > __ Do you Yahoo!? The New Yahoo! Shopping - with improved product search http://shopping.yahoo.com ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] A conditional DROP TABLE function
Hi All, Here's a Conditional drop_table func for those interested. There was a thread on this a long time back. We do this all the time : DELETE TABLE sales; CREATE TABLE sales (...); But nobody likes ERROR: table "sales" does not exist which we see all the time in the logs. I want to show the logs to none db folk -- so we can't have those error messages in it. (There must be some explaination why postgresql (and Oracle as well) do not have CREATE OR REPLACE TABLE as it does for VIEWs, and FUNCTIONs. Anybody know?) Anyway here's drop_table (): CREATE or REPLACE function drop_table (varchar) returns varchar as ' DECLARE tablename alias for $1; cntint4; BEGIN SELECT into cnt count(*) from pg_class where relname = tablename::name; if cnt > 0 then execute \'DROP TABLE \' || tablename; return tablename || \' DROPPED\'; end if; return tablename || \' does not exist\'; END;' language 'plpgsql' ; And here's it's usage in an SQL script: \set QUIET \pset format unaligned \pset tuples_only \unset QUIET select drop_table('sale'); CREATE TABLE sale ( ... ); Regards, DAvid __ Do you Yahoo!? The New Yahoo! Shopping - with improved product search http://shopping.yahoo.com ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] Why does adding SUM and GROUP BY destroy performance?
Thanks Ron, Thanks Christopher for your excellent feedback. I guess it's back to the drawing board. This is a very late hour business requirement change. And we need quick real-time results. Two things are being considered: 1. loading the non aggregate query entirely into memory (using perl cgi, currently, but looking at the possiblity of doing this in the database with either PL/perl or PL/plsql, though I don't know what would be gained by doing it that way). And handling the summing and then the sort ourselves in the program, -- or -- 2. preprocessing a lot of the data at pre-determined times. Essentially doubling the size of our database. I'd be open to any other suggestions. Thanks again. very much. -David __ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Why does adding SUM and GROUP BY destroy performance?
Hi, Why does adding SUM and GROUP BY destroy performance? details follow. Thanks, David Link s1.sql: SELECT t.tid, t.title, COALESCE(s0c100r100.units, 0) as w0c100r100units, (COALESCE(r1c2r100.units, 0) + COALESCE(y0c2r100.units, 0)) as r0c2r100units FROM title t JOIN upc u1 ON t.tid = u1.tid LEFT OUTER JOIN sale_200331 s0c100r100 ON u1.upc = s0c100r100.upc AND s0c100r100.week = 200331 AND s0c100r100.channel = 100 AND s0c100r100.region = 100 LEFT OUTER JOIN rtd r1c2r100 ON u1.upc = r1c2r100.upc AND r1c2r100.year = 2002 AND r1c2r100.channel = 2 AND r1c2r100.region = 100 LEFT OUTER JOIN ytd_200331 y0c2r100 ON u1.upc = y0c2r100.upc AND y0c2r100.week = 200331 AND y0c2r100.channel = 2 AND y0c2r100.region = 100 LEFT OUTER JOIN media m ON t.media = m.key LEFT OUTER JOIN screen_format sf ON t.screen_format = sf.key WHERE t.distributor != 'CONTROL LABEL' ORDER BY t.title ASC LIMIT 50 ; s2.sql: SELECT t.tid, t.title, SUM(COALESCE(s0c100r100.units, 0)) as w0c100r100units, SUM((COALESCE(r1c2r100.units, 0) + COALESCE(y0c2r100.units, 0))) as r0c2r100units FROM title t JOIN upc u1 ON t.tid = u1.tid LEFT OUTER JOIN sale_200331 s0c100r100 ON u1.upc = s0c100r100.upc AND s0c100r100.week = 200331 AND s0c100r100.channel = 100 AND s0c100r100.region = 100 LEFT OUTER JOIN rtd r1c2r100 ON u1.upc = r1c2r100.upc AND r1c2r100.year = 2002 AND r1c2r100.channel = 2 AND r1c2r100.region = 100 LEFT OUTER JOIN ytd_200331 y0c2r100 ON u1.upc = y0c2r100.upc AND y0c2r100.week = 200331 AND y0c2r100.channel = 2 AND y0c2r100.region = 100 LEFT OUTER JOIN media m ON t.media = m.key LEFT OUTER JOIN screen_format sf ON t.screen_format = sf.key WHERE t.distributor != 'CONTROL LABEL' GROUP BY t.tid, t.title ORDER BY t.title ASC LIMIT 50 ; Times: s1.sql takes 0m0.124s s2.sql takes 1m1.450s Stats: title table: 68,000 rows sale_200331 table: 150,000 rows ytd_200331 table: 0 rows rtd table: 650,000 rows Indexes are in place. s1 explain plan: QUERY PLAN Limit (cost=0.00..65105.51 rows=50 width=132) -> Nested Loop (cost=0.00..91726868.54 rows=70445 width=132) Join Filter: ("outer".screen_format = "inner"."key") -> Nested Loop (cost=0.00..91651668.74 rows=70445 width=127) Join Filter: ("outer".media = "inner"."key") -> Nested Loop (cost=0.00..91578053.95 rows=70445 width=122) -> Nested Loop (cost=0.00..91236359.89 rows=70445 width=98) -> Nested Loop (cost=0.00..90894665.82 rows=70445 width=74) -> Nested Loop (cost=0.00..90539626.76 rows=70445 width=50) -> Index Scan using title_title_ind on title t (cost=0.00..193051.67 rows=68775 width=38) Filter: (distributor <> 'CONTROL LABEL'::character varying) -> Index Scan using davids_tid_index on upc u1 (cost=0.00..1309.24 rows=353 width=12) Index Cond: ("outer".tid = u1.tid) -> Index Scan using sale_200331_upc_wk_chl_reg_ind on sale_200331 s0c100r100 (cost=0.00..5.02 rows=1 width=24) Index Cond: (("outer".upc = s0c100r100.upc) AND (s0c100r100.week = 200331) AND (s0c100r100.channel = 100) AND (s0c100r100.region = 100)) -> Index Scan using rtd_upc_year_chl_reg_ind on rtd r1c2r100 (cost=0.00..4.83 rows=1 width=24) Index Cond: (("outer".upc = r1c2r100.upc) AND (r1c2r100."year" = 2002) AND (r1c2r100.channel = 2) AND (r1c2r100.region = 100)) -> Index Scan using ytd_200331_upc_wkchlreg_ind on ytd_200331 y0c2r100 (cost=0.00..4.83 rows=1 width=24) Index Cond: (("outer".upc = y0c2r100.upc) AND (y0c2r100.week = 200331) AND (y0c2r100.channel = 2) AND (y0c2r100.region = 100
[GENERAL] Understanding explain costs
Hi, Trying to understand the planner estimate costs ... one index scan seems to be much more expensive then another. Here are the facts: tiger=# \dbk_inv Table "bk_inv" Attribute | Type | Modifier ---+--+-- store | varchar(5) | isbn | varchar(10) | not null qty | numeric(5,0) | week | numeric(6,0) | Indices: bk_inv_isbn_idx, bk_inv_store_idx tiger=# \dbk_inv_isbn_idx Index "bk_inv_isbn_idx" Attribute |Type ---+- isbn | varchar(10) btree tiger=# \dbk_inv_store_idx Index "bk_inv_store_idx" Attribute |Type ---+ store | varchar(5) btree There are about 50,000,000 rows and you can see a full scan on the table is expensive: tiger=# explain select * from bk_inv; NOTICE: QUERY PLAN: Seq Scan on bk_inv (cost=0.00..999623.77 rows=46790877 width=48) However using the isbn index costs are much less: tiger=# explain select * from bk_inv where isbn = 'foo'; NOTICE: QUERY PLAN: Index Scan using bk_inv_isbn_idx on bk_inv (cost=0.00..53.13 rows=13 width=48) However, however using the store index costs are still rather high - why is that!!?? tiger=# explain select * from bk_inv where store = 'foo'; NOTICE: QUERY PLAN: Index Scan using bk_inv_store_idx on bk_inv (cost=0.00..53456.09 rows=13488 width=48) Incidently the store index is slightly smaller than the isbn index ... [postgres@roma tiger]$ ls -l bk_inv* -rw---1 postgres postgres 1073741824 Oct 10 14:28 bk_inv -rw---1 postgres postgres 1073741824 Oct 10 10:15 bk_inv.1 -rw---1 postgres postgres 1073741824 Oct 10 10:17 bk_inv.2 -rw---1 postgres postgres 1073741824 Oct 10 10:19 bk_inv.3 -rw---1 postgres postgres 60841984 Oct 11 15:51 bk_inv.4 -rw---1 postgres postgres 1073741824 Oct 10 13:37 bk_inv_isbn_idx -rw---1 postgres postgres 566288384 Oct 10 14:31 bk_inv_isbn_idx.1 -rw---1 postgres postgres 1073741824 Oct 11 13:13 bk_inv_store_idx -rw---1 postgres postgres 65921024 Oct 11 13:13 bk_inv_store_idx.1 Am I missing certain fundamentals about the planner/executor? Thanks, David Link ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[GENERAL] Performance problem with 50,000,000 rows
I'm new to PG but this just seems wrong. Can someone take a look: .---. .---. | bk_inv| | bk_title | |---| |---| | isbn |<--->| isbn | | store | | vendor| | qty | | | | week | `---' | | 2,000,000 recs `---' 50,000,000 recs Actual record numbers: bk_inv : 46,790,877 bk_title: 2,311,710 VENDOR REPORT A list of Inventory items, for any one given vendor (e.q. 01672708) select i.isbn, t.vendor, i.store, i.qty from bk_inv i, bk_title t wherei.isbn = t.isbn and t.vendor = '01672708' ; This query should be instantaneous. Granted that's 50 million rows, but I have created an index on the isbn column for both tables. After about 25 minutes (on 4 processor Del 6300 with 1GB Memory) it spits out: ERROR: Write to hashjoin temp file failed tiger=# explain select * from bk_inv i, bk_title t where i.isbn = t.isbn and t.vendor ='5029'; NOTICE: QUERY PLAN: Merge Join (cost=0.00..11229637.06 rows=2172466 width=72) -> Index Scan using bk_title_isbn_idx on bk_title t (cost=0.00..390788.08 rows=107331 width=24) -> Index Scan using bk_inv_isbn_idx on bk_inv i (cost=0.00..10252621.38 rows=46790877 width=48) BIG COST! These explain queries show the existance of the indexes and give small costs: tiger=# explain select * from bk_title where isbn = '5029'; NOTICE: QUERY PLAN: Index Scan using bk_title_isbn_idx on bk_title (cost=0.00..4.90 rows=1 width=24) tiger=# explain select * from bk_inv where isbn = '0897474228'; NOTICE: QUERY PLAN: Index Scan using bk_inv_isbn_idx on bk_inv (cost=0.00..225.53 rows=55 width=48) Note. Same tables, same query returns instantaneously with Oracle 8.1. What I am hoping to show is that Postgres can do our job too. Any help on this much obliged. (Yes I ran vacuum analyze). David Link White Plains, NY ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org