[PERFORM] on disk and in memory
Hello, If a table takes 100 MB while on disk, approximately how much space will it take in RAM/database buffer? Context - We are designing a database that will hold a couple of GBs of data. We wanted to figure out how much shared_buffers we should provide to ensure that most of the time, all the data will be in memory. This is mostly master data (transactions will go to Casandra), and will be read from, rarely written to. We do need data integrity, transaction management, failover etc - hence PostgreSQL. Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Re: [PERFORM] Execution from java - slow
Hi, > > > Yeah. well, hm. Is the function returning a whole bunch of data? > Also, try confirming the slow runtime from the server's point of view; > log_min_duration_statement is a good setting for that. > I did try those options. In the end, removing an order by (it was not necessary) from the SELECT solved the problem. But why the behavior was different when executed from psql and java is still a mystery. Thanks a lot for the suggestions. Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
[PERFORM] Execution from java - slow
Hello all, I have a plpgsql function that takes a few seconds (less than 5) when executed from psql. The same function, when invoked from java via a prepared statement takes a few minutes. There are a few queries in the function. Out of these, the first query takes input parameters for filtering the data. It is this query which takes a long time when the procedure is invoked from java. To ensure that the query does use actual values (and not bind variables) for optimization, we used execute ' select x.col_type_desc,x.acc_id,acc_svr from (. ' using d_from_date,d_to_date It did not help. Any suggestions? It is from_date and to_date on which data gets filtered. We are using the same values for filtering, when we execute it from java/psql Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Re: [PERFORM] Query performance issue
Hello, > > If that does not help, you'll have to change the query probably. The > problem is the explain analyze you've provided > (http://explain.depesz.com/s/MY1) does not match the query from your > yesterday's post so we can't really help with it. Thanks for the pointers. I think I posted the same plan, may be the variable values changed. Anyway, I changed the query and now it comes back in 2 seconds. Here is the plan http://explain.depesz.com/s/n9S Interesting observation - PostgreSQL takes from 2 seconds to 20 minutes fetch the same data set of 2212 records, with slightly modified queries. Oracle is consistent (taking under 1 minute in both cases), though not consistently faster. The modified query is SELECT PFLMST.MEMSHPNUM, PFLMST.MEMSHPTYP, ACCMST.PRGCOD, CNTINF.EMLADR, CNTINF.CELISDCOD, CNTINF.CELARACOD, CNTINF.CELNUM, CNTINF.ADRLINONE , CNTINF.ZIPCOD, CNTINF.ADRTYP, (select ONE.FLDDES from COMONETIM ONE WHERE ONE.CMPCOD =ACCMST.CMPCOD AND ONE.FLDCOD='program.member.accountStatus' AND ONE.FLDVAL=ACCMST.ACCSTA)ACCSTA, (SELECT ONE1.FLDDES FROM COMONETIM ONE1 WHERE ONE1.CMPCOD =ACCMST.CMPCOD AND ONE1.FLDCOD='common.member.membershipStatus' AND ONE1.FLDVAL=PFLMST.MEMSHPSTA )MEMSHPSTA, INDINF.CMPNAM EMPNAM, INDINF.PRFADR, INDINF.GVNNAM GVNNAM, INDINF.FAMNAM FAMNAM, INDINF.MEMDOB MEMDOB FROM PRGMEMACCMST ACCMST JOIN EAIMEMPFLMST PFLMST ON ACCMST.CMPCOD = PFLMST.CMPCOD AND ACCMST.MEMSHPNUM = PFLMST.MEMSHPNUM JOIN EAICUSPFLCNTINF CNTINF ON CNTINF.CMPCOD = PFLMST.CMPCOD AND CNTINF.CUSNUM = PFLMST.CUSNUM LEFT JOIN EAICUSPFLINDINF INDINF ON INDINF.CMPCOD = PFLMST.CMPCOD AND INDINF.CUSNUM = PFLMST.CUSNUM WHERE ACCMST.CMPCOD= 'SA' AND UPPER(INDINF.FAMNAM) LIKE 'PRICE' || '%' ORDER BY UPPER(INDINF.GVNNAM), UPPER(INDINF.FAMNAM), UPPER(INDINF.CMPNAM) Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Re: [PERFORM] Query performance issue
> > A really interesting part is the sort near the bottom - > > -> Sort (cost=1895.95..1896.49 rows=215 width=61) (actual > time=25.926..711784.723 rows=2673340321 loops=1) > Sort Key: memmst.memshpsta > Sort Method: quicksort Memory: 206kB > -> Nested Loop (cost=0.01..1887.62 rows=215 width=61) (actual > time=0.088..23.445 rows=1121 loops=1) > > How can a sort ge 1121 rows at the input and return 2673340321 rows at the > output? Not sure where this comes from. > > BTW what PostgreSQL version is this? PostgreSQL 9.0.4 on x86_64-pc-solaris2.10 Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Re: [PERFORM] Query performance issue
Hello, > > > > Please run EXPLAIN ANALYZE on the query and post that, it's hard to say > > what's wrong from just the query plan, without knowing where the time is > > actually spent. > Here is the explain analyze > http://explain.depesz.com/s/MY1 Going through the url tells me that statistics may be off. I will try analyzing the tables. That should help? Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Re: [PERFORM] Query performance issue
Hello, > > Please run EXPLAIN ANALYZE on the query and post that, it's hard to say > what's wrong from just the query plan, without knowing where the time is > actually spent. Here is the explain analyze http://explain.depesz.com/s/MY1 Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Re: [PERFORM] Query performance issue
Hello, > Please run EXPLAIN ANALYZE on the query and post that, it's hard to say > what's wrong from just the query plan, without knowing where the time is > actually spent. And the schema of the tables involved, and any indexes > on them. (see also http://wiki.postgresql.org/wiki/SlowQueryQuestions) The details of the tables and indexes may take a bit of effort to explain. Will do that. I remembered that a similar query took about 90 seconds to run a few days ago. Now that is also taking a few minutes to run. In between, we made some changes to a few tables (the tables are about 9-10 GB each). This was to fix some issue in conversion from CHARACTER VARYING to BOOLEAN on PostgreSQL (some columns in Oracle were of type VARCHAR, to store BOOLEAN values. We changed that to BOOLEAN in PostgreSQL to resolve some issues at the jdbc level). The alters were of similar type - ALTER TABLE cusdynatr ALTER tstflg TYPE boolean USING CASE WHEN tstflg = '1' THEN true WHEN tstflg = '0' then FALSE END; Do such alters result in fragmentation at storage level? Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
[PERFORM] Query performance issue
Hello all, I have a query which takes about 20 minutes to execute and retrieves 2000-odd records. The explain for the query is pasted here http://explain.depesz.com/s/52f The same query, with similar data structures/indexes and data comes back in 50 seconds in Oracle. We just ported the product to PostgreSQL and are testing it. Any input on what to look for? Possible relevant parameters are shared_buffers = 4GB temp_buffers = 8MB work_mem = 96MB maintenance_work_mem = 1GB effective_cache_size = 8GB default_statistics_target = 50 It is a machine with 16 GB RAM. Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Re: [PERFORM] Parameters for PostgreSQL
Hello, >The most important spec has been omitted. What's the storage subsystem? We have storage on SAN, RAID 5. > > We are suing weblogic. > ^ > Best. Typo. Ever. > > I hear most people who use it want to, you're just brave enough to do it :-P I wish I could make a few millions that way. Thank you for all the replies. The first step is, of course, to migrate the data. I am working with ora2pg for that. I assume creating files with 'COPY' to work as input for PostgreSQL is the right approach? We don't have many stored procedures or packages. So that part should be OK. DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
[PERFORM] Parameters for PostgreSQL
Hello all, We are planning to test one of our products, which works with Oracle, on PostgreSQL. The database size is about 100 GB. It is a product with a not-so-high load ( about 10 tps - mostly read). My doubts are about PostgreSQL settings. For Oracle, we give about 4 GB SGA (shared buffer) and 1.5 GB PGA (sum of session-specific memory). The machine configuration is Opteron 2CPU * 4cores @ 2.3GHz 16GB RAM OS Solaris10 x64 So far I have changed the following settings in postgresql.conf shared_buffers = 2GB temp_buffers = 8MB work_mem = 16MB maintenance_work_mem = 32MB wal_level = archive checkpoint_segments = 10 checkpoint_completion_target = 0.7 archive_mode = on effective_cache_size = 6GB log_destination = 'csvlog' logging_collector = on log_directory = '/backup/datapump/pgdata/log' log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' log_rotation_age = 1d client_min_messages = notice log_min_messages = warning log_min_duration_statement = 3000 Could you please let me know the parameters I should pay attention to? Do the settings mentioned above look OK? We are suing weblogic. Should we let weblogic manage the connection pool or try something else? Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Re: [PERFORM] Possible to improve query plan?
Hello, > > The distribution of the data is that all but 120,000 rows have null > values in the _revision_expired column. > A shot in the dark - will a partial index on the above column help? http://www.postgresql.org/docs/current/interactive/indexes-partial.html http://en.wikipedia.org/wiki/Partial_index One link with discussion about it... http://www.devheads.net/database/postgresql/general/when-can-postgresql-use-partial-not-null-index-seems-depend-size-clause-even-enable-seqscan.htm Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Re: [PERFORM] How to get FK to use new index without restarting the database
Hello, > Is there a way force the db to re-evaluate its execution plan for a FK > without bouncing the DB? > > PostgreSQL 8.1.17 > > In our latest release our developers have implemented some new foreign > keys but forgot to create indexes on these keys. > > The problem surfaced at one of our client installs where a maintenance > DELETE query was running for over 24 hrs. We have since then identified > the missing indexes and have sent the client a script to create them, > but in our testing we could not been able to get postgres to use the new > index for the FK cascade delete without bouncing the database. Did you try analyze? May be it will help. http://www.postgresql.org/docs/9.0/static/sql-analyze.html Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] best db schema for time series data?
Hi, > > If you mostly need the last few prices, I'd definitaly go with the > > first aproach, its much cleaner. Besides, you can store a date/time > > per price, so you know when it changed. We too were using such an approach for 'soft deletes'. Soon we realized that using a one char valid flag to mark the latest records was better. It was easier to filter on that. An index on the modified date column was not being used consistently for some reason or the other. The VALID records form a small portion of the big table and an index on the column help fetch the data pretty fast. Of course, you could partition on the flag also (we did not have to). A slight processing overhead of updating the valid FLAG column is the penalty. This was an Oracle database. Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM]
Hi > SQL Server 2008 R2 Query > select t1.id from table1 t1 inner join table2 t2 on t1.id = t2.id > and t1.question = t2.question and isnull(t1.response,'ISNULL') <> > isnull(t2.response,'ISNULL') > > Postgres 9.1 Query > select t1.id from table1 t1 inner join table2 t2 on t1.id = t2.id > and t1.question = t2.question and coalesce(t1.response,'ISNULL') <> > coalesce(t2.response,'ISNULL') > > What gives? They have same indexes/PK etc? Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Questions on query planner, join types, and work_mem
Hello, > the join column, lsid, is a poor choice for a join column as it is a > long varchar value (avg length 101 characters) that us only gets > unique way out on the right hand side. Would a join on subtring on the 'way out on the right hand side' (did you mean 'rightmost characters' or 'only when we take almost all the 101 characters'?) together with a function based index help? Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL PITR - more doubts
> Yes, It starts out form "where it needs to". Assuming you >did a pg_start_backup() before you did your base backup? Thanks. I did. It uses files like 000B00D9.0020.backupto get the necessary information? Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] PostgreSQL PITR - more doubts
Hello all, One doubt about how PostgreSQL PITR works. Let us say I have all the archived WALs for the past week with archive_command = 'cp -i %p /home/postgres/archive/%f http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query about index usage
Thank you for the detailed explanation. Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query about index usage
Hello, > PostgreSQL can't currently avoid reading the table, because that's > where the tuple visibility information is stored. We've been making > progress toward having some way to avoid reading the table for all > except very recently written tuples, but we're not there yet (in any > production version or in the 9.0 version to be released this > summer). More doubts on how indexes are used by PostgreSQL. It is mentioned that table data blocks have data about tuple visibility and hence table scans are always necessary. So how does PostgreSQL reduce the number of blocks to be read by using indexes? Does this mean that indexes will have references to all the 'possible' blocks which may contain the data one is searching for, and then scans all those blocks and eliminates records which should not be 'visible' to the query being executed? Do index data get updated as and when data is committed and made 'visible' or is it that index data get updated as soon as data is changed, before commit is issued and rollback of transaction results in a rollback of the index data changes too? Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query about index usage
> PostgreSQL can't currently avoid reading the table, because that's > where the tuple visibility information is stored. We've been making > progress toward having some way to avoid reading the table for all > except very recently written tuples, but we're not there yet (in any > production version or in the 9.0 version to be released this > summer). Thank you for all the replies. I am learning PostgreSQL and figuring out which of the standard techniques for tuning queries in Oracle works in PostgreSQL as well. Thank you. Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Query about index usage
Hello all, One query about PostgreSQL's index usage. If I select just one column on which there is an index (or select only columns on which there is an index), and the index is used by PostgreSQL, does PostgreSQL avoid table access if possible? I am trying to understand the differences between Oracle's data access patterns and PostgreSQL's. Here is how it works in Oracle. Case 1 - SELECT column which is not there in the index SQL> select name from myt where id = 13890; NAME --- Execution Plan -- Plan hash value: 2609414407 - | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time| - | 0 | SELECT STATEMENT| | 1 |65 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| MYT | 1 |65 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | MYIDX | 1 | | 1 (0)| 00:00:01 | - Predicate Information (identified by operation id): --- 2 - access("ID"=13890) Note - - dynamic sampling used for this statement Statistics -- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 409 bytes sent via SQL*Net to client 384 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed Case 1 - SELECT column which is there in the index SQL> select id from myt where id = 13890; ID -- 13890 Execution Plan -- Plan hash value: 2555454399 -- | Id | Operation| Name | Rows | Bytes | Cost (%CPU)| Time | -- | 0 | SELECT STATEMENT | | 1 |13 | 1 (0)| 00:00:01 | |* 1 | INDEX RANGE SCAN| MYIDX | 1 |13 | 1 (0)| 00:00:01 | -- Predicate Information (identified by operation id): --- 1 - access("ID"=13890) Note - - dynamic sampling used for this statement Statistics -- 0 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 407 bytes sent via SQL*Net to client 384 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed In the second query where id was selected, the table was not used at all. In PosgreSQL, explain gives me similar output in both cases. Table structure - postgres=# \d myt Table "public.myt" Column | Type | Modifiers +---+--- id | integer | name | character varying(20) | Indexes: "myidx" btree (id) Regards, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pg_dump and pg_restore
I increased shared_buffers and maintenance_work_memto 128MB and 64MB and the restore was over in about 20 minutes. Anyway, I am learning about PostgreSQL and it is not a critical situation. Thanks for all the replies. Regards, Jayadevan From: Robert Haas To: Jayadevan M Cc: pgsql-performance@postgresql.org Date: 22/05/2010 16:59 Subject:Re: [PERFORM] pg_dump and pg_restore On Mon, May 17, 2010 at 1:04 AM, Jayadevan M wrote: > Hello all, > I was testing how much time a pg_dump backup would take to get restored. > Initially, I tried it with psql (on a backup taken with pg_dumpall). It took > me about one hour. I felt that I should target for a recovery time of 15 > minutes to half an hour. So I went through the blogs/documentation etc and > switched to pg_dump and pg_restore. I tested only the database with the > maximum volume of data (about 1.5 GB). With > pg_restore -U postgres -v -d PROFICIENT --clean -Fc proficient.dmp > it took about 45 minutes. I tried it with > pg_restore -U postgres -j8 -v -d PROFICIENT --clean -Fc proficient.dmp > Not much improvement there either. Have I missed something or 1.5 GB data on > a machine with the following configuration will take about 45 minutes? There > is nothing else running on the machine consuming memory or CPU. Out of 300 > odd tables, about 10 tables have millions of records, rest are all having a > few thousand records at most. > > Here are the specs ( a pc class machine)- > > PostgreSQL 8.4.3 on i686-pc-linux-gnu > CentOS release 5.2 > Intel(R) Pentium(R) D CPU 2.80GHz > 2 GB RAM > Storage is local disk. > > Postgresql parameters (what I felt are relevant) - > max_connections = 100 > shared_buffers = 64MB > work_mem = 16MB > maintenance_work_mem = 16MB > synchronous_commit on I would suggest raising shared_buffers to perhaps 512MB and cranking up checkpoint_segments to 10 or more. Also, your email doesn't give too much information about how many CPUs you have and what kind of disk subsystem you are using (RAID? how many disks?) so it's had to say if -j8 is reasonable. That might be too high. Another thing I would recommend is that during the restore you use tools like top and iostat to monitor the system. You'll want to check things like whether all the CPUs are in use, and how the disk activity compares to the maximum you can generate using some other method (perhaps dd). One thing I've noticed (to my chagrin) is that if pg_restore is given a set of options that are incompatible with parallel restore, it just does a single-threaded restore. The options you've specified look right to me, but, again, examining exactly what is going on during the restore should tell you if there's a problem in this area. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] pg_dump and pg_restore
Hello all, I was testing how much time a pg_dump backup would take to get restored. Initially, I tried it with psql (on a backup taken with pg_dumpall). It took me about one hour. I felt that I should target for a recovery time of 15 minutes to half an hour. So I went through the blogs/documentation etc and switched to pg_dump and pg_restore. I tested only the database with the maximum volume of data (about 1.5 GB). With pg_restore -U postgres -v -d PROFICIENT --clean -Fc proficient.dmp it took about 45 minutes. I tried it with pg_restore -U postgres -j8 -v -d PROFICIENT --clean -Fc proficient.dmp Not much improvement there either. Have I missed something or 1.5 GB data on a machine with the following configuration will take about 45 minutes? There is nothing else running on the machine consuming memory or CPU. Out of 300 odd tables, about 10 tables have millions of records, rest are all having a few thousand records at most. Here are the specs ( a pc class machine)- PostgreSQL 8.4.3 on i686-pc-linux-gnu CentOS release 5.2 Intel(R) Pentium(R) D CPU 2.80GHz 2 GB RAM Storage is local disk. Postgresql parameters (what I felt are relevant) - max_connections = 100 shared_buffers = 64MB work_mem = 16MB maintenance_work_mem = 16MB synchronous_commit on Thank you for any suggestions. Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
[PERFORM] PostgreSQL - case studies
Hello all, Apologies for the long mail. I work for a company that is provides solutions mostly on a Java/Oracle platform. Recently we moved on of our products to PostgreSQL. The main reason was PostgreSQL's GIS capabilities and the inability of government departments (especially road/traffic) to spend a lot of money for such projects. This product is used to record details about accidents and related analysis (type of road, when/why etc) with maps. Fortunately, even in India, an accident reporting application does not have to handle many tps :). So, I can't say PostgreSQL's performance was really tested in this case. Later, I tested one screen of one of our products - load testing with Jmeter. We tried it with Oracle, DB2, PostgreSQL and Ingres, and PostgreSQL easily out-performed the rest. We tried a transaction mix with 20+ SELECTS, update, delete and a few inserts. After a really good experience with the database, I subscribed to all PostgreSQL groups (my previous experience is all-Oracle) and reading these mails, I realized that many organizations are using plan, 'not customized' PostgreSQL for databases that handle critical applications. Since there is no company trying to 'sell' PostgreSQL, many of us are not aware of such cases. Could some of you please share some info on such scenarios- where you are supporting/designing/developing databases that run into at least a few hundred GBs of data (I know, that is small by todays' standards)? I went through http://www.postgresql.org/about/casestudies/ and felt those are a bit old. I am sure PostgreSQL has matured a lot more from the days when these case studies where posted. I went through the case studies at EnterpiseDB and similar vendors too. But those are customized PostgreSQL servers. I am looking more for a 'first-hand' feedback Any feedback - a few sentences with the db size, tps, h/w necessary to support that, and acceptable down-time, type of application etc will be greatly appreciated. Our products are not of the blog/social networking type, but more of on-line reservation type where half an hour down-time can lead to significant revenue losses for customers. Thank you, Jayadevan DISCLAIMER: "The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."