[PERFORM] Coraid/AoE device experience?
Lately, I've been reading a lot about these new Coraid AoE RAID devices ( http://www.coraid.com ). They tout it as being fast and cheap and better than iSCSI due to the lack of TCP/IP over the wire. Is it likely that a 15-drive RAID 10 Linux software RAID would outperform a 4-drive 10k SCSI RAID 0+1 for a heavy-loaded database? If not software RAID, how about their dedicated RAID controller blade? I'm definitely IO bound right now and starving for spindles. Does this make sense or is it too good to be true? Thanks -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Mirroring PostgreSQL database
Hi, I have one customer who is using PostgreSQL 7.4.8 on Linux . He has some problems with database mirroring . The details are follows. The customer is using Linux on which PostgreSQL 7.4.8 along with Jboss 3.2.3 is running . He has 2 servers , one is acting as a live server (primary) and another is acting as a fail-over (secondary)server .Secondary server is placed in remote location . These servers are acting as a Attendence server for daily activities . Nearly 50,000 employees depend on the live server . The customer is using DBmirror tool to mirror the database records of primary to secondary . The customer is complaining that there is one day (24 hours) delay between primary and secondray for database synchronization . They have dedicated line and bandwidth , but still the problems exists. I just want to know , for immediate data mirroring , what is the best way for PostgreSQL . PostgreSQL is offering many mirror tools , but which one is the best ?. Is there any other way to accomplish the task ? Thank you . Waiting for your reply. Thanks & Regards,Shashi KanthConsultant - LinuxRHCE , LPIC-2Onward Novell - Bangalore9886455567
Re: [PERFORM] Mirroring PostgreSQL database
Shashi Kanth Boddula wrote: The customer is using DBmirror tool to mirror the database records of primary to secondary . The customer is complaining that there is one day (24 hours) delay between primary and secondray for database synchronization . They have dedicated line and bandwidth , but still the problems exists. You don't say what the nature of the problem with dbmirror is. Are they saturating their bandwidth? Are one or both servers unable to keep pace with the updates? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Mirroring PostgreSQL database
Try Slony: www.slony.info Shashi Kanth Boddula wrote: Hi, I have one customer who is using PostgreSQL 7.4.8 on Linux . He has some problems with database mirroring . The details are follows. The customer is using Linux on which PostgreSQL 7.4.8 along with Jboss 3.2.3 is running . He has 2 servers , one is acting as a live server (primary) and another is acting as a fail-over (secondary) server . Secondary server is placed in remote location . These servers are acting as a Attendence server for daily activities . Nearly 50,000 employees depend on the live server . The customer is using DBmirror tool to mirror the database records of primary to secondary . The customer is complaining that there is one day (24 hours) delay between primary and secondray for database synchronization . They have dedicated line and bandwidth , but still the problems exists. I just want to know , for immediate data mirroring , what is the best way for PostgreSQL . PostgreSQL is offering many mirror tools , but which one is the best ?. Is there any other way to accomplish the task ? Thank you . Waiting for your reply. Thanks Regards, Shashi Kanth Consultant - Linux RHCE , LPIC-2 Onward Novell - Bangalore 9886455567 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Mirroring PostgreSQL database
I just want to know , for immediate data mirroring , what is the best way for PostgreSQL . PostgreSQL is offering many mirror tools , but which one is the best ?. Is there any other way to accomplish the task ? You want to take a look at Slony-I or Mammoth Replicator. http://www.slony.info/ http://www.commandprompt.com/ Thank you . Waiting for your reply. Thanks Regards, Shashi Kanth Consultant - Linux RHCE , LPIC-2 Onward Novell - Bangalore 9886455567 -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Vacuum Full Analyze taking so long
I have an 8.02 postgresql database with about 180 GB in size, running on 2.6 RedHat kernel with 32 GB of RAM and 2 CPUs. I'm running the vacuum full analyze command, and has been running for at least two consecutive days with no other processes running (it's an offline loading server). I tweaked the maintenanace_mem to its max (2 GB) with work_mem of 8M. I have no issues with my checkpoints. I can still I/O activities against the physical files of the property table and its two indexes (primary key and r index). The property files are about 128GB and indexes are about 15 GB. I have run the same maintenance job on a different box (staging) with identical hardware config (except with 64 GB instead of 32) and took less than 12 hours. Any clue or tip is really appreciated. Also read a comment by Tom Lane, that terminating the process should be crash-safe if I had to. Thanks, -- Husam ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ** This message contains confidential information intended only for the use of the addressee(s) named above and may contain information that is legally privileged. If you are not the addressee, or the person responsible for delivering it to the addressee, you are hereby notified that reading, disseminating, distributing or copying this message is strictly prohibited. If you have received this message by mistake, please immediately notify us by replying to the message and delete the original message immediately thereafter. Thank you. FADLD Tag ** ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] COPY insert performance
Title: Message I needCOPY via libpqxx to insert millions of rows into two tables. One table has roughly have as many rows and requires half the storage. In production, the largest table will grow by ~30M rows/day. To test the COPY performance I split my transactionsinto 10,000 rows. I insert roughly 5000 rows into table A for every 10,000 rowsinto table B. Table A has one unique index: "order_main_pk" UNIQUE, btree (cl_ord_id) Table B has 1 unique index and 2 non-unique indexes: "order_transition_pk" UNIQUE, btree (collating_seq)"order_transition_ak2" btree (orig_cl_ord_id)"order_transition_ak3" btree (exec_id) My testing environment is as follows: -Postgresql 8.0.1 -libpqxx 2.5.0 -Linux 2.6.11.4-21.7-smp x86_64 -Dual Opteron 246 -System disk (postgres data resides on this SCSI disk) - Seagate (ST373453LC) - 15K, 73 GB (http://www.seagate.com/cda/products/discsales/marketing/detail/0,1081,549,00.html) -2nd logical disk - 10K, 36GB IBM SCSI (IC35L036UCDY10-0) - WAL reside on this disk -NO RAID PostgreSQL Here are the results of copying in 10M rows as fast as possible: (10K/transaction) Total Time: 1129.556 s Rows/sec: 9899.922 Transaction1.2s 225 Transaction1.5s 77 Transaction2.0s 4 Max Transaction2.325s MySQL I ran a similar test with MySQL 4.1.10a (InnoDB)which produced these results: (I used MySQL's INSERT INTOx VALUES (1,2,3)(4,5,6)(...,...,...) syntax) (10K/transaction) Total Time: 860.000 s Rows/sec:11627.91 Transaction1.2s 0 Transaction1.5s 0 Transaction2.0s 0 Max Transaction1.175s Considering the configurations shown below, can anyone offer advice to close the 15% gap and the much worse variability I'm experiencing. Thanks My postgresql.conf has the following non-default values: # -# PostgreSQL configuration file# -listen_addresses = '*'# what IP interface(s) to listen on; max_connections = 100 #---# RESOURCE USAGE (except WAL)#---shared_buffers = 65536# min 16, at least max_connections*2, 8KB eachwork_mem = 2048# min 64, size in KBmaintenance_work_mem = 204800# min 1024, size in KBmax_fsm_pages = 225# min max_fsm_relations*16, 6 bytes eachbgwriter_delay = 200# 10-1 milliseconds between roundsbgwriter_percent = 10# 0-100% of dirty buffers in each roundbgwriter_maxpages = 1000# 0-1000 buffers max per round #---# WRITE AHEAD LOG#---fsync = false# turns forced synchronization on or offwal_buffers = 64# min 4, 8KB eachcheckpoint_segments = 40# in logfile segments, min 1, 16MB eachcheckpoint_timeout = 600# range 30-3600, in seconds #---# QUERY TUNING#---effective_cache_size = 65536# typically 8KB eachrandom_page_cost = 2# units are one sequential page fetch cost #---# ERROR REPORTING AND LOGGING#--- log_min_duration_statement = 250 # -1 is disabled, in milliseconds. log_connections = truelog_disconnections = truelog_duration = truelog_line_prefix = '%r%u%p%t%d%%'# e.g. '%u%%%d ' # %u=user name %d=database name# %r=remote host and port# %p=PID %t=timestamp %i=command tag# %c=session id %l=session line number# %s=session start timestamp %x=transaction id# %q=stop here in non-session processes# %%='%'log_statement = 'none'# none, mod, ddl, all #---# RUNTIME STATISTICS#---# - Query/Index Statistics Collector -stats_start_collector = truestats_command_string = truestats_block_level = truestats_row_level = truestats_reset_on_server_start = true My MySQL my.ini has the following non default values: innodb_data_home_dir = /var/lib/mysql/innodb_data_file_path = ibdata1:10M:autoextendinnodb_log_group_home_dir = /var/lib/mysql/innodb_log_arch_dir = /var/lib/mysql/# You can set .._buffer_pool_size up to 50 - 80 %# of RAM but beware of setting memory usage too highinnodb_buffer_pool_size = 512Minnodb_additional_mem_pool_size = 64M# Set .._log_file_size to 25 % of buffer pool sizeinnodb_log_file_size = 128Minnodb_log_buffer_size = 64Minnodb_flush_log_at_trx_commit = 1innodb_lock_wait_timeout = 50innodb_flush_method = O_DSYNC max_allowed_packet = 16M
Re: [PERFORM] COPY insert performance
Chris Isaacson wrote: I need COPY via libpqxx to insert millions of rows into two tables. One table has roughly have as many rows and requires half the storage. In production, the largest table will grow by ~30M rows/day. To test the COPY performance I split my transactions into 10,000 rows. I insert roughly 5000 rows into table A for every 10,000 rows into table B. Table A has one unique index: order_main_pk UNIQUE, btree (cl_ord_id) Table B has 1 unique index and 2 non-unique indexes: order_transition_pk UNIQUE, btree (collating_seq) order_transition_ak2 btree (orig_cl_ord_id) order_transition_ak3 btree (exec_id) Do you have any foreign key references? If you are creating a table for the first time (or loading a large fraction of the data), it is common to drop the indexes and foreign keys first, and then insert/copy, and then drop them again. Is InnoDB the backend with referential integrity, and true transaction support? I believe the default backend does not support either (so it is cheating to give you speed, which may be just fine for your needs, especially since you are willing to run fsync=false). I think moving pg_xlog to a dedicated drive (set of drives) could help your performance. As well as increasing checkpoint_segments. I don't know if you gain much by changing the bg_writer settings, if you are streaming everything in at once, you probably want to have it written out right away. My understanding is that bg_writer settings are for the case where you have mixed read and writes going on at the same time, and you want to make sure that the reads have time to execute (ie the writes are not saturating your IO). Also, is any of this tested under load? Having a separate process issue queries while you are loading in data. Traditionally MySQL is faster with a single process inserting/querying for data, but once you have multiple processes hitting it at the same time, it's performance degrades much faster than postgres. You also seem to be giving MySQL 512M of ram to work with, while only giving 2M/200M to postgres. (re)creating indexes uses maintenance_work_mem, but updating indexes could easily use work_mem. You may be RAM starved. John =:- My testing environment is as follows: -Postgresql 8.0.1 -libpqxx 2.5.0 -Linux 2.6.11.4-21.7-smp x86_64 -Dual Opteron 246 -System disk (postgres data resides on this SCSI disk) - Seagate (ST373453LC) - 15K, 73 GB (http://www.seagate.com/cda/products/discsales/marketing/detail/0,1081,549,00.html) -2nd logical disk - 10K, 36GB IBM SCSI (IC35L036UCDY10-0) - WAL reside on this disk -NO RAID *PostgreSQL* Here are the results of copying in 10M rows as fast as possible: (10K/transaction) Total Time:1129.556 s Rows/sec: 9899.922 Transaction1.2s225 Transaction1.5s 77 Transaction2.0s 4 Max Transaction 2.325s **MySQL** **I ran a similar test with MySQL 4.1.10a (InnoDB) which produced these results: (I used MySQL's INSERT INTO x VALUES (1,2,3)(4,5,6)(...,...,...) syntax) (10K/transaction) Total Time: 860.000 s Rows/sec:11627.91 Transaction1.2s 0 Transaction1.5s 0 Transaction2.0s 0 Max Transaction 1.175s Considering the configurations shown below, can anyone offer advice to close the 15% gap and the much worse variability I'm experiencing. Thanks My *postgresql.conf* has the following non-default values: # - # PostgreSQL configuration file # - listen_addresses = '*' # what IP interface(s) to listen on; max_connections = 100 #--- # RESOURCE USAGE (except WAL) #--- shared_buffers = 65536 # min 16, at least max_connections*2, 8KB each work_mem = 2048 # min 64, size in KB maintenance_work_mem = 204800 # min 1024, size in KB max_fsm_pages = 225 # min max_fsm_relations*16, 6 bytes each bgwriter_delay = 200 # 10-1 milliseconds between rounds bgwriter_percent = 10 # 0-100% of dirty buffers in each round bgwriter_maxpages = 1000 # 0-1000 buffers max per round #--- # WRITE AHEAD LOG #--- fsync = false # turns forced synchronization on or off wal_buffers = 64 # min 4, 8KB each checkpoint_segments = 40 # in logfile segments, min 1, 16MB each checkpoint_timeout = 600 # range 30-3600, in seconds #--- # QUERY TUNING #--- effective_cache_size = 65536 # typically 8KB each random_page_cost = 2 # units are one sequential page fetch cost #--- # ERROR
Re: [PERFORM] Vacuum Full Analyze taking so long
I'd say, don't do that. Unless you've deleted a lot of stuff and are expecting the DB to shrink, a full vacuum shouldn't really be needed. On a DB that big a full vacuum is just going to take a long time. If you really are shrinking, consider structuring things so you can just drop a table instead of vacuuming it (the drop is fairly instantaneous). If you can't do that, consider dropping the indices, vacuuming, and recreating the indices. Mike Stone ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Vacuum Full Analyze taking so long
Vacuum full takes an exclusive lock on the tables it runs against, so if you have anything else reading the table while you are trying to run it, the vacuum full will wait, possibly forever until it can get the lock. What does the system load look like while you are running this? What does vmstat 1 show you? Is there load on the system other than the database? Do you really need to run vacuum full instead of vacuum? - Luke On 7/25/05 2:30 PM, Tomeh, Husam [EMAIL PROTECTED] wrote: I have an 8.02 postgresql database with about 180 GB in size, running on 2.6 RedHat kernel with 32 GB of RAM and 2 CPUs. I'm running the vacuum full analyze command, and has been running for at least two consecutive days with no other processes running (it's an offline loading server). I tweaked the maintenanace_mem to its max (2 GB) with work_mem of 8M. I have no issues with my checkpoints. I can still I/O activities against the physical files of the property table and its two indexes (primary key and r index). The property files are about 128GB and indexes are about 15 GB. I have run the same maintenance job on a different box (staging) with identical hardware config (except with 64 GB instead of 32) and took less than 12 hours. Any clue or tip is really appreciated. Also read a comment by Tom Lane, that terminating the process should be crash-safe if I had to. Thanks, ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Vacuum Full Analyze taking so long
Husam, On 7/25/05 4:31 PM, John A Meinel [EMAIL PROTECTED] wrote: Tomeh, Husam wrote: Nothing was running except the job. The server did not look stressed out looking at top and vmstat. We have seen slower query performance when performing load tests, so I run the re-index on all application indexes and then issue a full vacuum. I ran the same thing on a staging server and it took less than 12 hours. Is there a possibility the DB pages are corrupted. Is there a command to verify that. (In Oracle, there's a dbverify command that checks for corruption on the data files level). The other question I have. What would be the proper approach to rebuild indexes. I re-indexes and then run vacuum/analyze. Should I not use the re-index approach, and instead, drop the indexes, vacuum the tables, and then create the indexes, then run analyze on tables and indexes?? I *think* if you are planning on dropping the indexes anyway, just drop them, VACUUM ANALYZE, and then recreate them, I don't think you have to re-analyze after you have recreated them. I agree - and don't run VACUUM FULL, it is quite different from VACUUM. Also - you should only need to vacuum if you've deleted a lot of data. It's job is to reclaim space lost to rows marked deleted. So, in fact, you may not even need to run VACUUM. VACUUM FULL is like a disk defragmentation operation within the DBMS, and is only necessary if there is a slowdown in performance from lots and lots of deletes and/or updates and new data isn't finding sequential pages for storage, which is rare. Given the need for locking, it's generally better to dump and restore in that case, but again it's a very rare occasion. I don't know of a command to check for page corruption, but I would think that if you can run VACUUM (not full) you should be OK. - Luke ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] COPY insert performance
Title: Re: [PERFORM] COPY insert performance Chris, You can try the Bizgres distribution of postgres (based on version 8.0.3), the COPY support is 30% faster as reported by OSDL (without indexes). This is due to very slow parsing within the COPY command, which is sped up using micro-optimized logic for parsing. There is a patch pending for the development version of Postgres which implements the same code, but you can use Bizgres and get it now instead of waiting for postgres 8.1 to come out. Also, Bizgres is QA tested with the enhanced features. Bizgres is a free / open source distribution of Postgres for Business Intelligence / Data Warehousing. Bizgres currently features postgres 8.0.3 plus these patches: Bypass WAL when performing CREATE TABLE AS SELECT COPY is between 30% and 90% faster on machines with fast I/O Enhanced support for data partitioning with partition elimination optimization Bitmap Scan support for multiple index use in queries and better low cardinality column performance Improved optimization of queries with LIMIT See: http://www.bizgres.org for more. - Luke On 7/25/05 3:32 PM, Chris Isaacson [EMAIL PROTECTED] wrote: I need COPY via libpqxx to insert millions of rows into two tables. One table has roughly have as many rows and requires half the storage. In production, the largest table will grow by ~30M rows/day. To test the COPY performance I split my transactions into 10,000 rows. I insert roughly 5000 rows into table A for every 10,000 rows into table B. Table A has one unique index: order_main_pk UNIQUE, btree (cl_ord_id) Table B has 1 unique index and 2 non-unique indexes: order_transition_pk UNIQUE, btree (collating_seq) order_transition_ak2 btree (orig_cl_ord_id) order_transition_ak3 btree (exec_id) My testing environment is as follows: -Postgresql 8.0.1 -libpqxx 2.5.0 -Linux 2.6.11.4-21.7-smp x86_64 -Dual Opteron 246 -System disk (postgres data resides on this SCSI disk) - Seagate (ST373453LC) - 15K, 73 GB (http://www.seagate.com/cda/products/discsales/marketing/detail/0,1081,549,00.html) -2nd logical disk - 10K, 36GB IBM SCSI (IC35L036UCDY10-0) - WAL reside on this disk -NO RAID PostgreSQL Here are the results of copying in 10M rows as fast as possible: (10K/transaction) Total Time: 1129.556 s Rows/sec: 9899.922 Transaction1.2s 225 Transaction1.5s 77 Transaction2.0s 4 Max Transaction 2.325s MySQL I ran a similar test with MySQL 4.1.10a (InnoDB) which produced these results: (I used MySQL's INSERT INTO x VALUES (1,2,3)(4,5,6)(...,...,...) syntax) (10K/transaction) Total Time: 860.000 s Rows/sec: 11627.91 Transaction1.2s 0 Transaction1.5s 0 Transaction2.0s 0 Max Transaction 1.175s Considering the configurations shown below, can anyone offer advice to close the 15% gap and the much worse variability I'm experiencing. Thanks My postgresql.conf has the following non-default values: # - # PostgreSQL configuration file # - listen_addresses = '*' # what IP interface(s) to listen on; max_connections = 100 #--- # RESOURCE USAGE (except WAL) #--- shared_buffers = 65536 # min 16, at least max_connections*2, 8KB each work_mem = 2048 # min 64, size in KB maintenance_work_mem = 204800 # min 1024, size in KB max_fsm_pages = 225 # min max_fsm_relations*16, 6 bytes each bgwriter_delay = 200 # 10-1 milliseconds between rounds bgwriter_percent = 10 # 0-100% of dirty buffers in each round bgwriter_maxpages = 1000 # 0-1000 buffers max per round #--- # WRITE AHEAD LOG #--- fsync = false # turns forced synchronization on or off wal_buffers = 64 # min 4, 8KB each checkpoint_segments = 40 # in logfile segments, min 1, 16MB each checkpoint_timeout = 600 # range 30-3600, in seconds #--- # QUERY TUNING #--- effective_cache_size = 65536 # typically 8KB each random_page_cost = 2 # units are one sequential page fetch cost #--- # ERROR REPORTING AND LOGGING #--- log_min_duration_statement = 250 # -1 is disabled, in milliseconds. log_connections = true log_disconnections = true log_duration = true log_line_prefix = '%r%u%p%t%d%%' # e.g. '%u%%%d ' # %u=user name %d=database name # %r=remote host and port # %p=PID %t=timestamp %i=command tag # %c=session id %l=session line number # %s=session start timestamp %x=transaction id # %q=stop here in non-session processes # %%='%' log_statement = 'none' # none, mod, ddl, all
Re: [PERFORM] Vacuum Full Analyze taking so long
Tomeh, Husam wrote: Nothing was running except the job. The server did not look stressed out looking at top and vmstat. We have seen slower query performance when performing load tests, so I run the re-index on all application indexes and then issue a full vacuum. I ran the same thing on a staging server and it took less than 12 hours. Is there a possibility the DB pages are corrupted. Is there a command to verify that. (In Oracle, there's a dbverify command that checks for corruption on the data files level). The other question I have. What would be the proper approach to rebuild indexes. I re-indexes and then run vacuum/analyze. Should I not use the re-index approach, and instead, drop the indexes, vacuum the tables, and then create the indexes, then run analyze on tables and indexes?? I *think* if you are planning on dropping the indexes anyway, just drop them, VACUUM ANALYZE, and then recreate them, I don't think you have to re-analyze after you have recreated them. John =:- Thanks, signature.asc Description: OpenPGP digital signature