Re: [PERFORM] DBT-5 Postgres 9.0.3
On Wed, Aug 17, 2011 at 8:29 AM, bobbyw bob...@sendprobe.com wrote: Hi, I know this is an old thread, but I wanted to chime in since I am having problems with this as well. I too am trying to run dbt5 against Postgres. Specifically I am trying to run it against Postgres 9.1beta3. After jumping through many hoops I ultimately was able to build dbt5 on my debian environment, but when I attempt to run the benchmark with: dbt5-run-workload -a pgsql -c 5000 -t 5000 -d 60 -u 1 -i ~/dbt5-0.1.0/egen -f 500 -w 300 -n dbt5 -p 5432 -o /tmp/results it runs to completion but all of the dbt5 log files contain errors like: terminate called after throwing an instance of 'pqxx::broken_connection' what(): could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket /var/run/postgresql/.s.PGSQL.5432? I'm lead to believe that this is an error I would receive if the Postgres db were not running, but it is. In fact, the way dbt5-run-workload works it starts the database automatically. I have also confirmed it is running by manually connecting while this benchmark is in progress (and after it has already started the database and logged the above error). Any thoughts on why I might be getting this error? Hi there, Sorry I didn't catch this sooner. Can you try using the code from the git repository? I removed libpqxx and just used libpq a while ago to hopefully simplify the kit: git://osdldbt.git.sourceforge.net/gitroot/osdldbt/dbt5 Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] using dbt2 postgresql 8.4 - rampup time issue
On Mon, Jul 5, 2010 at 10:24 AM, MUHAMMAD ASIF anaeem...@hotmail.com wrote: A clarification of terms may help to start. The terminals per warehouse in the scripts correlates to the number terminals emulated. An emulated terminal is tied to a warehouse's district. In other words, the number of terminals translates to the number of districts in a warehouse across the entire database. To increase the terminals per warehouse implies you have scaled the database differently, which I'm assuming is not the case here. Scale the database … Can you please elaborate ? . To increase terminals per warehouse I added only one option ( i.e. -t for dbt2-run-workload ) with normal dbt2 test i.e. ./dbt2-pgsql-create-db ./dbt2-pgsql-build-db -d $DBDATA -g -r -w $WAREHOUSES ./dbt2-run-workload -a pgsql -c $DB_CONNECTIONS -d $REGRESS_DURATION_SEC -w $WAREHOUSES -o $OUTPUT_DIR -t $TERMINAL_PER_WAREHOUSE ./dbt2-pgsql-stop-db Is this change enough or I am missing some thing ? This isn't a trivial question even though at face value I do understand that you want to see what the performance of postgres is on 64-bit linux. This kit is complex enough where the answer it it depends. If you want to increase the workload following specification guidelines, then I think you need to understand the specification referenced above better. To best use this kit does involve a fair amount of understanding of the TPC-C specification. If you just want to increase the load on the database system there are several ways to do it. You can use the '-n' flag for the dbt2-run-workload so that all database transactions are run immediately after each other. If you build the database to a larger scale factor (using TPC terminology) by increasing the warehouses, then the scripts will appropriately scale the workload. Tweaking the -t flag would be a more advanced method that requires a better understand of the specification. Perhaps some more familiarity with the TPC-C specification would help here: http://www.tpc.org/tpcc/spec/tpcc_current.pdf Clause 4.1 discusses the scaling rules for sizing the database. Unfortunately that clause may not directly clarify things for you. The other thing to understand is that the dbt2 scripts allow you to break the specification guidelines in some ways, and not in others. I don't know how to better explain it. The database was built one way, and you told the scripts to run the programs in a way that asked for data that doesn't exist. 1. Settings : DATABASE CONNECTIONS: 50 TERMINALS PER WAREHOUSE: 10 SCALE FACTOR (WAREHOUSES): 200 DURATION OF TEST (in sec): 7200 Result : Response Time (s) Transaction % Average : 90th % Total Rollbacks % - - --- --- - Delivery 3.96 0.285 : 0.023 26883 0 0.00 New Order 45.26 0.360 : 0.010 307335 3082 1.01 Order Status 3.98 0.238 : 0.003 27059 0 0.00 Payment 42.82 0.233 : 0.003 290802 0 0.00 Stock Level 3.97 0.245 : 0.002 26970 0 0.00 - - --- --- - 2508.36 new-order transactions per minute (NOTPM) 120.1 minute duration 0 total unknown errors 2000 second(s) ramping up 2. Settings : DATABASE CONNECTIONS: 50 TERMINALS PER WAREHOUSE: 40 SCALE FACTOR (WAREHOUSES): 200 DURATION OF TEST (in sec): 7200 Result : Response Time (s) Transaction % Average : 90th % Total Rollbacks % - - --- --- - Delivery 3.95 8.123 : 4.605 43672 0 0.00 New Order 45.19 12.205 : 2.563 499356 4933 1.00 Order Status 4.00 7.385 : 3.314 44175 0 0.00 Payment 42.89 7.221 : 1.920 473912 0 0.00 Stock Level 3.97 7.093 : 1.887 43868 0 0.00 - - --- --- - 7009.40 new-order transactions per minute (NOTPM) 69.8 minute duration 0 total unknown errors 8016 second(s) ramping up 8016 (actual rampup time) + ( 69.8 * 60 ) = 12204 5010 (estimated rampup time) + 7200 (estimated steady state time) = 12210 I can see where you're pulling numbers from, but I'm having trouble understanding what correlation you are trying to make. 3. Settings : DATABASE CONNECTIONS: 50 TERMINALS PER WAREHOUSE: 40 SCALE FACTOR (WAREHOUSES): 200 DURATION OF TEST (in sec): 7200 Result : Response Time (s) Transaction
Re: [PERFORM] using dbt2 postgresql 8.4 - rampup time issue
On Fri, Jul 2, 2010 at 7:38 AM, MUHAMMAD ASIF anaeem...@hotmail.com wrote: Hi, We are using dbt2 to check performance of postgresql 8.4 on Linux64 machine. When we increase TERMINALS PER WAREHOUSE TPM value increase rapidly but rampup time increase too , dbt2 estimated rampup time calculation do not work properly that’s why it run the test for wrong duration i.e. A clarification of terms may help to start. The terminals per warehouse in the scripts correlates to the number terminals emulated. An emulated terminal is tied to a warehouse's district. In other words, the number of terminals translates to the number of districts in a warehouse across the entire database. To increase the terminals per warehouse implies you have scaled the database differently, which I'm assuming is not the case here. 1. Settings : DATABASE CONNECTIONS: 50 TERMINALS PER WAREHOUSE: 10 SCALE FACTOR (WAREHOUSES): 200 DURATION OF TEST (in sec): 7200 Result : Response Time (s) Transaction % Average : 90th % Total Rollbacks % - - --- --- - Delivery 3.96 0.285 : 0.023 26883 0 0.00 New Order 45.26 0.360 : 0.010 307335 3082 1.01 Order Status 3.98 0.238 : 0.003 27059 0 0.00 Payment 42.82 0.233 : 0.003 290802 0 0.00 Stock Level 3.97 0.245 : 0.002 26970 0 0.00 - - --- --- - 2508.36 new-order transactions per minute (NOTPM) 120.1 minute duration 0 total unknown errors 2000 second(s) ramping up 2. Settings : DATABASE CONNECTIONS: 50 TERMINALS PER WAREHOUSE: 40 SCALE FACTOR (WAREHOUSES): 200 DURATION OF TEST (in sec): 7200 Result : Response Time (s) Transaction % Average : 90th % Total Rollbacks % - - --- --- - Delivery 3.95 8.123 : 4.605 43672 0 0.00 New Order 45.19 12.205 : 2.563 499356 4933 1.00 Order Status 4.00 7.385 : 3.314 44175 0 0.00 Payment 42.89 7.221 : 1.920 473912 0 0.00 Stock Level 3.97 7.093 : 1.887 43868 0 0.00 - - --- --- - 7009.40 new-order transactions per minute (NOTPM) 69.8 minute duration 0 total unknown errors 8016 second(s) ramping up 3. Settings : DATABASE CONNECTIONS: 50 TERMINALS PER WAREHOUSE: 40 SCALE FACTOR (WAREHOUSES): 200 DURATION OF TEST (in sec): 7200 Result : Response Time (s) Transaction % Average : 90th % Total Rollbacks % - - --- --- - Delivery 3.98 9.095 : 16.103 15234 0 0.00 New Order 45.33 7.896 : 14.794 173539 1661 0.97 Order Status 3.96 8.165 : 13.989 15156 0 0.00 Payment 42.76 7.295 : 12.470 163726 0 0.00 Stock Level 3.97 7.198 : 12.520 15198 0 0.00 - - --- --- - 10432.09 new-order transactions per minute (NOTPM) 16.3 minute duration 0 total unknown errors 11227 second(s) ramping up These results show that dbt2 test actually did not run for 2 hours but it start varying with the increase of TERMINALS PER WAREHOUSE value i.e. 1st Run ( 120.1 minute duration ), 2nd Run (69.8 minute duration) and 3rd Run (16.3 minute duration). The ramp up times are actually as expected (explained below). What you are witnessing is more likely that the driver is crashing because the values are out of range from the scale of the database. You have effectively told the driver that there are more than 10 districts per warehouse, and have likely not built the database that way. I'm actually surprised the driver actually ramped up completely. To fix and sync with the rampup time, I have made a minor change in the dbt2-run-workload script i.e. --- dbt2-run-workload 2010-07-02 08:18:06.0 -0400 +++ dbt2-run-workload 2010-07-02 08:20:11.0 -0400 @@ -625,7 +625,11 @@ done echo -n estimated rampup time: -do_sleep $SLEEP_RAMPUP +#do_sleep $SLEEP_RAMPUP +while ! grep START ${DRIVER_OUTPUT_DIR}/*/mix.log ; do + sleep 1 +done +date echo estimated rampup time has elapsed # Clear the readprofile data after the driver ramps up. What is rempup time ? And what do you think about the
[PERFORM] hp hpsa vs cciss driver
Hi all, Are there any HP Smart Array disk controller users running linux that have experimented with the new scsi based hpsa driver over the block based cciss driver? I have a p800 controller that I'll try out soon. (I hope.) Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: [PERFORM] Dbt2 with postgres issues on CentOS-5. 3
2010/4/20 MUHAMMAD ASIF anaeem...@hotmail.com: Hi, I am using dbt2 on Linux 64 (CentOS release 5.3 (Final)) . I have compiled latest postgresql-8.4.3 code on the machine and run dbt2 against it. I am little confused about the results. I ran dbt2 with the following configuration i.e. DBT2 Options : WAREHOUSES=75 DB_CONNECTIONS=20 REGRESS_DURATION=1 #HOURS REGRESS_DURATION_SEC=$((60*60*$REGRESS_DURATION)) DBT2 Command : ./dbt2-pgsql-create-db ./dbt2-pgsql-build-db -d $DBDATA -g -r -w $WAREHOUSES ./dbt2-run-workload -a pgsql -c $DB_CONNECTIONS -d $REGRESS_DURATION_SEC -w $WAREHOUSES -o $OUTPUT_DIR ./dbt2-pgsql-stop-db I am not able to understand the sar related graphs. Iostat,mpstat and vmstat results are similar but sar results are strange. I tried to explore the dbt2 source code to find out the how graphs are drawn and why sar results differ.DBT2.pm : 189 reads sar.out and parse it and consider 1 minute elapsed time between each record i.e. That is certainly a weakness in the logic of the perl modules in plotting the charts accurately. I wouldn't be surprised if the other stat tools suffer the same problem. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Linux I/O tuning: CFQ vs. deadline
On Mon, Feb 8, 2010 at 9:49 AM, Josh Berkus j...@agliodbs.com wrote: That's basically what I've been trying to make clear all along: people should keep an open mind, watch what happens, and not make any assumptions. There's no clear cut preference for one scheduler or the other in all situations. I've seen CFQ do much better, you and Albe report situations where the opposite is true. I was just happy to see another report of someone running into the same sort of issue I've been seeing, because I didn't have very much data to offer about why the standard advice of always use deadline for a database app might not apply to everyone. Damn, you would have to make things complicated, eh? FWIW, back when deadline was first introduced Mark Wong did some tests and found Deadline to be the fastest of 4 on DBT2 ... but only by about 5%. If the read vs. checkpoint analysis is correct, what was happening is the penalty for checkpoints on deadline was almost wiping out the advantage for reads, but not quite. Those tests were also done on attached storage. So, what this suggests is: reads: deadline CFQ writes: CFQ deadline attached storage: deadline CFQ Man, we'd need a lot of testing to settle this. I guess that's why Linux gives us the choice of 4 ... I wonder what the impact is from the underlying RAID configuration. Those DBT2 tests were also LVM striped volumes on top of single RAID0 LUNS (no jbod option). Regards. Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Using IOZone to simulate DB access patterns
On Sat, Apr 11, 2009 at 11:44 AM, Mark Wong mark...@gmail.com wrote: On Fri, Apr 10, 2009 at 11:01 AM, Greg Smith gsm...@gregsmith.com wrote: On Fri, 10 Apr 2009, Scott Carey wrote: FIO with profiles such as the below samples are easy to set up There are some more sample FIO profiles with results from various filesystems at http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide There's a couple of potential flaws I'm trying to characterize this weekend. I'm having second thoughts about how I did the sequential read and write profiles. Using multiple processes doesn't let it really do sequential i/o. I've done one comparison so far resulting in about 50% more throughput using just one process to do sequential writes. I just want to make sure there shouldn't be any concern for being processor bound on one core. The other flaw is having a minimum run time. The max of 1 hour seems to be good to establishing steady system utilization, but letting some tests finish in less than 15 minutes doesn't provide good data. Good meaning looking at the time series of data and feeling confident it's a reliable result. I think I'm describing that correctly... FYI, I've updated the wiki with the parameters I'm running with now. I haven't updated the results yet though. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Using IOZone to simulate DB access patterns
On Sat, Apr 11, 2009 at 7:00 PM, Scott Carey sc...@richrelevance.com wrote: On 4/11/09 11:44 AM, Mark Wong mark...@gmail.com wrote: On Fri, Apr 10, 2009 at 11:01 AM, Greg Smith gsm...@gregsmith.com wrote: On Fri, 10 Apr 2009, Scott Carey wrote: FIO with profiles such as the below samples are easy to set up There are some more sample FIO profiles with results from various filesystems at http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide There's a couple of potential flaws I'm trying to characterize this weekend. I'm having second thoughts about how I did the sequential read and write profiles. Using multiple processes doesn't let it really do sequential i/o. I've done one comparison so far resulting in about 50% more throughput using just one process to do sequential writes. I just want to make sure there shouldn't be any concern for being processor bound on one core. FWIW, my raid array will do 1200MB/sec, and no tool I've used can saturate it without at least two processes. 'dd' and fio can get close (1050MB/sec), if the block size is = ~32k =64k. With a postgres sized 8k block 'dd' can't top 900MB/sec or so. FIO can saturate it only with two+ readers. I optimized my configuration for 4 concurrent sequential readers with 4 concurrent random readers, and this helped the overall real world performance a lot. I would argue that on any system with concurrent queries, concurrency of all types is important to measure. Postgres isn't going to hold up one sequential scan to wait for another. Postgres on a 3.16Ghz CPU is CPU bound on a sequential scan at between 250MB/sec and 800MB/sec on the type of tables/queries I have. Concurrent sequential performance was affected by: Xfs -- the gain over ext3 was large Readahead tuning -- about 2MB per spindle was optimal (20MB for me, sw raid 0 on 2x[10 drive hw raid 10]). Deadline scheduler (big difference with concurrent sequential + random mixed). One reason your tests write so much faster than they read was the linux readahead value not being tuned as you later observed. This helps ext3 a lot, and xfs enough so that fio single threaded was faster than 'dd' to the raw device. The other flaw is having a minimum run time. The max of 1 hour seems to be good to establishing steady system utilization, but letting some tests finish in less than 15 minutes doesn't provide good data. Good meaning looking at the time series of data and feeling confident it's a reliable result. I think I'm describing that correctly... It really depends on the specific test though. You can usually get random iops numbers that are realistic in a fairly short time, and 1 minute long tests for me vary by about 3% (which can be +-35MB/sec in my case). I ran my tests on a partition that was only 20% the size of the whole volume, and at the front of it. Sequential transfer varies by a factor of 2 across a SATA disk from start to end, so if you want to compare file systems fairly on sequential transfer rate you have to limit the partition to an area with relatively constant STR or else one file system might win just because it placed your file earlier on the drive. That's probably what is going with the 1 disk test: http://207.173.203.223/~markwkm/community10/fio/linux-2.6.28-gentoo/1-disk-raid0/ext2/seq-read/io-charts/iostat-rMB.s.png versus the 4 disk test: http://207.173.203.223/~markwkm/community10/fio/linux-2.6.28-gentoo/4-disk-raid0/ext2/seq-read/io-charts/iostat-rMB.s.png These are the throughput numbs but the iops are in the same directory. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Using IOZone to simulate DB access patterns
On Fri, Apr 10, 2009 at 11:01 AM, Greg Smith gsm...@gregsmith.com wrote: On Fri, 10 Apr 2009, Scott Carey wrote: FIO with profiles such as the below samples are easy to set up There are some more sample FIO profiles with results from various filesystems at http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide There's a couple of potential flaws I'm trying to characterize this weekend. I'm having second thoughts about how I did the sequential read and write profiles. Using multiple processes doesn't let it really do sequential i/o. I've done one comparison so far resulting in about 50% more throughput using just one process to do sequential writes. I just want to make sure there shouldn't be any concern for being processor bound on one core. The other flaw is having a minimum run time. The max of 1 hour seems to be good to establishing steady system utilization, but letting some tests finish in less than 15 minutes doesn't provide good data. Good meaning looking at the time series of data and feeling confident it's a reliable result. I think I'm describing that correctly... Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] linux deadline i/o elevator tuning
Hi all, Has anyone experimented with the Linux deadline parameters and have some experiences to share? Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] linux deadline i/o elevator tuning
On Thu, Apr 9, 2009 at 7:00 AM, Mark Wong mark...@gmail.com wrote: Hi all, Has anyone experimented with the Linux deadline parameters and have some experiences to share? Hi all, Thanks for all the responses, but I didn't mean selecting deadline as much as its parameters such as: antic_expire read_batch_expire read_expire write_batch_expire write_expire Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] linux deadline i/o elevator tuning
On Thu, Apr 9, 2009 at 7:53 AM, Mark Wong mark...@gmail.com wrote: On Thu, Apr 9, 2009 at 7:00 AM, Mark Wong mark...@gmail.com wrote: Hi all, Has anyone experimented with the Linux deadline parameters and have some experiences to share? Hi all, Thanks for all the responses, but I didn't mean selecting deadline as much as its parameters such as: antic_expire read_batch_expire read_expire write_batch_expire write_expire And I dumped the parameters for the anticipatory scheduler. :p Here are the deadline parameters: fifo_batch front_merges read_expire write_expire writes_starved Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] DBT Presentation Location?
On Mar 9, 2009, at 7:28 AM, Lee Hughes wrote: Hi- where can I find location of the DBT presentation in Portland next week? It'll be at Portland State University at 7pm Thursday March 12. It's in the Fourth Avenue Building (FAB) room 86-01, on 1900 SW 4th Ave. It's in G-10 on the map: http://www.pdx.edu/map.html See you soon. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5
On Thu, Jan 22, 2009 at 7:44 PM, Greg Smith gsm...@gregsmith.com wrote: The next fine-tuning bit I'd normally apply in this situation is to see if increasing checkpoint_completion_target from the default (0.5) to 0.9 does anything to flatten out that response time graph. I've seen a modest increase in wal_buffers (from the default to, say, 1MB) help smooth out the rough spots too. Hi all, After yet another delay, I have .6 to .9 (I forgot .5. :(): http://pugs.postgresql.org/node/526 I don't think the effects of the checkpoint_completion_target are significant, and I sort of feel it's because the entire database is on a single device. I've started doing some runs with the database log on a separate device, so I'll be trying some of these parameters again. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Getting error while running DBT2 test for PostgreSQL
On Wed, Feb 4, 2009 at 2:42 AM, Rohan Pethkar rohanpeth...@gmail.com wrote: Hi All, I am trying to conduct DBT2 test for PostgreSQL. I am getting following errors when I run client and driver separately(Instead of running dbt2-run-workload script). I am also attaching exact error log file for the same. tid:1073875280 /home/rohan/NEW_DBT2/Installer/dbt2/src/driver.c:496 connect_to_client() failed, thread exiting... Wed Feb 4 20:00:52 2009 tid:1074010448 /home/rohan/NEW_DBT2/Installer/dbt2/src/driver.c:496 connect_to_client() failed, thread exiting... . . . Can someone please provide inputs on this? Do I need to make specific changes if any then please let me know. Any help on this will be appreciated. Thanks in advance for your help. Hi Rohan, As I mentioned on the osdldbt list, for questions like this it would be more appropriate to cc the osdldbt-gene...@lists.sourceforge.net instead of the -performance list. It's not clear why the driver can't connect to the client. Can you provide all of the log files somewhere? Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Can't locate Test/Parser/Dbt2.pm in DBT2 tests
On Fri, Feb 6, 2009 at 3:46 AM, Richard Huxton d...@archonet.com wrote: Rohan Pethkar wrote: Hi All, I am conductingDBT2 tests on PostgreSQL. After completing the test while analyzing and creating the results I am getting following error: ./dbt2-run-workload: line 514: 731 Terminated dbt2-client ${CLIENT_COMMAND_ARGS} -p ${PORT} -o ${CDIR} ${CLIENT_OUTPUT_DIR}/`hostname`/client-${SEG}.out 21 waiting for server to shut down done server stopped Can't locate Test/Parser/Dbt2.pm in @INC (@INC contains: /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl/5.8.7 /usr/lib/perl5/site_perl/5.8.6 /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl/5.8.7 /usr/lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8 .) at /home/rohan/NEW_DBT2/Installer/DBT2_SETUP/bin/dbt2-post-process line 13. Well, if Test::Parser::Dbt2 isn't in somewhere in that list of directories, you'll need to tell perl where to look. Simplest is probably just to: export PERL5LIB=/path/to/extra/libs before running your tests. Can't exec gnuplot: No such file or directory at /home/rohan/NEW_DBT2/Installer/DBT2_SETUP/bin/dbt2-pgsql-analyze-stats line 113. It also looks like you're missing gnuplot for your charts. I ma not sure why it doesn't find Test/Parser/Dbt2.pm even if I have installed DBT2 completely. Did I miss any steps? Do I need to install some extra packages? If any then please let me know. You can always perldoc perlrun for more info (google it if you don't have docs installed locally). Hi Rohan, In addition to what Richard said, I'm guessing you don't have those perl modules installed. In the main README file lists the perl modules required for the post processing of the data colelcted, which is what failed here: The data analysis scripts requires two additional Perl packages to be installed, which are not checked by configure. They are Statistics::Descriptive and Test::Parser. To generate HTML reports, Test::Reporter is required. Sorry, it's still a work in progress., as slow as that may be. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5
On Thu, Jan 22, 2009 at 10:10 PM, Mark Wong mark...@gmail.com wrote: On Thu, Jan 22, 2009 at 7:44 PM, Greg Smith gsm...@gregsmith.com wrote: On Thu, 22 Jan 2009, Mark Wong wrote: I'm also capturing the PostgreSQL parameters as suggested so we can see what's set in the config file, default, command line etc. It's the Settings link in the System Summary section on the report web page. Those look good, much easier to pick out the stuff that's been customized. I note that the Linux Settings links seems to be broken though. Oh fudge, I think I see where my scripts are broken. We're running with a different Linux kernel now than before so I don't want to grab the parameters yet. I'll switch to the previous kernel to get the parameters after the current testing is done, and fix the scripts in the meantime. Sorry for the continuing delays. I have to make more time to spend on this part. One of the problems is that my scripts are listing the OS of the main driver system, as opposed to the db system. Mean while, I've attached the sysctl output from kernel running on the database system, 2.6.27-gentoo-r2.. Regards, Mark sysctl-2.6.27-gentoo-r2 Description: Binary data -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5
On Mon, Dec 22, 2008 at 12:59 AM, Greg Smith gsm...@gregsmith.com wrote: On Sat, 20 Dec 2008, Mark Wong wrote: Here are links to how the throughput changes when increasing shared_buffers: http://pugs.postgresql.org/node/505 My first glance takes tells me that the system performance is quite erratic when increasing the shared_buffers. If you smooth that curve out a bit, you have to throw out the 22528MB figure as meaningless--particularly since it's way too close to the cliff where performance dives hard. The sweet spot looks to me like 11264MB to 17408MB. I'd say 14336MB is the best performing setting that's in the middle of a stable area. And another series of tests to show how throughput changes when checkpoint_segments are increased: http://pugs.postgresql.org/node/503 I'm also not what to gather from increasing the checkpoint_segments. What was shared_buffers set to here? Those two settings are not completely independent, for example at a tiny buffer size it's not as obvious there's a win in spreading the checkpoints out more. It's actually a 3-D graph, with shared_buffers and checkpoint_segments as two axes and the throughput as the Z value. Since that's quite time consuming to map out in its entirety, the way I'd suggest navigating the territory more efficiently is to ignore the defaults altogether. Start with a configuration that someone familiar with tuning the database would pick for this hardware: 8192MB for shared_buffers and 100 checkpoint segments would be a reasonable base point. Run the same tests you did here, but with the value you're not changing set to those much larger values rather than the database defaults, and then I think you'd end with something more interesting. Also, I think the checkpoint_segments values 500 are a bit much, given what level of recovery time would come with a crash at that setting. Smaller steps from a smaller range would be better there I think. Sorry for the long delay. I have a trio of results (that I actually ran about four weeks ago) setting the shared_buffers to 7680MB (I don't know remember why it wasn't set to 8192MB :( ) and checkpoint_segments to 100: http://pugs.postgresql.org/node/517 I'm also capturing the PostgreSQL parameters as suggested so we can see what's set in the config file, default, command line etc. It's the Settings link in the System Summary section on the report web page. So about a 7% change for this particular workload: http://pugs.postgresql.org/node/502 We're re-running some filesystem tests for an upcoming conference, so we'll get back to it shortly... Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5
On Tue, Jan 13, 2009 at 7:40 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Mark Wong mark...@gmail.com wrote: It appears to peak around 220 database connections: http://pugs.postgresql.org/node/514 Interesting. What did you use for connection pooling? It's a fairly dumb but custom built C program for the test kit: http://git.postgresql.org/?p=~markwkm/dbt2.git;a=summary I think the bulk of the logic is in src/client.c, src/db_threadpool.c, and src/transaction_queue.c. My tests have never stayed that flat as the connections in use climbed. I'm curious why we're seeing such different results. I'm sure the difference in workloads makes a difference. Like you implied earlier, I think we have to figure out what works best in for our own workloads. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5
On Mon, Dec 22, 2008 at 7:27 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Mark Wong mark...@gmail.com wrote: The DL380 G5 is an 8 core Xeon E5405 with 32GB of memory. The MSA70 is a 25-disk 15,000 RPM SAS array, currently configured as a 25-disk RAID-0 array. number of connections (250): Moving forward, what other parameters (or combinations of) do people feel would be valuable to illustrate with this workload? To configure PostgreSQL for OLTP on that hardware, I would strongly recommend the use of a connection pool which queues requests above some limit on concurrent queries. My guess is that you'll see best results with a limit somewhere aound 40, based on my tests indicating that performance drops off above (cpucount * 2) + spindlecount. It appears to peak around 220 database connections: http://pugs.postgresql.org/node/514 Of course the system still isn't really tuned all that much... I wouldn't be surprised if the workload peaked at a different number of connections as it is tuned more. I wouldn't consider tests of the other parameters as being very useful before tuning this. This is more or less equivalent to the engines configuration in Sybase, for example. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5
On Sun, Dec 21, 2008 at 10:56 PM, Gregory Stark st...@enterprisedb.com wrote: Mark Wong mark...@gmail.com writes: On Dec 20, 2008, at 5:33 PM, Gregory Stark wrote: Mark Wong mark...@gmail.com writes: To recap, dbt2 is a fair-use derivative of the TPC-C benchmark. We are using a 1000 warehouse database, which amounts to about 100GB of raw text data. Really? Do you get conforming results with 1,000 warehouses? What's the 95th percentile response time? No, the results are not conforming. You and others have pointed that out already. The 95th percentile response time are calculated on each page of the previous links. Where exactly? Maybe I'm blind but I don't see them. Here's an example: http://207.173.203.223/~markwkm/community6/dbt2/baseline.1000.1/report/ The links on the blog entries should be pointing to their respective reports. I spot checked a few and it seems I got some right. I probably didn't make it clear you needed to click on the results to see the reports. I find your questions a little odd for the input I'm asking for. Are you under the impression we are trying to publish benchmarking results? Perhaps this is a simple misunderstanding? Hm, perhaps. The conventional way to run TPC-C is to run it with larger and larger scale factors until you find out the largest scale factor you can get a conformant result at. In other words the scale factor is an output, not an input variable. You're using TPC-C just as an example workload and looking to see how to maximize the TPM for a given scale factor. I guess there's nothing wrong with that as long as everyone realizes it's not a TPC-C benchmark. Perhaps, but we're not trying to run a TPC-C benchmark. We're trying to illustrate how performance changes with an understood OLTP workload. The purpose is to show how the system bahaves more so than what the maximum transactions are. We try to advertise the kit the and work for self learning, we never try to pass dbt-2 off as a benchmarking kit. Except that if the 95th percentile response times are well above a second I have to wonder whether the situation reflects an actual production OLTP system well. It implies there are so many concurrent sessions that any given query is being context switched out for seconds at a time. I have to imagine that a real production system would consider the system overloaded as soon as queries start taking significantly longer than they take on an unloaded system. People monitor the service wait times and queue depths for i/o systems closely and having several seconds of wait time is a highly abnormal situation. We attempt to illustrate the response times on the reports. For example, there is a histogram (drawn as a scatter plot) illustrating the number of transactions vs. the response time for each transaction. This is for the New Order transaction: http://207.173.203.223/~markwkm/community6/dbt2/baseline.1000.1/report/dist_n.png We also plot the response time for a transaction vs the elapsed time (also as a scatter plot). Again, this is for the New Order transaction: http://207.173.203.223/~markwkm/community6/dbt2/baseline.1000.1/report/rt_n.png I'm not sure how bad that is for the benchmarks. The only effect that comes to mind is that it might exaggerate the effects of some i/o intensive operations that under normal conditions might not cause any noticeable impact like wal log file switches or even checkpoints. I'm not sure I'm following. Is this something than can be shown by any stats collection or profiling? This vaguely reminds me of the the significant spikes in system time (and dips everywhere else) when the operating system is fsyncing during a checkpoint that we've always observed when running this in the past. If you have a good i/o controller it might confuse your results a bit when you're comparing random and sequential i/o because the controller might be able to sort requests by physical position better than in a typical oltp environment where the wait queues are too short to effectively do that. Thanks for the input. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5
On Mon, Dec 22, 2008 at 12:59 AM, Greg Smith gsm...@gregsmith.com wrote: On Sat, 20 Dec 2008, Mark Wong wrote: Here are links to how the throughput changes when increasing shared_buffers: http://pugs.postgresql.org/node/505 My first glance takes tells me that the system performance is quite erratic when increasing the shared_buffers. If you smooth that curve out a bit, you have to throw out the 22528MB figure as meaningless--particularly since it's way too close to the cliff where performance dives hard. The sweet spot looks to me like 11264MB to 17408MB. I'd say 14336MB is the best performing setting that's in the middle of a stable area. And another series of tests to show how throughput changes when checkpoint_segments are increased: http://pugs.postgresql.org/node/503 I'm also not what to gather from increasing the checkpoint_segments. What was shared_buffers set to here? Those two settings are not completely independent, for example at a tiny buffer size it's not as obvious there's a win in spreading the checkpoints out more. It's actually a 3-D graph, with shared_buffers and checkpoint_segments as two axes and the throughput as the Z value. The shared_buffers are the default, 24MB. The database parameters are saved, probably unclearly, here's an example link: http://207.173.203.223/~markwkm/community6/dbt2/baseline.1000.1/db/param.out Since that's quite time consuming to map out in its entirety, the way I'd suggest navigating the territory more efficiently is to ignore the defaults altogether. Start with a configuration that someone familiar with tuning the database would pick for this hardware: 8192MB for shared_buffers and 100 checkpoint segments would be a reasonable base point. Run the same tests you did here, but with the value you're not changing set to those much larger values rather than the database defaults, and then I think you'd end with something more interesting. Also, I think the checkpoint_segments values 500 are a bit much, given what level of recovery time would come with a crash at that setting. Smaller steps from a smaller range would be better there I think. I should probably run your pgtune script, huh? Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5
On Mon, Dec 22, 2008 at 2:56 AM, Gregory Stark st...@enterprisedb.com wrote: Mark Wong mark...@gmail.com writes: Thanks for the input. In a more constructive vein: 1) autovacuum doesn't seem to be properly tracked. It looks like you're just tracking the autovacuum process and not the actual vacuum subprocesses which it spawns. Hrm, tracking just the launcher process certainly doesn't help. Are the spawned processed short lived? I take a snapshot of /proc/pid/io data every 60 seconds. The only thing I see named autovacuum is the launcher process. Or perhaps I can't read? Here is the raw data of the /proc/pid/io captures: http://207.173.203.223/~markwkm/community6/dbt2/baseline.1000.1/db/iopp.out 2) The response time graphs would be more informative if you excluded the ramp-up portion of the test. As it is there are big spikes at the low end but it's not clear whether they're really part of the curve or due to ramp-up. This is especially visible in the stock-level graph where it throws off the whole y scale. Ok, we'll take note and see what we can do. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5
On Mon, Dec 22, 2008 at 7:27 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Mark Wong mark...@gmail.com wrote: The DL380 G5 is an 8 core Xeon E5405 with 32GB of memory. The MSA70 is a 25-disk 15,000 RPM SAS array, currently configured as a 25-disk RAID-0 array. number of connections (250): Moving forward, what other parameters (or combinations of) do people feel would be valuable to illustrate with this workload? To configure PostgreSQL for OLTP on that hardware, I would strongly recommend the use of a connection pool which queues requests above some limit on concurrent queries. My guess is that you'll see best results with a limit somewhere aound 40, based on my tests indicating that performance drops off above (cpucount * 2) + spindlecount. Yeah, we are using a homegrown connection concentrator as part of the test kit, but it's not very intelligent. I wouldn't consider tests of the other parameters as being very useful before tuning this. This is more or less equivalent to the engines configuration in Sybase, for example. Right, I have the database configured for 250 connections but I'm using 200 of them. I'm pretty sure for this scale factor 200 is more than enough. Nevertheless I should go through the exercise. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] dbt-2 tuning results with postgresql-8.3.5
Hi all, So after a long hiatus after running this OLTP workload at the OSDL, many of you know the community has had some equipment donated by HP: a DL380 G5 and an MSA70 disk array. We are currently using the hardware to do some tuning exercises to show the effects of various GUC parameters. I wanted to share what I've started with for input for what is realistic to tune an OLTP database on a single large LUN. The initial goal is to show how much can (or can't) be tuned on an OLTP type workload with just database and kernel parameters before physically partitioning the database. I hope this is actually a useful exercise (it was certainly helped get the kit updated a little bit.) To recap, dbt2 is a fair-use derivative of the TPC-C benchmark. We are using a 1000 warehouse database, which amounts to about 100GB of raw text data. The DL380 G5 is an 8 core Xeon E5405 with 32GB of memory. The MSA70 is a 25-disk 15,000 RPM SAS array, currently configured as a 25-disk RAID-0 array. More specific hardware details can be found here: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide#Hardware_Details So first task is to show the confidence of the results, here are a link to a few repeated runs using all default GUC values except the number of connections (250): http://pugs.postgresql.org/node/502 Here are links to how the throughput changes when increasing shared_buffers: http://pugs.postgresql.org/node/505 And another series of tests to show how throughput changes when checkpoint_segments are increased: http://pugs.postgresql.org/node/503 The links go to a graphical summary and raw data. Note that the maximum theoretical throughput at this scale factor is approximately 12000 notpm. My first glance takes tells me that the system performance is quite erratic when increasing the shared_buffers. I'm also not what to gather from increasing the checkpoint_segments. Is it simply that the more checkpoint segments you have, the more time the database spends fsyncing when at a checkpoint? Moving forward, what other parameters (or combinations of) do people feel would be valuable to illustrate with this workload? Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Effects of setting linux block device readahead size
Hi all, I've started to display the effects of changing the Linux block device readahead buffer to the sequential read performance using fio. There are lots of raw data buried in the page, but this is what I've distilled thus far. Please have a look and let me know what you think: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide#Readahead_Buffer_Size Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Effects of setting linux block device readahead size
On Wed, Sep 10, 2008 at 9:26 AM, Scott Carey [EMAIL PROTECTED] wrote: How does that readahead tunable affect random reads or mixed random / sequential situations? In many databases, the worst case scenarios aren't when you have a bunch of concurrent sequential scans but when there is enough random read/write concurrently to slow the whole thing down to a crawl. How the file system behaves under this sort of concurrency I would be very interested in a mixed fio profile with a background writer doing moderate, paced random and sequential writes combined with concurrent sequential reads and random reads. The data for the other fio profiles we've been using are on the wiki, if your eyes can take the strain. We are working on presenting the data in a more easily digestible manner. I don't think we'll add any more fio profiles in the interest of moving on to doing some sizing exercises with the dbt2 oltp workload. We're just going to wrap up a couple more scenarios first and get through a couple of conference presentations. The two conferences in particular are the Linux Plumbers Conference, and the PostgreSQL Conference: West 08, which are both in Portland, Oregon. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Software vs. Hardware RAID Data
On Tue, Aug 19, 2008 at 10:49 PM, [EMAIL PROTECTED] wrote: On Tue, 19 Aug 2008, Mark Wong wrote: Hi all, We started an attempt to slice the data we've been collecting in another way, to show the results of software vs. hardware RAID: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide#Hardware_vs._Software_Raid The angle we're trying to show here is the processor utilization and i/o throughput for a given file system and raid configuration. I wasn't sure about the best way to present it, so this is how it looks so far. Click on the results for a chart of the aggregate processor utilization for the test. Comments, suggestions, criticisms, et al. welcome. it's really good to show cpu utilization as well as throughput, but how about showing the cpu utilization as %cpu per MB/s (possibly with a flag to indicate any entries that look like they may have hit cpu limits) Ok, we'll add that and see how it looks. why did you use 4M stripe size on the software raid? especially on raid 5 this seems like a lot of data to have to touch when making an update. I'm sort of taking a shotgun approach, but ultimately we hope to show whether there is significant impact of the stripe width relative to the database blocksize. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Software vs. Hardware RAID Data
On Wed, Aug 20, 2008 at 12:53 AM, Tommy Gildseth [EMAIL PROTECTED] wrote: Mark Wong wrote: Hi all, We started an attempt to slice the data we've been collecting in another way, to show the results of software vs. hardware RAID: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide#Hardware_vs._Software_Raid Comments, suggestions, criticisms, et al. welcome. The link to the graph for Two Disk Software RAID-0 (64KB stripe) points to the wrong graph, hraid vs sraid. Thanks, I think I have it right this time. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Software vs. Hardware RAID Data
Hi all, We started an attempt to slice the data we've been collecting in another way, to show the results of software vs. hardware RAID: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide#Hardware_vs._Software_Raid The angle we're trying to show here is the processor utilization and i/o throughput for a given file system and raid configuration. I wasn't sure about the best way to present it, so this is how it looks so far. Click on the results for a chart of the aggregate processor utilization for the test. Comments, suggestions, criticisms, et al. welcome. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] file system and raid performance
On Fri, Aug 15, 2008 at 12:22 PM, Bruce Momjian [EMAIL PROTECTED] wrote: Mark Wong wrote: On Mon, Aug 4, 2008 at 10:04 PM, [EMAIL PROTECTED] wrote: On Mon, 4 Aug 2008, Mark Wong wrote: Hi all, We've thrown together some results from simple i/o tests on Linux comparing various file systems, hardware and software raid with a little bit of volume management: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide Mark, very useful analysis. I am curious why you didn't test 'data=writeback' on ext3; 'data=writeback' is the recommended mount method for that file system, though I see that is not mentioned in our official documentation. I have one set of results with ext3 data=writeback and it appears that some of the write tests have less throughput than data=ordered. For anyone who wants to look at the results details: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide it's under the Aggregate Bandwidth (MB/s) - RAID 5 (256KB stripe) - No partition table table. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] file system and raid performance
On Fri, Aug 15, 2008 at 12:22 PM, Bruce Momjian [EMAIL PROTECTED] wrote: Mark Wong wrote: On Mon, Aug 4, 2008 at 10:04 PM, [EMAIL PROTECTED] wrote: On Mon, 4 Aug 2008, Mark Wong wrote: Hi all, We've thrown together some results from simple i/o tests on Linux comparing various file systems, hardware and software raid with a little bit of volume management: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide Mark, very useful analysis. I am curious why you didn't test 'data=writeback' on ext3; 'data=writeback' is the recommended mount method for that file system, though I see that is not mentioned in our official documentation. I think the short answer is that I neglected to. :) I didn't realized 'data=writeback' is the recommended journal mode. We'll get a result or two and see how it looks. Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] file system and raid performance
On Thu, Aug 7, 2008 at 1:24 PM, Gregory S. Youngblood [EMAIL PROTECTED] wrote: -Original Message- From: Mark Wong [mailto:[EMAIL PROTECTED] Sent: Thursday, August 07, 2008 12:37 PM To: Mario Weilguni Cc: Mark Kirkwood; [EMAIL PROTECTED]; [EMAIL PROTECTED]; pgsql- [EMAIL PROTECTED]; Gabrielle Roth Subject: Re: [PERFORM] file system and raid performance I have heard of one or two situations where the combination of the disk controller caused bizarre behaviors with different journaling file systems. They seem so few and far between though. I personally wasn't looking forwarding to chasing Linux file system problems, but I can set up an account and remote management access if anyone else would like to volunteer. [Greg says] Tempting... if no one else takes you up on it by then, I might have some time in a week or two to experiment and test a couple of things. Ok, let me know and I'll set you up with access. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] file system and raid performance
On Thu, Aug 7, 2008 at 3:08 PM, Mark Mielke [EMAIL PROTECTED] wrote: Andrej Ricnik-Bay wrote: 2008/8/8 Scott Marlowe [EMAIL PROTECTED]: noatime turns off the atime write behaviour. Or did you already know that and I missed some weird post where noatime somehow managed to slow down performance? Scott, I'm quite aware of what noatime does ... you didn't miss a post, but if you look at Mark's graphs on http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide they pretty much all indicate that (unless I completely misinterpret the meaning and purpose of the labels), independent of the file-system, using noatime slows read/writes down (on average) That doesn't make sense - if noatime slows things down, then the analysis is probably wrong. Now, modern Linux distributions default to relatime - which will only update access time if the access time is currently less than the update time or something like this. The effect is that modern Linux distributions do not benefit from noatime as much as they have in the past. In this case, noatime vs default would probably be measuring % noise. Anyone know what to look for in kernel profiles? There is readprofile (profile.text) and oprofile (oprofile.kernel and oprofile.user) data available. Just click on the results number, then the raw data link for a directory listing of files. For example, here is one of the links: http://osdldbt.sourceforge.net/dl380/3disk/sraid5/ext3-journal/seq-read/fio/profiling/oprofile.kernel Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
On Fri, Aug 8, 2008 at 8:08 AM, Henrik [EMAIL PROTECTED] wrote: But random writes should be faster on a RAID10 as it doesn't need to calculate parity. That is why people suggest RAID 10 for datases, correct? I can understand that RAID5 can be faster with sequential writes. There is some data here that does not support that RAID5 can be faster than RAID10 for sequential writes: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] file system and raid performance
On Thu, Aug 7, 2008 at 3:08 PM, Mark Mielke [EMAIL PROTECTED] wrote: Andrej Ricnik-Bay wrote: 2008/8/8 Scott Marlowe [EMAIL PROTECTED]: noatime turns off the atime write behaviour. Or did you already know that and I missed some weird post where noatime somehow managed to slow down performance? Scott, I'm quite aware of what noatime does ... you didn't miss a post, but if you look at Mark's graphs on http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide they pretty much all indicate that (unless I completely misinterpret the meaning and purpose of the labels), independent of the file-system, using noatime slows read/writes down (on average) That doesn't make sense - if noatime slows things down, then the analysis is probably wrong. Now, modern Linux distributions default to relatime - which will only update access time if the access time is currently less than the update time or something like this. The effect is that modern Linux distributions do not benefit from noatime as much as they have in the past. In this case, noatime vs default would probably be measuring % noise. Interesting, now how would we see if it is defaulting to relatime? Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] file system and raid performance
On Thu, Aug 7, 2008 at 3:08 PM, Mark Mielke [EMAIL PROTECTED] wrote: Andrej Ricnik-Bay wrote: 2008/8/8 Scott Marlowe [EMAIL PROTECTED]: noatime turns off the atime write behaviour. Or did you already know that and I missed some weird post where noatime somehow managed to slow down performance? Scott, I'm quite aware of what noatime does ... you didn't miss a post, but if you look at Mark's graphs on http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide they pretty much all indicate that (unless I completely misinterpret the meaning and purpose of the labels), independent of the file-system, using noatime slows read/writes down (on average) That doesn't make sense - if noatime slows things down, then the analysis is probably wrong. Now, modern Linux distributions default to relatime - which will only update access time if the access time is currently less than the update time or something like this. The effect is that modern Linux distributions do not benefit from noatime as much as they have in the past. In this case, noatime vs default would probably be measuring % noise. It appears that the default mount option on this system is atime. Not specifying any options, relatime or noatime, results in neither being shown in /proc/mounts. I'm assuming if the default behavior was to use relatime that it would be shown in /proc/mounts. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] file system and raid performance
On Thu, Aug 7, 2008 at 3:21 AM, Mario Weilguni [EMAIL PROTECTED] wrote: Mark Kirkwood schrieb: Mark Kirkwood wrote: You are right, it does (I may be recalling performance from my other machine that has a 3Ware card - this was a couple of years ago...) Anyway, I'm thinking for the Hardware raid tests they may need to be specified. FWIW - of course this somewhat academic given that the single disk xfs test failed! I'm puzzled - having a Gentoo system of similar configuration (2.6.25-gentoo-r6) and running the fio tests a little modified for my config (2 cpu PIII 2G RAM with 4x ATA disks RAID0 and all xfs filesystems - I changed sizes of files to 4G and no. processes to 4) all tests that failed on Marks HP work on my Supermicro P2TDER + Promise TX4000. In fact the performance is pretty reasonable on the old girl as well (seq read is 142Mb/s and the random read/write is 12.7/12.0 Mb/s). I certainly would like to see some more info on why the xfs tests were failing - as on most systems I've encountered xfs is a great performer. regards Mark I can second this, we use XFS on nearly all our database servers, and never encountered the problems mentioned. I have heard of one or two situations where the combination of the disk controller caused bizarre behaviors with different journaling file systems. They seem so few and far between though. I personally wasn't looking forwarding to chasing Linux file system problems, but I can set up an account and remote management access if anyone else would like to volunteer. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] file system and raid performance
On Mon, Aug 4, 2008 at 10:04 PM, [EMAIL PROTECTED] wrote: On Mon, 4 Aug 2008, Mark Wong wrote: Hi all, We've thrown together some results from simple i/o tests on Linux comparing various file systems, hardware and software raid with a little bit of volume management: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide What I'd like to ask of the folks on the list is how relevant is this information in helping make decisions such as What file system should I use? What performance can I expect from this RAID configuration? I know these kind of tests won't help answer questions like Which file system is most reliable? but we would like to be as helpful as we can. Any suggestions/comments/criticisms for what would be more relevant or interesting also appreciated. We've started with Linux but we'd also like to hit some other OS's. I'm assuming FreeBSD would be the other popular choice for the DL-380 that we're using. I hope this is helpful. it's definantly timely for me (we were having a spirited 'discussion' on this topic at work today ;-) what happened with XFS? Not exactly sure, I didn't attempt to debug much. I only looked into it enough to see that the fio processes were waiting for something. In one case I left the test go for 24 hours too see if it would stop. Note that I specified to fio not to run longer than an hour. you show it as not completing half the tests in the single-disk table and it's completly missing from the other ones. what OS/kernel were you running? This is a Gentoo system, running the 2.6.25-gentoo-r6 kernel. if it was linux, which software raid did you try (md or dm) did you use lvm or raw partitions? We tried mdraid, not device-mapper. So far we have only used raw partitions (whole devices without partitions.) Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] file system and raid performance
Hi all, We've thrown together some results from simple i/o tests on Linux comparing various file systems, hardware and software raid with a little bit of volume management: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide What I'd like to ask of the folks on the list is how relevant is this information in helping make decisions such as What file system should I use? What performance can I expect from this RAID configuration? I know these kind of tests won't help answer questions like Which file system is most reliable? but we would like to be as helpful as we can. Any suggestions/comments/criticisms for what would be more relevant or interesting also appreciated. We've started with Linux but we'd also like to hit some other OS's. I'm assuming FreeBSD would be the other popular choice for the DL-380 that we're using. I hope this is helpful. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] A guide/tutorial to performance monitoring and tuning
On Mon, Jul 21, 2008 at 10:24 PM, Greg Smith [EMAIL PROTECTED] wrote: On Mon, 21 Jul 2008, Francisco Reyes wrote: On 2:59 pm 06/29/08 Greg Smith [EMAIL PROTECTED] wrote: Right now I'm working with a few other people to put together a more straightforward single intro guide that should address some of the vagueness you point out here, Was that ever completed? Not done yet; we're planning to have a first rev done in another couple of weeks. The work in progress is at http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server and I'm due to work out another set of improvements to that this week during OSCON. I'd also like to point out we're putting together some data revolving about software raid, hardware raid, volume management, and filesystem performance on a system donated by HP here: http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide Note that it's also a living guide and we've haven't started covering some of the things I just mentioned. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Benchmark Data requested
On Mon, 4 Feb 2008 15:09:58 -0500 (EST) Greg Smith [EMAIL PROTECTED] wrote: On Mon, 4 Feb 2008, Simon Riggs wrote: Would anybody like to repeat these tests with the latest production versions of these databases (i.e. with PGSQL 8.3) Do you have any suggestions on how people should run TPC-H? It looked like a bit of work to sort through how to even start this exercise. If you mean you want to get your hands on a kit, the one that Jenny and I put together is here: http://sourceforge.net/project/showfiles.php?group_id=52479package_id=71458 I hear it still works. :) Regards, Mark ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Benchmark Data requested
On Mon, 04 Feb 2008 17:33:34 -0500 Jignesh K. Shah [EMAIL PROTECTED] wrote: Hi Simon, I have some insight into TPC-H on how it works. First of all I think it is a violation of TPC rules to publish numbers without auditing them first. So even if I do the test to show the better performance of PostgreSQL 8.3, I cannot post it here or any public forum without doing going through the process. (Even though it is partial benchmark as they are just doing the equivalent of the PowerRun of TPCH) Maybe the PR of PostgreSQL team should email [EMAIL PROTECTED] about them and see what they have to say about that comparison. I think I am qualified enough to say it is not a violation of TPC fair-use policy if we scope the data as a measure of how PostgreSQL has changed from 8.1 to 8.3 and refrain from comparing these results to what any other database is doing. The point is to measure PostgreSQL's progress not market it, correct? Regards, Mark ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] dbt2 NOTPM numbers
On 6/11/07, Markus Schiltknecht [EMAIL PROTECTED] wrote: Heikki Linnakangas wrote: Markus Schiltknecht wrote: For dbt2, I've used 500 warehouses and 90 concurrent connections, default values for everything else. 500? That's just too much for the hardware. Start from say 70 warehouses and up it from there 10 at a time until you hit the wall. I'm using 30 connections with ~100 warehouses on somewhat similar hardware. Aha! That's why... I've seen the '500' in some dbt2 samples and thought it'd be a good default value. But it makes sense that the benchmark doesn't automatically 'scale down'... Stupid me. Thanks again! Hoping for larger NOTPMs. Yeah, I ran with 500+ warehouses, but I had 6 14-disk arrays of 15K RPM scsi drives and 6 dual-channel controllers... :) Regards, Mark ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] dbt2 NOTPM numbers
On 6/13/07, Markus Schiltknecht [EMAIL PROTECTED] wrote: Hi, Mark Wong wrote: Yeah, I ran with 500+ warehouses, but I had 6 14-disk arrays of 15K RPM scsi drives and 6 dual-channel controllers... :) Lucky you! In the mean time, I've figured out that the box in question peaked at about 1450 NOTPMs with 120 warehouses with RAID 1+0. I'll try to compare again to RAID 6. Is there any place where such results are collected? Unfortunately not anymore. When I was working at OSDL there was... I've been told that the lab has been mostly disassembled now so the data are lost now. Regards, Mark ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] dbt2 NOTPM numbers
On 6/4/07, Markus Schiltknecht [EMAIL PROTECTED] wrote: Thanks, that's exactly the one simple and very raw comparison value I've been looking for. (Since most of the results pages of (former?) OSDL are down). Yeah, those results pages are gone for good. :( Regards, Mark ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [PATCHES] COPY FROM performance improvements
On Fri, 22 Jul 2005 12:28:43 -0700 Luke Lonergan [EMAIL PROTECTED] wrote: Joshua, On 7/22/05 10:11 AM, Joshua D. Drake [EMAIL PROTECTED] wrote: The database server is a PE (Power Edge) 6600 Database Server IO: [EMAIL PROTECTED] root]# /sbin/hdparm -tT /dev/sda /dev/sda: Timing buffer-cache reads: 1888 MB in 2.00 seconds = 944.00 MB/sec Timing buffered disk reads: 32 MB in 3.06 seconds = 10.46 MB/sec Second Database Server IO: [EMAIL PROTECTED] root]# /sbin/hdparm -tT /dev/sda /dev/sda: Timing buffer-cache reads: 1816 MB in 2.00 seconds = 908.00 MB/sec Timing buffered disk reads: 26 MB in 3.11 seconds = 8.36 MB/sec [EMAIL PROTECTED] root]# Can you post the time dd if=/dev/zero of=bigfile bs=8k count=50 results? Also do the reverse (read the file) with time dd if=bigfile of=/dev/null bs=8k. I think you are observing what we've known for a while, hardware RAID is horribly slow. We've not found a hardware RAID adapter of this class yet that shows reasonable read or write performance. The Adaptec 2400R or the LSI or others have terrible internal I/O compared to raw SCSI with software RAID, and even the CPU usage is higher on these cards while doing slower I/O than linux SW RAID. Notably - we've found that the 3Ware RAID controller does a better job than the low end SCSI RAID at HW RAID support, and also exports JBOD at high speeds. If you export JBOD on the low end SCSI RAID adapters, the performance is also very poor, though generally faster than using HW RAID. Are there any recommendations for Qlogic controllers on Linux, scsi or fiber channel? I might be able to my hands on some. I have pci-x slots for AMD, Itanium, or POWER5 if the architecture makes a difference. Mark ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [PATCHES] COPY FROM performance improvements
On a single spindle: $ time dd if=/dev/zero of=bigfile bs=8k count=200 200+0 records in 200+0 records out real2m8.569s user0m0.725s sys 0m19.633s None of my drives are partitioned big enough for me to create 2x RAM sized files on a single disk. I have 16MB RAM and only 36GB drives. But here are some number for my 12-disk lvm2 striped volume. $ time dd if=/dev/zero of=bigfile3 bs=8k count=400 400+0 records in 400+0 records out real1m17.059s user0m1.479s sys 0m41.293s Mark On Thu, 21 Jul 2005 16:14:47 -0700 Luke Lonergan [EMAIL PROTECTED] wrote: Cool! At what rate does your disk setup write sequential data, e.g.: time dd if=/dev/zero of=bigfile bs=8k count=50 (sized for 2x RAM on a system with 2GB) BTW - the Compaq smartarray controllers are pretty broken on Linux from a performance standpoint in our experience. We've had disastrously bad results from the SmartArray 5i and 6 controllers on kernels from 2.4 - 2.6.10, on the order of 20MB/s. For comparison, the results on our dual opteron with a single LSI SCSI controller with software RAID0 on a 2.6.10 kernel: [EMAIL PROTECTED] dbfast]$ time dd if=/dev/zero of=bigfile bs=8k count=50 50+0 records in 50+0 records out real0m24.702s user0m0.077s sys 0m8.794s Which calculates out to about 161MB/s. - Luke On 7/21/05 2:55 PM, Mark Wong [EMAIL PROTECTED] wrote: I just ran through a few tests with the v14 patch against 100GB of data from dbt3 and found a 30% improvement; 3.6 hours vs 5.3 hours. Just to give a few details, I only loaded data and started a COPY in parallel for each the data files: http://www.testing.osdl.org/projects/dbt3testing/results/fast_copy/ Here's a visual of my disk layout, for those familiar with the database schema: http://www.testing.osdl.org/projects/dbt3testing/results/fast_copy/layout-dev4 -010-dbt3.html I have 6 arrays of fourteen 15k rpm drives in a split-bus configuration attached to a 4-way itanium2 via 6 compaq smartarray pci-x controllers. Let me know if you have any questions. Mark ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [HACKERS] PLM pulling from CVS nightly for testing in STP
I have dbt-2 tests automatically running against each pull from CVS and have started to automatically compile results here: http://developer.osdl.org/markw/postgrescvs/ I did start with a bit of a minimalistic approach, so I'm open for any comments, feedback, etc. Mark ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] PLM pulling from CVS nightly for testing in STP
Hi all, Just wanted everyone to know what we're pulling CVS HEAD nightly so it can be tested in STP now. Let me know if you have any questions. Tests are not automatically run yet, but I hope to remedy that shortly. For those not familiar with STP and PLM, here are a couple of links: STP http://www.osdl.org/stp/ PLM http://www.osdl.org/plm-cgi/plm Mark ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] PLM pulling from CVS nightly for testing in STP
On Wed, Apr 13, 2005 at 11:35:36AM -0700, Josh Berkus wrote: Mark, Just wanted everyone to know what we're pulling CVS HEAD nightly so it can be tested in STP now. Let me know if you have any questions. Way cool.How do I find the PLM number? How are you nameing these? The naming convention I'm using is postgresql-MMDD, for example postgresql-20050413, for the anonymous cvs export from today (April 13). I have a cronjob that'll do the export at 1AM PST8PDT. The search page for the PLM numbers is here: https://www.osdl.org/plm-cgi/plm?module=search or you can use the stpbot on linuxnet.mit.edu#osdl. Mark ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] ext3 journalling type
I have some data here, no detailed analyses though: http://www.osdl.org/projects/dbt2dev/results/fs/ Mark On Mon, Nov 08, 2004 at 01:26:09PM +0100, Dawid Kuroczko wrote: The ext3fs allows to selet type of journalling to be used with filesystem. Journalling pretty much mirrors the work of WAL logging by PostgreSQL... I wonder which type of journalling is best for PgSQL in terms of performance. Choices include: journal All data is committed into the journal prior to being written into the main file system. ordered This is the default mode. All data is forced directly out to the main file system prior to its metadata being committed to the journal. writeback Data ordering is not preserved - data may be written into the main file system after its metadata has been commit- ted to the journal. This is rumoured to be the highest- throughput option. It guarantees internal file system integrity, however it can allow old data to appear in files after a crash and journal recovery. Am I right to assume that writeback is both fastest and at the same time as safe to use as ordered? Maybe any of you did some benchmarks? Regards, Dawid ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] ia64 results with dbt2 and 8.0beta4
Hi everyone, Some more data I've collected, trying to best tune dbt-2 with 8.0beta4. Was hoping for some suggestions, explanations for what I'm seeing, etc. A review of hardware I've got: 4 x 1.5Ghz Itanium 2 16GB memory 84 15K RPM disks (6 controlers, 12 channels) Physical Database table layout (using LVM2 for tables using more than 1 disk): - warehouse 2 disks - district 2 disks - order_line 2 disks - customer 4 disks - stock 12 disks - log 12 disks - orders 2 disks - new_order 2 disks - history 1 disk - item 1 disk - index1 1 disk - index2 1 disk All these tests are using a 500 warehouse database. Test 1: http://www.osdl.org/projects/dbt2dev/results/dev4-010/188/ Metric: 3316 DB parameter changes from default: bgwriter_percent | 10 checkpoint_timeout | 300 checkpoint_segments| 800 checkpoint_timeout | 1800 default_statistics_target | 1000 max_connections| 140 stats_block_level | on stats_command_string | on stats_row_level| on wal_buffers| 128 wal_sync_method| fsync work_mem | 2048 Test 2: http://www.osdl.org/projects/dbt2dev/results/dev4-010/189/ Metric: 3261 -1.7% decrease Test 1 DB parameter changes from Test 1: shared_buffers | 6 Noted changes: The block read for the customer table decreases significantly according to the database. Test 3: http://www.osdl.org/projects/dbt2dev/results/dev4-010/190/ Metric: 3261 0% change from Test 2 DB parameter changes from Test 2: effective_cache_size | 22 Noted changes: No apparent changes according to the charts. Test 4: http://www.osdl.org/projects/dbt2dev/results/dev4-010/191/ Metric: 3323 1.9 increase from Test 3 DB parameter changes from Test 3: checkpoint_segments| 1024 effective_cache_size | 1000 Noted Changes: The increased checkpoint_segments smothed out the throughput and other i/o related stats. Test 5: http://www.osdl.org/projects/dbt2dev/results/dev4-010/192/ Metric: 3149 -5% decrease from Test 4 DB parameter changes from Test 4: shared_buffers | 8 Noted changes: The graphs are starting to jump around a bit. I figure 80,000 shared_buffers is too much. Test 6: http://www.osdl.org/projects/dbt2dev/results/dev4-010/193/ Metric: 3277 4% increase from Test 5 DB parameter changes from Test 5: random_page_cost | 2 shared_buffers | 6 Noted changes: Reducing the shared_buffers to the smoother performance found in Test 4 seemed to have disrupted by decreasing the random_page_cost to 2. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] different io elevators in linux
On Mon, Oct 25, 2004 at 10:09:17AM -0700, Josh Berkus wrote: Bjorn, I haven't read much FAQs but has anyone done some benchmarks with different io schedulers in linux with postgresql? According to OSDL, using the deadline scheduler sometimes results in a roughly 5% boost to performance, and sometimes none, depending on the application. We use it for all testing, though, just in case. --Josh Yes, we found with an OLTP type workload, the as scheduler performs about 5% worse than the deadline scheduler, where in a DSS type workload there really isn't much difference. The former doing a mix of reading/writing, where the latter is doing mostly reading. Mark ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] futex results with dbt-3
On Sun, Oct 17, 2004 at 09:39:33AM +0200, Manfred Spraul wrote: Neil wrote: . In any case, the futex patch uses the Linux 2.6 futex API to implement PostgreSQL spinlocks. Has anyone tried to replace the whole lwlock implementation with pthread_rwlock? At least for Linux with recent glibcs, pthread_rwlock is implemented with futexes, i.e. we would get a fast lock handling without os specific hacks. Perhaps other os contain user space pthread locks, too. Attached is an old patch. I tested it on an uniprocessor system a year ago and it didn't provide much difference, but perhaps the scalability is better. You'll have to add -lpthread to the library list for linking. I've heard that simply linking to the pthreads libraries, regardless of whether you're using them or not creates a significant overhead. Has anyone tried it for kicks? Mark ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] mmap (was First set of OSDL Shared Mem scalability results, some wierdness ...
On Fri, Oct 15, 2004 at 09:22:03PM -0400, Bruce Momjian wrote: Tom Lane wrote: Mark Wong [EMAIL PROTECTED] writes: I know where the do_sigaction is coming from in this particular case. Manfred Spraul tracked it to a pair of pgsignal calls in libpq. Commenting out those two calls out virtually eliminates do_sigaction from the kernel profile for this workload. Hmm, I suppose those are the ones associated with suppressing SIGPIPE during send(). It looks to me like those should go away in 8.0 if you have compiled with ENABLE_THREAD_SAFETY ... exactly how is PG being built in the current round of tests? Yes, those calls are gone in 8.0 with --enable-thread-safety and were added specifically because of Manfred's reports. Ok, I had the build commands changed for installing PostgreSQL in STP. The do_sigaction call isn't at the top of the profile anymore, here's a reference for those who are interested; it should have the same test parameters as the one Tom referenced a little earlier: http://khack.osdl.org/stp/298230/ Mark ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] mmap (was First set of OSDL Shared Mem scalability results, some wierdness ...
On Fri, Oct 15, 2004 at 01:09:01PM -0700, Sean Chittenden wrote: [snip] This ultimately depends on two things: how much time is spent copying buffers around in kernel memory, and how much advantage can be gained by freeing up the memory used by the backends to store the backend-local copies of the disk pages they use (and thus making that memory available to the kernel to use for additional disk buffering). Someone on IRC pointed me to some OSDL benchmarks, which broke down where time is being spent. Want to know what the most expensive part of PostgreSQL is? *drum roll* http://khack.osdl.org/stp/297960/profile/DBT_2_Profile-tick.sort 3967393 total 1.7735 2331284 default_idle 36426.3125 825716 do_sigaction 1290.1813 133126 __copy_from_user_ll 1040.0469 97780 __copy_to_user_ll763.9062 43135 finish_task_switch 269.5938 30973 do_anonymous_page 62.4456 24175 scsi_request_fn 22.2197 23355 __do_softirq 121.6406 17039 __wake_up133.1172 16527 __make_request10.8730 9823 try_to_wake_up13.6431 9525 generic_unplug_device 66.1458 8799 find_get_page 78.5625 7878 scsi_end_request 30.7734 Copying data to/from userspace and signal handling Let's hear it for the need for mmap(2)!!! *crowd goes wild* [snip] I know where the do_sigaction is coming from in this particular case. Manfred Spraul tracked it to a pair of pgsignal calls in libpq. Commenting out those two calls out virtually eliminates do_sigaction from the kernel profile for this workload. I've lost track of the discussion over the past year, but I heard a rumor that it was finally addressed to some degree. I did understand it touched on a lot of other things, but can anyone summarize where that discussion has gone? Mark ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [Testperf-general] Re: [PERFORM] First set of OSDL Shared Memscalability results, some wierdness ...
On Fri, Oct 15, 2004 at 05:27:29PM -0400, Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: I suspect the reason recalc_sigpending_tsk is so high is that the original coding of PG_TRY involved saving and restoring the signal mask, which led to a whole lot of sigsetmask-type kernel calls. Is this test with beta3, or something older? Beta3, *without* Gavin or Neil's Futex patch. Hmm, in that case the cost deserves some further investigation. Can we find out just what that routine does and where it's being called from? There's a call-graph feature with oprofile as of version 0.8 with the opstack tool, but I'm having a terrible time figuring out why the output isn't doing the graphing part. Otherwise, I'd have that available already... Mark ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] mmap (was First set of OSDL Shared Mem scalability results, some wierdness ...
On Fri, Oct 15, 2004 at 05:37:50PM -0400, Tom Lane wrote: Mark Wong [EMAIL PROTECTED] writes: I know where the do_sigaction is coming from in this particular case. Manfred Spraul tracked it to a pair of pgsignal calls in libpq. Commenting out those two calls out virtually eliminates do_sigaction from the kernel profile for this workload. Hmm, I suppose those are the ones associated with suppressing SIGPIPE during send(). It looks to me like those should go away in 8.0 if you have compiled with ENABLE_THREAD_SAFETY ... exactly how is PG being built in the current round of tests? Ah, yes. Ok. It's not being configured with any options. That'll be easy to rememdy though. I'll get that change made and we can try again. Mark ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [Testperf-general] Re: [PERFORM] First set of OSDL Shared Memscalability results, some wierdness ...
On Fri, Oct 15, 2004 at 05:44:34PM -0400, Tom Lane wrote: Mark Wong [EMAIL PROTECTED] writes: On Fri, Oct 15, 2004 at 05:27:29PM -0400, Tom Lane wrote: Hmm, in that case the cost deserves some further investigation. Can we find out just what that routine does and where it's being called from? There's a call-graph feature with oprofile as of version 0.8 with the opstack tool, but I'm having a terrible time figuring out why the output isn't doing the graphing part. Otherwise, I'd have that available already... I was wondering if this might be associated with do_sigaction. do_sigaction is only 0.23 percent of the runtime according to the oprofile results: http://khack.osdl.org/stp/298124/oprofile/DBT_2_Profile-all.oprofile.txt but the profile results for the same run: http://khack.osdl.org/stp/298124/profile/DBT_2_Profile-tick.sort show do_sigaction very high and recalc_sigpending_tsk nowhere at all. Something funny there. I have always attributed those kind of differences based on how readprofile and oprofile collect their data. Granted I don't exactly understand it. Anyone familiar with the two differences? Mark ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] O_DIRECT setting
On Thu, Sep 30, 2004 at 07:02:32PM +1200, Guy Thornley wrote: Sorry about the belated reply, its been busy around here. Incidentally, postgres heap files suffer really, really bad fragmentation, which affects sequential scan operations (VACUUM, ANALYZE, REINDEX ...) quite drastically. We have in-house patches that somewhat alleiviate this, but they are not release quality. Has anybody else suffered this? Any chance I could give those patches a try? I'm interested in seeing how they may affect our DBT-3 workload, which execute DSS type queries. Like I said, the patches are not release quality... if you run them on a metadata journalling filesystem, without an 'ordered write' mode, its possible to end up with corrupt heaps after a crash because of garbage data in the extended files. If/when we move to postgres 8 I'll try to ensure the patches get re-done with releasable quality Guy Thornley That's ok, we like to help test and proof things, we don't need patches to be release quality. Mark ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] O_DIRECT setting
On Thu, Sep 23, 2004 at 10:57:41AM -0400, Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: TODO has: * Consider use of open/fcntl(O_DIRECT) to minimize OS caching Should the item be removed? I think it's fine ;-) ... it says consider it, not do it. The point is that we could do with more research in this area, even if O_DIRECT per se is not useful. Maybe you could generalize the entry to investigate ways of fine-tuning OS caching behavior. regards, tom lane I talked to Jan a little about this during OSCon since Linux filesystems (ext2, ext3, etc) let you use O_DIRECT. He felt the only place where PostgreSQL may benefit from this now, without managing its own buffer first, would be with the log writer. I'm probably going to get this wrong, but he thought it would be interesting to try an experiment by taking X number of pages to be flushed, sort them (by age? where they go on disk?) and write them out. He thought this would be a relatively easy thing to try, a day or two of work. We'd really love to experiment with it. Mark ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] O_DIRECT setting
On Mon, Sep 20, 2004 at 07:57:34PM +1200, Guy Thornley wrote: [snip] Incidentally, postgres heap files suffer really, really bad fragmentation, which affects sequential scan operations (VACUUM, ANALYZE, REINDEX ...) quite drastically. We have in-house patches that somewhat alleiviate this, but they are not release quality. Has anybody else suffered this? Any chance I could give those patches a try? I'm interested in seeing how they may affect our DBT-3 workload, which execute DSS type queries. Thanks, Mark ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] fsync vs open_sync
On Sun, Sep 05, 2004 at 12:16:42AM -0500, Steve Bergman wrote: On Sat, 2004-09-04 at 23:47 -0400, Christopher Browne wrote: The world rejoiced as [EMAIL PROTECTED] (Merlin Moncure) wrote: Ok, you were right. I made some tests and NTFS is just not very good in the general case. I've seen some benchmarks for Reiser4 that are just amazing. Reiser4 has been sounding real interesting. Are these independent benchmarks, or the benchmarketing at namesys.com? Note that the APPEND, MODIFY, and OVERWRITE phases have been turned off on the mongo tests and the other tests have been set to a lexical (non default for mongo) mode. I've done some mongo benchmarking myself and reiser4 loses to ext3 (data=ordered) in the excluded tests. APPEND phase performance is absolutely *horrible*. So they just turned off the phases in which reiser4 lost and published the remaining results as proof that resier4 is the fastest filesystem. See: http://marc.theaimsgroup.com/?l=reiserfsm=109363302000856 -Steve Bergman Reiser4 also isn't optmized for lots of fsyncs (unless it's been done recently.) I believe the mention fsync performance in their release notes. I've seen this dramatically hurt performance with our OLTP workload. -- Mark Wong - - [EMAIL PROTECTED] Open Source Development Lab Inc - A non-profit corporation 12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005 (503) 626-2455 x 32 (office) (503) 626-2436 (fax) http://developer.osdl.org/markw/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] analyzing postgresql performance for dbt-2
On Tue, Oct 21, 2003 at 08:35:56PM -0400, Bruce Momjian wrote: [EMAIL PROTECTED] wrote: I'm running our DBT-2 workload against PostgreSQL 7.3.4 and I'm having some trouble figuring out what I should be looking for when I'm trying to tune the database. I have results for a decent baseline, but when I try to increase the load on the database, the performance drops. Nothing in the graphs (in the links listed later) sticks out to me so I'm wondering if there are other database statitics I should try to collect. Any suggestions would be great and let me know if I can answer any other questions. Here are a pair of results where I just raise the load on the database, where increasing the load increases the area of the database touched in addition to increasing the transaction rate. The overall metric increases somewhat, but the response time for most of the interactions also increases significantly: http://developer.osdl.org/markw/dbt2-pgsql/158/ [baseline] - load of 100 warehouses - metric 1249.65 http://developer.osdl.org/markw/dbt2-pgsql/149/ - load of 140 warehouses - metric 1323.90 I looked at these charts and they looked normal to me. It looked like your the load increased until your computer was saturated. Is there something I am missing? I've run some i/o tests so I'm pretty sure I haven't saturated that. And it looks like I have almost 10% more processor time left. I do agree that it appears something might be saturated, I just don't know where to look... Thanks, Mark ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org