Re: [PERFORM] big database performance
On Thu, Jan 10, 2008 at 12:08:39PM +0100, Stephane Bailliez wrote: > Jared Mauch wrote: >> I do large databases in Pg, like 300GB/day of new data. > > That's impressive. Would it be possible to have details on your hardware, > schema and configuration and type of usage ? > > I'm sure there's something to learn in there for a lot of people (or at > least for me) http://archives.postgresql.org/pgsql-performance/2007-12/msg00372.php http://archives.postgresql.org/pgsql-performance/2006-05/msg00444.php The hardware specs are kinda boring since it's not I/O bound, so you could get the same disk performance out of some EIDE 7200 rpm disks (which I have done for testing). The current setup is a 4xOpteron 8218 (dual core) w/ 16G ram. I have roughly 12TB usable disk space on the sysem connected via some SATA <-> FC thing our systems folks got us. Problem I have is the linear cpu speed isn't enough and there would be challenges splitting the workload across multiple cpus. All my major reporting is done via pg_dump and I'm pondering what would happen if I just removed Pg from the equation for the major reporting tasks entirely. I may see much better performance without the database [in my way]. I've not done that as some types of data access would need to be significantly redone and I don't want to spend the time on that... - Jared -- Jared Mauch | pgp key available via finger from [EMAIL PROTECTED] clue++; | http://puck.nether.net/~jared/ My statements are only mine. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] big database performance
On Thu, Jan 10, 2008 at 10:57:46AM +0200, Adrian Moisey wrote: > What sort of information do you need from me ? Ratio of read vs write operations (select vs insert/copy). average number of indicies per table average table size. (analyze verbose if you want to get into more details). What is the process doing (eg: in top, is it just on the CPU or is it blocking for I/O?). I/O information, from iostat -d (You may need to build an iostat binary for Linux, the source is out there, i can give you a pointer if you need it). >> Is your problem with performance database reads? writes? (insert/copy?) >> How many indicies do you have? > > I think the problem is related to load. Everything is slow because there > are way too many connections. So everything is making everything else > slow. Not much detail, is it? > > We have 345 indicies on the db. If the tables are heavily indexed this could easily slow down insert performance. Taking a large dataset and adding a second index, postgres doesn't use threads to create the two indicies on different cpus/cores in parallel. This could represent some of your performance difference. If you're doing a lot of write operations and fewer read, perhaps the cost of an index isn't worth it in the cpu time spent creating it vs the amount of time for a seq scan. - Jared -- Jared Mauch | pgp key available via finger from [EMAIL PROTECTED] clue++; | http://puck.nether.net/~jared/ My statements are only mine. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] big database performance
Jared Mauch wrote: I do large databases in Pg, like 300GB/day of new data. That's impressive. Would it be possible to have details on your hardware, schema and configuration and type of usage ? I'm sure there's something to learn in there for a lot of people (or at least for me) Cheers, -- stephane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] big database performance
Hi I do large databases in Pg, like 300GB/day of new data. Need a lot more data on what you're having issues with. That is big! What sort of information do you need from me ? Is your problem with performance database reads? writes? (insert/copy?) How many indicies do you have? I think the problem is related to load. Everything is slow because there are way too many connections. So everything is making everything else slow. Not much detail, is it? We have 345 indicies on the db. -- Adrian Moisey System Administrator | CareerJunction | Your Future Starts Here. Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED] Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] big database performance
Hi Also, we're running the db on ext3 with noatime. Should I look at changing or getting rid of journaling ? No (unless you like really long fsck times). data=writeback is safe with PostgreSQL, though. I tested that on a dev box, and I didn't notice a difference when using pgbench -- Adrian Moisey System Administrator | CareerJunction | Your Future Starts Here. Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED] Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] big database performance
On Wed, 2008-01-09 at 10:18 +0200, Adrian Moisey wrote: > We recently converted to postgres (from mssql) and we're having > performance issues. I think you need to say more about what the performance issues actually are, otherwise everybody will just speculate you to death. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] big database performance
On Wed, 9 Jan 2008, Guillaume Smet wrote: On Jan 9, 2008 9:27 AM, Joshua D. Drake <[EMAIL PROTECTED]> wrote: wal_sync_method = open_sync Do you recommend it in every situation or just because data are on a SAN? Do you have any numbers/real cases explaining this choice. Sync writes are faster on Linux in every case I've ever tried, compared to the default config that does a write followed by a sync. With regular discs they're just a little faster. On some SAN configurations, they're enormously faster, because the SANs are often optimized to handle syncronous writes far more efficiently than write/sync ones. This is mainly because Oracle does its writes that way, so if you want good Oracle performance you have to handle sync writes well. I have something on this topic I keep meaning to publish, but I got spooked about the potential to have silent problems or crashes when using open_sync due to a Linux kernel issue reported here: http://archives.postgresql.org/pgsql-hackers/2007-10/msg01310.php Certainly with that report floating out there I'd only recommend open_sync to people who are putting plenty of time into testing their database is robust under load with that configuration before deploying it; I sure wouldn't just make that changes on a production system just to see if it's faster. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] big database performance
On Wed, Jan 09, 2008 at 12:27:33AM -0800, Joshua D. Drake wrote: > Adrian Moisey wrote: >> Hi >> >> We recently converted to postgres (from mssql) and we're having >> performance issues. Not all the issues are related to postgres, but we're >> trying to sort everything out. Hi, I do large databases in Pg, like 300GB/day of new data. Need a lot more data on what you're having issues with. Is your problem with performance database reads? writes? (insert/copy?) How many indicies do you have? - jared -- Jared Mauch | pgp key available via finger from [EMAIL PROTECTED] clue++; | http://puck.nether.net/~jared/ My statements are only mine. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] big database performance
Hi Joshua, On Jan 9, 2008 9:27 AM, Joshua D. Drake <[EMAIL PROTECTED]> wrote: > wal_sync_method = open_sync Do you recommend it in every situation or just because data are on a SAN? Do you have any numbers/real cases explaining this choice. Thanks. -- Guillaume ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] big database performance
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, 9 Jan 2008 08:16:48 -0800 Alan Hodgson <[EMAIL PROTECTED]> wrote: > On Wednesday 09 January 2008, Adrian Moisey > <[EMAIL PROTECTED]> wrote: > > > > Also, we're running the db on ext3 with noatime. Should I look at > > changing or getting rid of journaling ? > > No (unless you like really long fsck times). data=writeback is safe > with PostgreSQL, though. > Except :)... for pg_xlog. If you have pg_xlog on a different partition, feel free to run ext2 for it. Joshua D. Drake > > ---(end of > broadcast)--- TIP 7: You can help support the > PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > - -- The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD' -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHhPtyATb/zqfZUUQRAk32AKCTvJPBCvHtb4JWMu7+xwxQZdA/ZQCgn3K2 pCmcUXAiAibLkTgEwGVXPyQ= =H2bK -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] big database performance
On Wednesday 09 January 2008, Adrian Moisey <[EMAIL PROTECTED]> wrote: > > Also, we're running the db on ext3 with noatime. Should I look at > changing or getting rid of journaling ? No (unless you like really long fsck times). data=writeback is safe with PostgreSQL, though. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] big database performance
Hi We recently converted to postgres (from mssql) and we're having performance issues. Not all the issues are related to postgres, but we're trying to sort everything out. The server is running ubuntu Gutsy with the database stored on a IBM SAN. It is a Dell box with dual quad core 2.00GHz Xeons and 8GB RAM. The database is about 71GB in size. I've looked at the postgres config files and we've tweaked as much as our knowledge allows. Can someone shed some light on the settings I should use ? Umpf that isn't quite enough info :) but assuming you are running 8.2.x: Sorry :/ Yes, we are running 8.2.x Start with 1GB shared_buffers (you may be able to go hire), 4MB work_mem, wal_sync_method = open_sync, checkpoint_segments = 30, default_statistics_target = 150, effective_cache_size = 6GB . Our shared_buffers is 1GB. work_mem is 32MB I changed wal_sync_method to open_sync (which helped a ton!) Can someone please explain effective_cache_size. what cache does it want to know about? Linux cache? Also, we're running the db on ext3 with noatime. Should I look at changing or getting rid of journaling ? 64bit Gutsy? Yes How is the SAN connected? fibre What does mpstat 5 (3 iterations) say? Even better what does sar -A say over a 24 hour period? I'll get these for you -- Adrian Moisey System Administrator | CareerJunction | Your Future Starts Here. Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED] Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] big database performance
Hi, what segment size do you use for the san partition? This could also be a bottle neck for db servers. Frank ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] big database performance
Adrian Moisey wrote: Hi We recently converted to postgres (from mssql) and we're having performance issues. Not all the issues are related to postgres, but we're trying to sort everything out. The server is running ubuntu Gutsy with the database stored on a IBM SAN. It is a Dell box with dual quad core 2.00GHz Xeons and 8GB RAM. The database is about 71GB in size. I've looked at the postgres config files and we've tweaked as much as our knowledge allows. Can someone shed some light on the settings I should use ? Umpf that isn't quite enough info :) but assuming you are running 8.2.x: Start with 1GB shared_buffers (you may be able to go hire), 4MB work_mem, wal_sync_method = open_sync, checkpoint_segments = 30, default_statistics_target = 150, effective_cache_size = 6GB . Restart, VACUUM ANALYZE VERBOSE, post back last 4 lines of output. Other good items to know: 64bit Gutsy? How is the SAN connected? What does mpstat 5 (3 iterations) say? Even better what does sar -A say over a 24 hour period? Sincerely, Joshua D. Drake Thanks in advance ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] big database performance
Hi We recently converted to postgres (from mssql) and we're having performance issues. Not all the issues are related to postgres, but we're trying to sort everything out. The server is running ubuntu Gutsy with the database stored on a IBM SAN. It is a Dell box with dual quad core 2.00GHz Xeons and 8GB RAM. The database is about 71GB in size. I've looked at the postgres config files and we've tweaked as much as our knowledge allows. Can someone shed some light on the settings I should use ? Thanks in advance -- Adrian Moisey System Administrator | CareerJunction | Your Future Starts Here. Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED] Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match