Re: [rt-users] Migrating from Postgres to MySQL
On Tue, Jul 28, 2009 at 3:56 PM, Kenneth Marshall k...@rice.edu wrote: Kage, The main advantage is gained by avoiding I/O through the virtual disk. The layout of the virtual disk tends to turn most I/O into random I/O, even I/O that starts as sequential. The factor of 10 performance difference between random/sequential I/O causes the majority of the performance problem. I have not had personal experience with using an NFS mount point to run a database so I cannot really comment on that. Good luck with your evaluation. You're trading head-seeking latencies for network latencies, and those are almost certainly higher. Hosting your database server binaries and such forth in NFS is possible, though again, not optimal both from a performance and risk standpoint (NFS server drops, your DB binaries vanish, your DB server drops even though the machine hosting it was fine). I think hosting databases in NFS can cause serious problems - I seem to remember older versions of mysql wouldn't support that. I don't know if newer ones do...but I do know in the *very large* IT environment I worked in, all database servers hosted the DBs on their local disks or in filesystems hosted on disks (SANS?) attached via fibre-channel. Could solid-state drives side-step the random-access issue with virtualization, or at least make it suck less? Based on how many people I know who have said Wow, my SSD died. I thought those were supposed to be more reliable? ... I wouldn't bet my service uptime on it. ;) -Rob ___ http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users Community help: http://wiki.bestpractical.com Commercial support: sa...@bestpractical.com Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
Re: [rt-users] Migrating from Postgres to MySQL
On Jul 29, 2009, at 7:10 AM, Robert Nesius wrote: On Tue, Jul 28, 2009 at 3:56 PM, Kenneth Marshall k...@rice.edu wrote: Kage, The main advantage is gained by avoiding I/O through the virtual disk. The layout of the virtual disk tends to turn most I/O into random I/O, even I/O that starts as sequential. The factor of 10 performance difference between random/sequential I/O causes the majority of the performance problem. I have not had personal experience with using an NFS mount point to run a database so I cannot really comment on that. Good luck with your evaluation. You're trading head-seeking latencies for network latencies, No. If this were a standard host environment, that would be true. But in a virtual environment, there is the overhead of the disk create/ maintenance/update processes of the virtualization engine which multiply the overhead of the disk. Just run a disk benchmarking utility inside a VE running under any platform that uses disk images (vmware, xen, parallels, etc...) and then run those same tests on the host node. The difference in performance is often an order of magnitude slower for the virtual disks. Contrast that with NFS performance, which has a small fixed overhead imposed by the network (even smaller if you use jumbo frames). If you were using a platform with a robust NFS implementation (Solaris, FreeBSD), I'd put money on the database performing better on NFS than inside most virtual machines. If you're using NFS with Linux, you will certainly have performance issues that you won't be able to get past. If the virtualization environment provides raw disk access to the VE, my bet is off. Examples of virtualization platforms that [can] do this are FreeBSD jails and Linux OpenVZ. On several occasions, I have built VEs for MySQL and mounted a dedicated partition in the VE. Assuming you've given adequate resources to the DB VE, that works as well as a dedicated machine. When I arrived at my current position, the SA team had put the databases into the VEs that needed them, along with the apps that accessed them. Despite having 6 servers to spread the load across, they had recurring database performance issues (a few times a week), particularly with RT. I resolved all the DB issues by building a dedicated machine with 4 disks (two battery backed RAID-1 mirrors) all the databases to it. The databases have dedicated spindles as does the OS logging. Despite the resistance to the all our DB eggs in one basket approach, the wisdom of that choice is now plainly evident. All the performance problems went away and haven't returned. and those are almost certainly higher. Hosting your database server binaries and such forth in NFS is possible, though again, not optimal both from a performance and risk standpoint (NFS server drops, your DB binaries vanish, your DB server drops even though the machine hosting it was fine). That's not how NFS works. If the NFS server vanishes, the NFS client hangs and waits for it to return. That is a design feature of NFS. The consistency of the databases is entirely dependency on the disk subsystem of the file server. I think hosting databases in NFS can cause serious problems - I seem to remember older versions of mysql wouldn't support that. I don't know if newer ones do...but I do know in the very large IT environment I worked in, all database servers hosted the DBs on their local disks or in filesystems hosted on disks (SANS?) attached via fibre-channel. I would never host a database server on anything but RAID protected disks with block level access (ie, local disks, iSCSI, etc). Database engines have been explicitly designed and optimized for this type of disk backend. That is starting to change, as a few new DB engines that are designed for network storage (like SimpleDB). But none I know of are production-ready. Could solid-state drives side-step the random-access issue with virtualization, or at least make it suck less? Haven't tried it yet, but my guess is no. However, I have put databases on SSD disks with excellent results. Based on how many people I know who have said Wow, my SSD died. I thought those were supposed to be more reliable? ... I wouldn't bet my service uptime on it. ;) There's this thing called RAID, that protects against disk failures It works quite well with SSD disks and delivers performance numbers for a couple thousand bucks that would otherwise take a $150,000+ SAN. Matt ___ http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users Community help: http://wiki.bestpractical.com Commercial support: sa...@bestpractical.com Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
Re: [rt-users] Migrating from Postgres to MySQL
Hi Matt, Raid is not the end-all be-all for disk safety, especially when you step into terabyte class computing, sorry I am taking this a bit off topic. While RAID has it's bonuses, there are drawbacks as well, take your standard RAID 5 setup, 4 Disks, 3 active, 1 Hot Spare. Now lets say that Disk number 2 decided it was going to release it's smoke to the world (never a good thing), now your array is still alive and it is starting to rebuild onto disk 4 to make up for the death of disk 2. During the rebuild process Disk 1 comes across a bad sector, poof, your data is gone. Just a word of warning, don't put all your data safety eggs into the RAID basket. Otherwise I agree that running via NFS from a virtualized server would probably have perfromance gains over running in the virtual invironment. Thanks, Bill On Wed, Jul 29, 2009 at 10:26 AM, Matt Simerson m...@corp.spry.com wrote: On Jul 29, 2009, at 7:10 AM, Robert Nesius wrote: On Tue, Jul 28, 2009 at 3:56 PM, Kenneth Marshall k...@rice.edu wrote: Kage, The main advantage is gained by avoiding I/O through the virtual disk. The layout of the virtual disk tends to turn most I/O into random I/O, even I/O that starts as sequential. The factor of 10 performance difference between random/sequential I/O causes the majority of the performance problem. I have not had personal experience with using an NFS mount point to run a database so I cannot really comment on that. Good luck with your evaluation. You're trading head-seeking latencies for network latencies, No. If this were a standard host environment, that would be true. But in a virtual environment, there is the overhead of the disk create/maintenance/update processes of the virtualization engine which multiply the overhead of the disk. Just run a disk benchmarking utility inside a VE running under any platform that uses disk images (vmware, xen, parallels, etc...) and then run those same tests on the host node. The difference in performance is often an order of magnitude slower for the virtual disks. Contrast that with NFS performance, which has a small fixed overhead imposed by the network (even smaller if you use jumbo frames). If you were using a platform with a robust NFS implementation (Solaris, FreeBSD), I'd put money on the database performing better on NFS than inside most virtual machines. If you're using NFS with Linux, you will certainly have performance issues that you won't be able to get past. If the virtualization environment provides raw disk access to the VE, my bet is off. Examples of virtualization platforms that [can] do this are FreeBSD jails and Linux OpenVZ. On several occasions, I have built VEs for MySQL and mounted a dedicated partition in the VE. Assuming you've given adequate resources to the DB VE, that works as well as a dedicated machine. When I arrived at my current position, the SA team had put the databases into the VEs that needed them, along with the apps that accessed them. Despite having 6 servers to spread the load across, they had recurring database performance issues (a few times a week), particularly with RT. I resolved all the DB issues by building a dedicated machine with 4 disks (two battery backed RAID-1 mirrors) all the databases to it. The databases have dedicated spindles as does the OS logging. Despite the resistance to the all our DB eggs in one basket approach, the wisdom of that choice is now plainly evident. All the performance problems went away and haven't returned. and those are almost certainly higher. Hosting your database server binaries and such forth in NFS is possible, though again, not optimal both from a performance and risk standpoint (NFS server drops, your DB binaries vanish, your DB server drops even though the machine hosting it was fine). That's not how NFS works. If the NFS server vanishes, the NFS client hangs and waits for it to return. That is a design feature of NFS. The consistency of the databases is entirely dependency on the disk subsystem of the file server. I think hosting databases in NFS can cause serious problems - I seem to remember older versions of mysql wouldn't support that. I don't know if newer ones do...but I do know in the *very large* IT environment I worked in, all database servers hosted the DBs on their local disks or in filesystems hosted on disks (SANS?) attached via fibre-channel. I would never host a database server on anything but RAID protected disks with block level access (ie, local disks, iSCSI, etc). Database engines have been explicitly designed and optimized for this type of disk backend. That is starting to change, as a few new DB engines that are designed for network storage (like SimpleDB). But none I know of are production-ready. Could solid-state drives side-step the random-access issue with virtualization, or at least make it suck less? Haven't tried it yet, but my guess is no. However, I have put
Re: [rt-users] Migrating from Postgres to MySQL
On Jul 29, 2009, at 10:44 AM, William Graboyes wrote: Hi Matt, Raid is not the end-all be-all for disk safety, especially when you step into terabyte class computing, sorry I am taking this a bit off topic. While RAID has it's bonuses, there are drawbacks as well, take your standard RAID 5 setup, 4 Disks, 3 active, 1 Hot Spare. Now lets say that Disk number 2 decided it was going to release it's smoke to the world (never a good thing), now your array is still alive and it is starting to rebuild onto disk 4 to make up for the death of disk 2. During the rebuild process Disk 1 comes across a bad sector, poof, your data is gone. Just a word of warning, don't put all your data safety eggs into the RAID basket. RAID poorly implemented is a placebo, and is often less reliable than a single disk. RAID properly implemented IS the end-all be-all of disk safety. Your chosen argument is only valid against RAID level 5, which would be a very poor choice for a database application. RAID-5 is a poor choice in any environment where the data set is volatile and valuable. Especially when you consider the number of hours (or days) it takes to rebuild a hot spare into the RAID-5 set. (HINT: test that before deployment!) During that rebuild window, your system performance is heavily degraded and extremely vulnerable. If the performance of your RAID system is halved, is that sufficient for your application, or is your system effectively down during the rebuild period? (HINT: test before deployment!). But a RAID-1 or RAID 10 can be extremely robust, remaining online and performing optimally during multiple catastrophic disk failures. You can mirror the data to as many spindles as you need to insure data integrity. When disks fail, rebuilding a mirror disk usually takes less than an hour. The systems engineer has to choose between data integrity, performance, and storage efficiency. One of my systems is about 2/3 full with 24.64 TB of data. Does that count as terabyte class computing? :) I'm using ZFS to manage it. :) :) Matt ___ http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users Community help: http://wiki.bestpractical.com Commercial support: sa...@bestpractical.com Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
[rt-users] Migrating from Postgres to MySQL
Well, basically, Postgres is seemingly a bucket of fail specifically for the RT system I maintain. We run RT/Postgres on a Ubuntu Hardy LTS virtual machine running 2 virtual CPUs and the KVM maximum of 2GB RAM, but Postgres ends up using so much in resources that it actually kills the entire KVM system on the virtual host. I know tuning of Postgres can fix this, but we've experienced issues with Postgres on many other virtual machines, leading us to believe that it may simply be an issue with Postgres not getting along well with KVM (which is a somewhat known issue). Ideally, I want to migrate my RT database from Postgres to MySQL, and preserve ALL data (tickets, attachments, links, everything). Is there an guaranteed to work method to do this that anyone knows of? Thanks! Specs: Postgres: 8.3 RT: 3.6.5 MySQL candidate on Hardy: 5.0.51a-3ubuntu5.4 Any help would be greatly appreciated. Thanks! -- ~ Kage http://vitund.com http://hackthissite.org ___ http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users Community help: http://wiki.bestpractical.com Commercial support: sa...@bestpractical.com Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
Re: [rt-users] Migrating from Postgres to MySQL
On Tue, Jul 28, 2009 at 03:35:12PM -0400, Kage wrote: Well, basically, Postgres is seemingly a bucket of fail specifically for the RT system I maintain. We run RT/Postgres on a Ubuntu Hardy LTS virtual machine running 2 virtual CPUs and the KVM maximum of 2GB RAM, but Postgres ends up using so much in resources that it actually kills the entire KVM system on the virtual host. I know tuning of Postgres can fix this, but we've experienced issues with Postgres on many other virtual machines, leading us to believe that it may simply be an issue with Postgres not getting along well with KVM (which is a somewhat known issue). Ideally, I want to migrate my RT database from Postgres to MySQL, and preserve ALL data (tickets, attachments, links, everything). Is there an guaranteed to work method to do this that anyone knows of? Thanks! Specs: Postgres: 8.3 RT: 3.6.5 MySQL candidate on Hardy: 5.0.51a-3ubuntu5.4 Any help would be greatly appreciated. Thanks! -- ~ Kage http://vitund.com http://hackthissite.org Hi Kage, I have had trouble with VMs with any I/O bound system. Have you tested your system with MySQL? You may just be swapping one I/O problem for another. That being said, for anything but the most trivial of DBs, tuning your backend DB be it PostgreSQL, MySQL, Oracle,... is critical for getting good performance. Another data point, I have had a simple repeated file read of a DNS zone file for an XFER tank a VM. VMs work well for CPU intensive tasks with small amount of I/O. For any larger amount of I/O, using real storage is a must. Good luck with whichever backend you choose. A big plus for both the Oracle and PostgreSQL backend is the availability of full-text indexing which allows content and attachment searches to use an index and not take your DB with a full table scan. Regards, Ken ___ http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users Community help: http://wiki.bestpractical.com Commercial support: sa...@bestpractical.com Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
Re: [rt-users] Migrating from Postgres to MySQL
(didn't do a reply-all, sorry if you get this twice, Ken) What, if any, performance gains exist by using an NFS mountpoint for the Postgres database within a virtual machine? In this case, I/O is not directly placed on the virtual disk (unless the actual act of I/O itself, regardless of what is being acted upon (virtual disk or not), is also an issue). Could this potentially yield any performance increase? Also, no, I have not tried MySQL yet with our RT setup. I'd like to move our database to MySQL to do some testing with that and gather realistic metrics of here's our data in Postgres vs. MySQL. On Tue, Jul 28, 2009 at 3:51 PM, Kenneth Marshallk...@rice.edu wrote: On Tue, Jul 28, 2009 at 03:35:12PM -0400, Kage wrote: Well, basically, Postgres is seemingly a bucket of fail specifically for the RT system I maintain. We run RT/Postgres on a Ubuntu Hardy LTS virtual machine running 2 virtual CPUs and the KVM maximum of 2GB RAM, but Postgres ends up using so much in resources that it actually kills the entire KVM system on the virtual host. I know tuning of Postgres can fix this, but we've experienced issues with Postgres on many other virtual machines, leading us to believe that it may simply be an issue with Postgres not getting along well with KVM (which is a somewhat known issue). Ideally, I want to migrate my RT database from Postgres to MySQL, and preserve ALL data (tickets, attachments, links, everything). Is there an guaranteed to work method to do this that anyone knows of? Thanks! Specs: Postgres: 8.3 RT: 3.6.5 MySQL candidate on Hardy: 5.0.51a-3ubuntu5.4 Any help would be greatly appreciated. Thanks! -- ~ Kage http://vitund.com http://hackthissite.org Hi Kage, I have had trouble with VMs with any I/O bound system. Have you tested your system with MySQL? You may just be swapping one I/O problem for another. That being said, for anything but the most trivial of DBs, tuning your backend DB be it PostgreSQL, MySQL, Oracle,... is critical for getting good performance. Another data point, I have had a simple repeated file read of a DNS zone file for an XFER tank a VM. VMs work well for CPU intensive tasks with small amount of I/O. For any larger amount of I/O, using real storage is a must. Good luck with whichever backend you choose. A big plus for both the Oracle and PostgreSQL backend is the availability of full-text indexing which allows content and attachment searches to use an index and not take your DB with a full table scan. Regards, Ken -- ~ Kage http://vitund.com http://hackthissite.org ___ http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users Community help: http://wiki.bestpractical.com Commercial support: sa...@bestpractical.com Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
Re: [rt-users] Migrating from Postgres to MySQL
Kage, The main advantage is gained by avoiding I/O through the virtual disk. The layout of the virtual disk tends to turn most I/O into random I/O, even I/O that starts as sequential. The factor of 10 performance difference between random/sequential I/O causes the majority of the performance problem. I have not had personal experience with using an NFS mount point to run a database so I cannot really comment on that. Good luck with your evaluation. Cheers, Ken On Tue, Jul 28, 2009 at 04:12:55PM -0400, Kage wrote: (didn't do a reply-all, sorry if you get this twice, Ken) What, if any, performance gains exist by using an NFS mountpoint for the Postgres database within a virtual machine? ?In this case, I/O is not directly placed on the virtual disk (unless the actual act of I/O itself, regardless of what is being acted upon (virtual disk or not), is also an issue). ?Could this potentially yield any performance increase? Also, no, I have not tried MySQL yet with our RT setup. ?I'd like to move our database to MySQL to do some testing with that and gather realistic metrics of here's our data in Postgres vs. MySQL. On Tue, Jul 28, 2009 at 3:51 PM, Kenneth Marshallk...@rice.edu wrote: On Tue, Jul 28, 2009 at 03:35:12PM -0400, Kage wrote: Well, basically, Postgres is seemingly a bucket of fail specifically for the RT system I maintain. ?We run RT/Postgres on a Ubuntu Hardy LTS virtual machine running 2 virtual CPUs and the KVM maximum of 2GB RAM, but Postgres ends up using so much in resources that it actually kills the entire KVM system on the virtual host. ?I know tuning of Postgres can fix this, but we've experienced issues with Postgres on many other virtual machines, leading us to believe that it may simply be an issue with Postgres not getting along well with KVM (which is a somewhat known issue). ?Ideally, I want to migrate my RT database from Postgres to MySQL, and preserve ALL data (tickets, attachments, links, everything). ?Is there an guaranteed to work method to do this that anyone knows of? ?Thanks! Specs: Postgres: 8.3 RT: 3.6.5 MySQL candidate on Hardy: 5.0.51a-3ubuntu5.4 Any help would be greatly appreciated. ?Thanks! -- ~ Kage http://vitund.com http://hackthissite.org Hi Kage, I have had trouble with VMs with any I/O bound system. Have you tested your system with MySQL? You may just be swapping one I/O problem for another. That being said, for anything but the most trivial of DBs, tuning your backend DB be it PostgreSQL, MySQL, Oracle,... is critical for getting good performance. Another data point, I have had a simple repeated file read of a DNS zone file for an XFER tank a VM. VMs work well for CPU intensive tasks with small amount of I/O. For any larger amount of I/O, using real storage is a must. Good luck with whichever backend you choose. A big plus for both the Oracle and PostgreSQL backend is the availability of full-text indexing which allows content and attachment searches to use an index and not take your DB with a full table scan. Regards, Ken -- ~ Kage http://vitund.com http://hackthissite.org ___ http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users Community help: http://wiki.bestpractical.com Commercial support: sa...@bestpractical.com Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com