Re: [rt-users] Migrating from Postgres to MySQL

2009-07-29 Thread Robert Nesius
On Tue, Jul 28, 2009 at 3:56 PM, Kenneth Marshall k...@rice.edu wrote:

 Kage,

 The main advantage is gained by avoiding I/O through the virtual
 disk. The layout of the virtual disk tends to turn most I/O into
 random I/O, even I/O that starts as sequential. The factor of
 10 performance difference between random/sequential I/O causes
 the majority of the performance problem. I have not had personal
 experience with using an NFS mount point to run a database so I
 cannot really comment on that. Good luck with your evaluation.


You're trading head-seeking latencies for network latencies, and those are
almost certainly higher.  Hosting your database server binaries and such
forth in NFS is possible, though again, not optimal both from a performance
and risk standpoint (NFS server drops, your DB binaries vanish, your DB
server drops even though the machine hosting it was fine).

I think hosting databases in NFS can cause serious problems - I seem to
remember older versions of mysql wouldn't support that.  I don't know if
newer ones do...but I do know in the *very large* IT environment I worked
in, all database servers hosted the DBs on their local disks or in
filesystems hosted on disks (SANS?) attached via fibre-channel.

Could solid-state drives side-step the random-access issue with
virtualization, or at least make it suck less? Based on how many people I
know who have said Wow, my SSD died.  I thought those were supposed to be
more reliable? ... I wouldn't bet my service uptime on it. ;)

-Rob
___
http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

Community help: http://wiki.bestpractical.com
Commercial support: sa...@bestpractical.com


Discover RT's hidden secrets with RT Essentials from O'Reilly Media. 
Buy a copy at http://rtbook.bestpractical.com

Re: [rt-users] Migrating from Postgres to MySQL

2009-07-29 Thread Matt Simerson


On Jul 29, 2009, at 7:10 AM, Robert Nesius wrote:

On Tue, Jul 28, 2009 at 3:56 PM, Kenneth Marshall k...@rice.edu  
wrote:

Kage,

The main advantage is gained by avoiding I/O through the virtual
disk. The layout of the virtual disk tends to turn most I/O into
random I/O, even I/O that starts as sequential. The factor of
10 performance difference between random/sequential I/O causes
the majority of the performance problem. I have not had personal
experience with using an NFS mount point to run a database so I
cannot really comment on that. Good luck with your evaluation.

You're trading head-seeking latencies for network latencies,


No.

If this were a standard host environment, that would be true. But in a  
virtual environment, there is the overhead of the disk create/ 
maintenance/update processes of the virtualization engine which  
multiply the overhead of the disk. Just run a disk benchmarking  
utility inside a VE running under any platform that uses disk images  
(vmware, xen, parallels, etc...) and then run those same tests on the  
host node. The difference in performance is often an order of  
magnitude slower for the virtual disks.


Contrast that with NFS performance, which has a small fixed overhead  
imposed by the network (even smaller if you use jumbo frames). If you  
were using a platform with a robust NFS implementation (Solaris,  
FreeBSD), I'd put money on the database performing better on NFS than  
inside most virtual machines. If you're using NFS with Linux, you will  
certainly have performance issues that you won't be able to get past.


If the virtualization environment provides raw disk access to the VE,  
my bet is off. Examples of virtualization platforms that [can] do this  
are FreeBSD jails and Linux OpenVZ. On several occasions, I have built  
VEs for MySQL and mounted a dedicated partition in the VE. Assuming  
you've given adequate resources to the DB VE, that works as well as a  
dedicated machine.


When I arrived at my current position, the SA team had put the  
databases into the VEs that needed them, along with the apps that  
accessed them. Despite having 6 servers to spread the load across,  
they had recurring database performance issues (a few times a week),  
particularly with RT. I resolved all the DB issues by building a  
dedicated machine with 4 disks (two battery backed RAID-1 mirrors) all  
the databases to it. The databases have dedicated spindles as does the  
OS  logging. Despite the resistance to the all our DB eggs in one  
basket approach, the wisdom of that choice is now plainly evident.  
All the performance problems went away and haven't returned.


and those are almost certainly higher.  Hosting your database server  
binaries and such forth in NFS is possible, though again, not  
optimal both from a performance and risk standpoint (NFS server  
drops, your DB binaries vanish, your DB server drops even though the  
machine hosting it was fine).


That's not how NFS works. If the NFS server vanishes, the NFS client  
hangs and waits for it to return. That is a design feature of NFS. The  
consistency of the databases is entirely dependency on the disk  
subsystem of the file server.


I think hosting databases in NFS can cause serious problems - I seem  
to remember older versions of mysql wouldn't support that.  I don't  
know if newer ones do...but I do know in the very large IT  
environment I worked in, all database servers hosted the DBs on  
their local disks or in filesystems hosted on disks (SANS?) attached  
via fibre-channel.


I would never host a database server on anything but RAID protected  
disks with block level access (ie, local disks, iSCSI, etc). Database  
engines have been explicitly designed and optimized for this type of  
disk backend.  That is starting to change, as a few new DB engines  
that are designed for network storage (like SimpleDB). But none I know  
of are production-ready.


Could solid-state drives side-step the random-access issue with  
virtualization, or at least make it suck less?


Haven't tried it yet, but my guess is no.  However, I have put  
databases on SSD disks with excellent results.


Based on how many people I know who have said Wow, my SSD died.  I  
thought those were supposed to be more reliable? ... I wouldn't bet  
my service uptime on it. ;)


There's this thing called RAID, that protects against disk  
failures It works quite well with SSD disks and delivers  
performance numbers for a couple thousand bucks that would otherwise  
take a $150,000+ SAN.


Matt

___
http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

Community help: http://wiki.bestpractical.com
Commercial support: sa...@bestpractical.com


Discover RT's hidden secrets with RT Essentials from O'Reilly Media. 
Buy a copy at http://rtbook.bestpractical.com

Re: [rt-users] Migrating from Postgres to MySQL

2009-07-29 Thread William Graboyes
Hi Matt,

Raid is not the end-all be-all for disk safety, especially when you step
into terabyte class computing, sorry I am taking this a bit off topic. While
RAID has it's bonuses, there are drawbacks as well, take your standard RAID
5 setup, 4 Disks, 3 active, 1 Hot Spare.  Now lets say that Disk number 2
decided it was going to release it's smoke to the world (never a good
thing), now your array is still alive and it is starting to rebuild onto
disk 4 to make up for the death of disk 2.  During the rebuild process Disk
1 comes across a bad sector, poof, your data is gone.  Just a word of
warning, don't put all your data safety eggs into the RAID basket.

Otherwise I agree that running via NFS from a virtualized server would
probably have perfromance gains over running in the virtual invironment.

Thanks,
Bill

On Wed, Jul 29, 2009 at 10:26 AM, Matt Simerson m...@corp.spry.com wrote:


 On Jul 29, 2009, at 7:10 AM, Robert Nesius wrote:

 On Tue, Jul 28, 2009 at 3:56 PM, Kenneth Marshall k...@rice.edu wrote:

 Kage,

 The main advantage is gained by avoiding I/O through the virtual
 disk. The layout of the virtual disk tends to turn most I/O into
 random I/O, even I/O that starts as sequential. The factor of
 10 performance difference between random/sequential I/O causes
 the majority of the performance problem. I have not had personal
 experience with using an NFS mount point to run a database so I
 cannot really comment on that. Good luck with your evaluation.


 You're trading head-seeking latencies for network latencies,


 No.

 If this were a standard host environment, that would be true. But in a
 virtual environment, there is the overhead of the disk
 create/maintenance/update processes of the virtualization engine which
 multiply the overhead of the disk. Just run a disk benchmarking utility
 inside a VE running under any platform that uses disk images (vmware, xen,
 parallels, etc...) and then run those same tests on the host node. The
 difference in performance is often an order of magnitude slower for the
 virtual disks.

 Contrast that with NFS performance, which has a small fixed overhead
 imposed by the network (even smaller if you use jumbo frames). If you were
 using a platform with a robust NFS implementation (Solaris, FreeBSD), I'd
 put money on the database performing better on NFS than inside most virtual
 machines. If you're using NFS with Linux, you will certainly have
 performance issues that you won't be able to get past.

 If the virtualization environment provides raw disk access to the VE, my
 bet is off. Examples of virtualization platforms that [can] do this are
 FreeBSD jails and Linux OpenVZ. On several occasions, I have built VEs for
 MySQL and mounted a dedicated partition in the VE. Assuming you've given
 adequate resources to the DB VE, that works as well as a dedicated machine.

 When I arrived at my current position, the SA team had put the databases
 into the VEs that needed them, along with the apps that accessed them.
 Despite having 6 servers to spread the load across, they had recurring
 database performance issues (a few times a week), particularly with RT. I
 resolved all the DB issues by building a dedicated machine with 4 disks (two
 battery backed RAID-1 mirrors) all the databases to it. The databases have
 dedicated spindles as does the OS  logging. Despite the resistance to the
 all our DB eggs in one basket approach, the wisdom of that choice is now
 plainly evident. All the performance problems went away and haven't
 returned.

 and those are almost certainly higher.  Hosting your database server
 binaries and such forth in NFS is possible, though again, not optimal both
 from a performance and risk standpoint (NFS server drops, your DB binaries
 vanish, your DB server drops even though the machine hosting it was fine).


 That's not how NFS works. If the NFS server vanishes, the NFS client hangs
 and waits for it to return. That is a design feature of NFS. The consistency
 of the databases is entirely dependency on the disk subsystem of the file
 server.

 I think hosting databases in NFS can cause serious problems - I seem to
 remember older versions of mysql wouldn't support that.  I don't know if
 newer ones do...but I do know in the *very large* IT environment I worked
 in, all database servers hosted the DBs on their local disks or in
 filesystems hosted on disks (SANS?) attached via fibre-channel.


 I would never host a database server on anything but RAID protected disks
 with block level access (ie, local disks, iSCSI, etc). Database engines have
 been explicitly designed and optimized for this type of disk backend.  That
 is starting to change, as a few new DB engines that are designed for network
 storage (like SimpleDB). But none I know of are production-ready.

 Could solid-state drives side-step the random-access issue with
 virtualization, or at least make it suck less?


 Haven't tried it yet, but my guess is no.  However, I have put 

Re: [rt-users] Migrating from Postgres to MySQL

2009-07-29 Thread Matt Simerson

On Jul 29, 2009, at 10:44 AM, William Graboyes wrote:

 Hi Matt,

 Raid is not the end-all be-all for disk safety, especially when you  
 step into terabyte class computing, sorry I am taking this a bit off  
 topic. While RAID has it's bonuses, there are drawbacks as well,  
 take your standard RAID 5 setup, 4 Disks, 3 active, 1 Hot Spare.   
 Now lets say that Disk number 2 decided it was going to release it's  
 smoke to the world (never a good thing), now your array is still  
 alive and it is starting to rebuild onto disk 4 to make up for the  
 death of disk 2.  During the rebuild process Disk 1 comes across a  
 bad sector, poof, your data is gone.  Just a word of warning, don't  
 put all your data safety eggs into the RAID basket.

RAID poorly implemented is a placebo, and is often less reliable than  
a single disk. RAID properly implemented IS the end-all be-all of disk  
safety.

Your chosen argument is only valid against RAID level 5, which would  
be a very poor choice for a database application. RAID-5 is a poor  
choice in any environment where the data set is volatile and valuable.  
Especially when you consider the number of hours (or days) it takes to  
rebuild a hot spare into the RAID-5 set. (HINT: test that before  
deployment!)  During that rebuild window, your system performance is  
heavily degraded and extremely vulnerable. If the performance of your  
RAID system is halved, is that sufficient for your application, or is  
your system effectively down during the rebuild period? (HINT: test  
before deployment!).

But a RAID-1 or RAID 10 can be extremely robust, remaining online and  
performing optimally during multiple catastrophic disk failures. You  
can mirror the data to as many spindles as you need to insure data  
integrity. When disks fail, rebuilding a mirror disk usually takes  
less than an hour.

The systems engineer has to choose between data integrity,  
performance, and storage efficiency.

One of my systems is about 2/3 full with 24.64 TB of data. Does that  
count as terabyte class computing?  :)  I'm using ZFS to manage  
it. :)  :)

Matt
___
http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

Community help: http://wiki.bestpractical.com
Commercial support: sa...@bestpractical.com


Discover RT's hidden secrets with RT Essentials from O'Reilly Media. 
Buy a copy at http://rtbook.bestpractical.com


[rt-users] Migrating from Postgres to MySQL

2009-07-28 Thread Kage
Well, basically, Postgres is seemingly a bucket of fail specifically
for the RT system I maintain.  We run RT/Postgres on a Ubuntu Hardy
LTS virtual machine running 2 virtual CPUs and the KVM maximum of 2GB
RAM, but Postgres ends up using so much in resources that it actually
kills the entire KVM system on the virtual host.  I know tuning of
Postgres can fix this, but we've experienced issues with Postgres on
many other virtual machines, leading us to believe that it may simply
be an issue with Postgres not getting along well with KVM (which is a
somewhat known issue).  Ideally, I want to migrate my RT database from
Postgres to MySQL, and preserve ALL data (tickets, attachments, links,
everything).  Is there an guaranteed to work method to do this that
anyone knows of?  Thanks!

Specs:
Postgres: 8.3
RT: 3.6.5
MySQL candidate on Hardy: 5.0.51a-3ubuntu5.4

Any help would be greatly appreciated.  Thanks!

-- 
~ Kage
http://vitund.com
http://hackthissite.org
___
http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

Community help: http://wiki.bestpractical.com
Commercial support: sa...@bestpractical.com


Discover RT's hidden secrets with RT Essentials from O'Reilly Media. 
Buy a copy at http://rtbook.bestpractical.com


Re: [rt-users] Migrating from Postgres to MySQL

2009-07-28 Thread Kenneth Marshall
On Tue, Jul 28, 2009 at 03:35:12PM -0400, Kage wrote:
 Well, basically, Postgres is seemingly a bucket of fail specifically
 for the RT system I maintain.  We run RT/Postgres on a Ubuntu Hardy
 LTS virtual machine running 2 virtual CPUs and the KVM maximum of 2GB
 RAM, but Postgres ends up using so much in resources that it actually
 kills the entire KVM system on the virtual host.  I know tuning of
 Postgres can fix this, but we've experienced issues with Postgres on
 many other virtual machines, leading us to believe that it may simply
 be an issue with Postgres not getting along well with KVM (which is a
 somewhat known issue).  Ideally, I want to migrate my RT database from
 Postgres to MySQL, and preserve ALL data (tickets, attachments, links,
 everything).  Is there an guaranteed to work method to do this that
 anyone knows of?  Thanks!
 
 Specs:
 Postgres: 8.3
 RT: 3.6.5
 MySQL candidate on Hardy: 5.0.51a-3ubuntu5.4
 
 Any help would be greatly appreciated.  Thanks!
 
 -- 
 ~ Kage
 http://vitund.com
 http://hackthissite.org

Hi Kage,

I have had trouble with VMs with any I/O bound system. Have you
tested your system with MySQL? You may just be swapping one I/O
problem for another. That being said, for anything but the most
trivial of DBs, tuning your backend DB be it PostgreSQL, MySQL,
Oracle,... is critical for getting good performance.

Another data point, I have had a simple repeated file read of a
DNS zone file for an XFER tank a VM. VMs work well for CPU intensive
tasks with small amount of I/O. For any larger amount of I/O, using
real storage is a must. Good luck with whichever backend you choose.
A big plus for both the Oracle and PostgreSQL backend is the
availability of full-text indexing which allows content and attachment
searches to use an index and not take your DB with a full table scan.

Regards,
Ken
___
http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

Community help: http://wiki.bestpractical.com
Commercial support: sa...@bestpractical.com


Discover RT's hidden secrets with RT Essentials from O'Reilly Media. 
Buy a copy at http://rtbook.bestpractical.com


Re: [rt-users] Migrating from Postgres to MySQL

2009-07-28 Thread Kage
(didn't do a reply-all, sorry if you get this twice, Ken)

What, if any, performance gains exist by using an NFS mountpoint for
the Postgres database within a virtual machine?  In this case, I/O is
not directly placed on the virtual disk (unless the actual act of
I/O itself, regardless of what is being acted upon (virtual disk or
not), is also an issue).  Could this potentially yield any performance
increase?

Also, no, I have not tried MySQL yet with our RT setup.  I'd like to
move our database to MySQL to do some testing with that and gather
realistic metrics of here's our data in Postgres vs. MySQL.

On Tue, Jul 28, 2009 at 3:51 PM, Kenneth Marshallk...@rice.edu wrote:
 On Tue, Jul 28, 2009 at 03:35:12PM -0400, Kage wrote:
 Well, basically, Postgres is seemingly a bucket of fail specifically
 for the RT system I maintain.  We run RT/Postgres on a Ubuntu Hardy
 LTS virtual machine running 2 virtual CPUs and the KVM maximum of 2GB
 RAM, but Postgres ends up using so much in resources that it actually
 kills the entire KVM system on the virtual host.  I know tuning of
 Postgres can fix this, but we've experienced issues with Postgres on
 many other virtual machines, leading us to believe that it may simply
 be an issue with Postgres not getting along well with KVM (which is a
 somewhat known issue).  Ideally, I want to migrate my RT database from
 Postgres to MySQL, and preserve ALL data (tickets, attachments, links,
 everything).  Is there an guaranteed to work method to do this that
 anyone knows of?  Thanks!

 Specs:
 Postgres: 8.3
 RT: 3.6.5
 MySQL candidate on Hardy: 5.0.51a-3ubuntu5.4

 Any help would be greatly appreciated.  Thanks!

 --
 ~ Kage
 http://vitund.com
 http://hackthissite.org

 Hi Kage,

 I have had trouble with VMs with any I/O bound system. Have you
 tested your system with MySQL? You may just be swapping one I/O
 problem for another. That being said, for anything but the most
 trivial of DBs, tuning your backend DB be it PostgreSQL, MySQL,
 Oracle,... is critical for getting good performance.

 Another data point, I have had a simple repeated file read of a
 DNS zone file for an XFER tank a VM. VMs work well for CPU intensive
 tasks with small amount of I/O. For any larger amount of I/O, using
 real storage is a must. Good luck with whichever backend you choose.
 A big plus for both the Oracle and PostgreSQL backend is the
 availability of full-text indexing which allows content and attachment
 searches to use an index and not take your DB with a full table scan.

 Regards,
 Ken

-- 
~ Kage
http://vitund.com
http://hackthissite.org
___
http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

Community help: http://wiki.bestpractical.com
Commercial support: sa...@bestpractical.com


Discover RT's hidden secrets with RT Essentials from O'Reilly Media. 
Buy a copy at http://rtbook.bestpractical.com


Re: [rt-users] Migrating from Postgres to MySQL

2009-07-28 Thread Kenneth Marshall
Kage,

The main advantage is gained by avoiding I/O through the virtual
disk. The layout of the virtual disk tends to turn most I/O into
random I/O, even I/O that starts as sequential. The factor of
10 performance difference between random/sequential I/O causes
the majority of the performance problem. I have not had personal
experience with using an NFS mount point to run a database so I
cannot really comment on that. Good luck with your evaluation.

Cheers,
Ken

On Tue, Jul 28, 2009 at 04:12:55PM -0400, Kage wrote:
 (didn't do a reply-all, sorry if you get this twice, Ken)
 
 What, if any, performance gains exist by using an NFS mountpoint for
 the Postgres database within a virtual machine? ?In this case, I/O is
 not directly placed on the virtual disk (unless the actual act of
 I/O itself, regardless of what is being acted upon (virtual disk or
 not), is also an issue). ?Could this potentially yield any performance
 increase?
 
 Also, no, I have not tried MySQL yet with our RT setup. ?I'd like to
 move our database to MySQL to do some testing with that and gather
 realistic metrics of here's our data in Postgres vs. MySQL.
 
 On Tue, Jul 28, 2009 at 3:51 PM, Kenneth Marshallk...@rice.edu wrote:
  On Tue, Jul 28, 2009 at 03:35:12PM -0400, Kage wrote:
  Well, basically, Postgres is seemingly a bucket of fail specifically
  for the RT system I maintain. ?We run RT/Postgres on a Ubuntu Hardy
  LTS virtual machine running 2 virtual CPUs and the KVM maximum of 2GB
  RAM, but Postgres ends up using so much in resources that it actually
  kills the entire KVM system on the virtual host. ?I know tuning of
  Postgres can fix this, but we've experienced issues with Postgres on
  many other virtual machines, leading us to believe that it may simply
  be an issue with Postgres not getting along well with KVM (which is a
  somewhat known issue). ?Ideally, I want to migrate my RT database from
  Postgres to MySQL, and preserve ALL data (tickets, attachments, links,
  everything). ?Is there an guaranteed to work method to do this that
  anyone knows of? ?Thanks!
 
  Specs:
  Postgres: 8.3
  RT: 3.6.5
  MySQL candidate on Hardy: 5.0.51a-3ubuntu5.4
 
  Any help would be greatly appreciated. ?Thanks!
 
  --
  ~ Kage
  http://vitund.com
  http://hackthissite.org
 
  Hi Kage,
 
  I have had trouble with VMs with any I/O bound system. Have you
  tested your system with MySQL? You may just be swapping one I/O
  problem for another. That being said, for anything but the most
  trivial of DBs, tuning your backend DB be it PostgreSQL, MySQL,
  Oracle,... is critical for getting good performance.
 
  Another data point, I have had a simple repeated file read of a
  DNS zone file for an XFER tank a VM. VMs work well for CPU intensive
  tasks with small amount of I/O. For any larger amount of I/O, using
  real storage is a must. Good luck with whichever backend you choose.
  A big plus for both the Oracle and PostgreSQL backend is the
  availability of full-text indexing which allows content and attachment
  searches to use an index and not take your DB with a full table scan.
 
  Regards,
  Ken
 
 -- 
 ~ Kage
 http://vitund.com
 http://hackthissite.org
 
___
http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

Community help: http://wiki.bestpractical.com
Commercial support: sa...@bestpractical.com


Discover RT's hidden secrets with RT Essentials from O'Reilly Media. 
Buy a copy at http://rtbook.bestpractical.com