Guys,

Thanks for the tips and info and links.

As far as sharding goes, I have about 1000 shards (identical schemas)
on each machine.  I have 4 physical machines to start.  Eventually
when i get more machines,  I will spread the shards (and load) to
include the newer machines.

I've read some opinions of the innodb_file_per_table option and it
seems most mysql experts tend to not recommend it for reasons that
Ronald mentioned and others.  So I probably will stick with the
default single global tablespace.

Some more questions,

1. In my MySQL setup, /db is my data directory with all of the shards
in it, including the global tablespace for innodb.  If I shutdown
mysql, cp the entire directory to a backup medium, will that work as a
raw backup?  Will I be able to recover from these files later if I
need to?

2. Ronald, you mentioned that mysqldump is only good for smaller dbs,
what tools do you recommend for dbs (having only INNODB tables)?

Thanks again!
Konstantin




On Sat, Feb 20, 2010 at 10:20 PM, Ronald Bradford
<[email protected]> wrote:
> Konstantin,
>
> Let me try to address some of your points, and also correct some of the
> inaccurate information in the response.
>
> mysqldump is a single stream export of data. With a mysql backup the most
> critical action is producing a "consistent" snapshot, that enables you to
> ensure a possible recovery.  You can use --lock-tables to get a consistent
> version for MyISAM, this is a blocking statement, for ALL InnoDB tables, you
> can use --single-transaction which is non-blocking, however depending on
> your DB size, buffer pool size and log file size you may experience
> problems.  mk-parallel-dump does not provide the "consistency" on a running
> DB.  Such consistency can only occur on a DB when no client access is
> available for the entire backup.  mysqldump is also really only practical
> for small databases 5-10GB.
>
> innodb_file_per_table is not a good idea for default, there are plenty of
> arguments for not using it when you have a large number of tables because of
> the increased data disk syncing.  innodb_file_per_table does have a number
> of benefits, some more significant then others,  You talk about 1000 shards,
> what is this? Is this schema's in your instance, or 1000 servers?
>
> There are several approaches to backups, they depend on your MySQL topology,
> your storage engines, your required uptime, you application design for read
> and read/write access etc.
>
> DRBD is not a backup solution, is a synchronous disk replication approach to
> ensuring a consistent and high availability for fail-over. While in theory
> you can break the DRBD replication to perform a consistent disk backup,
> that's a bad idea because you just broke your failover strategy.
>
> Disaster is enviable. It is critical you have a backup and recovery plan,
> you actually test it, you time it and you document it. Many organizations
> fail to do this, here is a checklist you need to ensure you follow --
> http://ronaldbradford.com/blog/checked-your-mysql-recovery-process-recently-2010-02-15/
>
> Regards
>
> Ronald
>
>
>
> 01110010 01101111 01101110 01100001 01101100 01100100 00100000 01100010
> 01110010 01100001 01100100 01100110 01101111 01110010 01100100
> Ronald Bradford
> MySQL Expert specializing in Performance Tuning, Scalability and High
> Availability
> MySQL Community Member of the Year 2009
> Co author of Expert PHP and MySQL. http://expertphpandmysql.com
>
> Web Site: http://ronaldbradford.com
> Linked In: http://www.linkedin.com/in/ronaldbradford
> Twitter:     @RonaldBradford, @MySQLExpert
>
>
>
>
> On Sat, Feb 20, 2010 at 8:36 PM, Rob Marscher <[email protected]>
> wrote:
>>
>> I think innodb_file_per_table is a good idea. I haven't seen any arguments
>> for not using it. I don't have experience with the parallel dump solution.
>> I've used replication to keep a live backup. I don't exactly remember off
>> the top of my head but there's an option for mysqldump to not lock innodb
>> tables during the dump. I think there's some caveats with that method but
>> they were acceptable for my situation. I think details are in the mysqldump
>> online documentation.
>> I believe there's a technique with DRBD too for quick backups. I seem to
>> remember some mysql experts talking about stopping the server using a super
>> quick method to copy the mysql data dir.
>> The most important thing is to make sure you test recovering your system
>> from a backup. There are some horror stories of people not realizing there
>> was a problem with their backups until they actually had a crash and needed
>> to recover from it.
>>
>> On Feb 20, 2010, at 7:42 PM, Konstantin K <[email protected]> wrote:
>>
>> Hi guys,
>>
>> I've been reading Maatkit documentation, High Performance MySQL 2nd
>> Edition, and googling to try to learn what my options are in terms of
>> backing up and recovery of MySQL.
>>
>> For our database layer, we will be launching with several physical
>> machines, each of which will have MySQL 5.1, RAID-10 (hardware), NO
>> LVM, and will have about 1000 database shards, with all tables using
>> the INNODB engine.
>>
>> In the documentation for mk-parallel-dump (http://www.maatkit.org/doc/
>> mk-parallel-dump.html), it says:
>>  "...mk-parallel-dump is not a backup program! It is only designed for
>> fast data exports, for purposes such as quickly loading data into test
>> systems. Do not use mk-parallel-dump for backups..."
>>
>> But in the book, it seems to point that mk-parallel-dump can be used
>> as a backup solution.
>>
>> So my questions are:
>> 1. Can mk-parallel-dump be used to backup the innodb databases?
>> 2. Is it recommended to use innodb_file_per_table option?
>> 3. What do you recommend for specific documentation, methodologies,
>> techniques for raw and logical backups?
>>
>> Thanks,
>> Konstantin
>>
>> _______________________________________________
>> New York PHP Community MySQL SIG
>> http://lists.nyphp.org/mailman/listinfo/mysql
>>
>> NYPHPCon 2006 Presentations Online
>> http://www.nyphpcon.com
>>
>> Show Your Participation in New York PHP
>> http://www.nyphp.org/show_participation.php
>>
>> _______________________________________________
>> New York PHP Community MySQL SIG
>> http://lists.nyphp.org/mailman/listinfo/mysql
>>
>> NYPHPCon 2006 Presentations Online
>> http://www.nyphpcon.com
>>
>> Show Your Participation in New York PHP
>> http://www.nyphp.org/show_participation.php
>
>
> _______________________________________________
> New York PHP Community MySQL SIG
> http://lists.nyphp.org/mailman/listinfo/mysql
>
> NYPHPCon 2006 Presentations Online
> http://www.nyphpcon.com
>
> Show Your Participation in New York PHP
> http://www.nyphp.org/show_participation.php
>
_______________________________________________
New York PHP Community MySQL SIG
http://lists.nyphp.org/mailman/listinfo/mysql

NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com

Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php

Reply via email to