On Thu, 5 Jun 2014 09:50:53 Igor M wrote: > But data to this big tables is only appended, it's never deleted. So > no rewrites should be happening.
When you write to the big tables the indexes will be rewritten. Indexes can be in the same file as table data or as separate files depending on what data base you use. For the former you get fragmented table files and for the latter 70G of data will have index files that are large enough to get fragmented. Also when you have multiple files in a filesystem being written at the same time (EG multiple tables appended to in each transaction) then you will get some fragmentation. Add COW and that makes a lot of fragmentation. Finally append is done at the file level while COW is rewriting at the block level. If your database rounds up the allocated space to some power of 2 larger than 4K then things will be fine for a filesystem like Ext3 where file offsets correspond to fixed locations on disk. But with BTRFS that pre- allocated space filled with zeros will be rewritten to a different part of disk when the allocated space is used. If you use a database that doesn't preallocate space then COW will be invoked when the end of the file at an offset that isn't a multiple of 4K (or I think 16K for a BTRFS filesystem created with a recent mkfs.btrfs) is written as appending to data within a block offset means rewriting the block. I believe that COW is desirable for a database. I don't believe that a lack of integrity at the filesystem level will help integrity at the database level. If the working set of your database can fit in RAM then you can rely on cache to ensure that little data is read during operation. For example one of my database servers has been running for 330 days and the /mysql filesystem has writes outnumbering reads by a factor of 3:1. When most IO is for writes fragmentation of data is less of an issue - although in this case the server is running Ext3 so it wouldn't get the COW fragmentation issues. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html