Large filesystems with lots of block groups can suffer long stalls during commit while we create and send down all of the block group caches. The more blocks groups dirtied in a transaction, the longer these stalls can be. Some workloads average 10 seconds per commit, but see peak times much higher.
The first problem is that we write and wait for each block group cache individually, so we aren't keeping the disk pipeline full. This patch set uses the io_ctl struct to start cache IO, and then waits on it in bulk. The second problem is that we only allow cache writeout while new modifications are blocked during the final stage of commit. This adds some locking so that cache writeout can happen very early in the commit, and any block groups that are redirtied will be sent down during the final stages. With both together, average commit stalls are under a second and our overall performance is much smoother. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html