> Thanks for taking the time to write this up follow through the thread. > It's always interesting to hear situations where btrfs doesn't work > well. > > There are three basic problems with the database workloads on btrfs. > First is that we have higher latencies on writes because we are feeding > everything through helper threads for crcs. Usually the extra latencies > don't show up because we have enough work in the pipeline to keep the > drive busy. > > I don't believe the UEK kernels have the recent changes to do some of > the crc work inline (without handing off) for smaller synchronous IOs. > > Second, on O_SYNC writes btrfs will write both the file metadata and > data into a special tree so we can be crash safe. For big files this > tends to spend a lot of time looking for the extents in the file that > have changed. > > Josef fixed that up and it is queued for the next merge window. > > The third problem is that lots of random writes tend to make lots of > metadata. If this doesn't fit in ram, we can end up doing many reads > that slow things down. We're working on this now as well, but recent > kernels change how we cache things and should improve the results.
I feel I should update my previous thread about performance issues using btrfs in light of recent findings. We have discovered that, in all likelihood, what we experienced and what was described, was not a problem with btrfs per se, but a result of a more general issue which btrfs was just really good at exposing (using threads more aggressively than zfs?!). Various benchmarks in Java (thread-pool setup/shutdown) and C (pthreads creation and joining), has shown that our Xeon/E5-2620 server with the latest Oracle Unbreakable Linux has a very slow time serving up new threads (benchmarks available upon request). Java threading benchmark on Xeon/E5-2620 @ 2.0GHz: Oracle Unbreakable Linux: 1m49s realtime, 3m17s sys-time Ubuntu: 5s realtime, 3.9s sys-time. We are not sure how to continue investigating why the Oracle Linux/Kernel performs so poorly (scheduler, kernel config etc?), but it seems pretty obvious that this issue should be raised with Oracle rather than the btrfs developers - though we'll probably look into using another OS entirely. As such, apologies for creating the noise, btrfs was not to blame! If you do have a suspicion or insight on the matter (perhaps work for Oracle, or know OUK?), of course we'd love a followup offline this list. Kind regards, Casper -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html