2011/12/1 Christian Brunner <c...@muc.de>: > 2011/12/1 Alexandre Oliva <ol...@lsd.ic.unicamp.br>: >> On Nov 29, 2011, Christian Brunner <c...@muc.de> wrote: >> >>> When I'm doing havy reading in our ceph cluster. The load and wait-io >>> on the patched servers is higher than on the unpatched ones. >> >> That's unexpected.
In the mean time I know, that it's not related to the reads. >> I suppose I could wave my hands while explaining that you're getting >> higher data throughput, so it's natural that it would take up more >> resources, but that explanation doesn't satisfy me. I suppose >> allocation might have got slightly more CPU intensive in some cases, as >> we now use bitmaps where before we'd only use the cheaper-to-allocate >> extents. But that's unsafisfying as well. > > I must admit, that I do not completely understand the difference > between bitmaps and extents. > > From what I see on my servers, I can tell, that the degradation over > time is gone. (Rebooting the servers every day is no longer needed. > This is a real plus.) But the performance compared to a freshly > booted, unpatched server is much slower with my ceph workload. > > I wonder if it would make sense to initialize the list field only, > when the cluster setup fails? This would avoid the fallback to the > much unclustered allocation and would give us the cheaper-to-allocate > extents. I've now tried various combinations of you patches and I can really nail it down to this one line. With this patch applied I get much higher write-io values than without it. Some of the other patches help to reduce the effect, but it's still significant. iostat on an unpatched node is giving me: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 105.90 0.37 15.42 14.48 2657.33 560.13 107.61 1.89 62.75 6.26 18.71 while on a node with this patch it's sda 128.20 0.97 11.10 57.15 3376.80 552.80 57.58 20.58 296.33 4.16 28.36 Also interesting, is the fact that the average request size on the patched node is much smaller. Josef was telling me, that this could be related to the number of bitmaps we write out, but I've no idea how to trace this. I would be very happy if someone could give me a hint on what to do next, as this is one of the last remaining issues with our ceph cluster. Thanks, Christian -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html