On Mon, Sep 24, 2012 at 12:19:20PM -0600, Arne Jansen wrote: > On 09/24/12 20:11, Josef Bacik wrote: > > The reason we offload csumming is because it is CPU intensive, except it is > > not on modern intel CPUs. So check to see if we support hardware crc32c, > > and if we do just do the csumming in our current threads context. Otherwise > > we can farm it off. Thanks, > > > > Signed-off-by: Josef Bacik <jba...@fusionio.com> > > --- > > fs/btrfs/disk-io.c | 17 +++++++++++++++++ > > 1 files changed, 17 insertions(+), 0 deletions(-) > > > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > > index dcaf556..830b9af 100644 > > --- a/fs/btrfs/disk-io.c > > +++ b/fs/btrfs/disk-io.c > > @@ -31,6 +31,7 @@ > > #include <linux/migrate.h> > > #include <linux/ratelimit.h> > > #include <asm/unaligned.h> > > +#include <asm/cpufeature.h> > > #include "compat.h" > > #include "ctree.h" > > #include "disk-io.h" > > @@ -880,6 +881,22 @@ static int btree_submit_bio_hook(struct inode *inode, > > int rw, struct bio *bio, > > } > > > > /* > > + * Pretty sure I'm going to hell for this. If our CPU can do crc32cs in > > + * the hardware then there is no reason to do the csum stuff > > + * asynchronously, it will be faster to do it inline, so test to see if > > + * our CPU can do hardware crc32c and if it can just do the csum in our > > + * threads context. > > + */ > > +#ifdef CONFIG_X86 > > + if (cpu_has_xmm4_2) { > > + printk(KERN_ERR "doing it the fast way\n"); > > You'll probably go to hell for the printk...
;) Testing with dd on my recent intel box, I can hardware crc32c at 1.3GB/s. Anything beyond that and you really want more cpus jumping into the mix. I wanted to use this test for data crcs too, but I suppose the helpers only really hurt for the synchronous IO. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html