Hi folks

We've fielded a number of mirroring questions offline as well as watched/participated in discussions here. I thought it was important to make sure some of these are answered and searchable on the lists.

 One major question that kept arising was as follows:

q: If I have a large image file (say a VM vmdk/other format) on a mirrored volume, will one small change of a few bytes result in a resync of the entire file?

a:  No.

To test this, we created a 20GB file on a mirror volume.

root@metal:/local2/home/landman# ls -alF /mirror1gfs/big.file
-rw-r--r-- 1 root root 21474836490 2011-05-02 12:44 /mirror1gfs/big.file

Then using the following quick and dirty Perl, we appended about 10-20 bytes to the file.

#!/usr/bin/env perl

my $file=shift;
my $fh;
open($fh,">>".$file);
print $fh "end ".$$."\n";
close($fh);


root@metal:/local2/home/landman# ./app.pl /mirror1gfs/big.file

then I had to write a quick and dirty tail replacement, as I've discovered that tail doesn't seek ... (yeah, it started reading every 'line' of that file ...)

#!/usr/bin/env perl

my $file=shift;
my $fh;
my $buf;

open($fh,"<".$file);
seek $fh,-200,2;
read $fh,$buf,200;
printf "buffer: \'%s\'\n",$buf;
close($fh);


root@metal:/local2/home/landman# ./tail.pl /mirror1gfs/big.file
buffer: 'end 19362'

While running the app.pl, I did not see any massive resyncs. I had dstat running in another window.

You might say, that this is irrelevant, as we only appended, and that could be special cased.

So I wrote a random updater, that updated at random spots throughtout the large file (sorta like a VM vmdk and other files).


#!/usr/bin/env perl

my $file=shift;
my $fh;
my $buf;
my @stat;
my $loc;

@stat = stat($file);
$loc    =       int(rand($stat[7]));
open($fh,">>+".$file);
seek $fh,$loc,0;
printf $fh "I was here!!!";
printf "loc: %i\n",$loc;
close($fh);

root@metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 17598205436
root@metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 16468787891
root@metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 9271612568
root@metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 1356667302
root@metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 12365324308
root@metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 15654714313
root@metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 10127739152
root@metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 10259920623

and again, no massive resyncs.

So I think its fairly safe to say that the concern over massive resyncs for small updates is not something we see in the field.

Regards,

Joe

--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: land...@scalableinformatics.com
web  : http://scalableinformatics.com
       http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to