On 02/28/2014 01:09 PM, Zhang Huan wrote:
On Fri, Feb 28, 2014 at 12:07 PM, Ravishankar N
<ravishan...@redhat.com <mailto:ravishan...@redhat.com>>wrote:
On 02/28/2014 07:28 AM, Zhang Huan wrote:
Hello Ravi,
Thanks for your reply.
Sorry that I have a typo in my mail. It should by "underlying
corruption" instead of "underlying correction".
I guess the logic of eliminating zero byte files from all
innocent nodes is working for preventing underlying corruption to
propagate to other brick. Asked in another way, if the underlying
brick finds some file is corrupted, anything it could do to tell
glusterfs to fix it?
Hi Zhang,
If all nodes are innocent (from AFR's point of view) ,then AFR
cannot use the changelog attributes to determine which is source.
In this case, the safest bet is to mark all zero byte files as
sink, so that we don't end up healing in the wrong direction.
Like I said earlier, AFR can only use the changelog attributes
(xattrs) to determine the source/sinks. It cannot detect
underlying on disk file system corruptions outside the scope of
the xattrs.
If you are sure that a particular brick is the right source
despite the xattrs saying otherwise, you can manually change the
attributes of the file on all bricks so that AFR now sees that
brick as the source and heals in the expected direction.
-Ravi
Hello Ravi,
IMO, changing the attributes might be dangerous, since concurrent
access with glusterfs is introduced. Not sure if glusterfs has already
provided some mechanism for this.
You are right Zhang. My assumption was that the file wouldn't be
modified from the mount point while you are modifying the xattrs at the
bricks.
My suggestion is to eliminate the zero-byte file from heal source even
if is marked as a source. If the underlying filesystem finds some
corruption (by scrubbing daemon after checking data checksum), it
could truncate it to 0 and let glusterfs to do the healing job.
If there is underlying FS corruption and we need to make gluster aware
of it, then something like bit rot detection would be the way to go. You
can find more information about some work in progress on the gluster
website/ mailing list archives:
http://www.gluster.org/community/documentation/index.php/Arch/BitRot_Detection
http://lists.nongnu.org/archive/html/gluster-devel/2014-01/msg00209.html
https://lists.gnu.org/archive/html/gluster-devel/2014-01/msg00006.html
-Ravi
Here is several cases of analysis in my mind.
1. If this corrupted file is marked as the only source, then there is
no correct replica in the filesystem (actually all are fools), just
pick any one as the source to heal is OK;
2. If the corrupted file is one of the potential sources, eliminate
this one should keep healing in the right direction without further
corrupting other correct replicas.
3. If the corrupted file is not marked as a source, some other replica
will be chosen as a source and this file will be overwritten with
correct data.
4. If there is no one is marked as clean by attribute, it is quite
unlikely this file is chosen as a source as its size is 0. Even it is
chosen as a source, there is no further corruption of file content
after heal.
Zhang Huan
_______________________________________________
Gluster-devel mailing list
Gluster-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/gluster-devel