Re: [Gluster-users] data corruption - any update?

2017-10-03 Thread Krutika Dhananjay
On Wed, Oct 4, 2017 at 10:51 AM, Nithya Balachandran 
wrote:

>
>
> On 3 October 2017 at 13:27, Gandalf Corvotempesta <
> gandalf.corvotempe...@gmail.com> wrote:
>
>> Any update about multiple bugs regarding data corruptions with
>> sharding enabled ?
>>
>> Is 3.12.1 ready to be used in production?
>>
>
> Most issues have been fixed but there appears to be one more race for
> which the patch is being worked on.
>
> @Krutika, is that correct?
>
>
>
That is my understanding too, yes, in light of the discussion that happened
at https://bugzilla.redhat.com/show_bug.cgi?id=1465123

-Krutika


> Thanks,
> Nithya
>
>
>
>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] data corruption - any update?

2017-10-03 Thread Nithya Balachandran
On 3 October 2017 at 13:27, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Any update about multiple bugs regarding data corruptions with
> sharding enabled ?
>
> Is 3.12.1 ready to be used in production?
>

Most issues have been fixed but there appears to be one more race for which
the patch is being worked on.

@Krutika, is that correct?


Thanks,
Nithya




> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] data corruption - any update?

2017-10-03 Thread Gandalf Corvotempesta
Any update about multiple bugs regarding data corruptions with
sharding enabled ?

Is 3.12.1 ready to be used in production?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] how to verify bitrot signed file manually?

2017-10-03 Thread Amudhan P
my volume is distributed disperse volume 8+2 EC.
file1 and file2 are different files lying in same brick. I am able to read
the file from mount point without any issue because of EC it reads rest of
the available blocks in other nodes.

my question is "file1" sha256 value matches bitrot signature value but
still, it is also marked as bad by scrubber daemon. why is that?



On Fri, Sep 29, 2017 at 12:52 PM, Kotresh Hiremath Ravishankar <
khire...@redhat.com> wrote:

> Hi Amudhan,
>
> Sorry for the late response as I was busy with other things. You are right
> bitrot uses sha256 for checksum.
> If file-1, file-2 are marked bad, the I/O should be errored out with EIO.
> If that is not happening, we need
> to look further into it. But what's the file contents of file-1 and file-2
> on the replica bricks ? Are they
> matching ?
>
> Thanks and Regards,
> Kotresh HR
>
> On Mon, Sep 25, 2017 at 4:19 PM, Amudhan P  wrote:
>
>> resending mail.
>>
>>
>> On Fri, Sep 22, 2017 at 5:30 PM, Amudhan P  wrote:
>>
>>> ok, from bitrot code I figured out gluster using sha256 hashing algo.
>>>
>>>
>>> Now coming to the problem, during scrub run in my cluster some of my
>>> files were marked as bad in few set of nodes.
>>> I just wanted to confirm bad file. so, I have used "sha256sum" tool in
>>> Linux to manually get file hash.
>>>
>>> here is the result.
>>>
>>> file-1, file-2 marked as bad by scrub and file-3 is healthy.
>>>
>>> file-1 sha256 and bitrot signature value matches but still it's been
>>> marked as bad.
>>>
>>> file-2 sha256 and bitrot signature value don't match, could be a victim
>>> of bitrot or bitflip.file is still readable without any issue and no errors
>>> found in the drive.
>>>
>>> file-3 sha256 and bitrot signature matches and healthy.
>>>
>>>
>>> file-1 output from
>>>
>>> "sha256sum" = "71eada9352b1352aaef0f806d3d56
>>> 1768ce2df905ded1668f665e06eca2d0bd4"
>>>
>>>
>>> "getfattr -m. -e hex -d "
>>> # file: file-1
>>> trusted.bit-rot.bad-file=0x3100
>>> trusted.bit-rot.signature=0x01020071eada9352b135
>>> 2aaef0f806d3d561768ce2df905ded1668f665e06eca2d0bd4
>>> trusted.bit-rot.version=0x020058e4f3b40006793d
>>> trusted.ec.config=0x080a02000200
>>> trusted.ec.dirty=0x
>>> trusted.ec.size=0x000718996701
>>> trusted.ec.version=0x00038c4c00038c4d
>>> trusted.gfid=0xf078a24134fe4f9bb953eca8c28dea9a
>>>
>>> output scrub log:
>>> [2017-09-02 13:02:20.311160] A [MSGID: 118023]
>>> [bit-rot-scrub.c:244:bitd_compare_ckum] 0-qubevaultdr-bit-rot-0:
>>> CORRUPTION DETECTED: Object /file-1 {Brick: /media/disk16/brick16 | GFID:
>>> f078a241-34fe-4f9b-b953-eca8c28dea9a}
>>> [2017-09-02 13:02:20.311579] A [MSGID: 118024]
>>> [bit-rot-scrub.c:264:bitd_compare_ckum] 0-qubevaultdr-bit-rot-0:
>>> Marking /file-1 [GFID: f078a241-34fe-4f9b-b953-eca8c28dea9a | Brick:
>>> /media/disk16/brick16] as corrupted..
>>>
>>> file-2 output from
>>>
>>> "sha256sum" = "c41ef9c81faed4f3e6010ea67984c
>>> 3cfefd842f98ee342939151f9250972dcda"
>>>
>>>
>>> "getfattr -m. -e hex -d "
>>> # file: file-2
>>> trusted.bit-rot.bad-file=0x3100
>>> trusted.bit-rot.signature=0x0102009162cb17d4f0be
>>> e676fcb7830c5286d05b8e8940d14f3d117cb90b7b1defc129
>>> trusted.bit-rot.version=0x020058e4f3b400019bb2
>>> trusted.ec.config=0x080a02000200
>>> trusted.ec.dirty=0x
>>> trusted.ec.size=0x403433f6
>>> trusted.ec.version=0x201a201b
>>> trusted.gfid=0xa50012b0a632477c99232313928d239a
>>>
>>> output scrub log:
>>> [2017-09-02 05:18:14.003156] A [MSGID: 118023]
>>> [bit-rot-scrub.c:244:bitd_compare_ckum] 0-qubevaultdr-bit-rot-0:
>>> CORRUPTION DETECTED: Object /file-2 {Brick: /media/disk13/brick13 | GFID:
>>> a50012b0-a632-477c-9923-2313928d239a}
>>> [2017-09-02 05:18:14.006629] A [MSGID: 118024]
>>> [bit-rot-scrub.c:264:bitd_compare_ckum] 0-qubevaultdr-bit-rot-0:
>>> Marking /file-2 [GFID: a50012b0-a632-477c-9923-2313928d239a | Brick:
>>> /media/disk13/brick13] as corrupted..
>>>
>>>
>>> file-3 output from
>>>
>>> "sha256sum" = "a590735b3c8936cc7ca9835128a19
>>> c38a3f79c8fd53fddc031a9349b7e273f27"
>>>
>>>
>>> "getfattr -m. -e hex -d "
>>> # file: file-3
>>> trusted.bit-rot.signature=0x010200a590735b3c8936
>>> cc7ca9835128a19c38a3f79c8fd53fddc031a9349b7e273f27
>>> trusted.bit-rot.version=0x020058e4f3b400019bb2
>>> trusted.ec.config=0x080a02000200
>>> trusted.ec.dirty=0x
>>> trusted.ec.size=0x3530fc96
>>> trusted.ec.version=0x1a981a99
>>> trusted.gfid=0x10d8920e42cd42cf9448b8bf3941c192
>>>
>>>
>>>
>>> most of the bitrot bad files are in the set of new nodes and data were
>>> uploaded using gluster 3.10.1. no drive issues are any kind of error msgs
>>> in logs.
>>>
>>> what could be gone wrong?
>>>
>>> regards
>>> Amudhan
>>>
>>> On Thu, Sep 21, 2017 at 1:23 PM, Amudhan P  wrote:
>>>
 Hi,

 I hav