Re: [Gluster-users] Split brain?

2016-02-08 Thread Pranith Kumar Karampuri



On 02/08/2016 09:05 PM, Rodolfo Gonzalez wrote:
Hello, I think that I have a split-brain issue in a replicated-striped 
gluster cluster with 4 bricks: brick1+brick2 - brick3+brick4


In both brick3 and brick4 I'm getting these kind of messages:

[2016-02-08 15:01:49.720343] E 
[afr-self-heal-entry.c:246:afr_selfheal_detect_gfid_and_type_mismatch] 
0-data-replicate-1: Gfid mismatch detected for 
<572624f9-6752-44c4-b403-d1775dc7ea0d/Report.csv>, 
05dabf44-56aa-471f-a898-02ce225d31b0 on data-client-3 and 
1cf52b86-6c69-4483-8f17-686f50b3f316 on data-client-2. Skipping 
conservative merge on the file.


gluster volumen status show 4 bricks online and no errors. gluster 
peer status shows all 4 bricks connected. bricks 2 and 1 show no 
errors. Gluster version is 3.6.8 in all 4 servers.


How can this be solved? Any help is appreciated :)


Could you give "gluster volume heal  info" output?

Pranith


Thank you.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS behaviour on stat syscall with relatime activated

2016-02-08 Thread Pranith Kumar Karampuri



On 02/08/2016 09:57 PM, Simon Turcotte-Langevin wrote:


Good day to you Pranith,

Once again, thank you for your time. Our use case does include a lot 
of small files and the read performances must not be impacted by a 
RELATIME-based solution. Even though this option could fix the 
RELATIME behavior on GlusterFS, it looks like the impact of the 
performance could be too great for us. Therefore, we will test the 
solution, but we will also consider alternative ways to detect usage 
of the files we serve.



hi Simon,
Yeah it is a trade off :-/. What is the kind of workload you 
guys use? It would be nice to know what you guys(ubisoft) do in detail, 
even if it is a blog or something where you guys explain it in detail it 
would be fine. I remember you guys from gluster 3.4 days (vaguely 
remember debugging root:root directory permission issues). What version 
are you guys using nowadays? How has been the experience? What are your 
pain points with gluster? This feedback helps in improving gluster.


Thanks
Pranith


Simon

*From:*Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
*Sent:* 8 février 2016 00:26
*To:* Simon Turcotte-Langevin ; 
gluster-users@gluster.org

*Cc:* UPS_Development 
*Subject:* Re: [Gluster-users] GlusterFS behaviour on stat syscall 
with relatime activated


On 02/06/2016 12:19 AM, Simon Turcotte-Langevin wrote:

Good day to you Pranith,

Thank you for your answer, it was exactly this. However, we still
have an issue with RELATIME on GlusterFS.

Stating the file does not modify atime anymore, with quick-read
disabled, however cat-ing the file does not replicate the atime.

This is because of open-behind feature. Disable open-behind with: 
"gluster volume set  open-behind off". I believe you will see 
the atime behavior you want to see with it. This will reduce the 
performance of small file reads (< 64KB). Instead of one lookup over 
the network now it will do, lookup + open(This will be sent to both 
the replica bricks which updates atime) + read (Only one of the 
bricks). Let me know if you want any more information.


Pranith


If I touch manually the file, the atime (or utimes) is replicated
correctly.

So to sum it up:

·[node1] Touch –a file1

oàAccess time is right on [node1] [node2] and [node3]

·[node1] Cat file1

oàAccess time is right on [node1]

oàAccess time is wrong on [node2] and [node3]

Would you have any idea what is going on behind the curtain, and
if there is any way to fix that behavior?

Thank you,

Simon

*From:*Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
*Sent:* 5 février 2016 00:55
*To:* Simon Turcotte-Langevin

;
gluster-users@gluster.org 
*Cc:* UPS_Development 

*Subject:* Re: [Gluster-users] GlusterFS behaviour on stat syscall
with relatime activated

On 02/03/2016 10:12 PM, Simon Turcotte-Langevin wrote:

Hi, we have multiple clusters of GlusterFS which are mostly
alike. The typical setup is as such:

-Cluster of 3 nodes

-Replication factor of 3

-Each node has 1 brick, mounted on XFS with RELATIME and
NODIRATIME

-Each node has 8 disks in RAID 0 hardware

The main problem we are facing is that observation of the
access time of a file on the volume will update the access time.

The steps to reproduce the problem are:

-Create a file (echo ‘some data’ > /mnt/gv0/file)

-Touch its mtime and atime to some past date (touch –d
19700101 /mnt/gv0/file)

-Touch its mtime to the current timestamp (touch –m /mnt/gv0/file)

-Stat the file until atime is updated (stat /mnt/gv0/file)

oSometimes it’s instant, sometimes it requires to execute the
above command a couple of time

atime changes on open call.

Quick-read xlator opens the file and reads the content on 'lookup'
which gets triggered in stat. It does that to serve reads from
memory to reduce number of network round trips for small files.
Could you disable that xlator and try the experiment? On my
machine the time didn't change after I disabled that feature using:

"gluster volume set  quick-read off"

Pranith


On the IRC channel, I spoke to a developer (nickname ndevos)
who said that it might be a getxattr() syscall that could be
called when stat() is called on a replicated volume.

Anybody can reproduce this issue? Is it a bug, or is it
working as intended? Is there any workaround?

Thank you,

Simon





___

Gluster-users mailing list

Gluster-users@gluster.org  

http://www.gluster.org/mailman/listinfo/gluster-users




[Gluster-users] geo-replication 3.7.6 fails after initial hybrid crawl...

2016-02-08 Thread Dietmar Putz

Hi all,

once again i need some help to get our geo-replication running again...
master and slave are 6-node distributed replicated volumes running 
ubuntu 14.04 and glusterfs 3.7.6 from the ubuntu ppa.
the master volume already contains about 45 TByte of data, the slave 
volume was created from scratch before geo-replication was setup and 
started.
both cluster exist since gluster 3.3 and were updated step by step to 
3.4, 3.5, 3.6 and 3.7. since update to 3.5 the geo-replication is not 
running anymore.


the geo-replication started with 3 active and 3 passive connections in 
hybrid crawl modus and was transferring about 99 % of the data to the 
slave volume.
afterwards the geo-replication on the first master pair (the active and 
the correspondig passive) node becomes faulty, about two hours later the 
the second master pair.
the last active master node remains for further 36 hours in hybrid crawl 
and was still transferring data to the slave until it fails too.


currently sometimes i can see an active master in history crawl for a 
very short moment before the status is faulty again.
while i'm writing this mail i recognize some failures reported for 
gluster-ger-ber-07. this was the last active master node...


does anybody have an idea what to do next...?

any help is welcome...
best regards
dietmar



[ 15:47:45 ] - root@gluster-ger-ber-07  ~/tmp/geo-rep-376 $gluster 
volume geo-replication ger-ber-01 gluster-wien-02::wien-01 status detail


MASTER NODE   MASTER VOLMASTER BRICK   SLAVE USER
SLAVE   SLAVE NODE STATUSCRAWL STATUS 
LAST_SYNCEDENTRYDATAMETA FAILURESCHECKPOINT TIME
CHECKPOINT COMPLETEDCHECKPOINT COMPLETION TIME


gluster-ger-ber-07ger-ber-01/gluster-export root  
gluster-wien-02::wien-01N/A FaultyN/A  
N/AN/A  N/A N/A N/A N/AN/A N/A
gluster-ger-ber-10ger-ber-01/gluster-export root  
gluster-wien-02::wien-01N/A FaultyN/A  
N/AN/A  N/A N/A N/A N/AN/A N/A
gluster-ger-ber-12ger-ber-01/gluster-export root  
gluster-wien-02::wien-01N/A FaultyN/A  
N/AN/A  N/A N/A N/A N/AN/A N/A
gluster-ger-ber-09ger-ber-01/gluster-export root  
gluster-wien-02::wien-01N/A FaultyN/A  
N/AN/A  N/A N/A N/A N/AN/A N/A
gluster-ger-ber-11ger-ber-01/gluster-export root  
gluster-wien-02::wien-01N/A FaultyN/A  
N/AN/A  N/A N/A N/A N/AN/A N/A
gluster-ger-ber-08ger-ber-01/gluster-export root  
gluster-wien-02::wien-01gluster-wien-04-int ActiveHistory 
CrawlN/A00   0 0   
N/AN/A N/A





[ 15:47:14 ] - root@gluster-ger-ber-07  ~/tmp/geo-rep-376 $gluster 
volume geo-replication ger-ber-01 gluster-wien-02::wien-01 status detail


MASTER NODE   MASTER VOLMASTER BRICK   SLAVE USER
SLAVE   SLAVE NODE STATUSCRAWL STATUS 
LAST_SYNCEDENTRYDATA METAFAILURESCHECKPOINT 
TIMECHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME


gluster-ger-ber-07ger-ber-01/gluster-export root  
gluster-wien-02::wien-01gluster-wien-07-int ActiveHistory 
Crawl2016-02-07 22:12:5100 0   2601
N/AN/A N/A
gluster-ger-ber-12ger-ber-01/gluster-export root  
gluster-wien-02::wien-01N/A FaultyN/A  
N/AN/A  N/A N/A N/A 
N/AN/A N/A
gluster-ger-ber-11ger-ber-01/gluster-export root  
gluster-wien-02::wien-01N/A FaultyN/A  
N/AN/A  N/A N/A N/A 
N/AN/A N/A
gluster-ger-ber-10ger-ber-01/gluster-export root  
gluster-wien-02::wien-01N/A FaultyN/A  
N/AN/A  N/A N/A N/A 
N/AN/A N/A
gluster-ger-ber-09ger-ber-01/gluster-export root  
gluster-wien-02::wien-01N/A FaultyN/A  
N/AN/A  N/A N/A N/A 
N/AN/A N/A
gluster-ge

[Gluster-users] Split brain?

2016-02-08 Thread Rodolfo Gonzalez
Hello, I think that I have a split-brain issue in a replicated-striped
gluster cluster with 4 bricks: brick1+brick2 - brick3+brick4

In both brick3 and brick4 I'm getting these kind of messages:

[2016-02-08 15:01:49.720343] E
[afr-self-heal-entry.c:246:afr_selfheal_detect_gfid_and_type_mismatch]
0-data-replicate-1: Gfid mismatch detected for
<572624f9-6752-44c4-b403-d1775dc7ea0d/Report.csv>,
05dabf44-56aa-471f-a898-02ce225d31b0 on data-client-3 and
1cf52b86-6c69-4483-8f17-686f50b3f316 on data-client-2. Skipping
conservative merge on the file.

gluster volumen status show 4 bricks online and no errors. gluster peer
status shows all 4 bricks connected. bricks 2 and 1 show no errors. Gluster
version is 3.6.8 in all 4 servers.

How can this be solved? Any help is appreciated :)

Thank you.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fail of one brick lead to crash VMs

2016-02-08 Thread FNU Raghavendra Manjunath
+ Pranith

In the meantime, can you please provide the logs of all the gluster server
machines  and the client machines?

Logs can be found in /var/log/glusterfs directory.

Regards,
Raghavendra

On Mon, Feb 8, 2016 at 9:20 AM, Dominique Roux 
wrote:

> Hi guys,
>
> I faced a problem a week ago.
> In our environment we have three servers in a quorum. The gluster volume
> is spreaded over two bricks and has the type replicated.
>
> We now, for simulating a fail of one brick, isolated one of the two
> bricks with iptables, so that communication to the other two peers
> wasn't possible anymore.
> After that VMs (opennebula) which had I/O in this time crashed.
> We stopped the glusterfsd hard (kill -9) and restarted it, what made
> things work again (Certainly we also had to restart the failed VMs). But
> I think this shouldn't happen. Since quorum was not reached (2/3 hosts
> were still up and connected).
>
> Here some infos of our system:
> OS: CentOS Linux release 7.1.1503
> Glusterfs version: glusterfs 3.7.3
>
> gluster volume info:
>
> Volume Name: cluster1
> Type: Replicate
> Volume ID:
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: srv01:/home/gluster
> Brick2: srv02:/home/gluster
> Options Reconfigured:
> cluster.self-heal-daemon: enable
> cluster.server-quorum-type: server
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: on
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> server.allow-insecure: on
> nfs.disable: 1
>
> Hope you can help us.
>
> Thanks a lot.
>
> Best regards
> Dominique
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Fail of one brick lead to crash VMs

2016-02-08 Thread Dominique Roux
Hi guys,

I faced a problem a week ago.
In our environment we have three servers in a quorum. The gluster volume
is spreaded over two bricks and has the type replicated.

We now, for simulating a fail of one brick, isolated one of the two
bricks with iptables, so that communication to the other two peers
wasn't possible anymore.
After that VMs (opennebula) which had I/O in this time crashed.
We stopped the glusterfsd hard (kill -9) and restarted it, what made
things work again (Certainly we also had to restart the failed VMs). But
I think this shouldn't happen. Since quorum was not reached (2/3 hosts
were still up and connected).

Here some infos of our system:
OS: CentOS Linux release 7.1.1503
Glusterfs version: glusterfs 3.7.3

gluster volume info:

Volume Name: cluster1
Type: Replicate
Volume ID:
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: srv01:/home/gluster
Brick2: srv02:/home/gluster
Options Reconfigured:
cluster.self-heal-daemon: enable
cluster.server-quorum-type: server
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: on
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
server.allow-insecure: on
nfs.disable: 1

Hope you can help us.

Thanks a lot.

Best regards
Dominique
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Different file in two bricks, no split-brain detected

2016-02-08 Thread Krutika Dhananjay
Hi, 

Could you disable quick-read, read-ahead and io-cache and run your test again 
and share the results: 
#gluster volume set  performance.quick-read off 
#gluster volume set  performance.read-ahead off 
#gluster volume set  performance.io-cache off 

-Krutika 
- Original Message -

> From: "Klearchos Chaloulos (Nokia - GR/Athens)"
> 
> To: "EXT Krutika Dhananjay" 
> Cc: gluster-users@gluster.org
> Sent: Friday, February 5, 2016 8:30:57 PM
> Subject: RE: [Gluster-users] Different file in two bricks, no split-brain
> detected

> Hello,

> I managed to get logs from two occurrences, please see anonymized logs
> attached.

> Occurrence1: The copied file in both bricks had the correct checksum, but the
> client so an erroneous checksum.

> Occurrence2: The copied file in one brick had the correct checksum and in the
> second brick has an erroneous checksum.

> For details check the notes.txt file in the tarballs.

> Do you have any idea what could be causing this behavior?

> Best regards,

> Klearchos

> From: Chaloulos, Klearchos (Nokia - GR/Athens)
> Sent: Monday, February 01, 2016 10:39 AM
> To: 'EXT Krutika Dhananjay' 
> Cc: gluster-users@gluster.org
> Subject: RE: [Gluster-users] Different file in two bricks, no split-brain
> detected

> Hello,

> Sorry for not replying, but lately the issue cannot be reproduced. If we have
> any new occurrences I’ll collect the logs and send them here.

> Klearchos

> From: EXT Krutika Dhananjay [ mailto:kdhan...@redhat.com ]
> Sent: Wednesday, January 27, 2016 7:12 AM
> To: Chaloulos, Klearchos (Nokia - GR/Athens) < klearchos.chalou...@nokia.com
> >
> Cc: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Different file in two bricks, no split-brain
> detected

> Hi,

> Could you share the following pieces of information:

> 1) output of `gluster volume info `

> 2) the client/mount logs

> 3) glustershd logs

> -Krutika

> > From: "Klearchos Chaloulos (Nokia - GR/Athens)" <
> > klearchos.chalou...@nokia.com >
> 
> > To: gluster-users@gluster.org
> 
> > Sent: Tuesday, January 26, 2016 9:57:38 PM
> 
> > Subject: [Gluster-users] Different file in two bricks, no split-brain
> > detected
> 

> > Description of problem:
> 

> > My setup has 5 gluster volumes, and each of them has 2 bricks as backend.
> 

> > When I copy a large file (100MB) in a gluster volume, 9/10 times it works
> > OK.
> > But about 1 in 10 times the resulting md5 is wrong. After checking I found
> > that the file in one brick has the correct md5sum, while the file in the
> > other brick has a wrong md5sum. The size of the two files is the same.
> 

> > By running "cmp -l  "
> 

> > I found that the difference was in 49 bytes. So the files in the two bricks
> > had the same size, but 49 files were different. Interestingly enough I saw
> > the same number of 49 bytes being different at every check that I made.
> 

> > Do you know what might cause this behavior, has anyone seen something like
> > this before? Is this a bug in glusterfs?
> 

> > Version-Release number of selected component (if applicable):
> 

> > glusterfs 3.7.5 built on Nov 19 2015 16:29:59
> 

> > Repository revision: git://git.gluster.com/glusterfs.git
> 

> > Copyright (c) 2006-2011 Gluster Inc. < http://www.gluster.com >
> 

> > GlusterFS comes with ABSOLUTELY NO WARRANTY.
> 

> > You may redistribute copies of GlusterFS under the terms of the GNU General
> > Public License.
> 

> > How reproducible:
> 

> > Not easy to reproduce, about 1 in 10 times in some environments, not
> > reproducible at all in other environments.
> 

> > Steps to Reproduce:
> 

> > 1. scp <100MB file> 
> 

> > Actual results:
> 

> > 1. md5sum of destination should be the same as the source
> 

> > 2. If checksum of files is different between the two bricks, the command
> > "gluster volume heal  info split-brain" should return that the
> > two
> > bricks are in split-brain.
> 

> > Expected results:
> 

> > 1. 1 in 10 times the destination file has incorrect checksum. Size is the
> > same, but 49 bytes are altered.
> 

> > 2. "gluster volume heal  info split-brain" does not return that
> > the
> > bricks are in split-brain, even though the checksum of the file in the two
> > bricks is different. The size of the file is the same in the two bricks.
> > But
> > 49 bytes are altered.
> 

> > Additional info:
> 

> > ___
> 
> > Gluster-users mailing list
> 
> > Gluster-users@gluster.org
> 
> > http://www.gluster.org/mailman/listinfo/gluster-users
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users