[Gluster-users] Release 3.12.8: Scheduled for the 12th of April

2018-04-10 Thread Jiffin Tony Thottan

Hi,

It's time to prepare the 3.12.8 release, which falls on the 10th of
each month, and hence would be 12-04-2018 this time around.

This mail is to call out the following,

1) Are there any pending *blocker* bugs that need to be tracked for
3.12.7? If so mark them against the provided tracker [1] as blockers
for the release, or at the very least post them as a response to this
mail

2) Pending reviews in the 3.12 dashboard will be part of the release,
*iff* they pass regressions and have the review votes, so use the
dashboard [2] to check on the status of your patches to 3.12 and get
these going

3) I have made checks on what went into 3.10 post 3.12 release and if
these fixes are already included in 3.12 branch, then status on this is 
*green*

as all fixes ported to 3.10, are ported to 3.12 as well.

@Mlind

IMO https://review.gluster.org/19659 is like a minor feature to me. Can 
please provide a justification for why it need to include in 3.12 stable 
release?


And please rebase the change as well

@Raghavendra

The smoke failed for https://review.gluster.org/#/c/19818/. Can please 
check the same?


Thanks,
Jiffin

[1] Release bug tracker:
https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.12.8

[2] 3.12 review dashboard:
https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:3-12-dashboard 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] volume start: gv01: failed: Quorum not met. Volume operation not allowed.

2018-04-10 Thread TomK

On 4/9/2018 2:45 AM, Alex K wrote:
Hey Alex,

With two nodes, the setup works but both sides go down when one node is 
missing.  Still I set the below two params to none and that solved my issue:


cluster.quorum-type: none
cluster.server-quorum-type: none

Thank you for that.

Cheers,
Tom


Hi,

You need 3 nodes at least to have quorum enabled. In 2 node setup you 
need to disable quorum so as to be able to still use the volume when one 
of the nodes go down.


On Mon, Apr 9, 2018, 09:02 TomK > wrote:


Hey All,

In a two node glusterfs setup, with one node down, can't use the second
node to mount the volume.  I understand this is expected behaviour?
Anyway to allow the secondary node to function then replicate what
changed to the first (primary) when it's back online?  Or should I just
go for a third node to allow for this?

Also, how safe is it to set the following to none?

cluster.quorum-type: auto
cluster.server-quorum-type: server


[root@nfs01 /]# gluster volume start gv01
volume start: gv01: failed: Quorum not met. Volume operation not
allowed.
[root@nfs01 /]#


[root@nfs01 /]# gluster volume status
Status of volume: gv01
Gluster process                             TCP Port  RDMA Port 
Online  Pid


--
Brick nfs01:/bricks/0/gv01                  N/A       N/A        N 
      N/A

Self-heal Daemon on localhost               N/A       N/A        Y
25561

Task Status of Volume gv01

--
There are no active volume tasks

[root@nfs01 /]#


[root@nfs01 /]# gluster volume info

Volume Name: gv01
Type: Replicate
Volume ID: e5ccc75e-5192-45ac-b410-a34ebd777666
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: nfs01:/bricks/0/gv01
Brick2: nfs02:/bricks/0/gv01
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
nfs.trusted-sync: on
performance.cache-size: 1GB
performance.io-thread-count: 16
performance.write-behind-window-size: 8MB
performance.readdir-ahead: on
client.event-threads: 8
server.event-threads: 8
cluster.quorum-type: auto
cluster.server-quorum-type: server
[root@nfs01 /]#




==> n.log <==
[2018-04-09 05:08:13.704156] I [MSGID: 100030] [glusterfsd.c:2556:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
3.13.2 (args: /usr/sbin/glusterfs --process-name fuse
--volfile-server=nfs01 --volfile-id=/gv01 /n)
[2018-04-09 05:08:13.711255] W [MSGID: 101002]
[options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is
deprecated, preferred is 'transport.address-family', continuing with
correction
[2018-04-09 05:08:13.728297] W [socket.c:3216:socket_connect]
0-glusterfs: Error disabling sockopt IPV6_V6ONLY: "Protocol not
available"
[2018-04-09 05:08:13.729025] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2018-04-09 05:08:13.737757] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 2
[2018-04-09 05:08:13.738114] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 3
[2018-04-09 05:08:13.738203] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 4
[2018-04-09 05:08:13.738324] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 5
[2018-04-09 05:08:13.738330] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 6
[2018-04-09 05:08:13.738655] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 7
[2018-04-09 05:08:13.738742] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 8
[2018-04-09 05:08:13.739460] W [MSGID: 101174]
[graph.c:363:_log_if_unknown_option] 0-gv01-readdir-ahead: option
'parallel-readdir' is not recognized
[2018-04-09 05:08:13.739787] I [MSGID: 114020] [client.c:2360:notify]
0-gv01-client-0: parent translators are ready, attempting connect on
transport
[2018-04-09 05:08:13.747040] W [socket.c:3216:socket_connect]
0-gv01-client-0: Error disabling sockopt IPV6_V6ONLY: "Protocol not
available"
[2018-04-09 05:08:13.747372] I [MSGID: 114020] [client.c:2360:notify]
0-gv01-client-1: parent translators are ready, attempting connect on
transport
[2018-04-09 05:08:13.747883] E [MSGID: 

[Gluster-users] Reminder: Community Meeting on April 11, 15:00 UTC

2018-04-10 Thread Amye Scavarda
See you on IRC in #gluster-meeting!
Our next Gluster Community Meeting is on April 11 at 15:00 UTC.
Meeting agenda is https://bit.ly/gluster-community-meetings

Thanks!
-- amye
-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster cluster on two networks

2018-04-10 Thread Marcus Pedersén
Yes,
In first server (urd-gds-001):
gluster peer probe urd-gds-000
gluster peer probe urd-gds-002
gluster peer probe urd-gds-003
gluster peer probe urd-gds-004

gluster pool list (from urd-gds-001):
UUIDHostnameState
bdbe4622-25f9-4ef1-aad1-639ca52fc7e0urd-gds-002 Connected 
2a48a3b9-efa0-4fb7-837f-c800f04bf99furd-gds-003 Connected 
ad893466-ad09-47f4-8bb4-4cea84085e5burd-gds-004 Connected 
bfe05382-7e22-4b93-8816-b239b733b610urd-gds-000 Connected 
912bebfd-1a7f-44dc-b0b7-f001a20d58cdlocalhost   Connected

Client mount command (same on both sides):
mount -t glusterfs urd-gds-001:/urd-gds-volume /mnt

Regards
Marcus

On Tue, Apr 10, 2018 at 06:24:05PM +0530, Milind Changire wrote:
> Marcus,
> Can you share server-side  gluster peer probe and client-side mount
> command-lines.
> 
> 
> 
> On Tue, Apr 10, 2018 at 12:36 AM, Marcus Pedersén 
> wrote:
> 
> > Hi all!
> >
> > I have setup a replicated/distributed gluster cluster 2 x (2 + 1).
> >
> > Centos 7 and gluster version 3.12.6 on server.
> >
> > All machines have two network interfaces and connected to two different
> > networks,
> >
> > 10.10.0.0/16 (with hostnames in /etc/hosts, gluster version 3.12.6)
> >
> > 192.168.67.0/24 (with ldap, gluster version 3.13.1)
> >
> > Gluster cluster was created on the 10.10.0.0/16 net, gluster peer
> > probe ...and so on.
> >
> > All nodes are available on both networks and have the same names on both
> > networks.
> >
> >
> > Now to my problem, the gluster cluster is mounted on multiple clients on
> > the 192.168.67.0/24 net
> >
> > and a process was running on one of the clients, reading and writing to
> > files.
> >
> > At the same time I mounted the cluster on a client on the 10.10.0.0/16
> > net and started to create
> >
> > and edit files on the cluster. Around the same time the process on the
> > 192-net stopped without any
> >
> > specific errors. Started other processes on the 192-net and continued to
> > make changes on the 10-net
> >
> > and got the same behavior with stopping processes on the 192-net.
> >
> >
> > Is there any known problems with this type of setup?
> >
> > How do I proceed to figure out a solution as I need access from both
> > networks?
> >
> >
> > Following error shows a couple of times on server (systemd -> glusterd):
> >
> > [2018-04-09 11:46:46.254071] C [mem-pool.c:613:mem_pools_init_early]
> > 0-mem-pool: incorrect order of mem-pool initialization (init_done=3)
> >
> >
> > Client logs:
> >
> > Client on 192-net:
> >
> > [2018-04-09 11:35:31.402979] I [MSGID: 114046] 
> > [client-handshake.c:1231:client_setvolume_cbk]
> > 5-urd-gds-volume-client-1: Connected to urd-gds-volume-client-1, attached
> > to remote volume '/urd-gds/gluster'.
> > [2018-04-09 11:35:31.403019] I [MSGID: 114047] 
> > [client-handshake.c:1242:client_setvolume_cbk]
> > 5-urd-gds-volume-client-1: Server and Client lk-version numbers are not
> > same, reopening the fds
> > [2018-04-09 11:35:31.403051] I [MSGID: 114046] 
> > [client-handshake.c:1231:client_setvolume_cbk]
> > 5-urd-gds-volume-snapd-client: Connected to urd-gds-volume-snapd-client,
> > attached to remote volume 'snapd-urd-gds-vo\
> > lume'.
> > [2018-04-09 11:35:31.403091] I [MSGID: 114047] 
> > [client-handshake.c:1242:client_setvolume_cbk]
> > 5-urd-gds-volume-snapd-client: Server and Client lk-version numbers are not
> > same, reopening the fds
> > [2018-04-09 11:35:31.403271] I [MSGID: 114035] 
> > [client-handshake.c:202:client_set_lk_version_cbk]
> > 5-urd-gds-volume-client-3: Server lk version = 1
> > [2018-04-09 11:35:31.403325] I [MSGID: 114035] 
> > [client-handshake.c:202:client_set_lk_version_cbk]
> > 5-urd-gds-volume-client-4: Server lk version = 1
> > [2018-04-09 11:35:31.403349] I [MSGID: 114035] 
> > [client-handshake.c:202:client_set_lk_version_cbk]
> > 5-urd-gds-volume-client-0: Server lk version = 1
> > [2018-04-09 11:35:31.403367] I [MSGID: 114035] 
> > [client-handshake.c:202:client_set_lk_version_cbk]
> > 5-urd-gds-volume-client-2: Server lk version = 1
> > [2018-04-09 11:35:31.403616] I [MSGID: 114035] 
> > [client-handshake.c:202:client_set_lk_version_cbk]
> > 5-urd-gds-volume-client-1: Server lk version = 1
> > [2018-04-09 11:35:31.403751] I [MSGID: 114057] [client-handshake.c:1484:
> > select_server_supported_programs] 5-urd-gds-volume-client-5: Using
> > Program GlusterFS 3.3, Num (1298437), Version (330)
> > [2018-04-09 11:35:31.404174] I [MSGID: 114035] 
> > [client-handshake.c:202:client_set_lk_version_cbk]
> > 5-urd-gds-volume-snapd-client: Server lk version = 1
> > [2018-04-09 11:35:31.405030] I [MSGID: 114046] 
> > [client-handshake.c:1231:client_setvolume_cbk]
> > 5-urd-gds-volume-client-5: Connected to urd-gds-volume-client-5, attached
> > to remote volume '/urd-gds/gluster2'.
> > [2018-04-09 11:35:31.405069] I [MSGID: 114047] 
> > [client-handshake.c:1242:client_setvolume_cbk]
> > 5-urd-gds-volume-client-5: 

Re: [Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs

2018-04-10 Thread Artem Russakovskii
Hi Vlad,

I actually saw that post already and even asked a question 4 days ago (
https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode#comment1172497_540917).
The accepted answer also seems to go against your suggestion to enable
direct-io-mode as it says it should be disabled for better performance when
used just for file accesses.

It'd be great if someone from the Gluster team chimed in about this thread.


Sincerely,
Artem

--
Founder, Android Police , APK Mirror
, Illogical Robot LLC
beerpla.net | +ArtemRussakovskii
 | @ArtemR


On Tue, Apr 10, 2018 at 7:01 AM, Vlad Kopylov  wrote:

> Wish I knew or was able to get detailed description of those options
> myself.
> here is direct-io-mode  https://serverfault.com/
> questions/517775/glusterfs-direct-i-o-mode
> Same as you I ran tests on a large volume of files, finding that main
> delays are in attribute calls, ending up with those mount options to add
> performance.
> I discovered those options through basically googling this user list with
> people sharing their tests.
> Not sure I would share your optimism, and rather then going up I
> downgraded to 3.12 and have no dir view issue now. Though I had to recreate
> the cluster and had to re-add bricks with existing data.
>
> On Tue, Apr 10, 2018 at 1:47 AM, Artem Russakovskii 
> wrote:
>
>> Hi Vlad,
>>
>> I'm using only localhost: mounts.
>>
>> Can you please explain what effect each option has on performance issues
>> shown in my posts? "negative-timeout=10,attribute
>> -timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5"
>> From what I remember, direct-io-mode=enable didn't make a difference in my
>> tests, but I suppose I can try again. The explanations about direct-io-mode
>> are quite confusing on the web in various guides, saying enabling it could
>> make performance worse in some situations and better in others due to OS
>> file cache.
>>
>> There are also these gluster volume settings, adding to the confusion:
>> Option: performance.strict-o-direct
>> Default Value: off
>> Description: This option when set to off, ignores the O_DIRECT flag.
>>
>> Option: performance.nfs.strict-o-direct
>> Default Value: off
>> Description: This option when set to off, ignores the O_DIRECT flag.
>>
>> Re: 4.0. I moved to 4.0 after finding out that it fixes the disappearing
>> dirs bug related to cluster.readdir-optimize if you remember (
>> http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html).
>> I was already on 3.13 by then, and 4.0 resolved the issue. It's been stable
>> for me so far, thankfully.
>>
>>
>> Sincerely,
>> Artem
>>
>> --
>> Founder, Android Police , APK Mirror
>> , Illogical Robot LLC
>> beerpla.net | +ArtemRussakovskii
>>  | @ArtemR
>> 
>>
>> On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov  wrote:
>>
>>> you definitely need mount options to /etc/fstab
>>> use ones from here http://lists.gluster.org/piper
>>> mail/gluster-users/2018-April/033811.html
>>>
>>> I went on with using local mounts to achieve performance as well
>>>
>>> Also, 3.12 or 3.10 branches would be preferable for production
>>>
>>> On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii 
>>> wrote:
>>>
 Hi again,

 I'd like to expand on the performance issues and plead for help. Here's
 one case which shows these odd hiccups: https://i.imgur.com/C
 XBPjTK.gifv.

 In this GIF where I switch back and forth between copy operations on 2
 servers, I'm copying a 10GB dir full of .apk and image files.

 On server "hive" I'm copying straight from the main disk to an attached
 volume block (xfs). As you can see, the transfers are relatively speedy and
 don't hiccup.
 On server "citadel" I'm copying the same set of data to a 4-replicate
 gluster which uses block storage as a brick. As you can see, performance is
 much worse, and there are frequent pauses for many seconds where nothing
 seems to be happening - just freezes.

 All 4 servers have the same specs, and all of them have performance
 issues with gluster and no such issues when raw xfs block storage is used.

 hive has long finished copying the data, while citadel is barely
 chugging along and is expected to take probably half an hour to an hour. I
 have over 1TB of data to migrate, at which point if we went live, I'm not
 even sure gluster would be able to keep up instead of bringing the machines
 and services down.



 Here's the cluster config, though it didn't seem to make any difference
 performance-wise before I applied the customizations vs after.

 Volume Name: 

Re: [Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs

2018-04-10 Thread Vlad Kopylov
Wish I knew or was able to get detailed description of those options myself.
here is direct-io-mode
https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode
Same as you I ran tests on a large volume of files, finding that main
delays are in attribute calls, ending up with those mount options to add
performance.
I discovered those options through basically googling this user list with
people sharing their tests.
Not sure I would share your optimism, and rather then going up I downgraded
to 3.12 and have no dir view issue now. Though I had to recreate the
cluster and had to re-add bricks with existing data.

On Tue, Apr 10, 2018 at 1:47 AM, Artem Russakovskii 
wrote:

> Hi Vlad,
>
> I'm using only localhost: mounts.
>
> Can you please explain what effect each option has on performance issues
> shown in my posts? "negative-timeout=10,attribute
> -timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5" From
> what I remember, direct-io-mode=enable didn't make a difference in my
> tests, but I suppose I can try again. The explanations about direct-io-mode
> are quite confusing on the web in various guides, saying enabling it could
> make performance worse in some situations and better in others due to OS
> file cache.
>
> There are also these gluster volume settings, adding to the confusion:
> Option: performance.strict-o-direct
> Default Value: off
> Description: This option when set to off, ignores the O_DIRECT flag.
>
> Option: performance.nfs.strict-o-direct
> Default Value: off
> Description: This option when set to off, ignores the O_DIRECT flag.
>
> Re: 4.0. I moved to 4.0 after finding out that it fixes the disappearing
> dirs bug related to cluster.readdir-optimize if you remember (
> http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html).
> I was already on 3.13 by then, and 4.0 resolved the issue. It's been stable
> for me so far, thankfully.
>
>
> Sincerely,
> Artem
>
> --
> Founder, Android Police , APK Mirror
> , Illogical Robot LLC
> beerpla.net | +ArtemRussakovskii
>  | @ArtemR
> 
>
> On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov  wrote:
>
>> you definitely need mount options to /etc/fstab
>> use ones from here http://lists.gluster.org/piper
>> mail/gluster-users/2018-April/033811.html
>>
>> I went on with using local mounts to achieve performance as well
>>
>> Also, 3.12 or 3.10 branches would be preferable for production
>>
>> On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii 
>> wrote:
>>
>>> Hi again,
>>>
>>> I'd like to expand on the performance issues and plead for help. Here's
>>> one case which shows these odd hiccups: https://i.imgur.com/CXBPjTK.gifv
>>> .
>>>
>>> In this GIF where I switch back and forth between copy operations on 2
>>> servers, I'm copying a 10GB dir full of .apk and image files.
>>>
>>> On server "hive" I'm copying straight from the main disk to an attached
>>> volume block (xfs). As you can see, the transfers are relatively speedy and
>>> don't hiccup.
>>> On server "citadel" I'm copying the same set of data to a 4-replicate
>>> gluster which uses block storage as a brick. As you can see, performance is
>>> much worse, and there are frequent pauses for many seconds where nothing
>>> seems to be happening - just freezes.
>>>
>>> All 4 servers have the same specs, and all of them have performance
>>> issues with gluster and no such issues when raw xfs block storage is used.
>>>
>>> hive has long finished copying the data, while citadel is barely
>>> chugging along and is expected to take probably half an hour to an hour. I
>>> have over 1TB of data to migrate, at which point if we went live, I'm not
>>> even sure gluster would be able to keep up instead of bringing the machines
>>> and services down.
>>>
>>>
>>>
>>> Here's the cluster config, though it didn't seem to make any difference
>>> performance-wise before I applied the customizations vs after.
>>>
>>> Volume Name: apkmirror_data1
>>> Type: Replicate
>>> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x 4 = 4
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1
>>> Brick2: forge:/mnt/forge_block1/apkmirror_data1
>>> Brick3: hive:/mnt/hive_block1/apkmirror_data1
>>> Brick4: citadel:/mnt/citadel_block1/apkmirror_data1
>>> Options Reconfigured:
>>> cluster.quorum-count: 1
>>> cluster.quorum-type: fixed
>>> network.ping-timeout: 5
>>> network.remote-dio: enable
>>> performance.rda-cache-limit: 256MB
>>> performance.readdir-ahead: on
>>> performance.parallel-readdir: on
>>> network.inode-lru-limit: 50
>>> performance.md-cache-timeout: 600
>>> performance.cache-invalidation: on
>>> performance.stat-prefetch: on
>>> features.cache-invalidation-timeout: 600
>>> 

Re: [Gluster-users] Gluster cluster on two networks

2018-04-10 Thread Milind Changire
Marcus,
Can you share server-side  gluster peer probe and client-side mount
command-lines.



On Tue, Apr 10, 2018 at 12:36 AM, Marcus Pedersén 
wrote:

> Hi all!
>
> I have setup a replicated/distributed gluster cluster 2 x (2 + 1).
>
> Centos 7 and gluster version 3.12.6 on server.
>
> All machines have two network interfaces and connected to two different
> networks,
>
> 10.10.0.0/16 (with hostnames in /etc/hosts, gluster version 3.12.6)
>
> 192.168.67.0/24 (with ldap, gluster version 3.13.1)
>
> Gluster cluster was created on the 10.10.0.0/16 net, gluster peer
> probe ...and so on.
>
> All nodes are available on both networks and have the same names on both
> networks.
>
>
> Now to my problem, the gluster cluster is mounted on multiple clients on
> the 192.168.67.0/24 net
>
> and a process was running on one of the clients, reading and writing to
> files.
>
> At the same time I mounted the cluster on a client on the 10.10.0.0/16
> net and started to create
>
> and edit files on the cluster. Around the same time the process on the
> 192-net stopped without any
>
> specific errors. Started other processes on the 192-net and continued to
> make changes on the 10-net
>
> and got the same behavior with stopping processes on the 192-net.
>
>
> Is there any known problems with this type of setup?
>
> How do I proceed to figure out a solution as I need access from both
> networks?
>
>
> Following error shows a couple of times on server (systemd -> glusterd):
>
> [2018-04-09 11:46:46.254071] C [mem-pool.c:613:mem_pools_init_early]
> 0-mem-pool: incorrect order of mem-pool initialization (init_done=3)
>
>
> Client logs:
>
> Client on 192-net:
>
> [2018-04-09 11:35:31.402979] I [MSGID: 114046] 
> [client-handshake.c:1231:client_setvolume_cbk]
> 5-urd-gds-volume-client-1: Connected to urd-gds-volume-client-1, attached
> to remote volume '/urd-gds/gluster'.
> [2018-04-09 11:35:31.403019] I [MSGID: 114047] 
> [client-handshake.c:1242:client_setvolume_cbk]
> 5-urd-gds-volume-client-1: Server and Client lk-version numbers are not
> same, reopening the fds
> [2018-04-09 11:35:31.403051] I [MSGID: 114046] 
> [client-handshake.c:1231:client_setvolume_cbk]
> 5-urd-gds-volume-snapd-client: Connected to urd-gds-volume-snapd-client,
> attached to remote volume 'snapd-urd-gds-vo\
> lume'.
> [2018-04-09 11:35:31.403091] I [MSGID: 114047] 
> [client-handshake.c:1242:client_setvolume_cbk]
> 5-urd-gds-volume-snapd-client: Server and Client lk-version numbers are not
> same, reopening the fds
> [2018-04-09 11:35:31.403271] I [MSGID: 114035] 
> [client-handshake.c:202:client_set_lk_version_cbk]
> 5-urd-gds-volume-client-3: Server lk version = 1
> [2018-04-09 11:35:31.403325] I [MSGID: 114035] 
> [client-handshake.c:202:client_set_lk_version_cbk]
> 5-urd-gds-volume-client-4: Server lk version = 1
> [2018-04-09 11:35:31.403349] I [MSGID: 114035] 
> [client-handshake.c:202:client_set_lk_version_cbk]
> 5-urd-gds-volume-client-0: Server lk version = 1
> [2018-04-09 11:35:31.403367] I [MSGID: 114035] 
> [client-handshake.c:202:client_set_lk_version_cbk]
> 5-urd-gds-volume-client-2: Server lk version = 1
> [2018-04-09 11:35:31.403616] I [MSGID: 114035] 
> [client-handshake.c:202:client_set_lk_version_cbk]
> 5-urd-gds-volume-client-1: Server lk version = 1
> [2018-04-09 11:35:31.403751] I [MSGID: 114057] [client-handshake.c:1484:
> select_server_supported_programs] 5-urd-gds-volume-client-5: Using
> Program GlusterFS 3.3, Num (1298437), Version (330)
> [2018-04-09 11:35:31.404174] I [MSGID: 114035] 
> [client-handshake.c:202:client_set_lk_version_cbk]
> 5-urd-gds-volume-snapd-client: Server lk version = 1
> [2018-04-09 11:35:31.405030] I [MSGID: 114046] 
> [client-handshake.c:1231:client_setvolume_cbk]
> 5-urd-gds-volume-client-5: Connected to urd-gds-volume-client-5, attached
> to remote volume '/urd-gds/gluster2'.
> [2018-04-09 11:35:31.405069] I [MSGID: 114047] 
> [client-handshake.c:1242:client_setvolume_cbk]
> 5-urd-gds-volume-client-5: Server and Client lk-version numbers are not
> same, reopening the fds
> [2018-04-09 11:35:31.405585] I [MSGID: 114035] 
> [client-handshake.c:202:client_set_lk_version_cbk]
> 5-urd-gds-volume-client-5: Server lk version = 1
> [2018-04-09 11:42:29.622006] I [fuse-bridge.c:4835:fuse_graph_sync]
> 0-fuse: switched to graph 5
> [2018-04-09 11:42:29.627533] I [MSGID: 109005] 
> [dht-selfheal.c:2458:dht_selfheal_directory]
> 5-urd-gds-volume-dht: Directory selfheal failed: Unable to form layout for
> directory /
> [2018-04-09 11:42:29.627935] I [MSGID: 114021] [client.c:2369:notify]
> 2-urd-gds-volume-client-0: current graph is no longer active, destroying
> rpc_client
> [2018-04-09 11:42:29.628013] I [MSGID: 114021] [client.c:2369:notify]
> 2-urd-gds-volume-client-1: current graph is no longer active, destroying
> rpc_client
> [2018-04-09 11:42:29.628047] I [MSGID: 114021] [client.c:2369:notify]
> 2-urd-gds-volume-client-2: current graph is no longer active, destroying
> rpc_client

[Gluster-users] Gluster cluster on two networks

2018-04-10 Thread Marcus Pedersén
Hi all!

I have setup a replicated/distributed gluster cluster 2 x (2 + 1).
Centos 7 and gluster version 3.12.6 on server.
All machines have two network interfaces and connected to two different 
networks,
10.10.0.0/16 (with hostnames in /etc/hosts, gluster version 3.12.6)
192.168.67.0/24 (with ldap, gluster version 3.13.1)

Gluster cluster was created on the 10.10.0.0/16 net, gluster peer probe ...and 
so on.
All nodes are available on both networks and have the same names on both 
networks.

Now to my problem, the gluster cluster is mounted on multiple clients on the 
192.168.67.0/24 net
and a process was running on one of the clients, reading and writing to files.
At the same time I mounted the cluster on a client on the 10.10.0.0/16 net and 
started to create
and edit files on the cluster. Around the same time the process on the 192-net 
stopped without any
specific errors. Started other processes on the 192-net and continued to make 
changes on the 10-net
and got the same behavior with stopping processes on the 192-net.


Is there any known problems with this type of setup?
How do I proceed to figure out a solution as I need access from both networks?

Following error shows a couple of times on server (systemd -> glusterd):
[2018-04-09 11:46:46.254071] C [mem-pool.c:613:mem_pools_init_early] 
0-mem-pool: incorrect order of mem-pool initialization (init_done=3)


Client logs:
Client on 192-net: 

[2018-04-09 11:35:31.402979] I [MSGID: 114046] 
[client-handshake.c:1231:client_setvolume_cbk] 5-urd-gds-volume-client-1: 
Connected to urd-gds-volume-client-1, attached to remote volume 
'/urd-gds/gluster'.
[2018-04-09 11:35:31.403019] I [MSGID: 114047] 
[client-handshake.c:1242:client_setvolume_cbk] 5-urd-gds-volume-client-1: 
Server and Client lk-version numbers are not same, reopening the fds
[2018-04-09 11:35:31.403051] I [MSGID: 114046] 
[client-handshake.c:1231:client_setvolume_cbk] 5-urd-gds-volume-snapd-client: 
Connected to urd-gds-volume-snapd-client, attached to remote volume 
'snapd-urd-gds-vo
lume'.
[2018-04-09 11:35:31.403091] I [MSGID: 114047] 
[client-handshake.c:1242:client_setvolume_cbk] 5-urd-gds-volume-snapd-client: 
Server and Client lk-version numbers are not same, reopening the fds
[2018-04-09 11:35:31.403271] I [MSGID: 114035] 
[client-handshake.c:202:client_set_lk_version_cbk] 5-urd-gds-volume-client-3: 
Server lk version = 1
[2018-04-09 11:35:31.403325] I [MSGID: 114035] 
[client-handshake.c:202:client_set_lk_version_cbk] 5-urd-gds-volume-client-4: 
Server lk version = 1
[2018-04-09 11:35:31.403349] I [MSGID: 114035] 
[client-handshake.c:202:client_set_lk_version_cbk] 5-urd-gds-volume-client-0: 
Server lk version = 1
[2018-04-09 11:35:31.403367] I [MSGID: 114035] 
[client-handshake.c:202:client_set_lk_version_cbk] 5-urd-gds-volume-client-2: 
Server lk version = 1
[2018-04-09 11:35:31.403616] I [MSGID: 114035] 
[client-handshake.c:202:client_set_lk_version_cbk] 5-urd-gds-volume-client-1: 
Server lk version = 1
[2018-04-09 11:35:31.403751] I [MSGID: 114057] 
[client-handshake.c:1484:select_server_supported_programs] 
5-urd-gds-volume-client-5: Using Program GlusterFS 3.3, Num (1298437), Version 
(330)
[2018-04-09 11:35:31.404174] I [MSGID: 114035] 
[client-handshake.c:202:client_set_lk_version_cbk] 
5-urd-gds-volume-snapd-client: Server lk version = 1
[2018-04-09 11:35:31.405030] I [MSGID: 114046] 
[client-handshake.c:1231:client_setvolume_cbk] 5-urd-gds-volume-client-5: 
Connected to urd-gds-volume-client-5, attached to remote volume 
'/urd-gds/gluster2'.
[2018-04-09 11:35:31.405069] I [MSGID: 114047] 
[client-handshake.c:1242:client_setvolume_cbk] 5-urd-gds-volume-client-5: 
Server and Client lk-version numbers are not same, reopening the fds
[2018-04-09 11:35:31.405585] I [MSGID: 114035] 
[client-handshake.c:202:client_set_lk_version_cbk] 5-urd-gds-volume-client-5: 
Server lk version = 1
[2018-04-09 11:42:29.622006] I [fuse-bridge.c:4835:fuse_graph_sync] 0-fuse: 
switched to graph 5
[2018-04-09 11:42:29.627533] I [MSGID: 109005] 
[dht-selfheal.c:2458:dht_selfheal_directory] 5-urd-gds-volume-dht: Directory 
selfheal failed: Unable to form layout for directory /
[2018-04-09 11:42:29.627935] I [MSGID: 114021] [client.c:2369:notify] 
2-urd-gds-volume-client-0: current graph is no longer active, destroying 
rpc_client
[2018-04-09 11:42:29.628013] I [MSGID: 114021] [client.c:2369:notify] 
2-urd-gds-volume-client-1: current graph is no longer active, destroying 
rpc_client
[2018-04-09 11:42:29.628047] I [MSGID: 114021] [client.c:2369:notify] 
2-urd-gds-volume-client-2: current graph is no longer active, destroying 
rpc_client
[2018-04-09 11:42:29.628069] I [MSGID: 114018] 
[client.c:2285:client_rpc_notify] 2-urd-gds-volume-client-0: disconnected from 
urd-gds-volume-client-0. Client process will keep trying to connect to glusterd 
unti
l brick's port is available
[2018-04-09 11:42:29.628077] I [MSGID: 114021] [client.c:2369:notify] 
2-urd-gds-volume-client-3: current graph is no longer