Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-07 Thread Pavel Szalbot
Seems to be so, but if we look back at the described setup and procedure -
what is the reason for iops to stop/fail? Rebooting a node is somewhat
similar to updating gluster, replacing cabling etc. IMO this should not
always end up with arbiter blaming the other node and even though I did not
investigate this issue deeply, I do not believe the blame is the reason for
iops to drop.

On Sep 7, 2017 21:25, "Alastair Neil"  wrote:

> True but to work your way into that problem with replica 3 is a lot harder
> to achieve than with just replica 2 + arbiter.
>
> On 7 September 2017 at 14:06, Pavel Szalbot 
> wrote:
>
>> Hi Neil, docs mention two live nodes of replica 3 blaming each other and
>> refusing to do IO.
>>
>> https://gluster.readthedocs.io/en/latest/Administrator%20Gui
>> de/Split%20brain%20and%20ways%20to%20deal%20with%20it/#1-replica-3-volume
>>
>>
>>
>> On Sep 7, 2017 17:52, "Alastair Neil"  wrote:
>>
>>> *shrug* I don't use arbiter for vm work loads just straight replica 3.
>>> There are some gotchas with using an arbiter for VM workloads.  If
>>> quorum-type is auto and a brick that is not the arbiter drop out then if
>>> the up brick is dirty as far as the arbiter is concerned i.e. the only good
>>> copy is on the down brick you will get ENOTCONN and your VMs will halt on
>>> IO.
>>>
>>> On 6 September 2017 at 16:06,  wrote:
>>>
 Mh, I never had to do that and I never had that problem. Is that an
 arbiter specific thing ? With replica 3 it just works.

 On Wed, Sep 06, 2017 at 03:59:14PM -0400, Alastair Neil wrote:
 > you need to set
 >
 > cluster.server-quorum-ratio 51%
 >
 > On 6 September 2017 at 10:12, Pavel Szalbot 
 wrote:
 >
 > > Hi all,
 > >
 > > I have promised to do some testing and I finally find some time and
 > > infrastructure.
 > >
 > > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
 > > replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
 > > with disk accessible through gfapi. Volume group is set to virt
 > > (gluster volume set gv_openstack_1 virt). VM runs current (all
 > > packages updated) Ubuntu Xenial.
 > >
 > > I set up following fio job:
 > >
 > > [job1]
 > > ioengine=libaio
 > > size=1g
 > > loops=16
 > > bs=512k
 > > direct=1
 > > filename=/tmp/fio.data2
 > >
 > > When I run fio fio.job and reboot one of the data nodes, IO
 statistics
 > > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
 > > filesystem gets remounted as read-only.
 > >
 > > If you care about infrastructure, setup details etc., do not
 hesitate to
 > > ask.
 > >
 > > Gluster info on volume:
 > >
 > > Volume Name: gv_openstack_1
 > > Type: Replicate
 > > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
 > > Status: Started
 > > Snapshot Count: 0
 > > Number of Bricks: 1 x (2 + 1) = 3
 > > Transport-type: tcp
 > > Bricks:
 > > Brick1: gfs-2.san:/export/gfs/gv_1
 > > Brick2: gfs-3.san:/export/gfs/gv_1
 > > Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
 > > Options Reconfigured:
 > > nfs.disable: on
 > > transport.address-family: inet
 > > performance.quick-read: off
 > > performance.read-ahead: off
 > > performance.io-cache: off
 > > performance.stat-prefetch: off
 > > performance.low-prio-threads: 32
 > > network.remote-dio: enable
 > > cluster.eager-lock: enable
 > > cluster.quorum-type: auto
 > > cluster.server-quorum-type: server
 > > cluster.data-self-heal-algorithm: full
 > > cluster.locking-scheme: granular
 > > cluster.shd-max-threads: 8
 > > cluster.shd-wait-qlength: 1
 > > features.shard: on
 > > user.cifs: off
 > >
 > > Partial KVM XML dump:
 > >
 > > 
 > >   
 > >   >>> > > name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
 > > 
 > >   
 > >   
 > >   
 > >   77ebfd13-6a92-4f38-b036-e9e55d752e1e
 > >   
 > >   >>> > > function='0x0'/>
 > > 
 > >
 > > Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps
 > > SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
 > > nodes (including arbiter).
 > >
 > > I would really love to know what am I doing wrong, because this is
 my
 > > experience with Gluster for a long time a and a reason I would not
 > > recommend it as VM storage backend in production environment where
 you
 > > cannot start/stop VMs on your own (e.g. providing private clouds for
 > > customers).
 > > -ps
 > >
 > >
 > > On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti 
 > > wrote:
 > > 

[Gluster-users] Gluster Monthly Newsletter, August 2017

2017-09-07 Thread Amye Scavarda
Gluster Monthly Newsletter, August 2017

-- 

Great news! We’ve released 3.12!

https://www.gluster.org/3-12/announcing-glusterfs-release-3-12-0-long-term-maintenance/


This is a major release that will be Long Term Stable.

Notable feature highlights are:


   -

   Ability to mount sub-directories using the Gluster native protocol (FUSE)
   -

   Brick multiplexing enhancements that help scale to larger brick counts
   per node
   -

   Enhancements to gluster get-state CLI enabling better understanding of
   various bricks and nodes participation/roles in the cluster
   -

   Ability to resolve GFID split-brain using existing CLI
   -

   Easier GFID to real path mapping thus enabling diagnostics and
   correction for reported GFID issues (healing among other uses where GFID is
   the only available source for identifying a file)


Gluster Summit 2017 Schedule is out!

https://www.gluster.org/events/summit2017 has the details. We’re excited to
see everyone there!

If you’re applying for travel, our funding deadline is September 8th at
midnight Pacific.

Want to see our new digs? https://www.gluster.org has a new look and feel.

Suggestions? Bugs? We’re keeping our Glusterweb repo on Github to help
manage this. File an issue if you have something you think we should take a
look at.

We’ve also improved our documentation with a new site address and new
search: http://docs.gluster.org/

Welcome new maintainers!

https://github.com/gluster/glusterfs/blob/master/MAINTAINERS now reflects
our modified maintainer structure, and the maintainers meeting happens
every other week at the same time as the community meeting.

http://lists.gluster.org/pipermail/gluster-devel/2017-July/053362.html has
more details.

Our weekly community meeting has changed: we'll be meeting every other week
instead of weekly, moving the time to 15:00 UTC, and our agenda is at:
https://bit.ly/gluster-community-meetings

Noteworthy threads:

[Gluster-users] [Update] GD2 - what's been happening -
http://lists.gluster.org/pipermail/gluster-users/2017-August/031964.html

[Gluster-users] Gluster 4.0: Update -

http://lists.gluster.org/pipermail/gluster-users/2017-August/032164.html

[Gluster-users] docs.gluster.org

http://lists.gluster.org/pipermail/gluster-users/2017-August/032239.html

[Gluster-users] Gluster documentation search -

http://lists.gluster.org/pipermail/gluster-users/2017-August/032210.html

[Gluster-devel] [Need Feedback] Monitoring

http://lists.gluster.org/pipermail/gluster-devel/2017-August/053426.html

[Gluster-devel] Glusterd2 - Some anticipated changes to glusterfs source

http://lists.gluster.org/pipermail/gluster-devel/2017-August/053433.html

[Gluster-devel] How commonly applications make use of fadvise?

http://lists.gluster.org/pipermail/gluster-devel/2017-August/053457.html

[Gluster-devel] Proposed Protocol changes for 4.0: Need feedback.

http://lists.gluster.org/pipermail/gluster-devel/2017-August/053462.html

[Gluster-devel] Static analysis job for Gluster

http://lists.gluster.org/pipermail/gluster-devel/2017-August/053470.html

[Gluster-devel] Migration from gerrit bugzilla hook to a jenkins job

http://lists.gluster.org/pipermail/gluster-devel/2017-August/053474.html

[Gluster-devel] On knowledge transfer of some of the components

http://lists.gluster.org/pipermail/gluster-devel/2017-August/053502.html

[Gluster-devel] Introducing Fstat API

http://lists.gluster.org/pipermail/gluster-devel/2017-August/053518.html


Gluster Top 5 Contributors in the last 30 days:

Kaleb Keithley, Jose A. Rivera, Sachidananda URS, Michael Scherer, Michael
Adam

Top Contributing Companies:  Red Hat,  Gluster, Inc.,  Facebook

Upcoming CFPs:

DevConf - CfP opens September 15th, 2017!

FOSDEM

https://archive.fosdem.org/2017/news/2016-07-20-call-for-participation/  -
October 31


-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Firewalls and ports and protocols

2017-09-07 Thread Leonid Isaev
On Thu, Sep 07, 2017 at 10:49:21AM +, max.degr...@kpn.com wrote:
> Reading the documentation there is conflicting information:
> 
> According to https://wiki.centos.org/HowTos/GlusterFSonCentOS we only need 
> port TCP open between 2 GlusterFS servers:
> Ports TCP:24007-24008 are required for communication between GlusterFS nodes 
> and each brick requires another TCP port starting at 24009.
> 
> According to 
> https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Setting%20Up%20Clients/
>  we also need to open UPD:
> Ensure that TCP and UDP ports 24007 and 24008 are open on all Gluster 
> servers. Apart from these ports, you need to open one port for each brick 
> starting from port 49152 (instead of 24009 onwards as with previous 
> releases). The brick ports assignment scheme is now compliant with IANA 
> guidelines. For example: if you have five bricks, you need to have ports 
> 49152 to 49156 open.
> This part of the page is actually in the "Setting up Clients" section but it 
> clearly mentions server.
> 
> To add some more confusion there is an examply when using iptables:
> `$ sudo iptables -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp 
> --dport 24007:24008 -j ACCEPT `
> `$ sudo iptables -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp 
> --dport 49152:49156 -j ACCEPT`
> This conflicts with the directions using UPD as well as it only opens TCP.
> 
> 
> So basically I have 2 questions:
> What protocol/ports are needed for 2 glusterfs servers to work together.
> What protocol/ports are needed for a glusters client (using only the native 
> client) to work with a glusterfs server.
> 
> PS: All our machines are running Centos 7.1.

Gluster 3.9.x+ require ports 24007/tcp and 49152+/tcp. This is for bare
gluster, without NFS os samba, so clients mount the volumes via fuse.

Regarding conflicting info in wikis... how about simply trying yourself and
seeing what configuration works?

Cheers,
-- 
Leonid Isaev
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Redis db permission issue while running GitLab in Kubernetes with Gluster

2017-09-07 Thread John Strunk
I don't think this is a gluster problem...

Each container is going to have its own notion of user ids, hence the
mystery uid 1000 in the redis container. I suspect that if you exec into
the gitlab container, it may be the one running as 1000 (guessing based on
the file names). If you want to share volumes across containers, you're
going to have to do something explicitly to make sure each of them (with
their own uid/gid) can read/write the volume, for example by sharing the
same gid across all containers.

I'm going to suggest not sharing the same volume across all 3 containers
unless they need shared access to the data.

-John


On Thu, Sep 7, 2017 at 12:13 PM, Gaurav Chhabra 
wrote:

> Hello,
>
>
> I am trying to setup GitLab, Redis and PostgreSQL containers in Kubernetes
> using Gluster for persistence. GlusterFS nodes are setup on machines
> (CentOS) external to Kubernetes cluster (running on RancherOS host). Issue
> is that when GitLab tries starting up, the login page doesn't load. It's a
> fresh setup and not something that stopped working now.
>
> root@gitlab-2797053212-ph4j8:/var/log/gitlab/gitlab# tail -50 sidekiq.log
> ...
> 2017-09-07T11:53:03.099Z 547 TID-1fdf1k ERROR: Error fetching job: ERR Error 
> running script (call to f_7b91ed9f4cba40689cea7172d1fd3e08b2efd8c9): 
> @user_script:7: @user_script: 7: -MISCONF Redis is configured to save RDB 
> snapshots, but is currently not able to persist on disk. Commands that may 
> modify the data set are disabled. Please check Redis logs for details about 
> the error.
> 2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR: 
> /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/redis-3.3.3/lib/redis/client.rb:121:in
>  `call'
> 2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR: 
> /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/peek-redis-1.2.0/lib/peek/views/redis.rb:9:in
>  `call'
> 2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR: 
> /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/redis-3.3.3/lib/redis.rb:2399:in
>  `block in _eval'
> 2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR: 
> /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/redis-3.3.3/lib/redis.rb:58:in 
> `block in synchronize'
> 2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR: 
> /usr/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'
> 2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR: 
> /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/redis-3.3.3/lib/redis.rb:58:in 
> `synchronize'
> ...
>
> So i checked for Redis container logs.
>
> [root@node-a ~]# docker logs -f 67d44f585705
> ...
> ...
> [1] 07 Sep 14:43:48.140 # Background saving error
> [1] 07 Sep 14:43:54.048 * 1 changes in 900 seconds. Saving...
> [1] 07 Sep 14:43:54.048 * Background saving started by pid 2437
> [2437] 07 Sep 14:43:54.053 # Failed opening .rdb for saving: Permission denied
> ...
>
> Checked online for this issue and then noticed the following permissions
> and owner details *inside*of Redis pod:
>
> [root@node-a ~]# docker exec -it 67d44f585705 bash
> groups: cannot find name for group ID 2000
> root@redis-2138096053-0mlx4:/# ls -ld /var/lib/redis/
> drwxr-sr-x 12 1000 1000 8192 Sep  7 11:51 /var/lib/redis/
> root@redis-2138096053-0mlx4:/#
> root@redis-2138096053-0mlx4:/# ls -l /var/lib/redis/
> total 22
> drwxr-sr-x 2  1000  1000 6 Sep  6 10:37 backups
> drwxr-sr-x 2  1000  1000 6 Sep  6 10:37 builds
> drwxr-sr-x 2 redis redis 6 Sep  6 10:14 data
> -rw-r--r-- 1 redis redis 13050 Sep  7 11:51 dump.rdb
> -rwxr-xr-x 1 redis redis21 Sep  5 11:00 index.html
> drwxrws--- 2  1000  1000 6 Sep  6 10:37 repositories
> drwxr-sr-x 5  1000  100055 Sep  6 10:37 shared
> drwxr-sr-x 2 root  root   8192 Sep  6 10:37 ssh
> drwxr-sr-x 3 redis redis70 Sep  7 10:20 tmp
> drwx--S--- 2  1000  1000 6 Sep  6 10:37 uploads
> root@redis-2138096053-0mlx4:/#
> root@redis-2138096053-0mlx4:/# grep 1000 /etc/passwd
> root@redis-2138096053-0mlx4:/#
>
> Ran following and all looked fine.
>
> root@redis-2138096053-0mlx4:/# chown redis:redis -R /var/lib/redis/
>
> However, when i deleted and ran the GitLab deployment YAML again, the
> permissions inside the Redis container *again* got skewed up. I am not
> sure whether Gluster is messing up with the Redis file/folders permissions
> but i can't think of any other reason
> ​except for mount​.
>
> One thing i would like to highlight is that all the three containers are
> using the *same* PVC
>
> - name: gluster-vol1
>   persistentVolumeClaim:
> claimName: gluster-dyn-pvc
>
> Above is common for all three. What differs is shown below:
>
> a) postgresql-deployment.yaml
>
> volumeMounts:
> - name: gluster-vol1
>   mountPath: /var/lib/postgresql
>
> b) redisio-deployment.yaml
>
> volumeMounts:
> - name: gluster-vol1
>   mountPath: /var/lib/redis
>
> c) gitlab-deployment.yaml
>
> volumeMounts:
> - name: gluster-vol1
>   mountPath: /home/git/data
>
> Any suggestion? Also, i
> ​ guess this is not
> the right way to use the same PVC/Storage Class for all three containers

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-07 Thread Alastair Neil
True but to work your way into that problem with replica 3 is a lot harder
to achieve than with just replica 2 + arbiter.

On 7 September 2017 at 14:06, Pavel Szalbot  wrote:

> Hi Neil, docs mention two live nodes of replica 3 blaming each other and
> refusing to do IO.
>
> https://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/Split%20brain%20and%20ways%20to%20deal%20with%
> 20it/#1-replica-3-volume
>
>
>
> On Sep 7, 2017 17:52, "Alastair Neil"  wrote:
>
>> *shrug* I don't use arbiter for vm work loads just straight replica 3.
>> There are some gotchas with using an arbiter for VM workloads.  If
>> quorum-type is auto and a brick that is not the arbiter drop out then if
>> the up brick is dirty as far as the arbiter is concerned i.e. the only good
>> copy is on the down brick you will get ENOTCONN and your VMs will halt on
>> IO.
>>
>> On 6 September 2017 at 16:06,  wrote:
>>
>>> Mh, I never had to do that and I never had that problem. Is that an
>>> arbiter specific thing ? With replica 3 it just works.
>>>
>>> On Wed, Sep 06, 2017 at 03:59:14PM -0400, Alastair Neil wrote:
>>> > you need to set
>>> >
>>> > cluster.server-quorum-ratio 51%
>>> >
>>> > On 6 September 2017 at 10:12, Pavel Szalbot 
>>> wrote:
>>> >
>>> > > Hi all,
>>> > >
>>> > > I have promised to do some testing and I finally find some time and
>>> > > infrastructure.
>>> > >
>>> > > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
>>> > > replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
>>> > > with disk accessible through gfapi. Volume group is set to virt
>>> > > (gluster volume set gv_openstack_1 virt). VM runs current (all
>>> > > packages updated) Ubuntu Xenial.
>>> > >
>>> > > I set up following fio job:
>>> > >
>>> > > [job1]
>>> > > ioengine=libaio
>>> > > size=1g
>>> > > loops=16
>>> > > bs=512k
>>> > > direct=1
>>> > > filename=/tmp/fio.data2
>>> > >
>>> > > When I run fio fio.job and reboot one of the data nodes, IO
>>> statistics
>>> > > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
>>> > > filesystem gets remounted as read-only.
>>> > >
>>> > > If you care about infrastructure, setup details etc., do not
>>> hesitate to
>>> > > ask.
>>> > >
>>> > > Gluster info on volume:
>>> > >
>>> > > Volume Name: gv_openstack_1
>>> > > Type: Replicate
>>> > > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
>>> > > Status: Started
>>> > > Snapshot Count: 0
>>> > > Number of Bricks: 1 x (2 + 1) = 3
>>> > > Transport-type: tcp
>>> > > Bricks:
>>> > > Brick1: gfs-2.san:/export/gfs/gv_1
>>> > > Brick2: gfs-3.san:/export/gfs/gv_1
>>> > > Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
>>> > > Options Reconfigured:
>>> > > nfs.disable: on
>>> > > transport.address-family: inet
>>> > > performance.quick-read: off
>>> > > performance.read-ahead: off
>>> > > performance.io-cache: off
>>> > > performance.stat-prefetch: off
>>> > > performance.low-prio-threads: 32
>>> > > network.remote-dio: enable
>>> > > cluster.eager-lock: enable
>>> > > cluster.quorum-type: auto
>>> > > cluster.server-quorum-type: server
>>> > > cluster.data-self-heal-algorithm: full
>>> > > cluster.locking-scheme: granular
>>> > > cluster.shd-max-threads: 8
>>> > > cluster.shd-wait-qlength: 1
>>> > > features.shard: on
>>> > > user.cifs: off
>>> > >
>>> > > Partial KVM XML dump:
>>> > >
>>> > > 
>>> > >   
>>> > >   >> > > name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
>>> > > 
>>> > >   
>>> > >   
>>> > >   
>>> > >   77ebfd13-6a92-4f38-b036-e9e55d752e1e
>>> > >   
>>> > >   >> > > function='0x0'/>
>>> > > 
>>> > >
>>> > > Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps
>>> > > SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
>>> > > nodes (including arbiter).
>>> > >
>>> > > I would really love to know what am I doing wrong, because this is my
>>> > > experience with Gluster for a long time a and a reason I would not
>>> > > recommend it as VM storage backend in production environment where
>>> you
>>> > > cannot start/stop VMs on your own (e.g. providing private clouds for
>>> > > customers).
>>> > > -ps
>>> > >
>>> > >
>>> > > On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti 
>>> > > wrote:
>>> > > > Il 30-08-2017 17:07 Ivan Rossi ha scritto:
>>> > > >>
>>> > > >> There has ben a bug associated to sharding that led to VM
>>> corruption
>>> > > >> that has been around for a long time (difficult to reproduce I
>>> > > >> understood). I have not seen reports on that for some time after
>>> the
>>> > > >> last fix, so hopefully now VM hosting is stable.
>>> > > >
>>> > > >
>>> > > > Mmmm... this is precisely the kind of bug that scares me... data
>>> > > corruption
>>> > > > :|
>>> > > > Any more information on what causes it and how to resolve? Even if
>>> in
>>> > > newer
>>> > > > 

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-07 Thread Pavel Szalbot
Hi Neil, docs mention two live nodes of replica 3 blaming each other and
refusing to do IO.

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/#1-replica-3-volume



On Sep 7, 2017 17:52, "Alastair Neil"  wrote:

> *shrug* I don't use arbiter for vm work loads just straight replica 3.
> There are some gotchas with using an arbiter for VM workloads.  If
> quorum-type is auto and a brick that is not the arbiter drop out then if
> the up brick is dirty as far as the arbiter is concerned i.e. the only good
> copy is on the down brick you will get ENOTCONN and your VMs will halt on
> IO.
>
> On 6 September 2017 at 16:06,  wrote:
>
>> Mh, I never had to do that and I never had that problem. Is that an
>> arbiter specific thing ? With replica 3 it just works.
>>
>> On Wed, Sep 06, 2017 at 03:59:14PM -0400, Alastair Neil wrote:
>> > you need to set
>> >
>> > cluster.server-quorum-ratio 51%
>> >
>> > On 6 September 2017 at 10:12, Pavel Szalbot 
>> wrote:
>> >
>> > > Hi all,
>> > >
>> > > I have promised to do some testing and I finally find some time and
>> > > infrastructure.
>> > >
>> > > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
>> > > replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
>> > > with disk accessible through gfapi. Volume group is set to virt
>> > > (gluster volume set gv_openstack_1 virt). VM runs current (all
>> > > packages updated) Ubuntu Xenial.
>> > >
>> > > I set up following fio job:
>> > >
>> > > [job1]
>> > > ioengine=libaio
>> > > size=1g
>> > > loops=16
>> > > bs=512k
>> > > direct=1
>> > > filename=/tmp/fio.data2
>> > >
>> > > When I run fio fio.job and reboot one of the data nodes, IO statistics
>> > > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
>> > > filesystem gets remounted as read-only.
>> > >
>> > > If you care about infrastructure, setup details etc., do not hesitate
>> to
>> > > ask.
>> > >
>> > > Gluster info on volume:
>> > >
>> > > Volume Name: gv_openstack_1
>> > > Type: Replicate
>> > > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
>> > > Status: Started
>> > > Snapshot Count: 0
>> > > Number of Bricks: 1 x (2 + 1) = 3
>> > > Transport-type: tcp
>> > > Bricks:
>> > > Brick1: gfs-2.san:/export/gfs/gv_1
>> > > Brick2: gfs-3.san:/export/gfs/gv_1
>> > > Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
>> > > Options Reconfigured:
>> > > nfs.disable: on
>> > > transport.address-family: inet
>> > > performance.quick-read: off
>> > > performance.read-ahead: off
>> > > performance.io-cache: off
>> > > performance.stat-prefetch: off
>> > > performance.low-prio-threads: 32
>> > > network.remote-dio: enable
>> > > cluster.eager-lock: enable
>> > > cluster.quorum-type: auto
>> > > cluster.server-quorum-type: server
>> > > cluster.data-self-heal-algorithm: full
>> > > cluster.locking-scheme: granular
>> > > cluster.shd-max-threads: 8
>> > > cluster.shd-wait-qlength: 1
>> > > features.shard: on
>> > > user.cifs: off
>> > >
>> > > Partial KVM XML dump:
>> > >
>> > > 
>> > >   
>> > >   > > > name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
>> > > 
>> > >   
>> > >   
>> > >   
>> > >   77ebfd13-6a92-4f38-b036-e9e55d752e1e
>> > >   
>> > >   > > > function='0x0'/>
>> > > 
>> > >
>> > > Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps
>> > > SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
>> > > nodes (including arbiter).
>> > >
>> > > I would really love to know what am I doing wrong, because this is my
>> > > experience with Gluster for a long time a and a reason I would not
>> > > recommend it as VM storage backend in production environment where you
>> > > cannot start/stop VMs on your own (e.g. providing private clouds for
>> > > customers).
>> > > -ps
>> > >
>> > >
>> > > On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti 
>> > > wrote:
>> > > > Il 30-08-2017 17:07 Ivan Rossi ha scritto:
>> > > >>
>> > > >> There has ben a bug associated to sharding that led to VM
>> corruption
>> > > >> that has been around for a long time (difficult to reproduce I
>> > > >> understood). I have not seen reports on that for some time after
>> the
>> > > >> last fix, so hopefully now VM hosting is stable.
>> > > >
>> > > >
>> > > > Mmmm... this is precisely the kind of bug that scares me... data
>> > > corruption
>> > > > :|
>> > > > Any more information on what causes it and how to resolve? Even if
>> in
>> > > newer
>> > > > Gluster releases it is a solved bug, knowledge on how to treat it
>> would
>> > > be
>> > > > valuable.
>> > > >
>> > > >
>> > > > Thanks.
>> > > >
>> > > > --
>> > > > Danti Gionatan
>> > > > Supporto Tecnico
>> > > > Assyoma S.r.l. - www.assyoma.it
>> > > > email: g.da...@assyoma.it - i...@assyoma.it
>> > > > GPG public key ID: FF5F32A8
>> > > > 

[Gluster-users] Redis db permission issue while running GitLab in Kubernetes with Gluster

2017-09-07 Thread Gaurav Chhabra
Hello,


I am trying to setup GitLab, Redis and PostgreSQL containers in Kubernetes
using Gluster for persistence. GlusterFS nodes are setup on machines
(CentOS) external to Kubernetes cluster (running on RancherOS host). Issue
is that when GitLab tries starting up, the login page doesn't load. It's a
fresh setup and not something that stopped working now.

root@gitlab-2797053212-ph4j8:/var/log/gitlab/gitlab# tail -50 sidekiq.log
...
2017-09-07T11:53:03.099Z 547 TID-1fdf1k ERROR: Error fetching job: ERR
Error running script (call to
f_7b91ed9f4cba40689cea7172d1fd3e08b2efd8c9): @user_script:7:
@user_script: 7: -MISCONF Redis is configured to save RDB snapshots,
but is currently not able to persist on disk. Commands that may modify
the data set are disabled. Please check Redis logs for details about
the error.
2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR:
/home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/redis-3.3.3/lib/redis/client.rb:121:in
`call'
2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR:
/home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/peek-redis-1.2.0/lib/peek/views/redis.rb:9:in
`call'
2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR:
/home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/redis-3.3.3/lib/redis.rb:2399:in
`block in _eval'
2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR:
/home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/redis-3.3.3/lib/redis.rb:58:in
`block in synchronize'
2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR:
/usr/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'
2017-09-07T11:53:03.100Z 547 TID-1fdf1k ERROR:
/home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/redis-3.3.3/lib/redis.rb:58:in
`synchronize'
...

So i checked for Redis container logs.

[root@node-a ~]# docker logs -f 67d44f585705
...
...
[1] 07 Sep 14:43:48.140 # Background saving error
[1] 07 Sep 14:43:54.048 * 1 changes in 900 seconds. Saving...
[1] 07 Sep 14:43:54.048 * Background saving started by pid 2437
[2437] 07 Sep 14:43:54.053 # Failed opening .rdb for saving: Permission denied
...

Checked online for this issue and then noticed the following permissions
and owner details *inside*of Redis pod:

[root@node-a ~]# docker exec -it 67d44f585705 bash
groups: cannot find name for group ID 2000
root@redis-2138096053-0mlx4:/# ls -ld /var/lib/redis/
drwxr-sr-x 12 1000 1000 8192 Sep  7 11:51 /var/lib/redis/
root@redis-2138096053-0mlx4:/#
root@redis-2138096053-0mlx4:/# ls -l /var/lib/redis/
total 22
drwxr-sr-x 2  1000  1000 6 Sep  6 10:37 backups
drwxr-sr-x 2  1000  1000 6 Sep  6 10:37 builds
drwxr-sr-x 2 redis redis 6 Sep  6 10:14 data
-rw-r--r-- 1 redis redis 13050 Sep  7 11:51 dump.rdb
-rwxr-xr-x 1 redis redis21 Sep  5 11:00 index.html
drwxrws--- 2  1000  1000 6 Sep  6 10:37 repositories
drwxr-sr-x 5  1000  100055 Sep  6 10:37 shared
drwxr-sr-x 2 root  root   8192 Sep  6 10:37 ssh
drwxr-sr-x 3 redis redis70 Sep  7 10:20 tmp
drwx--S--- 2  1000  1000 6 Sep  6 10:37 uploads
root@redis-2138096053-0mlx4:/#
root@redis-2138096053-0mlx4:/# grep 1000 /etc/passwd
root@redis-2138096053-0mlx4:/#

Ran following and all looked fine.

root@redis-2138096053-0mlx4:/# chown redis:redis -R /var/lib/redis/

However, when i deleted and ran the GitLab deployment YAML again, the
permissions inside the Redis container *again* got skewed up. I am not sure
whether Gluster is messing up with the Redis file/folders permissions but i
can't think of any other reason
​except for mount​.

One thing i would like to highlight is that all the three containers are
using the *same* PVC

- name: gluster-vol1
  persistentVolumeClaim:
claimName: gluster-dyn-pvc

Above is common for all three. What differs is shown below:

a) postgresql-deployment.yaml

volumeMounts:
- name: gluster-vol1
  mountPath: /var/lib/postgresql

b) redisio-deployment.yaml

volumeMounts:
- name: gluster-vol1
  mountPath: /var/lib/redis

c) gitlab-deployment.yaml

volumeMounts:
- name: gluster-vol1
  mountPath: /home/git/data

Any suggestion? Also, i
​ guess this is not
the right way to use the same PVC/Storage Class for all three containers
​ because i just noticed that all contents reside in the same dir inside
Gluster nodes.

I know there are many things involved besides Gluster so this may not be
_the_ right forum but amongst all, my gut feeling is that Gluster might be
the reason for the permission issue.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-07 Thread Alastair Neil
*shrug* I don't use arbiter for vm work loads just straight replica 3.
There are some gotchas with using an arbiter for VM workloads.  If
quorum-type is auto and a brick that is not the arbiter drop out then if
the up brick is dirty as far as the arbiter is concerned i.e. the only good
copy is on the down brick you will get ENOTCONN and your VMs will halt on
IO.

On 6 September 2017 at 16:06,  wrote:

> Mh, I never had to do that and I never had that problem. Is that an
> arbiter specific thing ? With replica 3 it just works.
>
> On Wed, Sep 06, 2017 at 03:59:14PM -0400, Alastair Neil wrote:
> > you need to set
> >
> > cluster.server-quorum-ratio 51%
> >
> > On 6 September 2017 at 10:12, Pavel Szalbot 
> wrote:
> >
> > > Hi all,
> > >
> > > I have promised to do some testing and I finally find some time and
> > > infrastructure.
> > >
> > > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
> > > replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
> > > with disk accessible through gfapi. Volume group is set to virt
> > > (gluster volume set gv_openstack_1 virt). VM runs current (all
> > > packages updated) Ubuntu Xenial.
> > >
> > > I set up following fio job:
> > >
> > > [job1]
> > > ioengine=libaio
> > > size=1g
> > > loops=16
> > > bs=512k
> > > direct=1
> > > filename=/tmp/fio.data2
> > >
> > > When I run fio fio.job and reboot one of the data nodes, IO statistics
> > > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
> > > filesystem gets remounted as read-only.
> > >
> > > If you care about infrastructure, setup details etc., do not hesitate
> to
> > > ask.
> > >
> > > Gluster info on volume:
> > >
> > > Volume Name: gv_openstack_1
> > > Type: Replicate
> > > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
> > > Status: Started
> > > Snapshot Count: 0
> > > Number of Bricks: 1 x (2 + 1) = 3
> > > Transport-type: tcp
> > > Bricks:
> > > Brick1: gfs-2.san:/export/gfs/gv_1
> > > Brick2: gfs-3.san:/export/gfs/gv_1
> > > Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
> > > Options Reconfigured:
> > > nfs.disable: on
> > > transport.address-family: inet
> > > performance.quick-read: off
> > > performance.read-ahead: off
> > > performance.io-cache: off
> > > performance.stat-prefetch: off
> > > performance.low-prio-threads: 32
> > > network.remote-dio: enable
> > > cluster.eager-lock: enable
> > > cluster.quorum-type: auto
> > > cluster.server-quorum-type: server
> > > cluster.data-self-heal-algorithm: full
> > > cluster.locking-scheme: granular
> > > cluster.shd-max-threads: 8
> > > cluster.shd-wait-qlength: 1
> > > features.shard: on
> > > user.cifs: off
> > >
> > > Partial KVM XML dump:
> > >
> > > 
> > >   
> > >> > name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
> > > 
> > >   
> > >   
> > >   
> > >   77ebfd13-6a92-4f38-b036-e9e55d752e1e
> > >   
> > >> > function='0x0'/>
> > > 
> > >
> > > Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps
> > > SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
> > > nodes (including arbiter).
> > >
> > > I would really love to know what am I doing wrong, because this is my
> > > experience with Gluster for a long time a and a reason I would not
> > > recommend it as VM storage backend in production environment where you
> > > cannot start/stop VMs on your own (e.g. providing private clouds for
> > > customers).
> > > -ps
> > >
> > >
> > > On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti 
> > > wrote:
> > > > Il 30-08-2017 17:07 Ivan Rossi ha scritto:
> > > >>
> > > >> There has ben a bug associated to sharding that led to VM corruption
> > > >> that has been around for a long time (difficult to reproduce I
> > > >> understood). I have not seen reports on that for some time after the
> > > >> last fix, so hopefully now VM hosting is stable.
> > > >
> > > >
> > > > Mmmm... this is precisely the kind of bug that scares me... data
> > > corruption
> > > > :|
> > > > Any more information on what causes it and how to resolve? Even if in
> > > newer
> > > > Gluster releases it is a solved bug, knowledge on how to treat it
> would
> > > be
> > > > valuable.
> > > >
> > > >
> > > > Thanks.
> > > >
> > > > --
> > > > Danti Gionatan
> > > > Supporto Tecnico
> > > > Assyoma S.r.l. - www.assyoma.it
> > > > email: g.da...@assyoma.it - i...@assyoma.it
> > > > GPG public key ID: FF5F32A8
> > > > ___
> > > > Gluster-users mailing list
> > > > Gluster-users@gluster.org
> > > > http://lists.gluster.org/mailman/listinfo/gluster-users
> > > ___
> > > Gluster-users mailing list
> > > Gluster-users@gluster.org
> > > http://lists.gluster.org/mailman/listinfo/gluster-users
> > >
>
> > ___
> > Gluster-users mailing list
> > 

[Gluster-users] Can I use 3.7.11 server with 3.10.5 client?

2017-09-07 Thread Serkan Çoban
Hi,

Is it safe to use 3.10.5 client with 3.7.11 server with read-only data
move operation?
Client will have 3.10.5 glusterfs-client packages. It will mount one
volume from 3.7.11 cluster and one from 3.10.5 cluster. I will read
from 3.7.11 and write to 3.10.5.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Firewalls and ports and protocols

2017-09-07 Thread max.degraaf
Reading the documentation there is conflicting information:

According to https://wiki.centos.org/HowTos/GlusterFSonCentOS we only need port 
TCP open between 2 GlusterFS servers:
Ports TCP:24007-24008 are required for communication between GlusterFS nodes 
and each brick requires another TCP port starting at 24009.

According to 
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Setting%20Up%20Clients/
 we also need to open UPD:
Ensure that TCP and UDP ports 24007 and 24008 are open on all Gluster servers. 
Apart from these ports, you need to open one port for each brick starting from 
port 49152 (instead of 24009 onwards as with previous releases). The brick 
ports assignment scheme is now compliant with IANA guidelines. For example: if 
you have five bricks, you need to have ports 49152 to 49156 open.
This part of the page is actually in the "Setting up Clients" section but it 
clearly mentions server.

To add some more confusion there is an examply when using iptables:
`$ sudo iptables -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp 
--dport 24007:24008 -j ACCEPT `
`$ sudo iptables -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp 
--dport 49152:49156 -j ACCEPT`
This conflicts with the directions using UPD as well as it only opens TCP.


So basically I have 2 questions:
What protocol/ports are needed for 2 glusterfs servers to work together.
What protocol/ports are needed for a glusters client (using only the native 
client) to work with a glusterfs server.

PS: All our machines are running Centos 7.1.

Thanks,

Max

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] investigate & troubleshoot speed bottleneck(s) - how?

2017-09-07 Thread lejeczek

hi guys/gals

I realize that this question must have been asked before, I 
sroogled and found some posts on the web on how to 
tweak/tune gluster, however..


What I hope is that some experts and/or devel could write a 
bit more, maybe compose a doc on - How to investigate and 
trouble gluster's speed-performance bottleneck.


Why I think such a thorough guide would be important? Well.. 
I guess many of us wonder when we look at how "raw" fs vs 
glusterfs does then we wonder - how come!?
I know such a comparison is oversimplification or maybe even 
unfair but when I see such a gigantic performance difference 
then I think, I hope - it must be possible to help with a 
detective work to unravel any bottlenecks that may hamper 
glusterfs so badly that almost to a point that one 
wonders... what is the point.


I did today such a oversimplified test, I used dbench on a 
raw xfs lvm raid0 four ssd pvs (no hardware raid)


$ dbench -t 60 10
...
8 of 10 processes prepared for launch   0 sec
10 of 10 processes prepared for launch   0 sec
releasing clients
  10 21573  1339.94 MB/sec  warmup   1 sec  latency 
35.500 ms
  10 50505  1448.58 MB/sec  warmup   2 sec  latency 
10.027 ms
  10 78424  1467.54 MB/sec  warmup   3 sec  latency 
8.810 ms
  10    105338  1462.94 MB/sec  warmup   4 sec  latency 
19.670 ms
  10    134820  1488.04 MB/sec  warmup   5 sec  latency 
11.237 ms
  10    164380  1505.12 MB/sec  warmup   6 sec  latency 
4.007 ms

...
Throughput 1662.91 MB/sec  10 clients  10 procs  
max_latency=38.879 ms


Cluster hosts 9 vols, each vol on three peers in replica 
mode. Generally, but also most of the time vols utilization 
is really low, data would be regular office work, read 
randomly single files.
Peers are connected via net switch stack, each peer connects 
to stack via two-port lacp 1GB link, jumbo mtu.


So I do do same dbench on one of the vols:
...
8 of 10 processes prepared for launch   0 sec
10 of 10 processes prepared for launch   0 sec
releasing clients
  10    98    45.41 MB/sec  warmup   1 sec  latency 
113.146 ms
  10   212    41.52 MB/sec  warmup   2 sec  latency 
93.800 ms
  10   343    41.23 MB/sec  warmup   3 sec  latency 
53.545 ms
  10   468    41.06 MB/sec  warmup   4 sec  latency 
54.450 ms
  10   612    41.89 MB/sec  warmup   5 sec  latency 
152.659 ms
  10   866    35.99 MB/sec  warmup   6 sec  latency 
31.377 ms
  10  1074    32.74 MB/sec  warmup   7 sec  latency 
39.923 ms
  10  1307    29.77 MB/sec  warmup   8 sec  latency 
42.388 ms

...
Throughput 15.3757 MB/sec  10 clients  10 procs  
max_latency=54.371 ms


So yes.. gee.. how can I make my gluster-cluster faster???

many thanks, L.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] peer rejected but connected

2017-09-07 Thread Gaurav Yadav
Thank you for the acknowledgement.

On Mon, Sep 4, 2017 at 6:39 PM, lejeczek  wrote:

> yes, I see things got lost in transit, I said before:
>
> I did from first time and now not rejected.
> now I'm restarting fourth(newly added) peer's glusterd
> and.. it seems to work. <- HERE!  (even though
>
> and then I asked:
>
> I there anything I should double check & make sure all
> is 100% fine before I use that newly added peer for
> bricks?
>
> below is my full message. Basically, new peers do not get rejected any
> more.
>
>
> On 04/09/17 13:56, Gaurav Yadav wrote:
>
>>
>> Executing "gluster volume set all cluster.op-version "on all
>> the existing nodes will solve this problem.
>>
>> If issue still persists please  provide me following logs
>> (working-cluster + newly added peer)
>> 1. glusterd.info  file from /var/lib/glusterd from
>> all nodes
>> 2. glusterd.logs from all nodes
>> 3. info file from all the nodes.
>> 4. cmd-history from all the nodes.
>>
>> Thanks
>> Gaurav
>>
>> On Mon, Sep 4, 2017 at 2:09 PM, lejeczek  pelj...@yahoo.co.uk>> wrote:
>>
>> I do not see, did you write anything?
>>
>> On 03/09/17 11:54, Gaurav Yadav wrote:
>>
>>
>>
>> On Fri, Sep 1, 2017 at 9:02 PM, lejeczek
>> 
>> > >> wrote:
>>
>> you missed my reply before?
>> here:
>>
>> now, a "weir" thing
>>
>> I did that, still fourth peer rejected, still
>> fourth
>> probe would fail to restart(all after upping
>> to 31004)
>> I redone, wiped and re-probed from a different
>> peer
>> than I did from first time and now not rejected.
>> now I'm restarting fourth(newly added) peer's
>> glusterd
>> and.. it seems to work.(even though
>> tier-enabled=0 is
>> still there, now on all four peers, was not
>> there on
>> three before working peers)
>>
>> I there anything I should double check & make
>> sure all
>> is 100% fine before I use that newly added
>> peer for
>> bricks?
>>
>>   For this only I need logs to see what has
>> gone wrong.
>>
>>
>> Please provide me following things
>> (working-cluster + newly added peer)
>> 1. glusterd.info 
>>   file
>> from /var/lib/glusterd from all nodes
>> 2. glusterd.logs from all nodes
>> 3. info file from all the nodes.
>> 4. cmd-history from all the nodes.
>>
>>
>> On 01/09/17 11:11, Gaurav Yadav wrote:
>>
>> I replicate the problem locally and with
>> the steps
>> I suggested you, it worked for me...
>>
>> Please provide me following things
>> (working-cluster + newly added peer)
>> 1. glusterd.info 
>> 
>>  file from
>> /var/lib/glusterd
>> from all nodes
>> 2. glusterd.logs from all nodes
>> 3. info file from all the nodes.
>> 4. cmd-history from all the nodes.
>>
>>
>> On Fri, Sep 1, 2017 at 3:39 PM, lejeczek
>> > 
>> > >
>> > 
>> > >
>> Like I said, I upgraded from 3.8 to 3.10 a
>> while ago,
>> at the moment 3.10.5, only now with
>> 3.10.5 I
>> tried to
>> add a peer.
>>
>> On 01/09/17 10:51, Gaurav Yadav wrote:
>>
>> What is gluster --version on all
>> these nodes?
>>
>> On Fri, Sep 1, 2017 at 3:18 PM,
>> lejeczek
>> > 
>> > >
>> > 
>> > >>
>> > 
>> > >
>>
>> >