Re: [Gluster-users] Quorum in distributed-replicate volume

2018-02-26 Thread Karthik Subrahmanya
On Mon, Feb 26, 2018 at 6:14 PM, Dave Sherohman  wrote:

> On Mon, Feb 26, 2018 at 05:45:27PM +0530, Karthik Subrahmanya wrote:
> > > "In a replica 2 volume... If we set the client-quorum option to
> > > auto, then the first brick must always be up, irrespective of the
> > > status of the second brick. If only the second brick is up, the
> > > subvolume becomes read-only."
> > >
> > By default client-quorum is "none" in replica 2 volume.
>
> I'm not sure where I saw the directions saying to set it, but I do have
> "cluster.quorum-type: auto" in my volume configuration.  (And I think
> that's client quorum, but feel free to correct me if I've misunderstood
> the docs.)
>
If it is "auto" then I think it is reconfigured. In replica 2 it will be
"none".

>
> > It applies to all the replica 2 volumes even if it has just 2 brick or
> more.
> > Total brick count in the volume doesn't matter for the quorum, what
> matters
> > is the number of bricks which are up in the particular replica subvol.
>
> Thanks for confirming that.
>
> > If I understood your configuration correctly it should look something
> like
> > this:
> > (Please correct me if I am wrong)
> > replica-1:  bricks 1 & 2
> > replica-2: bricks 3 & 4
> > replica-3: bricks 5 & 6
>
> Yes, that's correct.
>
> > Since quorum is per replica, if it is set to auto then it needs the first
> > brick of the particular replica subvol to be up to perform the fop.
> >
> > In replica 2 volumes you can end up in split-brains.
>
> How would that happen if bricks which are not in (cluster-wide) quorum
> refuse to accept writes?  I'm not seeing the reason for using individual
> subvolume quorums instead of full-volume quorum.
>
Split brains happen within the replica pair.
I will try to explain how you can end up in split-brain even with cluster
wide quorum:
Lets say you have 6 bricks (replica 2) volume and you always have at least
quorum number of bricks up & running.
Bricks 1 & 2 are part of replica subvol-1
Bricks 3 & 4 are part of replica subvol-2
Bricks 5 & 6 are part of replica subvol-3

- Brick 1 goes down and a write comes on a file which is part of that
replica subvol-1
- Quorum is met since we have 5 out of 6 bricks are running
- Brick 2 says brick 1 is bad
- Brick 2 goes down and brick 1 comes up. Heal did not happened
- Write comes on the same file, quorum is met, and now brick 1 says brick 2
is bad
- When both the bricks 1 & 2 are up, both of them blame the other brick -
*split-brain*

>
> > It would be great if you can consider configuring an arbiter or
> > replica 3 volume.
>
> I can.  My bricks are 2x850G and 4x11T, so I can repurpose the small
> bricks as arbiters with minimal effect on capacity.  What would be the
> sequence of commands needed to:
>
> 1) Move all data off of bricks 1 & 2
> 2) Remove that replica from the cluster
> 3) Re-add those two bricks as arbiters
>
>
(And did I miss any additional steps?)
>
> Unfortunately, I've been running a few months already with the current
> configuration and there are several virtual machines running off the
> existing volume, so I'll need to reconfigure it online if possible.
>
Without knowing the volume configuration it is difficult to suggest the
configuration change,
and since it is a live system you may end up in data unavailability or data
loss.
Can you give the output of "gluster volume info "
and which brick is of what size.
Note: The arbiter bricks need not be of bigger size.
[1] gives information about how you can provision the arbiter brick.

[1]
http://docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/#arbiter-bricks-sizing

Regards,
Karthik

>
> --
> Dave Sherohman
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] NFS Ganesha HA w/ GlusterFS

2018-02-26 Thread TomK

On 2/26/2018 7:14 AM, Kaleb S. KEITHLEY wrote:
Hey,

Yep. A blog is where I was writing it up to begin with.

Anyway, got alot of demand for it over the last one day so here it is:

http://microdevsys.com/wp/glusterfs-configuration-and-setup-w-nfs-ganesha-for-an-ha-nfs-cluster/

Skip to the SUMMARY and TESTING sections so you can just copy and paste 
the configs to get things moving very quickly.  The detailed section is 
my running log of all the troubleshooting and failed attempts.


FreeIPA is being used as the DNS / Kerberos backend to which these NFS 
servers will be configured to.  Not yet done with this piece so not yet 
including that mailing list here.


Feel free to point anything out as I would like to keep it accurate.

The post includes all work needed for firewalld and selinux on CentOS 7 
without turning off either service.


Again, thanks for all the help here.  Couldn't get this working without 
all the work you guy's do!


Cheers,
Tom



On 02/25/2018 08:29 PM, TomK wrote:

Hey Guy's,

A success story instead of a question.

With your help, managed to get the HA component working with HAPROXY and
keepalived to build a fairly resilient NFS v4 VM cluster.  ( Used
Gluster, NFS Ganesha v2.60, HAPROXY, keepalived w/ selinux enabled )

If someone needs or it could help your work, please PM me for the
written up post or I could just post here if the lists allow it.



Hi,

I strongly encourage you to write a blog post about it. And if not that
at least write about it and post it to the list(s).

I'm not sure why your post to nfs-ganesha-support was blocked. Maybe
it's waiting for some moderator attention. (But I don't believe that
list is moderated.)

Thanks for sharing.




--
Cheers,
Tom K.
-

Living on earth is expensive, but it includes a free trip around the sun.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Problems with write-behind with large files on Gluster 3.8.4

2018-02-26 Thread Raghavendra Gowdappa
+csaba

On Tue, Feb 27, 2018 at 2:49 AM, Jim Prewett  wrote:

>
> Hello,
>
> I'm having problems when write-behind is enabled on Gluster 3.8.4.
>
> I have 2 Gluster servers each with a single brick that is mirrored between
> them.  The code causing these issues reads two data files each approx. 128G
> in size.  It opens a third file, mmap()'s that file, and subsequently reads
> and writes to it.  The third file, on sucessful runs (without write-behind
> enabled) is ultimately approx. 224G in size.
>

What exactly is the problem you are facing with write-behind enabled? Is it
that the file size is smaller?


> The servers have the IP addresses 172.17.2.254 and 172.17.2.255 and the
> client has the IP address 172.17.1.61.  These are all IP over InfiniBand.
>
> I'm attaching logfiles for the brick and for the volume from each of the
> servers and for the client.  I'm also attaching the output of "gluster
> volume info" and "gluster volume get  all".
>
> I have only noticed problems with write-behind being enabled with this one
> particular workload.  When I ran it under strace, I see it seeking all over
> the place and reading and writing little bits of data to/from the third
> file.
>

What is the pattern you see when write-behind is disabled? Can you attach
strace of the application for both scenarios - write-behind enabled and
disabled? Can you also explain the workload and its data access pattern?


> For now, I'm leaving write-behind disabled.  What are the performance
> implications of this for jobs that don't have this strange access pattern?
>

Disabling write-behind can bring down performance for sequential workloads.


> My co-worker who usually maintains the Gluster filesystems here is busy
> having a baby right now and I've gotten it while he's out, so I'm /really/
> new to Gluster and am not confident that anything is correct in my
> configuration (nor do I have a specific reason to doubt its correctness! :)
>
> I have checked the InfiniBand fabric for errors and do not see any beyond
> the normal PortXmitWait counter.  There is no firewall on any of these
> machines.  Their system clocks seem to all be synchronized.
>
> Is there anything additional I can provide to help diagnose this problem?
>
> Thanks for any help you can provide! :)
>
> Jim
>
> James E. Prewettj...@prewett.org downl...@hpc.unm.edu
> Systems Team Leader   LoGS: http://www.hpc.unm.edu/~download/LoGS/
> Designated Security Officer OpenPGP key: pub 1024D/31816D93
> HPC Systems Engineer III   UNM HPC  505.277.8210
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] new Gluster cluster: 3.10 vs 3.12

2018-02-26 Thread Vlad Kopylov
Thanks!

On Mon, Feb 26, 2018 at 4:26 PM, Ingard Mevåg  wrote:
> After discussing with Xavi in #gluster-dev we found out that we could
> eliminate the slow lstats by disabling disperse.eager-lock.
> There is an open issue here :
> https://bugzilla.redhat.com/show_bug.cgi?id=1546732
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] new Gluster cluster: 3.10 vs 3.12

2018-02-26 Thread Anatoliy Dmytriyev
Hi Vlad and Ingard,

Thanks a lot for the replies.

Regards,
Anatoliy



> On 26 Feb 2018, at 22:26, Ingard Mevåg  wrote:
> 
> After discussing with Xavi in #gluster-dev we found out that we could 
> eliminate the slow lstats by disabling disperse.eager-lock.
> There is an open issue here : 
> https://bugzilla.redhat.com/show_bug.cgi?id=1546732 
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] new Gluster cluster: 3.10 vs 3.12

2018-02-26 Thread Ingard Mevåg
After discussing with Xavi in #gluster-dev we found out that we could
eliminate the slow lstats by disabling disperse.eager-lock.
There is an open issue here :
https://bugzilla.redhat.com/show_bug.cgi?id=1546732
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] NFS Ganesha HA w/ GlusterFS

2018-02-26 Thread WK

+1

I also would like to see those instruction.

I've been interested in NFS-Ganesha with Gluster but there wasn't 
obvious references that were up to date.


(Let alone introduction of StorHaug)


-wk



On 2/25/2018 10:28 PM, Serkan Çoban wrote:

I would like to see the steps for reference, can you provide a link or
just post them on mail list?

On Mon, Feb 26, 2018 at 4:29 AM, TomK  wrote:

Hey Guy's,

A success story instead of a question.

With your help, managed to get the HA component working with HAPROXY and
keepalived to build a fairly resilient NFS v4 VM cluster.  ( Used Gluster,
NFS Ganesha v2.60, HAPROXY, keepalived w/ selinux enabled )

If someone needs or it could help your work, please PM me for the written up
post or I could just post here if the lists allow it.

Cheers,
Tom



On 2/19/2018 12:25 PM, TomK wrote:

On 2/19/2018 12:09 PM, Kaleb S. KEITHLEY wrote:
Sounds good and no problem at all.  Will look out for this update in the
future.  In the meantime, three's a few things I'll try including your
suggestion.

Was looking for a sense of direction with the projects and now you've
given that.  Ty.  Appreciated!

Cheers,
Tom



On 02/19/2018 11:37 AM, TomK wrote:

On 2/19/2018 10:55 AM, Kaleb S. KEITHLEY wrote:
Yep, I noticed a couple of pages including this for 'storhaug
configuration' off google.  Adding 'mailing list' to the search didn't
help alot:

https://sourceforge.net/p/nfs-ganesha/mailman/message/35929089/

https://www.spinics.net/lists/gluster-users/msg33018.html

Hence the ask here.  storhaug feels like it's not moving with any sort
of update now.

Any plans to move back to the previous NFS Ganesha HA model with
upcoming GlusterFS versions as a result?


No.

(re)writing or finishing storhaug has been on my plate ever since the
guy who was supposed to do it didn't.

I have lots of other stuff to do too. All I can say is it'll get done
when it gets done.



In the meantime I'll look to cobble up the GlusterFS 3.10 packages and
try with those per your suggestion.

What's your thoughts on using HAPROXY / keepalived w/ NFS Ganesha and
GlusterFS?  Anyone tried this sort of combination?  I want to avoid the
situation where I have to remount clients as a result of a node failing.
   In other words, avoid this situation:

[root@yes01 ~]# cd /n
-bash: cd: /n: Stale file handle
[root@yes01 ~]#

Cheers,
Tom


On 02/19/2018 10:24 AM, TomK wrote:

On 2/19/2018 2:39 AM, TomK wrote:
+ gluster users as well.  Just read another post on the mailing lists
about a similar ask from Nov which didn't really have a clear answer.


That's funny because I've answered questions like this several times.

Gluster+Ganesha+Pacemaker-based HA is available up to GlusterFS 3.10.x.

If you need HA, that is one "out of the box" option.

There's support for using CTDB in Samba for Ganesha HA, and people have
used it successfully with Gluster+Ganesha.


Perhaps there's a way to get NFSv4 work with GlusterFS without NFS
Ganesha then?


Not that I'm aware of.


Cheers,
Tom


Hey All,

I've setup GlusterFS on two virtuals and enabled NFS Ganesha on each
node.  ATM the configs are identical between the two NFS Ganesha
hosts. (Probably shouldn't be but I'm just testing things out.)

I need HA capability and notice these instructions here:


http://aravindavkgluster.readthedocs.io/en/latest/Administrator%20Guide/Configuring%20HA%20NFS%20Server/



However I don't have package glusterfs-ganesha available on this
CentOS Linux release 7.4.1708 (Core) and the maintainer's of CentOS 7
haven't uploaded some of the 2.5.x packages yet so I can't use that
version.

glusterfs-api-3.12.6-1.el7.x86_64
glusterfs-libs-3.12.6-1.el7.x86_64
glusterfs-3.12.6-1.el7.x86_64
glusterfs-fuse-3.12.6-1.el7.x86_64
glusterfs-server-3.12.6-1.el7.x86_64
python2-glusterfs-api-1.1-1.el7.noarch
glusterfs-client-xlators-3.12.6-1.el7.x86_64
glusterfs-cli-3.12.6-1.el7.x86_64

nfs-ganesha-xfs-2.3.2-1.el7.x86_64
nfs-ganesha-vfs-2.3.2-1.el7.x86_64
nfs-ganesha-2.3.2-1.el7.x86_64
nfs-ganesha-gluster-2.3.2-1.el7.x86_64

The only high availability packages are the following but they don't
come with any instructions that I can find:

storhaug.noarch : High-Availability Add-on for NFS-Ganesha and Samba
storhaug-nfs.noarch : storhaug NFS-Ganesha module

Given that I'm missing that one package above, will configuring using
ganesha-ha.conf still work?  Or should I be looking at another option
alltogether?

Appreciate any help.  Ty!









--
Cheers,
Tom K.
-

Living on earth is expensive, but it includes a free trip around the sun.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



[Gluster-users] Release 4.0: RC1 tagged

2018-02-26 Thread Shyam Ranganathan
Hi,

RC1 is tagged in the code, and the request for packaging the same is on
its way.

We should have packages as early as today, and request the community to
test the same and return some feedback.

We have about 3-4 days (till Thursday) for any pending fixes and the
final release to happen, so shout out in case you face any blockers.

The RC1 packages should land here:
https://download.gluster.org/pub/gluster/glusterfs/qa-releases/4.0rc1/
and like so for CentOS,
CentOS7:
  # yum install
http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.centos.x86_64.rpm
  # yum install glusterfs-server

Thanks,
Gluster community
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster performance / Dell Idrac enterprise conflict

2018-02-26 Thread Ryan Wilkinson
Here is info. about the Raid controllers.  Doesn't seem to be the culprit.

Slow host:
Name PERC H710 Mini (Embedded)
Firmware Version 21.3.4-0001
Cache Memory Size 512 MB
Fast Host:

Name PERC H310 Mini (Embedded)
Firmware Version 20.12.1-0002
Cache Memory Size 0 MB
Slow host:
Name PERC H310 Mini (Embedded)
Firmware Version 20.13.1-0002
Cache Memory Size 0 MB
Slow host:
Name PERC H310 Mini (Embedded)
Firmware Version 20.13.3-0001 Cache Memory Size 0 MB
Slow Host:
Name PERC H710 Mini (Embedded)
Firmware Version 21.3.5-0002
Cache Memory Size 512 MB
Fast Host
Perc H730
Cache Memory Size 1GB

On Mon, Feb 26, 2018 at 9:42 AM, Alvin Starr  wrote:

> I would be really supprised if the problem was related to Idrac.
>
> The Idrac processor is a stand alone cpu with its own nic and runs
> independent of the main CPU.
>
> That being said it does have visibility into the whole system.
>
> try using dmidecode to compare the systems and take a close look at the
> raid controllers and what size and form of cache they have.
>
> On 02/26/2018 11:34 AM, Ryan Wilkinson wrote:
>
> I've tested about 12 different Dell servers.  Ony a couple of them have
> Idrac express and all the others have Idrac Enterprise.  All the boxes with
> Enterprise perform poorly and the couple that have express perform well.  I
> use the disks in raid mode on all of them.  I've tried a few non-Dell boxes
> and they all perform well even though some of them are very old.  I've also
> tried disabling Idrac, the Idrac nic, virtual storage for Idrac with no
> sucess..
>
> On Mon, Feb 26, 2018 at 9:28 AM, Serkan Çoban 
> wrote:
>
>> I don't think it is related with iDRAC itself but some configuration
>> is wrong or there is some hw error.
>> Did you check battery of raid controller? Do you use disks in jbod
>> mode or raid mode?
>>
>> On Mon, Feb 26, 2018 at 6:12 PM, Ryan Wilkinson 
>> wrote:
>> > Thanks for the suggestion.  I tried both of these with no difference in
>> > performance.I have tried several other Dell hosts with Idrac Enterprise
>> and
>> > getting the same results.  I also tried a new Dell T130 with Idrac
>> express
>> > and was getting over 700 MB/s.  Any other users had this issues with
>> Idrac
>> > Enterprise??
>> >
>> >
>> > On Thu, Feb 22, 2018 at 12:16 AM, Serkan Çoban 
>> > wrote:
>> >>
>> >> "Did you check the BIOS/Power settings? They should be set for high
>> >> performance.
>> >> Also you can try to boot "intel_idle.max_cstate=0" kernel command line
>> >> option to be sure CPUs not entering power saving states.
>> >>
>> >> On Thu, Feb 22, 2018 at 9:59 AM, Ryan Wilkinson 
>> >> wrote:
>> >> >
>> >> >
>> >> > I have a 3 host gluster replicated cluster that is providing storage
>> for
>> >> > our
>> >> > RHEV environment.  We've been having issues with inconsistent
>> >> > performance
>> >> > from the VMs depending on which Hypervisor they are running on.  I've
>> >> > confirmed throughput to be ~9Gb/s to each of the storage hosts from
>> the
>> >> > hypervisors.  I'm getting ~300MB/s disk read spead when our test vm
>> is
>> >> > on
>> >> > the slow Hypervisors and over 500 on the faster ones.  The
>> performance
>> >> > doesn't seem to be affected much by the cpu, memory that are in the
>> >> > hypervisors.  I have tried a couple of really old boxes and got over
>> 500
>> >> > MB/s.  The common thread seems to be that the poorly perfoming hosts
>> all
>> >> > have Dell's Idrac 7 Enterprise.  I have one Hypervisor that has
>> Idrac 7
>> >> > express and it performs well.  We've compared system packages and
>> >> > versions
>> >> > til we're blue in the face and have been struggling with this for a
>> >> > couple
>> >> > months but that seems to be the only common denominator.  I've tried
>> on
>> >> > one
>> >> > of those Idrac 7 hosts to disable the nic, virtual drive, etc, etc.
>> but
>> >> > no
>> >> > change in performance.  In addition, I tried 5 new hosts and all are
>> >> > complying to the Idrac enterprise theory.  Anyone else had this
>> issue?!
>> >> >
>> >> >
>> >> >
>> >> > ___
>> >> > Gluster-users mailing list
>> >> > Gluster-users@gluster.org
>> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
>> >
>> >
>>
>
>
>
> ___
> Gluster-users mailing 
> listGluster-users@gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> --
> Alvin Starr   ||   land:  (905)513-7688 <(905)%20513-7688>
> Netvel Inc.   ||   Cell:  (416)806-0133 
> <(416)%20806-0133>al...@netvel.net  ||
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org

Re: [Gluster-users] Gluster performance / Dell Idrac enterprise conflict

2018-02-26 Thread Alvin Starr

I would be really supprised if the problem was related to Idrac.

The Idrac processor is a stand alone cpu with its own nic and runs 
independent of the main CPU.


That being said it does have visibility into the whole system.

try using dmidecode to compare the systems and take a close look at the 
raid controllers and what size and form of cache they have.



On 02/26/2018 11:34 AM, Ryan Wilkinson wrote:
I've tested about 12 different Dell servers.  Ony a couple of them 
have Idrac express and all the others have Idrac Enterprise.  All the 
boxes with Enterprise perform poorly and the couple that have express 
perform well.  I use the disks in raid mode on all of them.  I've 
tried a few non-Dell boxes and they all perform well even though some 
of them are very old. I've also tried disabling Idrac, the Idrac nic, 
virtual storage for Idrac with no sucess..


On Mon, Feb 26, 2018 at 9:28 AM, Serkan Çoban > wrote:


I don't think it is related with iDRAC itself but some configuration
is wrong or there is some hw error.
Did you check battery of raid controller? Do you use disks in jbod
mode or raid mode?

On Mon, Feb 26, 2018 at 6:12 PM, Ryan Wilkinson
> wrote:
> Thanks for the suggestion.  I tried both of these with no
difference in
> performance.I have tried several other Dell hosts with Idrac
Enterprise and
> getting the same results.  I also tried a new Dell T130 with
Idrac express
> and was getting over 700 MB/s.  Any other users had this issues
with Idrac
> Enterprise??
>
>
> On Thu, Feb 22, 2018 at 12:16 AM, Serkan Çoban
>
> wrote:
>>
>> "Did you check the BIOS/Power settings? They should be set for high
>> performance.
>> Also you can try to boot "intel_idle.max_cstate=0" kernel
command line
>> option to be sure CPUs not entering power saving states.
>>
>> On Thu, Feb 22, 2018 at 9:59 AM, Ryan Wilkinson
>
>> wrote:
>> >
>> >
>> > I have a 3 host gluster replicated cluster that is providing
storage for
>> > our
>> > RHEV environment.  We've been having issues with inconsistent
>> > performance
>> > from the VMs depending on which Hypervisor they are running
on.  I've
>> > confirmed throughput to be ~9Gb/s to each of the storage
hosts from the
>> > hypervisors.  I'm getting ~300MB/s disk read spead when our
test vm is
>> > on
>> > the slow Hypervisors and over 500 on the faster ones.  The
performance
>> > doesn't seem to be affected much by the cpu, memory that are
in the
>> > hypervisors.  I have tried a couple of really old boxes and
got over 500
>> > MB/s.  The common thread seems to be that the poorly
perfoming hosts all
>> > have Dell's Idrac 7 Enterprise.  I have one Hypervisor that
has Idrac 7
>> > express and it performs well.  We've compared system packages and
>> > versions
>> > til we're blue in the face and have been struggling with this
for a
>> > couple
>> > months but that seems to be the only common denominator. 
I've tried on
>> > one
>> > of those Idrac 7 hosts to disable the nic, virtual drive,
etc, etc. but
>> > no
>> > change in performance.  In addition, I tried 5 new hosts and
all are
>> > complying to the Idrac enterprise theory.  Anyone else had
this issue?!
>> >
>> >
>> >
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org 
>> > http://lists.gluster.org/mailman/listinfo/gluster-users

>
>




___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


--
Alvin Starr   ||   land:  (905)513-7688
Netvel Inc.   ||   Cell:  (416)806-0133
al...@netvel.net  ||

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster performance / Dell Idrac enterprise conflict

2018-02-26 Thread Ryan Wilkinson
I've tested about 12 different Dell servers.  Ony a couple of them have
Idrac express and all the others have Idrac Enterprise.  All the boxes with
Enterprise perform poorly and the couple that have express perform well.  I
use the disks in raid mode on all of them.  I've tried a few non-Dell boxes
and they all perform well even though some of them are very old.  I've also
tried disabling Idrac, the Idrac nic, virtual storage for Idrac with no
sucess..

On Mon, Feb 26, 2018 at 9:28 AM, Serkan Çoban  wrote:

> I don't think it is related with iDRAC itself but some configuration
> is wrong or there is some hw error.
> Did you check battery of raid controller? Do you use disks in jbod
> mode or raid mode?
>
> On Mon, Feb 26, 2018 at 6:12 PM, Ryan Wilkinson 
> wrote:
> > Thanks for the suggestion.  I tried both of these with no difference in
> > performance.I have tried several other Dell hosts with Idrac Enterprise
> and
> > getting the same results.  I also tried a new Dell T130 with Idrac
> express
> > and was getting over 700 MB/s.  Any other users had this issues with
> Idrac
> > Enterprise??
> >
> >
> > On Thu, Feb 22, 2018 at 12:16 AM, Serkan Çoban 
> > wrote:
> >>
> >> "Did you check the BIOS/Power settings? They should be set for high
> >> performance.
> >> Also you can try to boot "intel_idle.max_cstate=0" kernel command line
> >> option to be sure CPUs not entering power saving states.
> >>
> >> On Thu, Feb 22, 2018 at 9:59 AM, Ryan Wilkinson 
> >> wrote:
> >> >
> >> >
> >> > I have a 3 host gluster replicated cluster that is providing storage
> for
> >> > our
> >> > RHEV environment.  We've been having issues with inconsistent
> >> > performance
> >> > from the VMs depending on which Hypervisor they are running on.  I've
> >> > confirmed throughput to be ~9Gb/s to each of the storage hosts from
> the
> >> > hypervisors.  I'm getting ~300MB/s disk read spead when our test vm is
> >> > on
> >> > the slow Hypervisors and over 500 on the faster ones.  The performance
> >> > doesn't seem to be affected much by the cpu, memory that are in the
> >> > hypervisors.  I have tried a couple of really old boxes and got over
> 500
> >> > MB/s.  The common thread seems to be that the poorly perfoming hosts
> all
> >> > have Dell's Idrac 7 Enterprise.  I have one Hypervisor that has Idrac
> 7
> >> > express and it performs well.  We've compared system packages and
> >> > versions
> >> > til we're blue in the face and have been struggling with this for a
> >> > couple
> >> > months but that seems to be the only common denominator.  I've tried
> on
> >> > one
> >> > of those Idrac 7 hosts to disable the nic, virtual drive, etc, etc.
> but
> >> > no
> >> > change in performance.  In addition, I tried 5 new hosts and all are
> >> > complying to the Idrac enterprise theory.  Anyone else had this
> issue?!
> >> >
> >> >
> >> >
> >> > ___
> >> > Gluster-users mailing list
> >> > Gluster-users@gluster.org
> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
> >
> >
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster performance / Dell Idrac enterprise conflict

2018-02-26 Thread Serkan Çoban
I don't think it is related with iDRAC itself but some configuration
is wrong or there is some hw error.
Did you check battery of raid controller? Do you use disks in jbod
mode or raid mode?

On Mon, Feb 26, 2018 at 6:12 PM, Ryan Wilkinson  wrote:
> Thanks for the suggestion.  I tried both of these with no difference in
> performance.I have tried several other Dell hosts with Idrac Enterprise and
> getting the same results.  I also tried a new Dell T130 with Idrac express
> and was getting over 700 MB/s.  Any other users had this issues with Idrac
> Enterprise??
>
>
> On Thu, Feb 22, 2018 at 12:16 AM, Serkan Çoban 
> wrote:
>>
>> "Did you check the BIOS/Power settings? They should be set for high
>> performance.
>> Also you can try to boot "intel_idle.max_cstate=0" kernel command line
>> option to be sure CPUs not entering power saving states.
>>
>> On Thu, Feb 22, 2018 at 9:59 AM, Ryan Wilkinson 
>> wrote:
>> >
>> >
>> > I have a 3 host gluster replicated cluster that is providing storage for
>> > our
>> > RHEV environment.  We've been having issues with inconsistent
>> > performance
>> > from the VMs depending on which Hypervisor they are running on.  I've
>> > confirmed throughput to be ~9Gb/s to each of the storage hosts from the
>> > hypervisors.  I'm getting ~300MB/s disk read spead when our test vm is
>> > on
>> > the slow Hypervisors and over 500 on the faster ones.  The performance
>> > doesn't seem to be affected much by the cpu, memory that are in the
>> > hypervisors.  I have tried a couple of really old boxes and got over 500
>> > MB/s.  The common thread seems to be that the poorly perfoming hosts all
>> > have Dell's Idrac 7 Enterprise.  I have one Hypervisor that has Idrac 7
>> > express and it performs well.  We've compared system packages and
>> > versions
>> > til we're blue in the face and have been struggling with this for a
>> > couple
>> > months but that seems to be the only common denominator.  I've tried on
>> > one
>> > of those Idrac 7 hosts to disable the nic, virtual drive, etc, etc. but
>> > no
>> > change in performance.  In addition, I tried 5 new hosts and all are
>> > complying to the Idrac enterprise theory.  Anyone else had this issue?!
>> >
>> >
>> >
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster performance / Dell Idrac enterprise conflict

2018-02-26 Thread Ryan Wilkinson
Thanks for the suggestion.  I tried both of these with no difference in
performance.I have tried several other Dell hosts with Idrac Enterprise and
getting the same results.  I also tried a new Dell T130 with Idrac express
and was getting over 700 MB/s.  Any other users had this issues with Idrac
Enterprise??

On Thu, Feb 22, 2018 at 12:16 AM, Serkan Çoban 
wrote:

> "Did you check the BIOS/Power settings? They should be set for high
> performance.
> Also you can try to boot "intel_idle.max_cstate=0" kernel command line
> option to be sure CPUs not entering power saving states.
>
> On Thu, Feb 22, 2018 at 9:59 AM, Ryan Wilkinson 
> wrote:
> >
> >
> > I have a 3 host gluster replicated cluster that is providing storage for
> our
> > RHEV environment.  We've been having issues with inconsistent performance
> > from the VMs depending on which Hypervisor they are running on.  I've
> > confirmed throughput to be ~9Gb/s to each of the storage hosts from the
> > hypervisors.  I'm getting ~300MB/s disk read spead when our test vm is on
> > the slow Hypervisors and over 500 on the faster ones.  The performance
> > doesn't seem to be affected much by the cpu, memory that are in the
> > hypervisors.  I have tried a couple of really old boxes and got over 500
> > MB/s.  The common thread seems to be that the poorly perfoming hosts all
> > have Dell's Idrac 7 Enterprise.  I have one Hypervisor that has Idrac 7
> > express and it performs well.  We've compared system packages and
> versions
> > til we're blue in the face and have been struggling with this for a
> couple
> > months but that seems to be the only common denominator.  I've tried on
> one
> > of those Idrac 7 hosts to disable the nic, virtual drive, etc, etc. but
> no
> > change in performance.  In addition, I tried 5 new hosts and all are
> > complying to the Idrac enterprise theory.  Anyone else had this issue?!
> >
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] rpc/glusterd-locks error

2018-02-26 Thread Vineet Khandpur
Good morning.

We have a 6 node cluster. 3 nodes are participating in a replica 3 volume.
Naming convention:
xx01 - 3 nodes participating in ovirt_vol
xx02 - 3 nodes NOT particpating in ovirt_vol

Last week, restarted glusterd on each node in cluster to update (one at a
time).
The three xx01 nodes all show the following in glusterd.log:

[2018-02-26 14:31:47.330670] E [socket.c:2020:__socket_read_frag] 0-rpc:
wrong MSG-TYPE (29386) received from 172.26.30.9:24007
[2018-02-26 14:31:47.330879] W
[glusterd-locks.c:843:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2322a)
[0x7f46020e922a]
-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2d198)
[0x7f46020f3198]
-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0xe4755)
[0x7f46021aa755] ) 0-management: Lock for vol ovirtprod_vol not held
[2018-02-26 14:31:47.331066] E [rpc-clnt.c:350:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f460d64dedb] (-->
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f460d412e6e] (-->
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f460d412f8e] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7f460d414710] (-->
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7f460d415200] )
0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
at 2018-02-26 14:31:47.330496 (xid=0x72e0)
[2018-02-26 14:31:47.333993] E [socket.c:2020:__socket_read_frag] 0-rpc:
wrong MSG-TYPE (84253) received from 172.26.30.8:24007
[2018-02-26 14:31:47.334148] W
[glusterd-locks.c:843:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2322a)
[0x7f46020e922a]
-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2d198)
[0x7f46020f3198]
-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0xe4755)
[0x7f46021aa755] ) 0-management: Lock for vol ovirtprod_vol not held
[2018-02-26 14:31:47.334317] E [rpc-clnt.c:350:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f460d64dedb] (-->
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f460d412e6e] (-->
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f460d412f8e] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7f460d414710] (-->
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7f460d415200] )
0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
at 2018-02-26 14:31:47.333824 (xid=0x1494b)
[2018-02-26 14:31:48.511390] E [socket.c:2632:socket_poller]
0-socket.management: poll error on socket

Additionally, all show connectivity to 2 of the three hosts (itself, and a
second). None of the 3 show connectivity to the same host (xx01 show
connectivity to itself and yy01, yy01 show connectivity to itself and zz01,
zz01 shows itself and xx01).

However, xx02 hosts (non-volume participating, same cluster) show volume
info as being fine, and all xx01 hosts participating in volume.

In our dev environment, had to stop the volume, and restart glusterd on all
hosts, however for prod, that would mean a system wide outage and down
time, which needs to be avoided.

Any suggestions? Thanks.

vk

Vineet Khandpur
UNIX System Administrator
Information Technology Services
University of Alberta Libraries
+1-780-492-4718
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Quorum in distributed-replicate volume

2018-02-26 Thread Dave Sherohman
On Mon, Feb 26, 2018 at 05:45:27PM +0530, Karthik Subrahmanya wrote:
> > "In a replica 2 volume... If we set the client-quorum option to
> > auto, then the first brick must always be up, irrespective of the
> > status of the second brick. If only the second brick is up, the
> > subvolume becomes read-only."
> >
> By default client-quorum is "none" in replica 2 volume.

I'm not sure where I saw the directions saying to set it, but I do have
"cluster.quorum-type: auto" in my volume configuration.  (And I think
that's client quorum, but feel free to correct me if I've misunderstood
the docs.)

> It applies to all the replica 2 volumes even if it has just 2 brick or more.
> Total brick count in the volume doesn't matter for the quorum, what matters
> is the number of bricks which are up in the particular replica subvol.

Thanks for confirming that.

> If I understood your configuration correctly it should look something like
> this:
> (Please correct me if I am wrong)
> replica-1:  bricks 1 & 2
> replica-2: bricks 3 & 4
> replica-3: bricks 5 & 6

Yes, that's correct.

> Since quorum is per replica, if it is set to auto then it needs the first
> brick of the particular replica subvol to be up to perform the fop.
> 
> In replica 2 volumes you can end up in split-brains.

How would that happen if bricks which are not in (cluster-wide) quorum
refuse to accept writes?  I'm not seeing the reason for using individual
subvolume quorums instead of full-volume quorum.

> It would be great if you can consider configuring an arbiter or
> replica 3 volume.

I can.  My bricks are 2x850G and 4x11T, so I can repurpose the small
bricks as arbiters with minimal effect on capacity.  What would be the
sequence of commands needed to:

1) Move all data off of bricks 1 & 2
2) Remove that replica from the cluster
3) Re-add those two bricks as arbiters

(And did I miss any additional steps?)

Unfortunately, I've been running a few months already with the current
configuration and there are several virtual machines running off the
existing volume, so I'll need to reconfigure it online if possible.

-- 
Dave Sherohman
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Quorum in distributed-replicate volume

2018-02-26 Thread Karthik Subrahmanya
Hi Dave,

On Mon, Feb 26, 2018 at 4:45 PM, Dave Sherohman  wrote:

> I've configured 6 bricks as distributed-replicated with replica 2,
> expecting that all active bricks would be usable so long as a quorum of
> at least 4 live bricks is maintained.
>
The client quorum is configured per replica sub volume and not for the
entire volume.
Since you have a distributed-replicated volume with replica 2, the data
will have 2 copies,
and considering your scenario of quorum to be taken on the total number of
bricks will lead to split-brains.

>
> However, I have just found
>
> http://docs.gluster.org/en/latest/Administrator%20Guide/
> Split%20brain%20and%20ways%20to%20deal%20with%20it/
>
> Which states that "In a replica 2 volume... If we set the client-quorum
> option to auto, then the first brick must always be up, irrespective of
> the status of the second brick. If only the second brick is up, the
> subvolume becomes read-only."
>
By default client-quorum is "none" in replica 2 volume.

>
> Does this apply only to a two-brick replica 2 volume or does it apply to
> all replica 2 volumes, even if they have, say, 6 bricks total?
>
It applies to all the replica 2 volumes even if it has just 2 brick or more.
Total brick count in the volume doesn't matter for the quorum, what matters
is the number of bricks which are up in the particular replica subvol.

>
> If it does apply to distributed-replicated volumes with >2 bricks,
> what's the reasoning for it?  I would expect that, if the cluster splits
> into brick 1 by itself and bricks 2-3-4-5-6 still together, then brick 1
> will recognize that it doesn't have volume-wide quorum and reject
> writes, thus allowing brick 2 to remain authoritative and able to accept
> writes.
>
If I understood your configuration correctly it should look something like
this:
(Please correct me if I am wrong)
replica-1:  bricks 1 & 2
replica-2: bricks 3 & 4
replica-3: bricks 5 & 6
Since quorum is per replica, if it is set to auto then it needs the first
brick of the particular replica subvol to be up to perform the fop.

In replica 2 volumes you can end up in split-brains. It would be great if
you can consider configuring an arbiter or replica 3 volume.
You can find more details about their advantages over replica 2 volume in
the same document.

Regards,
Karthik

>
> --
> Dave Sherohman
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] NFS Ganesha HA w/ GlusterFS

2018-02-26 Thread Kaleb S. KEITHLEY
On 02/25/2018 08:29 PM, TomK wrote:
> Hey Guy's,
> 
> A success story instead of a question.
> 
> With your help, managed to get the HA component working with HAPROXY and
> keepalived to build a fairly resilient NFS v4 VM cluster.  ( Used
> Gluster, NFS Ganesha v2.60, HAPROXY, keepalived w/ selinux enabled )
> 
> If someone needs or it could help your work, please PM me for the
> written up post or I could just post here if the lists allow it.
>

Hi,

I strongly encourage you to write a blog post about it. And if not that
at least write about it and post it to the list(s).

I'm not sure why your post to nfs-ganesha-support was blocked. Maybe
it's waiting for some moderator attention. (But I don't believe that
list is moderated.)

Thanks for sharing.

-- 

Kaleb
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Quorum in distributed-replicate volume

2018-02-26 Thread Dave Sherohman
I've configured 6 bricks as distributed-replicated with replica 2,
expecting that all active bricks would be usable so long as a quorum of
at least 4 live bricks is maintained.

However, I have just found

http://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/

Which states that "In a replica 2 volume... If we set the client-quorum
option to auto, then the first brick must always be up, irrespective of
the status of the second brick. If only the second brick is up, the
subvolume becomes read-only."

Does this apply only to a two-brick replica 2 volume or does it apply to
all replica 2 volumes, even if they have, say, 6 bricks total?

If it does apply to distributed-replicated volumes with >2 bricks,
what's the reasoning for it?  I would expect that, if the cluster splits
into brick 1 by itself and bricks 2-3-4-5-6 still together, then brick 1
will recognize that it doesn't have volume-wide quorum and reject
writes, thus allowing brick 2 to remain authoritative and able to accept
writes.

-- 
Dave Sherohman
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users