date:20180307

Re: [Gluster-users] Kernel NFS on GlusterFS

2018-03-07 Thread Ondrej Valousek

You say that accessing Gluster via NFS is actually faster than native (fuse) 
client?
Still I would like to know why we can’t use kernel NFS server on the data 
bricks. I understand we can’t use it on MDS as it can’t support pNFS.

Ondrej

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Jim Kinney
Sent: Wednesday, March 07, 2018 11:47 PM
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] Kernel NFS on GlusterFS

Gluster does the sync part better than corosync. It's not an active/passive 
failover system. It more all active. Gluster handles the recovery once all 
nodes are back online.

That requires the client tool chain to understand that a write goes to all 
storage devices not just the active one.

3.10 is a long term support release. Upgrading to 3.12 or 4 is not a 
significant issue once a replacement for NFS-ganesha stabilizes.

Kernel NFS doesn't understand "write to two IP addresses". That's what 
NFS-Ganesha does. The gluster-fuse client works but is slower than most people 
like. I use the fuse process in my setup at work. Will be changing to 
NFS-Ganesha as part of the upgrade to 3.10.

On Wed, 2018-03-07 at 14:50 -0500, Ben Mason wrote:
Hello,

I'm designing a 2-node, HA NAS that must support NFS. I had planned on using 
GlusterFS native NFS until I saw that it is being deprecated. Then, I was going 
to use GlusterFS + NFS-Ganesha until I saw that the Ganesha HA support ended 
after 3.10 and its replacement is still a WIP. So, I landed on GlusterFS + 
kernel NFS + corosync & pacemaker, which seems to work quite well. Are there 
any performance issues or other concerns with using GlusterFS as a replication 
layer and kernel NFS on top of that?

Thanks!

___

Gluster-users mailing list

Gluster-users@gluster.org

http://lists.gluster.org/mailman/listinfo/gluster-users

--

James P. Kinney III



Every time you stop a school, you will have to build a jail. What you

gain at one end you lose at the other. It's like feeding a dog on his

own tail. It won't fatten the dog.

- Speech 11/23/1900 Mark Twain



http://heretothereideas.blogspot.com/

-

The information contained in this e-mail and in any attachments is confidential 
and is designated solely for the attention of the intended recipient(s). If you 
are not an intended recipient, you must not use, disclose, copy, distribute or 
retain this e-mail or any part thereof. If you have received this e-mail in 
error, please notify the sender by return e-mail and delete all copies of this 
e-mail from your computer system(s). Please direct any additional queries to: 
communicati...@s3group.com. Thank You. Silicon and Software Systems Limited (S3 
Group). Registered in Ireland no. 378073. Registered Office: South County 
Business Park, Leopardstown, Dublin 18.___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster for home directories?

2018-03-07 Thread Ondrej Valousek

Ok,
I agree Gluster fits more in certain scenarios - but you just can't expect the 
same performance you receive from NAS based solution - this is especially true 
when you deal with lots of relatively small files.
Ondrej

-Original Message-
From: Rik Theys [mailto:rik.th...@esat.kuleuven.be] 
Sent: Wednesday, March 07, 2018 7:38 PM
To: Ondrej Valousek 
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] gluster for home directories?

Hi,

On 2018-03-07 16:35, Ondrej Valousek wrote:
> Why do you need to replace your existing solution?
> If you don't need to scale out due to the capacity reasons, the async 
> NFS server will always outperform GlusterFS

The current solution is 8 years old and is reaching its end of life.

The reason we are also looking into gluster is that we like that it uses 
standard components and that we can prevent forklift upgrades every
5,6,7 years by replacing a few bricks each year. Next to providing storage for 
home directories we would like to also use the hosts to run VM's in a 
hyperconverged setup (with their storage as an additional gluster volume on 
those bricks).

Regards,

Rik

--
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT) Kasteelpark Arenberg 10 bus 2440  - 
B-3001 Leuven-Heverlee
+32(0)16/32.11.07

<>
-

The information contained in this e-mail and in any attachments is confidential 
and is designated solely for the attention of the intended recipient(s). If you 
are not an intended recipient, you must not use, disclose, copy, distribute or 
retain this e-mail or any part thereof. If you have received this e-mail in 
error, please notify the sender by return e-mail and delete all copies of this 
e-mail from your computer system(s). Please direct any additional queries to: 
communicati...@s3group.com. Thank You. Silicon and Software Systems Limited (S3 
Group). Registered in Ireland no. 378073. Registered Office: South County 
Business Park, Leopardstown, Dublin 18.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fixing a rejected peer

2018-03-07 Thread Jamie Lawrence


> On Mar 7, 2018, at 4:39 AM, Atin Mukherjee  wrote:
> 
> Please run 'gluster v get all cluster.max-op-version' and what ever value it 
> throws up should be used to bump up the cluster.op-version (gluster v set all 
> cluster.op-version ) . With that if you restart the rejected peer I 
> believe the problem should go away, if it doesn't I'd need to investigate 
> further once you can pass down the glusterd and cmd_history log files and the 
> content of /var/lib/glusterd from all the nodes.

Thanks so much - that worked.

I clearly need to catch up on my Gluster reading - I thought I understood 
op-version, but clearly don't.

Anyway, thanks again for putting up with me.

Cheers,

-j
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Kernel NFS on GlusterFS

2018-03-07 Thread Jim Kinney

Gluster does the sync part better than corosync. It's not an
active/passive failover system. It more all active. Gluster handles the
recovery once all nodes are back online. 
That requires the client tool chain to understand that a write goes to
all storage devices not just the active one.
3.10 is a long term support release. Upgrading to 3.12 or 4 is not a
significant issue once a replacement for NFS-ganesha stabilizes.
Kernel NFS doesn't understand "write to two IP addresses". That's what
NFS-Ganesha does. The gluster-fuse client works but is slower than most
people like. I use the fuse process in my setup at work. Will be
changing to NFS-Ganesha as part of the upgrade to 3.10.
On Wed, 2018-03-07 at 14:50 -0500, Ben Mason wrote:
> Hello,
> I'm designing a 2-node, HA NAS that must support NFS. I had planned
> on using GlusterFS native NFS until I saw that it is being
> deprecated. Then, I was going to use GlusterFS + NFS-Ganesha until I
> saw that the Ganesha HA support ended after 3.10 and its replacement
> is still a WIP. So, I landed on GlusterFS + kernel NFS + corosync &
> pacemaker, which seems to work quite well. Are there any performance
> issues or other concerns with using GlusterFS as a replication layer
> and kernel NFS on top of that?
> 
> Thanks!
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
-- 
James P. Kinney III

Every time you stop a school, you will have to build a jail. What you
gain at one end you lose at the other. It's like feeding a dog on his
own tail. It won't fatten the dog.
- Speech 11/23/1900 Mark Twain

http://heretothereideas.blogspot.com/
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Kernel NFS on GlusterFS

2018-03-07 Thread Ben Mason

Hello,

I'm designing a 2-node, HA NAS that must support NFS. I had planned on
using GlusterFS native NFS until I saw that it is being deprecated. Then, I
was going to use GlusterFS + NFS-Ganesha until I saw that the Ganesha HA
support ended after 3.10 and its replacement is still a WIP. So, I landed
on GlusterFS + kernel NFS + corosync & pacemaker, which seems to work quite
well. Are there any performance issues or other concerns with using
GlusterFS as a replication layer and kernel NFS on top of that?

Thanks!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster for home directories?

2018-03-07 Thread Rik Theys


Hi,

On 2018-03-07 16:35, Ondrej Valousek wrote:

Why do you need to replace your existing solution?
If you don't need to scale out due to the capacity reasons, the async
NFS server will always outperform GlusterFS


The current solution is 8 years old and is reaching its end of life.

The reason we are also looking into gluster is that we like that it uses 
standard components and that we can prevent forklift upgrades every 
5,6,7 years by replacing a few bricks each year. Next to providing 
storage for home directories we would like to also use the hosts to run 
VM's in a hyperconverged setup (with their storage as an additional 
gluster volume on those bricks).


Regards,

Rik


--
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT)
Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
+32(0)16/32.11.07

<>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Intermittent mount disconnect due to socket poller error

2018-03-07 Thread Ryan Lee

I happened to review the status of volume clients and realized they were 
reporting a mix of different op-versions: 3.13 clients were still 
connecting to the downgraded 3.12 server (likely a timing issue between 
downgrading clients and mounting volumes).  Remounting the reported 
clients has resulted in the correct op-version all around and about a 
week free of these errors.


On 2018-03-01 12:38, Ryan Lee wrote:
Thanks for your response - is there more that would be useful in 
addition to what I already attached?  We're logging at default level on 
the brick side and at error on clients.  I could turn it up for a few 
days to try to catch this problem in action (it's happened several more 
times since I first wrote).


On 2018-02-28 18:38, Raghavendra Gowdappa wrote:

Is it possible to attach logfiles of problematic client and bricks?

On Thu, Mar 1, 2018 at 3:00 AM, Ryan Lee > wrote:


    We've been on the Gluster 3.7 series for several years with things
    pretty stable.  Given that it's reached EOL, yesterday I upgraded to
    3.13.2.  Every Gluster mount and server was disabled then brought
    back up after the upgrade, changing the op-version to 31302 and then
    trying it all out.

    It went poorly.  Every sizable read and write (100's MB) lead to
    'Transport endpoint not connected' errors on the command line and
    immediate unavailability of the mount.  After unsuccessfully trying
    to search for similar problems with solutions, I ended up
    downgrading to 3.12.6 and changing the op-version to 31202.  That
    brought us back to usability with the majority of those operations
    succeeding enough to consider it online, but there are still
    occasional mount disconnects that we never saw with 3.7 - about 6 in
    the past 18 hours.  It seems these disconnects would never come
    back, either, unless manually re-mounted.  Manually remounting
    reconnects immediately.  They only disconnect the affected client,
    though some simultaneous disconnects have occurred due to
    simultaneous activity.  The lower-level log info seems to indicate a
    socket problem, potentially broken on the client side based on
    timing (but the timing is narrow, and I won't claim the clocks are
    that well synchronized across all our servers).  The client and one
    server claim a socket polling error with no data available, and the
    other server claims a writev error.  This seems to lead the client
    to the 'all subvolumes are down' state, even though all other
    clients are still connected.  Has anybody run into this?  Did I miss
    anything moving so many versions ahead?

    I've included the output of volume info and some excerpts from the
    logs.   We have two servers running glusterd and two replica volumes
    with a brick on each server.  Both experience disconnects; there are
    about 10 clients for each, with one using both.  We use SSL over
    internal IPv4. Names in all caps were replaced, as were IP addresses.

    Let me know if there's anything else I can provide.

    % gluster v info VOL
    Volume Name: VOL
    Type: Replicate
    Volume ID: 3207155f-02c6-447a-96c4-5897917345e0
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x 2 = 2
    Transport-type: tcp
    Bricks:
    Brick1: SERVER1:/glusterfs/VOL-brick1/data
    Brick2: SERVER2:/glusterfs/VOL-brick2/data
    Options Reconfigured:
    config.transport: tcp
    features.selinux: off
    transport.address-family: inet
    nfs.disable: on
    client.ssl: on
    performance.readdir-ahead: on
    auth.ssl-allow: [NAMES, including CLIENT]
    server.ssl: on
    ssl.certificate-depth: 3

    Log excerpts (there was nothing related in glusterd.log):

    CLIENT:/var/log/glusterfs/mnt-VOL.log
    [2018-02-28 19:35:58.378334] E [socket.c:2648:socket_poller]
    0-VOL-client-1: socket_poller SERVER2:49153 failed (No data 
available)

    [2018-02-28 19:35:58.477154] E [MSGID: 108006]
    [afr-common.c:5164:__afr_handle_child_down_event] 0-VOL-replicate-0:
    All subvolumes are down. Going offline until atleast one of them
    comes back up.
    [2018-02-28 19:35:58.486146] E [MSGID: 101046]
    [dht-common.c:1501:dht_lookup_dir_cbk] 0-VOL-dht: dict is null <67
    times>
    
    [2018-02-28 19:38:06.428607] E [socket.c:2648:socket_poller]
    0-VOL-client-1: socket_poller SERVER2:24007 failed (No data 
available)

    [2018-02-28 19:40:12.548650] E [socket.c:2648:socket_poller]
    0-VOL-client-1: socket_poller SERVER2:24007 failed (No data 
available)


    


    SERVER2:/var/log/glusterfs/bricks/VOL-brick2.log
    [2018-02-28 19:35:58.379953] E [socket.c:2632:socket_poller]
    0-tcp.VOL-server: poll error on socket
    [2018-02-28 19:35:58.380530] I [MSGID: 115036]
    [server.c:527:server_rpc_notify] 0-VOL-server: disconnecting
    connection from 
CLIENT-30688-2018/02/28-03:11:39:784734-VOL-client-1-0-0

    [2018-02-28 19:35:58.380932] I

Re: [Gluster-users] gluster debian build repo redirection loop on apt-get update on docker

2018-03-07 Thread Paul Anderson

Thanks for the feedback! I was able to fix my problem... turns out,
the standard docker debian and php-cli images don't include a critical
package: apt-transport-https

So to test gluster FS client using docker, start with a file like this:

# you can also use 'FROM debian'
FROM php:cli
# use up to date gluster.org built packages
RUN apt-get update

# this is key - you can't install gluster packages without it:
RUN apt-get --assume-yes install apt-transport-https
RUN apt-get --assume-yes install wget gnupg gnupg1 gnupg2
RUN wget -O - http://download.gluster.org/pub/gluster/glusterfs/3.13/rsa.pub
| apt-key add -
RUN echo deb [arch=amd64]
https://download.gluster.org/pub/gluster/glusterfs/3.13/LATEST/Debian/stretch/amd64/apt
stretch main > /etc/apt/sources.list.d/gluster.list
RUN apt-get update

# now install gluster fs client code and a few other fun things
RUN apt-get --assume-yes install glusterfs-client
RUN apt-get --assume-yes install iputils-ping
RUN apt-get --assume-yes install iproute2
RUN apt-get --assume-yes install strace

# setup to mount gluster fs
RUN mkdir -pv /etc/glusterfs
COPY fstab /etc/fstab
COPY dockerstore.vol /etc/glusterfs

# copy our test php file in
COPY sqlite_tester_flock.php /usr/bin



And also, as you noted, my package path was botched - I'd tried a lot
of different ones, and gotten the same redirect loop for all, but this
was due to the apt-transport-https package not being installed. On
Ubuntu, there is a symlink for the https transport file, but on
debian, it appears to have to be a real file.

Paul


On Tue, Mar 6, 2018 at 10:28 PM, Kaleb S. KEITHLEY  wrote:
> On 03/06/2018 05:50 PM, Paul Anderson wrote:
>> When I follow the directions at
>> http://docs.gluster.org/en/latest/Install-Guide/Install/ to install
>> the latest gluster on a debian 9 docker container, I get the following
>> error:
>
> Files in the .../3.13/3.13.2 directory had the wrong owner/group,
> (rsync_aide). I'm not sure why, maybe an incomplete rsync? I've fixed
> the owners and reset the selinux context, although nobody else has
> complained about it not being able to install Debian packages from the
> repos.
>
> But independent of that, your package path looks wrong. You have:
>   https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/...
> but I believe it should be:
>   https://download.gluster.org/pub/gluster/glusterfs/3.13/LATEST/Debian/...
>
> Double check that your /etc/apt/sources.list.d/gluster.list file
> consists of the single line:
>
>   deb [arch=amd64]
> https://download.gluster.org/pub/gluster/glusterfs/3.13/LATEST/Debian/stretch/amd64/apt
> stretch main
>
>
>>
>> Step 6/15 : RUN echo deb [arch=amd64]
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt/
>> stretch main > /etc/apt/sources.list.d/gluster.list
>>  ---> Running in 1ef386afb192
>> Removing intermediate container 1ef386afb192
>>  ---> f3c99b2a7c7a
>> Step 7/15 : RUN apt-get update
>>  ---> Running in 3a4744736ab0
>> Ign:1 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch InRelease
>> Ign:2 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch Release
>> Ign:3 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main all Packages
>> Ign:4 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main amd64 Packages
>> Ign:3 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main all Packages
>> Hit:5 http://security.debian.org stretch/updates InRelease
>> Ign:4 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main amd64 Packages
>> Ign:3 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main all Packages
>> Ign:4 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main amd64 Packages
>> Ign:3 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main all Packages
>> Ign:4 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main amd64 Packages
>> Ign:3 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main all Packages
>> Ign:4 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main amd64 Packages
>> Ign:3 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main all Packages
>> Ign:6 http://cdn-fastly.deb.debian.org/debian stretch InRelease
>> Err:4 
>> https://download.gluster.org/pub/gluster/glusterfs/3.13/9/Debian/stretch/amd64/apt
>> stretch/main amd64 Packages
>>   Redirection loop encountered
>> Hit:7 http://cdn-fastly.deb.debian.org/debian stretch-updates InRelease
>> Hit:8

Re: [Gluster-users] gluster for home directories?

2018-03-07 Thread Ondrej Valousek

Hi,
Why do you need to replace your existing solution?
If you don't need to scale out due to the capacity reasons, the async NFS 
server will always outperform GlusterFS

Ondrej
-

The information contained in this e-mail and in any attachments is confidential 
and is designated solely for the attention of the intended recipient(s). If you 
are not an intended recipient, you must not use, disclose, copy, distribute or 
retain this e-mail or any part thereof. If you have received this e-mail in 
error, please notify the sender by return e-mail and delete all copies of this 
e-mail from your computer system(s). Please direct any additional queries to: 
communicati...@s3group.com. Thank You. Silicon and Software Systems Limited (S3 
Group). Registered in Ireland no. 378073. Registered Office: South County 
Business Park, Leopardstown, Dublin 18.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] gluster for home directories?

2018-03-07 Thread Rik Theys

Hi,

We are looking into replacing our current storage solution and are
evaluating gluster for this purpose. Our current solution uses a SAN
with two servers attached that serve samba and NFS 4. Clients connect to
those servers using NFS or SMB. All users' home directories live on this
server.

I would like to have some insight in who else is using gluster for home
directories for about 500 users and what performance they get out of the
solution. Which connectivity method are you using on the clients
(gluster native, nfs, smb)? Which volume options do you have configured
for your gluster volume? What hardware are you using? Are you using
snapshots and/or quota? If so, any number on performance impact?

The solution I had in mind for our setup is multiple servers/bricks with
replica 3 arbiter 1 volume where each server is also running nfs-ganesha
and samba in HA. Clients would be connecting to one of the nfs servers
(dns round robin). In this case the nfs servers would be the gluster
clients. Gluster traffic would go over a dedicated network with 10G and
jumbo frames.

I'm currently testing gluster (3.12, now 3.13) on older machines[1] and
have created a replica 3 arbiter 1 volume 2x(2+1). I seem to run in all
sorts of (performance) problems. I must be doing something wrong but
I've tried all sorts of benchmarks and nothing seems to make my setup
live up to what I would expect from this hardware.

* I understand that gluster only starts to work well when multiple
clients are connecting in parallel, but I did expect the single client
performance to be better.

* Unpacking the linux-4.15.7.tar.xz file on the brick XFS filesystem
followed by a sync takes about 1 minute. Doing the same on the gluster
volume using the fuse client (client is one of the brick servers) takes
over 9 minutes and neither disk nor cpu nor network are reaching their
bottleneck. Doing the same over NFS-ganesha (client is a workstation
connected through gbit) takes even longer (more than 30min!?).

I understand that unpacking a lot of small files may be the worst
workload for a distributed filesystem, but when I look at the file sizes
of the files in our users' home directories, more than 90% is smaller
than 1MB.

* A file copy of a 300GB file over NFS 4 (nfs-ganesha) starts fast
(90MB/s) and then drops to 20MB/s. When I look at the servers during the
copy, I don't see where the bottleneck is as the cpu, disk and network
are not maxing out (on none of the bricks). When the same client copies
the file to our current NFS storage it is limited by the gbit network
connection of the client.

* I had the 'cluster.optimize-lookup' option enabled but ran into all
sorts of issues where ls is showing either the wrong files (content of a
different directory), or claiming a directory does not exist when mkdir
says it already exists... I current have the following options set:

server.outstanding-rpc-limit: 256
client.event-threads: 4
performance.io-thread-count: 16
performance.parallel-readdir: on
server.event-threads: 4
performance.cache-size: 2GB
performance.rda-cache-limit: 128MB
performance.write-behind-window-size: 8MB
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
network.inode-lru-limit: 50
performance.nl-cache-timeout: 600
performance.nl-cache: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
transport.address-family: inet
nfs.disable: on
cluster.enable-shared-storage: enable

The brick servers have 2 dual-core cpu's so I've set the client and
server event threads to 4.

* When using nfs-ganesha I run into bugs that makes me wonder who is
using nfs-ganesha with gluster and why are they not hitting these bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1543996
https://bugzilla.redhat.com/show_bug.cgi?id=1405147

* nfs-ganesha does not have the 'async' option that kernel nfs has. I
can understand why they don't want to implement this feature, but do
wonder how others are increasing their nfs-ganesha performance. I've put
some SSD's in each brick and have them configured as lvmcache to the
bricks. This setup only increases throughput once the data is on the ssd
and not for just-written data.

Regards,

Rik

[1] 4 servers with 2 1Gbit nics (one for the client traffic, one for s2s
traffic with jumbo frames enabled). Each server has two disks (bricks).

[2] ioping from the nfs client shows the following latencies:
min/avg/max/mdev = 695.2 us / 2.17 ms / 7.05 ms / 1.92 ms

ping rtt from client to nfs-ganesha server:
rtt min/avg/max/mdev = 0.106/1.551/6.195/2.098 ms

ioping on the volume fuse mounted from a brick:
min/avg/max/mdev = 557.0 us / 824.4 us / 2.68 ms / 421.9 us

ioping on the brick xfs filesystem:
min/avg/max/mdev = 275.2 us / 515.2 us / 12.4 ms / 1.21 ms

Are these normal numbers?


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fixing a rejected peer

2018-03-07 Thread Atin Mukherjee

Please run 'gluster v get all cluster.max-op-version' and what ever value
it throws up should be used to bump up the cluster.op-version (gluster v
set all cluster.op-version ) . With that if you restart the rejected
peer I believe the problem should go away, if it doesn't I'd need to
investigate further once you can pass down the glusterd and cmd_history log
files and the content of /var/lib/glusterd from all the nodes.

On Wed, Mar 7, 2018 at 4:13 AM, Jamie Lawrence 
wrote:

>
> > On Mar 5, 2018, at 6:41 PM, Atin Mukherjee  wrote:
>
> > I'm tempted to repeat - down things, copy the checksum the "good" ones
> agree on, start things; but given that this has turned into a
> balloon-squeezing exercise, I want to make sure I'm not doing this the
> wrong way.
> >
> > Yes, that's the way. Copy /var/lib/glusterd/vols// from the
> good node to the rejected one and restart glusterd service on the rejected
> peer.
>
>
> My apologies for the multiple messages - I'm having to work on this
> episodically.
>
> I've tried again to reset state on the bad peer, to no avail. This time I
> downed all of the peers, copied things over, ensuring that the tier-enabled
> line was absent and started back up; the cksum immediately changed to some
> a bad value, the two good nodes added that line in, and the bad node didn't
> have it.
>
> Just to have a clear view of this, I did it yet again, this time ensuring
> the tier-enbled line was present everywhere. Same result, except that it
> didn't add the tier-enabled line, which I suppose makes some sense.
>
> One oddity -  I see:
>
> # gluster v get all cluster.op-version
> Option  Value
> --  -
> cluster.op-version  30800
>
> but from one of the `info` files:
>
> op-version=30712
> client-op-version=30712
>
> I don't know what it means that the cluster is at one version but
> apparently the volume is set for another - I thought that was a
> cluster-level setting. (Client.op-version theoretically makes more sense -
> I can see Ovirt wanting an older version.)
>
> I'm at a loss to fix this - copying /var/lib/glusterd/vol/ over
> doesn't fix the problem. I'd be somewhat OK with trashing the volume and
> starting over, if it weren't for two things: (1) Ovirt  was also a massive
> pain to set up, and it configured on this volume. But perhaps more
> importantly, I'm concerned with this happening again once this is in
> production, which would be Bad, especially if I don't have a fix.
>
> So at this point, I'm unclear on how to move forward or even where more to
> look for potential problems.
>
> -j
>
> - - - -
>
> [2018-03-06 22:30:32.421530] I [MSGID: 106490] [glusterd-handler.c:2540:__
> glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from
> uuid: 77cdfbba-348c-43fe-ab3d-00621904ea9c
> [2018-03-06 22:30:32.422582] E [MSGID: 106010] [glusterd-utils.c:3374:
> glusterd_compare_friend_volume] 0-management: Version of Cksums
> sc5-ovirt_engine differ. local cksum = 3949237931, remote cksum =
> 2068896937 on peer sc5-gluster-10g-1.squaretrade.com
> [2018-03-06 22:30:32.422774] I [MSGID: 106493] 
> [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp]
> 0-glusterd: Responded to sc5-gluster-10g-1.squaretrade.com (0), ret: 0,
> op_ret: -1
> [2018-03-06 22:30:32.424621] I [MSGID: 106493] 
> [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk]
> 0-glusterd: Received RJT from uuid: 77cdfbba-348c-43fe-ab3d-00621904ea9c,
> host: sc5-gluster-10g-1.squaretrade.com, port: 0
> [2018-03-06 22:30:32.425563] I [MSGID: 106493] 
> [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk]
> 0-glusterd: Received RJT from uuid: c1877e0d-ccb2-401e-83a6-e4a680af683a,
> host: sc5-gluster-2.squaretrade.com, port: 0
> [2018-03-06 22:30:32.426706] I [MSGID: 106163]
> [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack]
> 0-management: using the op-version 30800
> [2018-03-06 22:30:32.428075] I [MSGID: 106490] [glusterd-handler.c:2540:__
> glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from
> uuid: c1877e0d-ccb2-401e-83a6-e4a680af683a
> [2018-03-06 22:30:32.428325] E [MSGID: 106010] [glusterd-utils.c:3374:
> glusterd_compare_friend_volume] 0-management: Version of Cksums
> sc5-ovirt_engine differ. local cksum = 3949237931, remote cksum =
> 2068896937 on peer sc5-gluster-2.squaretrade.com
> [2018-03-06 22:30:32.428468] I [MSGID: 106493] 
> [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp]
> 0-glusterd: Responded to sc5-gluster-2.squaretrade.com (0), ret: 0,
> op_ret: -1
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] [Gluster-Maintainers] Release 4.0: RC1 tagged

2018-03-07 Thread Shyam Ranganathan

On 03/05/2018 09:05 AM, Javier Romero wrote:
>> I am about halfway through my own upgrade testing (using centOS7
>> containers), and it is patterned around this [1], in case that helps.
> Taking a look at this.
> 
> 

Thanks for confirming the install of the bits.

On the upgrade front, I did find some issues that are since fixed. We
are in the process for rolling out the GA (general availability)
packages for 4.0.0, and if you have not started on the upgrades, I would
wait till these are announced, before testing them out.

We usually test the upgrades (and package sanity all over again on the
GA bits) before announcing the release.

Thanks again,
Shyam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Write speed of gluster volume reduced

2018-03-07 Thread Sherin George

Hi Guys,

I have a gluster volume with the following configuration.
~
Number of Bricks: 9
Transport-type: tcp

Hot Tier :
Hot Tier Type : Replicate
Number of Bricks: 1 x 3 = 3

Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 3 = 6

performance.flush-behind: on
performance.cache-max-file-size: 128MB
performance.cache-size: 25GB

diagnostics.count-fop-hits: off
diagnostics.latency-measurement: off
cluster.tier-mode: cache
features.ctr-enabled: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
~

Without write cache, bricks in hot tier got around 1.5 GB/s write speed and 
bricks in cold tier got 700 MB/s. However, I am getting only around 35 MB/s 
write speed in gluster volume mounted in glusterfs.

I checked and confirmed that I can get network throughput as much as 1.4 
Gbits/sec.

The volume had around 175 MB/s write speed initially, however, speed reduced as 
more data stored in the volume. I also implemented a second volume with similar 
configuration, and I am getting write speed around 175 MB/s again.

Could someone please advise on what could be done to improve write speed.

Is NFS Ganesha a solution to improve write speed?

Thanks in advance.
--
Kind Regards,
Sherin

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Kernel NFS on GlusterFS

Re: [Gluster-users] gluster for home directories?

Re: [Gluster-users] Fixing a rejected peer

Re: [Gluster-users] Kernel NFS on GlusterFS

[Gluster-users] Kernel NFS on GlusterFS

Re: [Gluster-users] gluster for home directories?

Re: [Gluster-users] Intermittent mount disconnect due to socket poller error

Re: [Gluster-users] gluster debian build repo redirection loop on apt-get update on docker

Re: [Gluster-users] gluster for home directories?

[Gluster-users] gluster for home directories?

Re: [Gluster-users] Fixing a rejected peer

Re: [Gluster-users] [Gluster-devel] [Gluster-Maintainers] Release 4.0: RC1 tagged

[Gluster-users] Write speed of gluster volume reduced

13 matches

Site Navigation

Mail list logo

Footer information