[Gluster-users] Fwd: Re: [Gluster-Maintainers] [gluster-packaging] glusterfs-4.0.0 released

2018-03-08 Thread Shyam Ranganathan
Forwarding to the devel and users groups as well.

We have tagged 4.0.0 branch as GA, and are in the process of building
packages.

It would a good time to run final install/upgrade tests if you get a
chance on these packages (I am running off to do the same now).

Thanks,
Shyam

 Forwarded Message 
Subject: Re: [Gluster-Maintainers] [gluster-packaging] glusterfs-4.0.0
released
Date: Thu, 8 Mar 2018 18:06:40 -0500
From: Kaleb S. KEITHLEY 
To: GlusterFS Maintainers , packag...@gluster.org


On 03/06/2018 10:25 AM, jenk...@build.gluster.org wrote:
> SRC: 
> https://build.gluster.org/job/release-new/45/artifact/glusterfs-4.0.0.tar.gz
> HASH: 
> https://build.gluster.org/job/release-new/45/artifact/glusterfs-4.0.0.sha512sum
> 
> This release is made off jenkins-release-45
> 

Status update:

I've made some progress on Debian packaging of glusterd2. Debian golang
packaging using dh-make-golang is strongly biased toward downloading the
$HEAD from github and building from that. I haven't been able to find
anything for building from a source (-vendor or not) tarball. Mostly
it's trial and error trying to defeat the dh-helper voodoo magic. If
anyone knows a better way, please speak up. Have I mentioned that I hate
Debian packaging?

In the mean time——

glusterfs-4.0.0 packages for:

* Fedora 26, 27, and 28 are on download.gluster.org at [1]. Fedora 29
are in the Fedora Rawhide repo. Use `dnf` to install.

* Debian Stretch/9 and Buster/10(Sid) are on download.gluster.org at [1]

* Xenial/16.04, Artful/17.10, and Bionic/18.04 are on Launchpad at [2]

* SuSE SLES12SP3, Leap42.3, and Tumbleweed are on OpenSuSE Build Service
at [3].

* RHEL/CentOS el7 and el6 (el6 client-side only) in CentOS Storage SIG
at [4].


glusterd2-4.0.0 packages for:

* Fedora 26, 27, 28, and 29 are on download.gluster.org at [5].
Eventually rpms will be available in Fedora (29 probably) pending
completion of package review.

* RHEL/CentOS el7 in CentOS Storage SIG at [4].

* SuSE SLES12SP3, Leap42.3, and Tumbleweed are on OpenSuSE Build Service
at [3]. glusterd2 rpms are in the same repos with the matching glusterfs
rpms.

All the LATEST and STM-4.0 symlinks have been created or updated to
point to the 4.0.0 release.

Please test the CentOS packages and give feedback so that packages can
be tagged for release.

And of course the Debian and Ubuntu glusterfs packages are usable
without glusterd2, so go ahead and start using them now.

[1] https://download.gluster.org/pub/gluster/glusterfs/4.0
[2] https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-4.0
[3] https://build.opensuse.org/project/subprojects/home:glusterfs
[4] https://buildlogs.centos.org/centos/7/storage/$arch/gluster-4.0
[5] https://download.gluster.org/pub/gluster/glusterd2/4.0

___
maintainers mailing list
maintain...@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Kernel NFS on GlusterFS

2018-03-08 Thread Joe Julian



On 03/07/18 14:47, Jim Kinney wrote:
[snip]. 
The gluster-fuse client works but is slower than most people like. I 
use the fuse process in my setup at work. ...


Depending on the use case and configuration. With client-side caching 
and cache invalidation, a good number of the performance complaints can 
be addressed in a similar (better) way to how nfs makes things fast.




On Wed, 2018-03-07 at 14:50 -0500, Ben Mason wrote:

Hello,

I'm designing a 2-node, HA NAS that must support NFS. I had planned 
on using GlusterFS native NFS until I saw that it is being 
deprecated. Then, I was going to use GlusterFS + NFS-Ganesha until I 
saw that the Ganesha HA support ended after 3.10 and its replacement 
is still a WIP. So, I landed on GlusterFS + kernel NFS + corosync & 
pacemaker, which seems to work quite well. Are there any performance 
issues or other concerns with using GlusterFS as a replication layer 
and kernel NFS on top of that?


Thanks!
___
Gluster-users mailing list
Gluster-users@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-users

--
James P. Kinney III

Every time you stop a school, you will have to build a jail. What you
gain at one end you lose at the other. It's like feeding a dog on his
own tail. It won't fatten the dog.
- Speech 11/23/1900 Mark Twain

http://heretothereideas.blogspot.com/


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Kernel NFS on GlusterFS

2018-03-08 Thread Joe Julian
There has been a deadlock problem in the past where both the knfs module 
and the fuse module each need more memory to satisfy a fop and neither 
can acquire that memory due to competing locks. This caused an infinite 
wait. Not sure if anything was ever done in the kernel to remedy that.



On 03/07/18 11:50, Ben Mason wrote:

Hello,

I'm designing a 2-node, HA NAS that must support NFS. I had planned on 
using GlusterFS native NFS until I saw that it is being deprecated. 
Then, I was going to use GlusterFS + NFS-Ganesha until I saw that the 
Ganesha HA support ended after 3.10 and its replacement is still a 
WIP. So, I landed on GlusterFS + kernel NFS + corosync & pacemaker, 
which seems to work quite well. Are there any performance issues or 
other concerns with using GlusterFS as a replication layer and kernel 
NFS on top of that?


Thanks!


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Write speed of gluster volume reduced

2018-03-08 Thread Vlad Kopylov
http://lists.gluster.org/pipermail/gluster-users/2017-July/031788.html
http://lists.gluster.org/pipermail/gluster-users/2017-September/032385.html
also try disperse.eager-lock off

On Tue, Mar 6, 2018 at 7:40 AM, Sherin George  wrote:
> Hi Guys,
>
> I have a gluster volume with the following configuration.
> ~
> Number of Bricks: 9
> Transport-type: tcp
>
> Hot Tier :
> Hot Tier Type : Replicate
> Number of Bricks: 1 x 3 = 3
>
> Cold Tier:
> Cold Tier Type : Distributed-Replicate
> Number of Bricks: 2 x 3 = 6
>
> performance.flush-behind: on
> performance.cache-max-file-size: 128MB
> performance.cache-size: 25GB
>
> diagnostics.count-fop-hits: off
> diagnostics.latency-measurement: off
> cluster.tier-mode: cache
> features.ctr-enabled: on
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
> ~
>
> Without write cache, bricks in hot tier got around 1.5 GB/s write speed and
> bricks in cold tier got 700 MB/s. However, I am getting only around 35 MB/s
> write speed in gluster volume mounted in glusterfs.
>
> I checked and confirmed that I can get network throughput as much as 1.4
> Gbits/sec.
>
> The volume had around 175 MB/s write speed initially, however, speed reduced
> as more data stored in the volume. I also implemented a second volume with
> similar configuration, and I am getting write speed around 175 MB/s again.
>
> Could someone please advise on what could be done to improve write speed.
>
> Is NFS Ganesha a solution to improve write speed?
>
> Thanks in advance.
> --
> Kind Regards,
> Sherin
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] SQLite3 on 3 node cluster FS?

2018-03-08 Thread Paul Anderson
I was able to get the docker containers I'm using to test with to
install the latest builds from gluster.org.

So client/server versions are both 3.13.2

I am testing two main cases, both using sqlite3. With a php program
wrapping all database operations with an flock(), it now works as
expected. I ran the same test 500 times (or so) yesterday afternoon,
and it worked every time.

I repeated that same test both with and without
performance.flush-behind/write-behind enabled with the same result.

So that's great!

When I ran my other test case, just allowing sqlite3 fcntl() style
locks to manage data, the test fails with either performance setting.

So it could be that sqlite3 is not correctly managing its lock and
flush operations, or it is possible gluster has a data integrity
problem in the case when fcntl() style locks are used. I have no way
of knowing which is more likely...

I think I've got what I need, so someone else is going to need to pick
up the ball if they want a sqlite3 lock to work on its own with
gluster. I will say that it is slow if a bunch of writers are trying
to update individual records at the same time, since the database is
ping-ponging all over the cluster as different clients get and hold
the lock.

I've updated my github repo with my latest changes if anyone feels
like trying it on their own: https://github.com/powool/gluster.git

My summary is: sqlite3 built in locks don't appear to work nicely with
gluster, so you have to put an flock() around the database operations
to prevent data loss. You also can't do any caching in your volume
mount on the client side. The performance settings server side appear
not to matter, provided you're up to date on client/server code.

I hope this helps someone!

Paul

On Tue, Mar 6, 2018 at 12:32 PM, Raghavendra Gowdappa
 wrote:
>
>
> On Tue, Mar 6, 2018 at 10:58 PM, Raghavendra Gowdappa 
> wrote:
>>
>>
>>
>> On Tue, Mar 6, 2018 at 10:22 PM, Paul Anderson  wrote:
>>>
>>> Raghavendra,
>>>
>>> I've commited my tests case to https://github.com/powool/gluster.git -
>>> it's grungy, and a work in progress, but I am happy to take change
>>> suggestions, especially if it will save folks significant time.
>>>
>>> For the rest, I'll reply inline below...
>>>
>>> On Mon, Mar 5, 2018 at 10:39 PM, Raghavendra Gowdappa
>>>  wrote:
>>> > +Csaba.
>>> >
>>> > On Tue, Mar 6, 2018 at 2:52 AM, Paul Anderson  wrote:
>>> >>
>>> >> Raghavendra,
>>> >>
>>> >> Thanks very much for your reply.
>>> >>
>>> >> I fixed our data corruption problem by disabling the volume
>>> >> performance.write-behind flag as you suggested, and simultaneously
>>> >> disabling caching in my client side mount command.
>>> >
>>> >
>>> > Good to know it worked. Can you give us the output of
>>> > # gluster volume info
>>>
>>> [root@node-1 /]# gluster volume info
>>>
>>> Volume Name: dockerstore
>>> Type: Replicate
>>> Volume ID: fb08b9f4-0784-4534-9ed3-e01ff71a0144
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x 3 = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 172.18.0.4:/data/glusterfs/store/dockerstore
>>> Brick2: 172.18.0.3:/data/glusterfs/store/dockerstore
>>> Brick3: 172.18.0.2:/data/glusterfs/store/dockerstore
>>> Options Reconfigured:
>>> performance.client-io-threads: off
>>> nfs.disable: on
>>> transport.address-family: inet
>>> locks.mandatory-locking: optimal
>>> performance.flush-behind: off
>>> performance.write-behind: off
>>>
>>> >
>>> > We would like to debug the problem in write-behind. Some questions:
>>> >
>>> > 1. What version of Glusterfs are you using?
>>>
>>> On the server nodes:
>>>
>>> [root@node-1 /]# gluster --version
>>> glusterfs 3.13.2
>>> Repository revision: git://git.gluster.org/glusterfs.git
>>>
>>> On the docker container sqlite test node:
>>>
>>> root@b4055d8547d2:/# glusterfs --version
>>> glusterfs 3.8.8 built on Jan 11 2017 14:07:11
>>
>>
>> I guess this is where client is mounted. If I am correct on where
>> glusterfs client is mounted, client is running quite a old version. There
>> have been significant number of fixes between 3.8.8 and current master.
>
>
> ... significant number of fixes to write-behind...
>
>> I would suggest to try out 3.13.2 patched with [1]. If you get a chance to
>> try this out, please report back how did the tests go.
>
>
> I would suggest to try out 3.13.2 patched with [1] and run tests with
> write-behind turned on.
>
>>
>> [1] https://review.gluster.org/19673
>>
>>>
>>> I recognize that version skew could be an issue.
>>>
>>> > 2. Were you able to figure out whether its stale data or metadata that
>>> > is
>>> > causing the issue?
>>>
>>> I lean towards stale data based on the only real observation I have:
>>>
>>> While debugging, I put log messages in as to when the flock() is
>>> acquired, and when it is released. There is no instance where two
>>> different processes ever hold the same flock()'d file. 

[Gluster-users] fuse vs libgfapi LIO performances comparison: how to make tests?

2018-03-08 Thread luca giacomelli
Dear support, I need to export gluster volume with LIO for a 
virtualization system. In this moment I have a very basic test 
configuration: 2x HP 380 G7(2 * Intel X5670 (Six core @ 2,93GHz), 72GB 
ram, hd RAID10 6xsas 10krpm, lan Intel X540 T2 10GB) directly 
interconnected. Gluster configuration is replica 2. OS is Fedora 27


For my tests I used dd and I found strange results. Apparently the 
volume mounted locally and exported with LIO is more faster than the 
same volume exported directly with LIO.


For my tests I exported the volume from the first server and mounted and 
checked the volume on the second


This is the dd command:

echo 3 > /proc/sys/vm/drop_caches

dd if=/dev/urandom of=file_1G bs=1M count=1000 oflag=direct

with fuse

1073741824 bytes (1.1 GB, 1.0 GiB) copied, 9.5597 s, 112 MB/s

direct with user:glfs

1073741824 bytes (1.1 GB, 1.0 GiB) copied, 27.0812 s, 39.6 MB/s

I think there is something strange. I don't really know if dd is good in 
order to obtain "good" numbers. I read that some people use iozone other 
Fio or iometer...


Before ask information i'd like to be sure that the procedure and the 
instrument used to check the configuration is correct.


I made also other tests and all suggest to me that the fuse way is 
better than the other. Is it possible?


Any help would be appreciated.

Thanks, Luca



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster for home directories?

2018-03-08 Thread Rik Theys
Hi,

On 03/08/2018 10:52 AM, Manoj Pillai wrote:
> On Wed, Mar 7, 2018 at 8:29 PM, Rik Theys  > wrote:
> 
> I'm currently testing gluster (3.12, now 3.13) on older machines[1] and
> have created a replica 3 arbiter 1 volume 2x(2+1). I seem to run in all
> sorts of (performance) problems. I must be doing something wrong but
> I've tried all sorts of benchmarks and nothing seems to make my setup
> live up to what I would expect from this hardware.
> 
> * I understand that gluster only starts to work well when multiple
> clients are connecting in parallel, but I did expect the single client
> performance to be better.
> 
> * Unpacking the linux-4.15.7.tar.xz file on the brick XFS filesystem
> followed by a sync takes about 1 minute. Doing the same on the gluster
> volume using the fuse client (client is one of the brick servers) takes
> over 9 minutes and neither disk nor cpu nor network are reaching their
> bottleneck. Doing the same over NFS-ganesha (client is a workstation
> connected through gbit) takes even longer (more than 30min!?).
> 
> I understand that unpacking a lot of small files may be the worst
> workload for a distributed filesystem, but when I look at the file sizes
> of the files in our users' home directories, more than 90% is smaller
> than 1MB.
> 
> * A file copy of a 300GB file over NFS 4 (nfs-ganesha) starts fast
> (90MB/s) and then drops to 20MB/s. When I look at the servers during the
> copy, I don't see where the bottleneck is as the cpu, disk and network
> are not maxing out (on none of the bricks). When the same client copies
> the file to our current NFS storage it is limited by the gbit network
> connection of the client.
> 
> 
> Both untar and cp are single-threaded, which means throughput is mostly
> dictated by latency. Latency is generally higher in a distributed FS;
> nfs-ganesha has an extra hop to the backend, and hence higher latency
> for most operations compared to glusterfs-fuse.
> 
> You don't necessarily need multiple clients for good performance with
> gluster. Many multi-threaded benchmarks give good performance from a
> single client. Here for e.g., if you run multiple copy commands in
> parallel from the same client, I'd expect your aggregate transfer rate
> to improve.
> 
> Been a long while since I looked at nfs-ganesha. But in terms of upper
> bounds for throughput tests: data needs to flow over the
> client->nfs-server link, and then, depending on which servers the file
> is located on, either 1x (if the nfs-ganesha node is also hosting one
> copy of the file, and neglecting arbiter) or 2x over the s2s link. With
> 1Gbps links, that means an upper bound between 125 MB/s and 62.5 MB/s,
> in the steady state, unless I miscalculated.

Yes, you are correct, but the speeds I'm seeing are far below 62.5MB/s.

In the untar case, I fully understand the overhead as there a lot of
small files and therefore a lot of metadata overhead.

In the sequential write the speed should be much better as latency
is/should be less of an issue here?

I've been trying to find some documentation on nfs-ganesha but
everything I find seems to be outdated :-(.

The documentation on their wiki states:

"
Version 2.0

This version is in active development and is not considered stable
enough for production use. Its documentation is still incomplete.
"

Their latest version is 2.6.0...

Also I can not find what changed between 2.5 and 2.6. Sure I can look at
he git commits, but there is no maintained changes/changelog,...

>From what I've read nfs-ganesha should be able cache a lot of data, but
I can't find any good documentation on how to configure this.

Regards,

Rik

-- 
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT)
Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
+32(0)16/32.11.07

<>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster for home directories?

2018-03-08 Thread Manoj Pillai
Hi Rik,

Nice clarity and detail in the description. Thanks!

inline...

On Wed, Mar 7, 2018 at 8:29 PM, Rik Theys 
wrote:

> Hi,
>
> We are looking into replacing our current storage solution and are
> evaluating gluster for this purpose. Our current solution uses a SAN
> with two servers attached that serve samba and NFS 4. Clients connect to
> those servers using NFS or SMB. All users' home directories live on this
> server.
>
> I would like to have some insight in who else is using gluster for home
> directories for about 500 users and what performance they get out of the
> solution. Which connectivity method are you using on the clients
> (gluster native, nfs, smb)? Which volume options do you have configured
> for your gluster volume? What hardware are you using? Are you using
> snapshots and/or quota? If so, any number on performance impact?
>
> The solution I had in mind for our setup is multiple servers/bricks with
> replica 3 arbiter 1 volume where each server is also running nfs-ganesha
> and samba in HA. Clients would be connecting to one of the nfs servers
> (dns round robin). In this case the nfs servers would be the gluster
> clients. Gluster traffic would go over a dedicated network with 10G and
> jumbo frames.
>
> I'm currently testing gluster (3.12, now 3.13) on older machines[1] and
> have created a replica 3 arbiter 1 volume 2x(2+1). I seem to run in all
> sorts of (performance) problems. I must be doing something wrong but
> I've tried all sorts of benchmarks and nothing seems to make my setup
> live up to what I would expect from this hardware.
>
> * I understand that gluster only starts to work well when multiple
> clients are connecting in parallel, but I did expect the single client
> performance to be better.
>
> * Unpacking the linux-4.15.7.tar.xz file on the brick XFS filesystem
> followed by a sync takes about 1 minute. Doing the same on the gluster
> volume using the fuse client (client is one of the brick servers) takes
> over 9 minutes and neither disk nor cpu nor network are reaching their
> bottleneck. Doing the same over NFS-ganesha (client is a workstation
> connected through gbit) takes even longer (more than 30min!?).
>
> I understand that unpacking a lot of small files may be the worst
> workload for a distributed filesystem, but when I look at the file sizes
> of the files in our users' home directories, more than 90% is smaller
> than 1MB.
>
> * A file copy of a 300GB file over NFS 4 (nfs-ganesha) starts fast
> (90MB/s) and then drops to 20MB/s. When I look at the servers during the
> copy, I don't see where the bottleneck is as the cpu, disk and network
> are not maxing out (on none of the bricks). When the same client copies
> the file to our current NFS storage it is limited by the gbit network
> connection of the client.
>

Both untar and cp are single-threaded, which means throughput is mostly
dictated by latency. Latency is generally higher in a distributed FS;
nfs-ganesha has an extra hop to the backend, and hence higher latency for
most operations compared to glusterfs-fuse.

You don't necessarily need multiple clients for good performance with
gluster. Many multi-threaded benchmarks give good performance from a single
client. Here for e.g., if you run multiple copy commands in parallel from
the same client, I'd expect your aggregate transfer rate to improve.

Been a long while since I looked at nfs-ganesha. But in terms of upper
bounds for throughput tests: data needs to flow over the client->nfs-server
link, and then, depending on which servers the file is located on, either
1x (if the nfs-ganesha node is also hosting one copy of the file, and
neglecting arbiter) or 2x over the s2s link. With 1Gbps links, that means
an upper bound between 125 MB/s and 62.5 MB/s, in the steady state, unless
I miscalculated.

-- Manoj


>
> * I had the 'cluster.optimize-lookup' option enabled but ran into all
> sorts of issues where ls is showing either the wrong files (content of a
> different directory), or claiming a directory does not exist when mkdir
> says it already exists... I current have the following options set:
>
> server.outstanding-rpc-limit: 256
> client.event-threads: 4
> performance.io-thread-count: 16
> performance.parallel-readdir: on
> server.event-threads: 4
> performance.cache-size: 2GB
> performance.rda-cache-limit: 128MB
> performance.write-behind-window-size: 8MB
> performance.md-cache-timeout: 600
> performance.cache-invalidation: on
> performance.stat-prefetch: on
> network.inode-lru-limit: 50
> performance.nl-cache-timeout: 600
> performance.nl-cache: on
> features.cache-invalidation-timeout: 600
> features.cache-invalidation: on
> transport.address-family: inet
> nfs.disable: on
> cluster.enable-shared-storage: enable
>
> The brick servers have 2 dual-core cpu's so I've set the client and
> server event threads to 4.
>
> * When using nfs-ganesha I run into bugs that makes me wonder who is
> using nfs-ganesha with