[Gluster-users] Enabling quotas on gluster

2019-04-03 Thread Lindolfo Meira
Hi folks.

Does anyone know how significant is the performance penalty for enabling 
directory level quotas on a gluster fs, compared to the case with no 
quotas at all?


Lindolfo Meira, MSc
Diretor Geral, Centro Nacional de Supercomputação
Universidade Federal do Rio Grande do Sul
+55 (51) 3308-3139___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6

2019-04-03 Thread Darrell Budic
Hari-

I was upgrading my test cluster from 5.5 to 6 and I hit this bug 
(https://bugzilla.redhat.com/show_bug.cgi?id=1694010 
) or something similar. In 
my case, the workaround did not work, and I was left with a gluster that had 
gone into no-quorum mode and stopped all the bricks. Wasn’t much in the logs 
either, but I noticed my /etc/glusterfs/glusterd.vol files were not the same as 
the newer versions, so I updated them, restarted glusterd, and suddenly the 
updated node showed as peer-in-cluster again. Once I updated other notes the 
same way, things started working again. Maybe a place to look?

My old config (all nodes):
volume management
type mgmt/glusterd
option working-directory /var/lib/glusterd
option transport-type socket
option transport.socket.keepalive-time 10
option transport.socket.keepalive-interval 2
option transport.socket.read-fail-log off
option ping-timeout 10
option event-threads 1
option rpc-auth-allow-insecure on
#   option transport.address-family inet6
#   option base-port 49152
end-volume

changed to:
volume management
type mgmt/glusterd
option working-directory /var/lib/glusterd
option transport-type socket,rdma
option transport.socket.keepalive-time 10
option transport.socket.keepalive-interval 2
option transport.socket.read-fail-log off
option transport.socket.listen-port 24007
option transport.rdma.listen-port 24008
option ping-timeout 0
option event-threads 1
option rpc-auth-allow-insecure on
#   option lock-timer 180
#   option transport.address-family inet6
#   option base-port 49152
option max-port  60999
end-volume

the only thing I found in the glusterd logs that looks relevant was (repeated 
for both of the other nodes in this cluster), so no clue why it happened:
[2019-04-03 20:19:16.802638] I [MSGID: 106004] 
[glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer 
 (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state , has disconnected from glusterd.


> On Apr 2, 2019, at 4:53 AM, Atin Mukherjee  wrote:
> 
> 
> 
> On Mon, 1 Apr 2019 at 10:28, Hari Gowtham  > wrote:
> Comments inline.
> 
> On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay
>  > wrote:
> >
> > Quite a considerable amount of detail here. Thank you!
> >
> > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham  > > wrote:
> > >
> > > Hello Gluster users,
> > >
> > > As you all aware that glusterfs-6 is out, we would like to inform you
> > > that, we have spent a significant amount of time in testing
> > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to
> > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3.
> > >
> > > As glusterfs-6 has got in a lot of changes, we wanted to test those 
> > > portions.
> > > There were xlators (and respective options to enable/disable them)
> > > added and deprecated in glusterfs-6 from various versions [1].
> > >
> > > We had to check the following upgrade scenarios for all such options
> > > Identified in [1]:
> > > 1) option never enabled and upgraded
> > > 2) option enabled and then upgraded
> > > 3) option enabled and then disabled and then upgraded
> > >
> > > We weren't manually able to check all the combinations for all the 
> > > options.
> > > So the options involving enabling and disabling xlators were prioritized.
> > > The below are the result of the ones tested.
> > >
> > > Never enabled and upgraded:
> > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works.
> > >
> > > Enabled and upgraded:
> > > Tested for tier which is deprecated, It is not a recommended upgrade.
> > > As expected the volume won't be consumable and will have a few more
> > > issues as well.
> > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade.
> > >
> > > Enabled, disabled before upgrade.
> > > Tested for tier with 3.12 and the upgrade went fine.
> > >
> > > There is one common issue to note in every upgrade. The node being
> > > upgraded is going into disconnected state. You have to flush the iptables
> > > and the restart glusterd on all nodes to fix this.
> > >
> >
> > Is this something that is written in the upgrade notes? I do not seem
> > to recall, if not, I'll send a PR
> 
> No this wasn't mentioned in the release notes. PRs are welcome.
> 
> >
> > > The testing for enabling new options is still pending. The new options
> > > won't cause as much issues as the deprecated ones so this was put at
> > > the end of the priority list. It would be nice to get contributions
> > > for this.
> > >
> >
> > Did the range of tests lead to any new issues?
> 
> Yes. In the first round of testing we found an issue and had to postpone the
> release of 6 until the fix was made available.
> https://bugzilla.redhat.com/show_bug.cgi?id=1684029 
> 
> 
> And then we tested it 

Re: [Gluster-users] [ovirt-users] Re: Announcing Gluster release 5.5

2019-04-03 Thread Olaf Buitelaar
Dear  Mohit,

Thanks for backporting this issue. Hopefully we can address the others as
well, if i can do anything let me know.
On my side i've tested with: gluster volume reset 
cluster.choose-local, but haven't noticed really a change in performance.
On the good side, the brick processes didn't crash with updating this
config.
I'll experiment with the other changes as well, and see how the
combinations affect performance.
I also saw this commit; https://review.gluster.org/#/c/glusterfs/+/21333/
which looks very useful, will this be an recommended option for VM/block
workloads?

Thanks Olaf


Op wo 3 apr. 2019 om 17:56 schreef Mohit Agrawal :

>
> Hi,
>
> Thanks Olaf for sharing the relevant logs.
>
> @Atin,
> You are right patch https://review.gluster.org/#/c/glusterfs/+/22344/
> will resolve the issue running multiple brick instance for same brick.
>
> As we can see in below logs glusterd is trying to start the same brick
> instance twice at the same time
>
> [2019-04-01 10:23:21.752401] I
> [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh
> brick process for brick /data/gfs/bricks/brick1/ovirt-engine
> [2019-04-01 10:23:30.348091] I
> [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh
> brick process for brick /data/gfs/bricks/brick1/ovirt-engine
> [2019-04-01 10:24:13.353396] I
> [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh
> brick process for brick /data/gfs/bricks/brick1/ovirt-engine
> [2019-04-01 10:24:24.253764] I
> [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh
> brick process for brick /data/gfs/bricks/brick1/ovirt-engine
>
> We are seeing below message between starting of two instances
> The message "E [MSGID: 101012] [common-utils.c:4075:gf_is_service_running]
> 0-: Unable to read pidfile:
> /var/run/gluster/vols/ovirt-engine/10.32.9.5-data-gfs-bricks-brick1-ovirt-engine.pid"
> repeated 2 times between [2019-04-01 10:23:21.748492] and [2019-04-01
> 10:23:21.752432]
>
> I will backport the same.
> Thanks,
> Mohit Agrawal
>
> On Wed, Apr 3, 2019 at 3:58 PM Olaf Buitelaar 
> wrote:
>
>> Dear Mohit,
>>
>> Sorry i thought Krutika was referring to the ovirt-kube brick logs. due
>> the large size (18MB compressed), i've placed the files here;
>> https://edgecastcdn.net/0004FA/files/bricklogs.tar.bz2
>> Also i see i've attached the wrong files, i intended to
>> attach profile_data4.txt | profile_data3.txt
>> Sorry for the confusion.
>>
>> Thanks Olaf
>>
>> Op wo 3 apr. 2019 om 04:56 schreef Mohit Agrawal :
>>
>>> Hi Olaf,
>>>
>>>   As per current attached "multi-glusterfsd-vol3.txt |
>>> multi-glusterfsd-vol4.txt" it is showing multiple processes are running
>>>   for "ovirt-core ovirt-engine" brick names but there are no logs
>>> available in bricklogs.zip specific to this bricks, bricklogs.zip
>>>   has a dump of ovirt-kube logs only
>>>
>>>   Kindly share brick logs specific to the bricks "ovirt-core
>>> ovirt-engine" and share glusterd logs also.
>>>
>>> Regards
>>> Mohit Agrawal
>>>
>>> On Tue, Apr 2, 2019 at 9:18 PM Olaf Buitelaar 
>>> wrote:
>>>
 Dear Krutika,

 1.
 I've changed the volume settings, write performance seems to increased
 somewhat, however the profile doesn't really support that since latencies
 increased. However read performance has diminished, which does seem to be
 supported by the profile runs (attached).
 Also the IO does seem to behave more consistent than before.
 I don't really understand the idea behind them, maybe you can explain
 why these suggestions are good?
 These settings seems to avoid as much local caching and access as
 possible and push everything to the gluster processes. While i would expect
 local access and local caches are a good thing, since it would lead to
 having less network access or disk access.
 I tried to investigate these settings a bit more, and this is what i
 understood of them;
 - network.remote-dio; when on it seems to ignore the O_DIRECT flag in
 the client, thus causing the files to be cached and buffered in the page
 cache on the client, i would expect this to be a good thing especially if
 the server process would access the same page cache?
 At least that is what grasp from this commit;
 https://review.gluster.org/#/c/glusterfs/+/4206/2/xlators/protocol/client/src/client.c
  line
 867
 Also found this commit;
 https://github.com/gluster/glusterfs/commit/06c4ba589102bf92c58cd9fba5c60064bc7a504e#diff-938709e499b4383c3ed33c3979b9080c
  suggesting
 remote-dio actually improves performance, not sure it's a write or read
 benchmark
 When a file is opened with O_DIRECT it will also disable the
 write-behind functionality

 - performance.strict-o-direct: when on, the AFR, will not ignore the
 O_DIRECT flag. and will invoke: fop_writev_stub with the wb_writev_helper,
 which seems to stack the 

Re: [Gluster-users] [ovirt-users] Re: Announcing Gluster release 5.5

2019-04-03 Thread Mohit Agrawal
Hi,

Thanks Olaf for sharing the relevant logs.

@Atin,
You are right patch https://review.gluster.org/#/c/glusterfs/+/22344/ will
resolve the issue running multiple brick instance for same brick.

As we can see in below logs glusterd is trying to start the same brick
instance twice at the same time

[2019-04-01 10:23:21.752401] I [glusterd-utils.c:6301:glusterd_brick_start]
0-management: starting a fresh brick process for brick
/data/gfs/bricks/brick1/ovirt-engine
[2019-04-01 10:23:30.348091] I [glusterd-utils.c:6301:glusterd_brick_start]
0-management: starting a fresh brick process for brick
/data/gfs/bricks/brick1/ovirt-engine
[2019-04-01 10:24:13.353396] I [glusterd-utils.c:6301:glusterd_brick_start]
0-management: starting a fresh brick process for brick
/data/gfs/bricks/brick1/ovirt-engine
[2019-04-01 10:24:24.253764] I [glusterd-utils.c:6301:glusterd_brick_start]
0-management: starting a fresh brick process for brick
/data/gfs/bricks/brick1/ovirt-engine

We are seeing below message between starting of two instances
The message "E [MSGID: 101012] [common-utils.c:4075:gf_is_service_running]
0-: Unable to read pidfile:
/var/run/gluster/vols/ovirt-engine/10.32.9.5-data-gfs-bricks-brick1-ovirt-engine.pid"
repeated 2 times between [2019-04-01 10:23:21.748492] and [2019-04-01
10:23:21.752432]

I will backport the same.
Thanks,
Mohit Agrawal

On Wed, Apr 3, 2019 at 3:58 PM Olaf Buitelaar 
wrote:

> Dear Mohit,
>
> Sorry i thought Krutika was referring to the ovirt-kube brick logs. due
> the large size (18MB compressed), i've placed the files here;
> https://edgecastcdn.net/0004FA/files/bricklogs.tar.bz2
> Also i see i've attached the wrong files, i intended to
> attach profile_data4.txt | profile_data3.txt
> Sorry for the confusion.
>
> Thanks Olaf
>
> Op wo 3 apr. 2019 om 04:56 schreef Mohit Agrawal :
>
>> Hi Olaf,
>>
>>   As per current attached "multi-glusterfsd-vol3.txt |
>> multi-glusterfsd-vol4.txt" it is showing multiple processes are running
>>   for "ovirt-core ovirt-engine" brick names but there are no logs
>> available in bricklogs.zip specific to this bricks, bricklogs.zip
>>   has a dump of ovirt-kube logs only
>>
>>   Kindly share brick logs specific to the bricks "ovirt-core
>> ovirt-engine" and share glusterd logs also.
>>
>> Regards
>> Mohit Agrawal
>>
>> On Tue, Apr 2, 2019 at 9:18 PM Olaf Buitelaar 
>> wrote:
>>
>>> Dear Krutika,
>>>
>>> 1.
>>> I've changed the volume settings, write performance seems to increased
>>> somewhat, however the profile doesn't really support that since latencies
>>> increased. However read performance has diminished, which does seem to be
>>> supported by the profile runs (attached).
>>> Also the IO does seem to behave more consistent than before.
>>> I don't really understand the idea behind them, maybe you can explain
>>> why these suggestions are good?
>>> These settings seems to avoid as much local caching and access as
>>> possible and push everything to the gluster processes. While i would expect
>>> local access and local caches are a good thing, since it would lead to
>>> having less network access or disk access.
>>> I tried to investigate these settings a bit more, and this is what i
>>> understood of them;
>>> - network.remote-dio; when on it seems to ignore the O_DIRECT flag in
>>> the client, thus causing the files to be cached and buffered in the page
>>> cache on the client, i would expect this to be a good thing especially if
>>> the server process would access the same page cache?
>>> At least that is what grasp from this commit;
>>> https://review.gluster.org/#/c/glusterfs/+/4206/2/xlators/protocol/client/src/client.c
>>>  line
>>> 867
>>> Also found this commit;
>>> https://github.com/gluster/glusterfs/commit/06c4ba589102bf92c58cd9fba5c60064bc7a504e#diff-938709e499b4383c3ed33c3979b9080c
>>>  suggesting
>>> remote-dio actually improves performance, not sure it's a write or read
>>> benchmark
>>> When a file is opened with O_DIRECT it will also disable the
>>> write-behind functionality
>>>
>>> - performance.strict-o-direct: when on, the AFR, will not ignore the
>>> O_DIRECT flag. and will invoke: fop_writev_stub with the wb_writev_helper,
>>> which seems to stack the operation, no idea why that is. But generally i
>>> suppose not ignoring the O_DIRECT flag in the AFR is a good thing, when a
>>> processes requests to have O_DIRECT. So this makes sense to me.
>>>
>>> - cluster.choose-local: when off, it doesn't prefer the local node, but
>>> would always choose a brick. Since it's a 9 node cluster, with 3
>>> subvolumes, only a 1/3 could end-up local, and the other 2/3 should be
>>> pushed to external nodes anyway. Or am I making the total wrong assumption
>>> here?
>>>
>>> It seems to this config is moving to the gluster-block config side of
>>> things, which does make sense.
>>> Since we're running quite some mysql instances, which opens the files
>>> with O_DIRECt i believe, it would mean the only layer of cache is within
>>> mysql it self. 

Re: [Gluster-users] Help: gluster-block

2019-04-03 Thread Prasanna Kalever
On Tue, Apr 2, 2019 at 1:34 AM Karim Roumani 
wrote:

> Actually we have a question.
>
> We did two tests as follows.
>
> Test 1 - iSCSI target on the glusterFS server
> Test 2 - iSCSI target on a separate server with gluster client
>
> Test 2 performed a read speed of <1GB/second while Test 1 about
> 300MB/second
>
> Any reason you see to why this may be the case?
>

For Test 1 case,

1. ops b/w
* iscsi initiator <-> iscsi target and
* tcmu-runner <-> gluster server

are all using the same NIC resource.

2.  Also, it might be possible that, the node might be facing high resource
usage like cpu is high and/or memory is low, as everything is on the same
node.

You can check also check gluster profile info, to corner down some of these.

Thanks!
--
Prasanna


> ᐧ
>
> On Mon, Apr 1, 2019 at 1:00 PM Karim Roumani 
> wrote:
>
>> Thank you Prasanna for your quick response very much appreaciated we will
>> review and get back to you.
>> ᐧ
>>
>> On Mon, Mar 25, 2019 at 9:00 AM Prasanna Kalever 
>> wrote:
>>
>>> [ adding +gluster-users for archive purpose ]
>>>
>>> On Sat, Mar 23, 2019 at 1:51 AM Jeffrey Chin 
>>> wrote:
>>> >
>>> > Hello Mr. Kalever,
>>>
>>> Hello Jeffrey,
>>>
>>> >
>>> > I am currently working on a project to utilize GlusterFS for VMWare
>>> VMs. In our research, we found that utilizing block devices with GlusterFS
>>> would be the best approach for our use case (correct me if I am wrong). I
>>> saw the gluster utility that you are a contributor for called gluster-block
>>> (https://github.com/gluster/gluster-block), and I had a question about
>>> the configuration. From what I understand, gluster-block only works on the
>>> servers that are serving the gluster volume. Would it be possible to run
>>> the gluster-block utility on a client machine that has a gluster volume
>>> mounted to it?
>>>
>>> Yes, that is right! At the moment gluster-block is coupled with
>>> glusterd for simplicity.
>>> But we have made some changes here [1] to provide a way to specify
>>> server address (volfile-server) which is outside the gluster-blockd
>>> node, please take a look.
>>>
>>> Although it is not complete solution, but it should at-least help for
>>> some usecases. Feel free to raise an issue [2] with the details about
>>> your usecase and etc or submit a PR by your self :-)
>>> We never picked it, as we never have a usecase needing separation of
>>> gluster-blockd and glusterd.
>>>
>>> >
>>> > I also have another question: how do I make the iSCSI targets persist
>>> if all of the gluster nodes were rebooted? It seems like once all of the
>>> nodes reboot, I am unable to reconnect to the iSCSI targets created by the
>>> gluster-block utility.
>>>
>>> do you mean rebooting iscsi initiator ? or gluster-block/gluster
>>> target/server nodes ?
>>>
>>> 1. for initiator to automatically connect to block devices post
>>> reboot, we need to make below changes in /etc/iscsi/iscsid.conf:
>>> node.startup = automatic
>>>
>>> 2. if you mean, just in case if all the gluster nodes goes down, on
>>> the initiator all the available HA path's will be down, but we still
>>> want the IO to be queued on the initiator, until one of the path
>>> (gluster node) is availabe:
>>>
>>> for this in gluster-block sepcific section of multipath.conf you need
>>> to replace 'no_path_retry 120' as 'no_path_retry queue'
>>> Note: refer README for current multipath.conf setting recommendations.
>>>
>>> [1] https://github.com/gluster/gluster-block/pull/161
>>> [2] https://github.com/gluster/gluster-block/issues/new
>>>
>>> BRs,
>>> --
>>> Prasanna
>>>
>>
>>
>> --
>>
>> Thank you,
>>
>> *Karim Roumani*
>> Director of Technology Solutions
>>
>> TekReach Solutions / Albatross Cloud
>> 714-916-5677
>> karim.roum...@tekreach.com
>> Albatross.cloud  - One Stop Cloud Solutions
>> Portalfronthosting.com  - Complete
>> SharePoint Solutions
>>
>
>
> --
>
> Thank you,
>
> *Karim Roumani*
> Director of Technology Solutions
>
> TekReach Solutions / Albatross Cloud
> 714-916-5677
> karim.roum...@tekreach.com
> Albatross.cloud  - One Stop Cloud Solutions
> Portalfronthosting.com  - Complete
> SharePoint Solutions
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster and LVM

2019-04-03 Thread Dmitry Melekhov


03.04.2019 18:20, kbh-admin пишет:

Hello Gluster-Community,


we consider to build several gluster-servers and have a question 
regarding  lvm and glusterfs.



Scenario 1: Snapshots

Of course, taking snapshots is a good capability and we want to use 
lvm for that.



Scenaraio 2: Increase gluster volume

We want to increase the gluster volume by adding hdd's and/or by adding

dell powervaults later. We got the recommendation to set up a new 
gluster volume


for the powervaults and don't use lvm in that case (lvresize ) .


What would you suggest and how do you manage both lvm and glusterfs 
together?



If you already have storage why you need gluster?

Just use it :-)





Thanks in advance.


Felix

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster and LVM

2019-04-03 Thread kbh-admin

Hello Gluster-Community,


we consider to build several gluster-servers and have a question 
regarding  lvm and glusterfs.



Scenario 1: Snapshots

Of course, taking snapshots is a good capability and we want to use lvm 
for that.



Scenaraio 2: Increase gluster volume

We want to increase the gluster volume by adding hdd's and/or by adding

dell powervaults later. We got the recommendation to set up a new 
gluster volume


for the powervaults and don't use lvm in that case (lvresize ) .


What would you suggest and how do you manage both lvm and glusterfs 
together?



Thanks in advance.


Felix

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster GEO replication fault after write over nfs-ganesha

2019-04-03 Thread Jiffin Tony Thottan

CCIng sunn as well.

On 28/03/19 4:05 PM, Soumya Koduri wrote:



On 3/27/19 7:39 PM, Alexey Talikov wrote:

I have two clusters with dispersed volumes (2+1) with GEO replication
It works fine till I use glusterfs-fuse, but as even one file written 
over nfs-ganesha replication goes to Fault and recovers after I 
remove this file (sometimes after stop/start)
I think nfs-hanesha writes file in some way that produces problem 
with replication




I am not much familiar with geo-rep and not sure what/why exactly 
failed here. Request Kotresh (cc'ed) to take a look and provide his 
insights on the issue.


Thanks,
Soumya

|OSError: [Errno 61] No data available: 
'.gfid/9c9514ce-a310-4a1c-a87b-a800a32a99f8' |


but if I check over glusterfs mounted with aux-gfid-mount

|getfattr -n trusted.glusterfs.pathinfo -e text 
/mnt/TEST/.gfid/9c9514ce-a310-4a1c-a87b-a800a32a99f8 getfattr: 
Removing leading '/' from absolute path names # file: 
mnt/TEST/.gfid/9c9514ce-a310-4a1c-a87b-a800a32a99f8 
trusted.glusterfs.pathinfo="( 
( 
))" |


File exists
Details available here 
https://github.com/nfs-ganesha/nfs-ganesha/issues/408



___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] performance - what can I expect

2019-04-03 Thread Pascal Suter

Hi all

I am currently testing gluster on a single server. I have three bricks, 
each a hardware RAID6 volume with thin provisioned LVM that was aligned 
to the RAID and then formatted with xfs.


i've created a distributed volume so that entire files get distributed 
across my three bricks.


first I ran a iozone benchmark across each brick testing the read and 
write perofrmance of a single large file per brick


i then mounted my gluster volume locally and ran another iozone run with 
the same parameters writing a single file. the file went to brick 1 
which, when used driectly, would write with 2.3GB/s and read with 
1.5GB/s. however, through gluster i got only 800MB/s read and 750MB/s 
write throughput


another run with two processes each writing a file, where one file went 
to the first brick and the other file to the second brick (which by 
itself when directly accessed wrote at 2.8GB/s and read at 2.7GB/s) 
resulted in 1.2GB/s of aggregated write and also aggregated read 
throughput.


Is this a normal performance i can expect out of a glusterfs or is it 
worth tuning in order to really get closer to the actual brick 
filesystem performance?


here are the iozone commands i use for writing and reading.. note that i 
am using directIO in order to make sure i don't get fooled by cache :)


./iozone -i 0 -t 1 -F /mnt/brick${b}/thread1 -+n -c -C -e -I -w -+S 0 -s 
$filesize -r $recordsize > iozone-brick${b}-write.txt


./iozone -i 1 -t 1 -F /mnt/brick${b}/thread1 -+n -c -C -e -I -w -+S 0 -s 
$filesize -r $recordsize > iozone-brick${b}-read.txt


cheers

Pascal

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 5.5 slower than 3.12.15

2019-04-03 Thread Amar Tumballi Suryanarayan
Strahil,

With some basic testing, we are noticing the similar behavior too.

One of the issue we identified was increased n/w usage in 5.x series (being
addressed by https://review.gluster.org/#/c/glusterfs/+/22404/), and there
are few other features which write extended attributes which caused some
delay.

We are in the process of publishing some numbers with release-3.12.x,
release-5 and release-6 comparison soon. With some numbers we are already
seeing release-6 currently is giving really good performance in many
configurations, specially for 1x3 replicate volume type.

While we continue to identify and fix issues in 5.x series, one of the
request is to validate release-6.x (6.0 or 6.1 which would happen on April
10th), so you can see the difference in your workload.

Regards,
Amar



On Wed, Apr 3, 2019 at 5:57 AM Strahil Nikolov 
wrote:

> Hi Community,
>
> I have the feeling that with gluster v5.5 I have poorer performance than
> it used to be on 3.12.15. Did you observe something like that?
>
> I have a 3 node Hyperconverged Cluster (ovirt + glusterfs with replica 3
> arbiter1 volumes) with NFS Ganesha and since I have upgraded to v5 - the
> issues came up.
> First it was 5.3 notorious experience and now with 5.5 - my sanlock is
> having problems and higher latency than it used to be. I have switched from
> NFS-Ganesha to pure FUSE , but the latency problems do not go away.
>
> Of course , this is partially due to the consumer hardware, but as the
> hardware has not changed I was hoping that the performance will remain as
> is.
>
> So, do you expect 5.5 to perform less than 3.12 ?
>
> Some info:
> Volume Name: engine
> Type: Replicate
> Volume ID: 30ca1cc2-f2f7-4749-9e2e-cee9d7099ded
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: ovirt1:/gluster_bricks/engine/engine
> Brick2: ovirt2:/gluster_bricks/engine/engine
> Brick3: ovirt3:/gluster_bricks/engine/engine (arbiter)
> Options Reconfigured:
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.low-prio-threads: 32
> network.remote-dio: off
> cluster.eager-lock: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-max-threads: 8
> cluster.shd-wait-qlength: 1
> features.shard: on
> user.cifs: off
> storage.owner-uid: 36
> storage.owner-gid: 36
> network.ping-timeout: 30
> performance.strict-o-direct: on
> cluster.granular-entry-heal: enable
> cluster.enable-shared-storage: enable
>
> Network: 1 gbit/s
>
> Filesystem:XFS
>
> Best Regards,
> Strahil Nikolov
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Amar Tumballi (amarts)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Is "replica 4 arbiter 1" allowed to tweak client-quorum?

2019-04-03 Thread Ravishankar N



On 03/04/19 12:18 PM, Ingo Fischer wrote:

Hi All,

I had a replica 2 cluster to host my VM images from my Proxmox cluster.
I got a bit around split brain scenarios by using "nufa" to make sure
the files are located on the host where the machine also runs normally.
So in fact one replica could fail and I still had the VM working.

But then I thought about doing better and decided to add a node to
increase replica and I decided against arbiter approach. During this I
also decided to go away from nufa to make it a more normal approach.

But in fact by adding the third replica and removing nufa I'm not really
better on availability - only split-brain-chance. I'm still at the point
that only one node is allowed to fail because else the now active client
quorum is no longer met and FS goes read only (which in fact is not
really better then failing completely as it was before).

So I thought about adding arbiter bricks as "kind of 4th replica (but
without space needs) ... but then I read in docs that only "replica 3
arbiter 1" is allowed as combination. Is this still true?
Yes, this is still true. Slightly off-topic, the 'replica 3 arbiter 1' 
was supposed
to mean there are 3 bricks out of which 1 is an arbiter. This supposedly 
caused
some confusion where people thought there were 4 bricks involved. The 
CLI syntax

was changed in the newer releases to 'replica 2 arbiter 1` to mean there are
2 data bricks and 1 arbiter brick. For backward compatibility, the older 
syntax

still works though. The documentation needs to be updated. :-)

If docs are true: Why arbiter is not allowed for higher replica counts?
The main motivation for the arbiter feature was to solve a specific 
case: people
who wanted to avoid split-brains associated with replica 2 but did not 
want to

add another full blown data brick to make it replica 3 for cost reasons.

It would allow to improve on client quorum in my understanding.

Agreed but the current implementation is only for a 2+1 configuration.
Perhaps it is something we could work on in the future to make it 
generic like you say.


Thank you for your opinion and/or facts :-)
I don't think NUFA is being worked on/tested actively. If you can afford 
a 3rd data
brick, making it replica 3 is definitely better than a 2+1 arbiter since 
there is more
availability by virtue of the 3rd brick also storing data. Both of them 
prevent split-brains

and are used successfully by OVirt/ VM storage/ hyperconvergance use cases.
Even without NUFA, for reads, AFR anyway serves it from the local copy 
(writes still need to go to all bricks).


Regards,
Ravi


Ingo


___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Is "replica 4 arbiter 1" allowed to tweak client-quorum?

2019-04-03 Thread Ingo Fischer
Hi All,

I had a replica 2 cluster to host my VM images from my Proxmox cluster.
I got a bit around split brain scenarios by using "nufa" to make sure
the files are located on the host where the machine also runs normally.
So in fact one replica could fail and I still had the VM working.

But then I thought about doing better and decided to add a node to
increase replica and I decided against arbiter approach. During this I
also decided to go away from nufa to make it a more normal approach.

But in fact by adding the third replica and removing nufa I'm not really
better on availability - only split-brain-chance. I'm still at the point
that only one node is allowed to fail because else the now active client
quorum is no longer met and FS goes read only (which in fact is not
really better then failing completely as it was before).

So I thought about adding arbiter bricks as "kind of 4th replica (but
without space needs) ... but then I read in docs that only "replica 3
arbiter 1" is allowed as combination. Is this still true?
If docs are true: Why arbiter is not allowed for higher replica counts?
It would allow to improve on client quorum in my understanding.

Thank you for your opinion and/or facts :-)

Ingo

-- 
Ingo Fischer
Technical Director of Platform

Gameforge 4D GmbH
Albert-Nestler-Straße 8
76131 Karlsruhe
Germany

Tel. +49 721 354 808-2269

ingo.fisc...@gameforge.com

http://www.gameforge.com
Amtsgericht Mannheim, Handelsregisternummer 718029
USt-IdNr.: DE814330106
Geschäftsführer Alexander Rösner, Jeffrey Brown
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users