Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov

On 11/26/20 8:14 PM, Gionatan Danti wrote:

So I think you simply are CPU limited. I remember doing some tests with loopback RAM disks and finding that Gluster used 100% CPU (ie: full load on an entire core) when doing 4K random writes. Side 
note: using synchronized (ie: fsync) 4k writes, I only get ~600 IOPs even when running both bricks on the same machine and backing them with RAM disks (in other words, with no network or disk 
bottleneck).


Thanks, it seems you're right. Running local replica 3 volume on 3x1Gb 
ramdisks, I'm seeing:

top - 08:44:35 up 1 day, 11:51,  1 user,  load average: 2.34, 1.94, 1.00
Tasks: 237 total,   2 running, 235 sleeping,   0 stopped,   0 zombie
%Cpu(s): 38.7 us, 29.4 sy,  0.0 ni, 23.6 id,  0.0 wa,  0.4 hi,  7.9 si,  0.0 st
MiB Mem :  15889.8 total,   1085.7 free,   1986.3 used,  12817.8 buff/cache
MiB Swap:  0.0 total,  0.0 free,  0.0 used.  12307.3 avail Mem

  PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ COMMAND
63651 root  20   0  664124  41676   9600 R 166.7   0.3   0:24.20 fio
63282 root  20   0 1235336  21484   8768 S 120.4   0.1   2:43.73 glusterfsd
63298 root  20   0 1235368  20512   8856 S 120.0   0.1   2:42.43 glusterfsd
63314 root  20   0 1236392  21396   8684 S 119.8   0.1   2:41.94 glusterfsd

So, 32-core server-class system with a lot of RAM can't perform much faster for 
an
individual I/O client - it just scales better if there are a lot of clients, 
right?

Dmitry




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Ewen Chan
Silly question to all though -

Akin to the problems that Linus Tech Tips experienced with ZFS and a multi-disk 
NVMe SSD array -- is GlusterFS written so that it takes how NVMe SSDS operate 
in mind?

(i.e. that the code itself might have wait and/or wait for synchronous commands 
to finish first before executing the next command?)

cf. 
https://forum.level1techs.com/t/fixing-slow-nvme-raid-performance-on-epyc/151909
[https://forum.level1techs.com/uploads/default/original/4X/a/6/f/a6f72ef3c2adffa5161619926007f716a4459c6e.png]
Fixing Slow NVMe Raid Performance on 
Epyc
Linus had this weird problem where, when we built his array, the NVMe 
performance wasn’t that great. It was very slow – trash, basically. This was a 
24-drive NVMe array. These error messages arent too serious, normally, but are 
a sign of a missed interrupt. There is some traffic I’m aware of on the LKML 
that there are (maybe) some latent bugs around the NVMe driver, so as a 
fallback it’ll poll the device if something takes unusually long. This many 
polling events, though, means the perf is ...
forum.level1techs.com

I'm not a programmer nor a developer, so I don't really understand programming 
software, but I am just wondering that if this might be a similar issue with 
GlusterFS as it is with ZFS with NVMe storage devices because the underlying 
code/system was written with mechanically rotating disks in mind and/or, at 
best, SATA 3.0 6 Gbps SSDs in mind, as opposed to NVMe SSDs.

Could this be a possible reason/cause, ad simile?




From: gluster-users-boun...@gluster.org  on 
behalf of Dmitry Antipov 
Sent: November 26, 2020 8:36 AM
To: gluster-users@gluster.org List 
Subject: Re: [Gluster-users] Poor performance on a server-class system vs. 
desktop

To whom it may be interesting, this paper says that ~80K IOPS (4K random 
writes) is real:

https://archive.fosdem.org/2018/schedule/event/optimizing_sds/attachments/slides/2300/export/events/attachments/optimizing_sds/slides/2300/GlusterOnNVMe_FOSDEM2018.pdf

On the same-class server hardware, following their tuning recommendations, etc. 
I just run 8 times slower.
So it seems that RH insiders are the only people knows how to setup real 
GlusterFS installation properly :(.

Dmitry




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] missing files on FUSE mounts

2020-11-26 Thread James Hammett
Yes, I compared the client count like this:

gluster volume status  clients |grep -B1 connected

I ran the find command on each client before and after shutting down the 
problematic daemon to determine any file count differences:

find /mount/point |wc -l

After my last post I discovered that one of the clients had somehow been 
blocked by iptables from connecting to one of the bricks. So for an extended 
period any file creation from that one client was perpetuating an imbalance 
between bricks, causing different files to be visible for different clients. 
What baffles me is that gluster wouldn't automatically fix an imbalance between 
replicas like that.

On Fri, Nov 20, 2020, at 1:24 AM, Benedikt Kaleß wrote:
> Dear James,

> we have exactly the same problems.

> Could you describe what you did to discover which of your bricks had the 
> worst file count discrepancy and how you find out that all clients matched 
> after shutting down this daemon?

> 

> Best regards

> Benedikt

> Am 02.11.20 um 17:30 schrieb James H:
>> I found a solution after making a discovery. I logged into the brick with 
>> the worst file count discrepancy - odroid4 - and killed the gluster daemon 
>> there. All file counts across all clients then matched. So I started the 
>> daemon and ran this command to try to fix it up:

>> 

>> gluster volume replace-brick gvol0 odroid4:/srv/gfs-brick/gvol0 
>> odroid4:/srv/gfs-brick/gvol0_2 commit force

>> 

>> ...and that fixed it. It's disconcerting that it's possible for Gluster to 
>> merrily hum along without any problems showing up in the various status 
>> summaries yet show vastly different directory listings to different clients. 
>> Is this a known problem or shall I open a bug report? Are there any 
>> particular error logs I should monitor to be alerted to this bad state?

>> 
>> On Thu, Oct 29, 2020 at 8:39 PM James H  wrote:
>>> Hi folks, I'm struggling to find a solution to missing files on FUSE 
>>> mounts. Which files are missing is different on different clients. I can 
>>> stat or ls the missing files directly when called by filename but listing 
>>> directories won't show them.
>>> 
>>> So far I've:
>>>  * verified heal info shows no files in need of healing and no split brain 
>>> condition
>>>  * verified the same number of clients are connected to each brick 
>>>  * verified the file counts on the bricks match
>>>  * upgraded Gluster server and clients from 3.x to 6.x and 7.x
>>>  * run a stat on all files
>>>  * run a heal full
>>>  * rebooted / remounted FUSE clients
>>> File count from running a 'find' command on FUSE mounts on the bricks 
>>> themselves. These counts should all be the same:
>>> *38823 *fuse-odroid1-share2 
>>> *38823 *fuse-odroid2-share2
>>> *60962 *fuse-odroid3-share2
>>> *7202 *fuse-odroid4-share2
>>> 
>>> ...and a FUSE mount on a seperate server:
>>> *38823 *fuse-phn2dsm-share2  
>>> 
>>> File count from running a 'find' command on brick directories themselves::  
>>> *43382 *brick-odroid1-share2
>>> *43382 *brick-odroid2-share2
>>> *43382 *brick-arbiter-odroid3-share2
>>> *23075 *brick-odroid3-share2
>>> *23075 *brick-odroid4-share2
>>> *23075 *brick-arbiter-odroid2-share2
>>> 
>>> Here's some info about the setup:
>>> 
>>> *# gluster --version | head -1; cat /etc/lsb-release; uname -r*
>>> glusterfs 7.8
>>> DISTRIB_ID=Ubuntu
>>> DISTRIB_RELEASE=18.04
>>> DISTRIB_CODENAME=bionic
>>> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
>>> 4.14.157-171
>>> 
>>> 
>>> *# gluster volume info*
>>> Volume Name: gvol0
>>> Type: Distributed-Replicate
>>> Volume ID: 57e3a085-5fb7-417d-a71a-fed5cd0ae2d9
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 2 x (2 + 1) = 6
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: odroid1:/srv/gfs-brick/gvol0
>>> Brick2: odroid2:/srv/gfs-brick/gvol0
>>> Brick3: odroid3:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
>>> Brick4: odroid3:/srv/gfs-brick/gvol0_2
>>> Brick5: odroid4:/srv/gfs-brick/gvol0
>>> Brick6: odroid2:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
>>> Options Reconfigured:
>>> cluster.self-heal-daemon: enable
>>> performance.readdir-ahead: yes
>>> performance.cache-invalidation: on
>>> performance.stat-prefetch: on
>>> performance.quick-read: on
>>> cluster.shd-max-threads: 4
>>> performance.parallel-readdir: on
>>> cluster.server-quorum-type: server
>>> server.event-threads: 4
>>> client.event-threads: 4
>>> performance.nl-cache-timeout: 600
>>> performance.nl-cache: on
>>> network.inode-lru-limit: 20
>>> performance.md-cache-timeout: 600
>>> performance.cache-samba-metadata: on
>>> features.cache-invalidation-timeout: 600
>>> features.cache-invalidation: on
>>> storage.fips-mode-rchecksum: on
>>> performance.client-io-threads: off
>>> nfs.disable: on
>>> transport.address-family: inet
>>> features.bitrot: on
>>> features.scrub: Active
>>> features.scrub-throttle: lazy
>>> features.scrub-freq: daily
>>> cluster.min-free-disk: 10%
>>> 
>>> *# gluster volume status gvol0 detail*
>>> Status of volume: gvol0

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Gionatan Danti

Il 2020-11-26 09:47 Dmitry Antipov ha scritto:

On 11/26/20 11:29 AM, Gionatan Danti wrote:


Can you details your exact client and server CPU model?


Desktop is 8x of:
model name  : Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz

Server is 32x of:
model name  : Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz


Your desktop CPU single-thread performance is significantly higher than 
the server CPU: the former turbo to 3.5 GHz, while the latter only to 
3.0 GHz. Moreover, for single-thread workloads Skylake client is 3-5% 
faster than Skylake server at the same frequency.


So I think you simply are CPU limited. I remember doing some tests with 
loopback RAM disks and finding that Gluster used 100% CPU (ie: full load 
on an entire core) when doing 4K random writes. Side note: using 
synchronized (ie: fsync) 4k writes, I only get ~600 IOPs even when 
running both bricks on the same machine and backing them with RAM disks 
(in other words, with no network or disk bottleneck).


Regards.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Strahil Nikolov
Erm... that's not correct.
Put them on the same line
27.0.0.1   localhost localhost.localdomain localhost4
> localhost4.localdomain4 trick.

Best Regards,
Strahil Nikolov

В 12:00 +0300 на 26.11.2020 (чт), Dmitry Antipov написа:
> On 11/26/20 11:42 AM, Strahil Nikolov wrote:
> 
> > And you gluster bricks are localhost:/brick1 , localhost:/brick2
> > and
> > localhost:/brick3 ?
> > If not, add the hostname used for the bricks on the line starting
> > with
> > 127.0.0.1 and try again.
> 
> Same thing with:
> 
> 127.0.0.1   trick trick.localdomain trick4 trick4.localdomain4
> 127.0.0.1   localhost localhost.localdomain localhost4
> localhost4.localdomain4
> ::1 localhost localhost.localdomain localhost6
> localhost6.localdomain6
> 
> and:
> 
> Volume Name: test0
> Type: Replicate
> Volume ID: 2699e6fd-3898-4912-b4de-2d3850c53fb9
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: trick:/glusterfs/test0-000
> Brick2: trick:/glusterfs/test0-001
> Brick3: trick:/glusterfs/test0-002
> 
> When running the workload, per-interface RX/TX counters (as shown by
> ifconfig)
> are rapidly grows on 'lo' but remains nearly the same on other
> interfaces.
> So I'm pretty sure that loopback is in action and the problem is
> somewhere else.
> 
> Dmitry





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov

To whom it may be interesting, this paper says that ~80K IOPS (4K random 
writes) is real:

https://archive.fosdem.org/2018/schedule/event/optimizing_sds/attachments/slides/2300/export/events/attachments/optimizing_sds/slides/2300/GlusterOnNVMe_FOSDEM2018.pdf

On the same-class server hardware, following their tuning recommendations, etc. 
I just run 8 times slower.
So it seems that RH insiders are the only people knows how to setup real 
GlusterFS installation properly :(.

Dmitry




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Yaniv Kaul
On Thu, Nov 26, 2020 at 2:31 PM Dmitry Antipov  wrote:

> On 11/26/20 12:49 PM, Yaniv Kaul wrote:
>
> > I run a slightly different command, which hides the kernel stuff and
> focuses on the user mode functions:
> > sudo perf record --call-graph dwarf -j any --buildid-all --all-user -p
> `pgrep -d\, gluster` -F 2000 -ag
>
> Thanks.
>
> BTW, how much is an overhead of passing data between xlators? Even if the
> most of their features
> are disabled, just passing through all of the below is unlikely to have
> near-to-zero overhead:
>

Very good question. I was always suspicious of that flow, and I do believe
we could do some optimizations, but here's the response I've received back
then:
Here's some data from some tests I was running last week - the avg
round-trip
time spent by fops in the brick stack from the top translator io-stats till
posix before it is executed on-disk is less than 20 microseconds. And this
stack includes both translators that are enabled and used in RHHI as well as
the do-nothing xls you mention. In contrast, the round-trip time spent by
these
fops between the client and server translator is of the order of a few
hundred
microseconds to sometimes even 1ms.


> Thread 14 (Thread 0x7f2c0e7fc640 (LWP 19482) "glfs_rpcrqhnd"):
> #0  data_unref (this=0x7f2bfc032e68) at dict.c:768
> #1  0x7f2c290b90b9 in dict_deln (keylen=,
> key=0x7f2c163d542e "glusterfs.inodelk-dom-count", this=0x7f2bfc0bb1c8) at
> dict.c:645
> #2  dict_deln (this=0x7f2bfc0bb1c8, key=0x7f2c163d542e
> "glusterfs.inodelk-dom-count", keylen=) at dict.c:614
> #3  0x7f2c163c87ee in pl_get_xdata_requests (local=0x7f2bfc0ea658,
> xdata=0x7f2bfc0bb1c8) at posix.c:238
> #4  0x7f2c163b3267 in pl_get_xdata_requests (xdata=0x7f2bfc0bb1c8,
> local=) at posix.c:213
>

For example, https://github.com/gluster/glusterfs/issues/1707
optimizes pl_get_xdata_requests()
a bit.
Y.

#5  pl_writev (frame=0x7f2bfc0d5348, this=0x7f2c08014830,
> fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, offset=108306432,
> flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at posix.c:2299
> #6  0x7f2c16395e31 in worm_writev (frame=0x7f2bfc0d5348,
> this=, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1,
> offset=108306432, flags=0, iobref=0x7f2c080820d0,
> xdata=0x7f2bfc0bb1c8) at worm.c:429
> #7  0x7f2c1638a55f in ro_writev (frame=frame@entry=0x7f2bfc0d5348,
> this=, fd=fd@entry=0x7f2bfc0bc768, 
> vector=vector@entry=0x7f2bfc105478,
> count=count@entry=1,
> off=off@entry=108306432, flags=0, iobref=0x7f2c080820d0,
> xdata=0x7f2bfc0bb1c8) at read-only-common.c:374
> #8  0x7f2c163705ac in leases_writev (frame=frame@entry=0x7f2bfc0bf148,
> this=0x7f2c0801a230, fd=fd@entry=0x7f2bfc0bc768, 
> vector=vector@entry=0x7f2bfc105478,
> count=count@entry=1,
> off=off@entry=108306432, flags=0, iobref=0x7f2c080820d0,
> xdata=0x7f2bfc0bb1c8) at leases.c:132
> #9  0x7f2c1634f6a8 in up_writev (frame=0x7f2bfc067508,
> this=0x7f2c0801bf00, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1,
> off=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8)
> at upcall.c:124
> #10 0x7f2c2913e6c2 in default_writev (frame=0x7f2bfc067508,
> this=, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1,
> off=108306432, flags=0, iobref=0x7f2c080820d0,
> xdata=0x7f2bfc0bb1c8) at defaults.c:2550
> #11 0x7f2c2913e6c2 in default_writev (frame=frame@entry=0x7f2bfc067508,
> this=, fd=fd@entry=0x7f2bfc0bc768, 
> vector=vector@entry=0x7f2bfc105478,
> count=count@entry=1,
> off=off@entry=108306432, flags=0, iobref=0x7f2c080820d0,
> xdata=0x7f2bfc0bb1c8) at defaults.c:2550
> #12 0x7f2c16315eb7 in marker_writev (frame=frame@entry=0x7f2bfc119e48,
> this=this@entry=0x7f2c08021440, fd=fd@entry=0x7f2bfc0bc768,
> vector=vector@entry=0x7f2bfc105478, count=count@entry=1,
> offset=offset@entry=108306432, flags=0, iobref=0x7f2c080820d0,
> xdata=0x7f2bfc0bb1c8) at marker.c:940
> #13 0x7f2c162fc0ab in barrier_writev (frame=0x7f2bfc119e48,
> this=, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1,
> off=108306432, flags=0, iobref=0x7f2c080820d0,
> xdata=0x7f2bfc0bb1c8) at barrier.c:248
> #14 0x7f2c2913e6c2 in default_writev (frame=0x7f2bfc119e48,
> this=, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1,
> off=108306432, flags=0, iobref=0x7f2c080820d0,
> xdata=0x7f2bfc0bb1c8) at defaults.c:2550
> #15 0x7f2c162c5cda in quota_writev (frame=frame@entry=0x7f2bfc119e48,
> this=, fd=fd@entry=0x7f2bfc0bc768, 
> vector=vector@entry=0x7f2bfc105478,
> count=count@entry=1,
> off=off@entry=108306432, flags=0, iobref=0x7f2c080820d0,
> xdata=0x7f2bfc0bb1c8) at quota.c:1947
> #16 0x7f2c16299c89 in io_stats_writev (frame=frame@entry=0x7f2bfc0e4358,
> this=this@entry=0x7f2c08029df0, fd=0x7f2bfc0bc768, 
> vector=vector@entry=0x7f2bfc105478,
> count=1, offset=108306432, flags=0,
> iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at io-stats.c:2893
> #17 0x7f2c161f01ac in server4_writev_resume (frame=0x7f2bfc0ef5c8,
> bound_xl=0x7f2c08029df0) at 

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Strahil Nikolov
And you gluster bricks are localhost:/brick1 , localhost:/brick2 and
localhost:/brick3 ?
If not, add the hostname used for the bricks on the line starting with
127.0.0.1 and try again.

Best Regards,
Strahil Nikolov

В 11:18 +0300 на 26.11.2020 (чт), Dmitry Antipov написа:
> On 11/26/20 9:05 AM, Strahil Nikolov wrote:
> > Can you test by adding entries in /etc/hosts for the loopback ip
> > (127.0.0.1)
> > 
> > something like this:
> > 127.0.0.1   localhost localhost.localdomain
> > localhost4 localhost4.localdomain4 server
> 
> On both systems, my /etc/hosts is:
> 
> 127.0.0.1   localhost localhost.localdomain localhost4
> localhost4.localdomain4
> ::1 localhost localhost.localdomain localhost6
> localhost6.localdomain6
> 
> Dmitry
> 





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov

On 11/26/20 12:49 PM, Yaniv Kaul wrote:


I run a slightly different command, which hides the kernel stuff and focuses on 
the user mode functions:
sudo perf record --call-graph dwarf -j any --buildid-all --all-user -p `pgrep 
-d\, gluster` -F 2000 -ag


Thanks.

BTW, how much is an overhead of passing data between xlators? Even if the most 
of their features
are disabled, just passing through all of the below is unlikely to have 
near-to-zero overhead:

Thread 14 (Thread 0x7f2c0e7fc640 (LWP 19482) "glfs_rpcrqhnd"):
#0  data_unref (this=0x7f2bfc032e68) at dict.c:768
#1  0x7f2c290b90b9 in dict_deln (keylen=, key=0x7f2c163d542e 
"glusterfs.inodelk-dom-count", this=0x7f2bfc0bb1c8) at dict.c:645
#2  dict_deln (this=0x7f2bfc0bb1c8, key=0x7f2c163d542e "glusterfs.inodelk-dom-count", 
keylen=) at dict.c:614
#3  0x7f2c163c87ee in pl_get_xdata_requests (local=0x7f2bfc0ea658, 
xdata=0x7f2bfc0bb1c8) at posix.c:238
#4  0x7f2c163b3267 in pl_get_xdata_requests (xdata=0x7f2bfc0bb1c8, 
local=) at posix.c:213
#5  pl_writev (frame=0x7f2bfc0d5348, this=0x7f2c08014830, fd=0x7f2bfc0bc768, 
vector=0x7f2bfc105478, count=1, offset=108306432, flags=0, 
iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at posix.c:2299
#6  0x7f2c16395e31 in worm_writev (frame=0x7f2bfc0d5348, this=, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, offset=108306432, flags=0, iobref=0x7f2c080820d0, 
xdata=0x7f2bfc0bb1c8) at worm.c:429
#7  0x7f2c1638a55f in ro_writev (frame=frame@entry=0x7f2bfc0d5348, this=, fd=fd@entry=0x7f2bfc0bc768, vector=vector@entry=0x7f2bfc105478, count=count@entry=1, 
off=off@entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at read-only-common.c:374
#8  0x7f2c163705ac in leases_writev (frame=frame@entry=0x7f2bfc0bf148, this=0x7f2c0801a230, fd=fd@entry=0x7f2bfc0bc768, vector=vector@entry=0x7f2bfc105478, count=count@entry=1, 
off=off@entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at leases.c:132
#9  0x7f2c1634f6a8 in up_writev (frame=0x7f2bfc067508, this=0x7f2c0801bf00, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, off=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) 
at upcall.c:124
#10 0x7f2c2913e6c2 in default_writev (frame=0x7f2bfc067508, this=, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, off=108306432, flags=0, iobref=0x7f2c080820d0, 
xdata=0x7f2bfc0bb1c8) at defaults.c:2550
#11 0x7f2c2913e6c2 in default_writev (frame=frame@entry=0x7f2bfc067508, this=, fd=fd@entry=0x7f2bfc0bc768, vector=vector@entry=0x7f2bfc105478, count=count@entry=1, 
off=off@entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at defaults.c:2550
#12 0x7f2c16315eb7 in marker_writev (frame=frame@entry=0x7f2bfc119e48, this=this@entry=0x7f2c08021440, fd=fd@entry=0x7f2bfc0bc768, vector=vector@entry=0x7f2bfc105478, count=count@entry=1, 
offset=offset@entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at marker.c:940
#13 0x7f2c162fc0ab in barrier_writev (frame=0x7f2bfc119e48, this=, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, off=108306432, flags=0, iobref=0x7f2c080820d0, 
xdata=0x7f2bfc0bb1c8) at barrier.c:248
#14 0x7f2c2913e6c2 in default_writev (frame=0x7f2bfc119e48, this=, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, off=108306432, flags=0, iobref=0x7f2c080820d0, 
xdata=0x7f2bfc0bb1c8) at defaults.c:2550
#15 0x7f2c162c5cda in quota_writev (frame=frame@entry=0x7f2bfc119e48, this=, fd=fd@entry=0x7f2bfc0bc768, vector=vector@entry=0x7f2bfc105478, count=count@entry=1, 
off=off@entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at quota.c:1947
#16 0x7f2c16299c89 in io_stats_writev (frame=frame@entry=0x7f2bfc0e4358, this=this@entry=0x7f2c08029df0, fd=0x7f2bfc0bc768, vector=vector@entry=0x7f2bfc105478, count=1, offset=108306432, flags=0, 
iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at io-stats.c:2893

#17 0x7f2c161f01ac in server4_writev_resume (frame=0x7f2bfc0ef5c8, 
bound_xl=0x7f2c08029df0) at server-rpc-fops_v2.c:3017
#18 0x7f2c161f901c in resolve_and_resume (fn=, frame=) at server-resolve.c:680
#19 server4_0_writev (req=) at server-rpc-fops_v2.c:3943
#20 0x7f2c290696e5 in rpcsvc_request_handler (arg=0x7f2c1614c0b8) at 
rpcsvc.c:2233
#21 0x7f2c28ffa3f9 in start_thread (arg=0x7f2c0e7fc640) at 
pthread_create.c:463
#22 0x7f2c28f25903 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Dmitry




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] possible memory leak in client/fuse mount

2020-11-26 Thread Ravishankar N


On 26/11/20 4:00 pm, Olaf Buitelaar wrote:

Hi Ravi,

I could try that, but i can only try a setup on VM's, and will not be 
able to setup an environment like our production environment.
Which runs on physical machines, and has actual production load etc. 
So the 2 setups would be quite different.
Personally i think it would be best debug the actual machines instead 
of trying to reproduce it. Since the reproduction of the issue on the 
physical machines is just swap the repositories and upgrade the packages.

Let me know what you think?


Physical machines or VMs - anything is fine. The only thing is I cannot 
guarantee quick responses , so if it is a production machine, it will be 
an issue for you. So any set up you can use for experimenting is fine. 
You don't need any clients for the testing. Just create a 1x2  replica 
volume using 2 nodes and start it. Then upgrade one node and see if shd 
and bricks come up on that node.


-Ravi



Thanks Olaf

Op do 26 nov. 2020 om 02:43 schreef Ravishankar N 
mailto:ravishan...@redhat.com>>:



On 25/11/20 7:17 pm, Olaf Buitelaar wrote:

Hi Ravi,

Thanks for checking. Unfortunately this is our production system,
what i've done is simple change the yum repo from gluter-6 to
http://mirror.centos.org/centos/$releasever/storage/$basearch/gluster-7/
.
Did a yum upgrade. And did restart the glusterd process
several times, i've also tried rebooting the machine. And didn't
touch the op-version yet, which is still at (6), usually i
only do this when all nodes are upgraded, and are running stable.
We're running multiple volumes with different configurations, but
for none of the volumes the shd starts on the upgraded nodes.
Is there anything further i could check/do to get to the bottom
of this?


Hi Olaf, like I said, would it be possible to create a test setup
to see if you can recreate it?

Regards,
Ravi


Thanks Olaf

Op wo 25 nov. 2020 om 14:14 schreef Ravishankar N
mailto:ravishan...@redhat.com>>:


On 25/11/20 5:50 pm, Olaf Buitelaar wrote:

Hi Ashish,

Thank you for looking into this. I indeed also suspect it
has something todo with the 7.X client, because on the 6.X
clients the issue doesn't really seem to occur.
I would love to update everything to 7.X, But since the
self-heal daemons

(https://lists.gluster.org/pipermail/gluster-users/2020-November/038917.html

)
won't start, i halted the full upgrade.


Olaf, based on your email. I did try to upgrade a 1 node of a
3-node replica 3 setup from 6.10 to 7.8 on my test VMs and I
found that the self-heal daemon (and the bricks) came online
after I restarted glusterd post-upgrade on that node. (I did
not touch the op-version), and I did not spend time on it
further.  So I don't think the problem is related to the shd
mux changes I referred to. But if you have a test setup where
you can reproduce this, please raise a github issue with the
details.

Thanks,
Ravi

Hopefully that issue will be addressed in the upcoming
release. Once i've everything running on the same version
i'll check if the issue still occurs and reach out, if
that's the case.

Thanks Olaf

Op wo 25 nov. 2020 om 10:42 schreef Ashish Pandey
mailto:aspan...@redhat.com>>:


Hi,

I checked the statedump and found some very high memory
allocations.
grep -rwn "num_allocs" glusterdump.17317.dump.1605* |
cut -d'=' -f2 | sort

30003616
30003616
3305
3305
36960008
36960008
38029944
38029944
38450472
38450472
39566824
39566824
4
I did check the lines on statedump and it could be
happening in protocol/clinet. However, I did not find
anything suspicious in my quick code exploration.
I would suggest to upgrade all the nodes on latest
version and the start your work and see if there is any
high usage of memory .
That way it will also be easier to debug this issue.

---
Ashish



*From: *"Olaf Buitelaar" mailto:olaf.buitel...@gmail.com>>
*To: *"gluster-users" mailto:gluster-users@gluster.org>>
*Sent: *Thursday, November 19, 2020 10:28:57 PM
*Subject: *[Gluster-users] possible memory leak in
client/fuse mount

Dear Gluster Users,


Re: [Gluster-users] possible memory leak in client/fuse mount

2020-11-26 Thread Olaf Buitelaar
Hi Ravi,

I could try that, but i can only try a setup on VM's, and will not be able
to setup an environment like our production environment.
Which runs on physical machines, and has actual production load etc. So the
2 setups would be quite different.
Personally i think it would be best debug the actual machines instead of
trying to reproduce it. Since the reproduction of the issue on the
physical machines is just swap the repositories and upgrade the packages.
Let me know what you think?

Thanks Olaf

Op do 26 nov. 2020 om 02:43 schreef Ravishankar N :

>
> On 25/11/20 7:17 pm, Olaf Buitelaar wrote:
>
> Hi Ravi,
>
> Thanks for checking. Unfortunately this is our production system, what
> i've done is simple change the yum repo from gluter-6 to
> http://mirror.centos.org/centos/$releasever/storage/$basearch/gluster-7/.
> Did a yum upgrade. And did restart the glusterd process several times, i've
> also tried rebooting the machine. And didn't touch the op-version yet,
> which is still at (6), usually i only do this when all nodes are
> upgraded, and are running stable.
> We're running multiple volumes with different configurations, but for none
> of the volumes the shd starts on the upgraded nodes.
> Is there anything further i could check/do to get to the bottom of this?
>
> Hi Olaf, like I said, would it be possible to create a test setup to see
> if you can recreate it?
> Regards,
> Ravi
>
>
> Thanks Olaf
>
> Op wo 25 nov. 2020 om 14:14 schreef Ravishankar N  >:
>
>>
>> On 25/11/20 5:50 pm, Olaf Buitelaar wrote:
>>
>> Hi Ashish,
>>
>> Thank you for looking into this. I indeed also suspect it has something
>> todo with the 7.X client, because on the 6.X clients the issue doesn't
>> really seem to occur.
>> I would love to update everything to 7.X, But since the self-heal daemons
>> (
>> https://lists.gluster.org/pipermail/gluster-users/2020-November/038917.html)
>> won't start, i halted the full upgrade.
>>
>> Olaf, based on your email. I did try to upgrade a 1 node of a 3-node
>> replica 3 setup from 6.10 to 7.8 on my test VMs and I found that the
>> self-heal daemon (and the bricks) came online after I restarted glusterd
>> post-upgrade on that node. (I did not touch the op-version), and I did not
>> spend time on it further.  So I don't think the problem is related to the
>> shd mux changes I referred to. But if you have a test setup where you can
>> reproduce this, please raise a github issue with the details.
>> Thanks,
>> Ravi
>>
>> Hopefully that issue will be addressed in the upcoming release. Once i've
>> everything running on the same version i'll check if the issue still occurs
>> and reach out, if that's the case.
>>
>> Thanks Olaf
>>
>> Op wo 25 nov. 2020 om 10:42 schreef Ashish Pandey :
>>
>>>
>>> Hi,
>>>
>>> I checked the statedump and found some very high memory allocations.
>>> grep -rwn "num_allocs" glusterdump.17317.dump.1605* | cut -d'=' -f2 |
>>> sort
>>>
>>> 30003616
>>> 30003616
>>> 3305
>>> 3305
>>> 36960008
>>> 36960008
>>> 38029944
>>> 38029944
>>> 38450472
>>> 38450472
>>> 39566824
>>> 39566824
>>> 4
>>> I did check the lines on statedump and it could be happening in
>>> protocol/clinet. However, I did not find anything suspicious in my quick
>>> code exploration.
>>> I would suggest to upgrade all the nodes on latest version and the start
>>> your work and see if there is any high usage of memory .
>>> That way it will also be easier to debug this issue.
>>>
>>> ---
>>> Ashish
>>>
>>> --
>>> *From: *"Olaf Buitelaar" 
>>> *To: *"gluster-users" 
>>> *Sent: *Thursday, November 19, 2020 10:28:57 PM
>>> *Subject: *[Gluster-users] possible memory leak in client/fuse mount
>>>
>>> Dear Gluster Users,
>>>
>>> I've a glusterfs process which consumes about all memory of the machine
>>> (~58GB);
>>>
>>> # ps -faxu|grep 17317
>>> root 17317  3.1 88.9 59695516 58479708 ?   Ssl  Oct31 839:36
>>> /usr/sbin/glusterfs --process-name fuse --volfile-server=10.201.0.1
>>> --volfile-server=10.201.0.8:10.201.0.5:10.201.0.6:10.201.0.7:10.201.0.9
>>> --volfile-id=/docker2 /mnt/docker2
>>>
>>> The gluster version on this machine is 7.8, but i'm currently running a
>>> mixed cluster of 6.10 and 7.8, while awaiting to proceed to upgrade for the
>>> issue mentioned earlier with the self-heal daemon.
>>>
>>> The affected volume info looks like;
>>>
>>> # gluster v info docker2
>>>
>>> Volume Name: docker2
>>> Type: Distributed-Replicate
>>> Volume ID: 4e0670a0-3d00-4360-98bd-3da844cedae5
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 3 x (2 + 1) = 9
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 10.201.0.5:/data0/gfs/bricks/brick1/docker2
>>> Brick2: 10.201.0.9:/data0/gfs/bricks/brick1/docker2
>>> Brick3: 10.201.0.3:/data0/gfs/bricks/bricka/docker2 (arbiter)
>>> Brick4: 10.201.0.6:/data0/gfs/bricks/brick1/docker2
>>> Brick5: 10.201.0.7:/data0/gfs/bricks/brick1/docker2
>>> Brick6: 10.201.0.4:/data0/gfs/bricks/bricka/docker2 (arbiter)
>>> Brick7: 

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Strahil Nikolov
Can you test by adding entries in /etc/hosts for the loopback ip
(127.0.0.1)

something like this:
127.0.0.1   localhost localhost.localdomain
localhost4 localhost4.localdomain4 server

Best Regards,
Strahil Nikolov

В 08:14 +0300 на 26.11.2020 (чт), Dmitry Antipov написа:
> On 11/26/20 6:33 AM, Ewen Chan wrote:
> 
> > Dmitry:
> > 
> > Is there a way to check and see if the GlusterFS write requests is
> > being routed through the network interface?
> > 
> > I am asking this because of your bricks/host definition as you
> > showed below.
> 
> In my test setup, all bricks and client workload (fio) are running on
> the same host. So
> all network traffic should be routed through the loopback interface,
> which is CPU-bounded.
> Since the server is 32-core and has plenty of RAM, loopback should be
> faster than even 10GbE.
> 
> Dmitry





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Yaniv Kaul
On Thu, Nov 26, 2020 at 11:44 AM Dmitry Antipov  wrote:

> BTW, did someone try to profile the brick process? I do, and got this
> for the default replica 3 volume ('perf record -F 2500 -g -p [PID]'):
>

I run a slightly different command, which hides the kernel stuff and
focuses on the user mode functions:
sudo perf record --call-graph dwarf -j any --buildid-all --all-user -p
`pgrep -d\, gluster` -F 2000 -ag
Y.


> +3.29% 0.02%  glfs_epoll001[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +3.17% 0.01%  glfs_epoll001[kernel.kallsyms]  [k]
> do_syscall_64
> +3.17% 0.02%  glfs_epoll000[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +3.06% 0.02%  glfs_epoll000[kernel.kallsyms]  [k]
> do_syscall_64
> +2.75% 0.01%  glfs_iotwr00f[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.74% 0.01%  glfs_iotwr00b[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.74% 0.01%  glfs_iotwr001[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.73% 0.00%  glfs_iotwr003[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.72% 0.00%  glfs_iotwr000[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.72% 0.01%  glfs_iotwr00c[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.70% 0.01%  glfs_iotwr003[kernel.kallsyms]  [k]
> do_syscall_64
> +2.69% 0.00%  glfs_iotwr001[kernel.kallsyms]  [k]
> do_syscall_64
> +2.69% 0.01%  glfs_iotwr008[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.68% 0.00%  glfs_iotwr00b[kernel.kallsyms]  [k]
> do_syscall_64
> +2.68% 0.00%  glfs_iotwr00c[kernel.kallsyms]  [k]
> do_syscall_64
> +2.68% 0.00%  glfs_iotwr00f[kernel.kallsyms]  [k]
> do_syscall_64
> +2.68% 0.01%  glfs_iotwr000[kernel.kallsyms]  [k]
> do_syscall_64
> +2.67% 0.00%  glfs_iotwr00a[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.65% 0.00%  glfs_iotwr008[kernel.kallsyms]  [k]
> do_syscall_64
> +2.64% 0.00%  glfs_iotwr00e[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.64% 0.01%  glfs_iotwr00d[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.63% 0.01%  glfs_iotwr00a[kernel.kallsyms]  [k]
> do_syscall_64
> +2.63% 0.01%  glfs_iotwr007[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.63% 0.00%  glfs_iotwr005[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.63% 0.01%  glfs_iotwr006[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.63% 0.00%  glfs_iotwr009[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.61% 0.01%  glfs_iotwr004[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.61% 0.01%  glfs_iotwr00e[kernel.kallsyms]  [k]
> do_syscall_64
> +2.60% 0.00%  glfs_iotwr006[kernel.kallsyms]  [k]
> do_syscall_64
> +2.59% 0.00%  glfs_iotwr005[kernel.kallsyms]  [k]
> do_syscall_64
> +2.59% 0.00%  glfs_iotwr00d[kernel.kallsyms]  [k]
> do_syscall_64
> +2.58% 0.00%  glfs_iotwr002[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +2.58% 0.01%  glfs_iotwr007[kernel.kallsyms]  [k]
> do_syscall_64
> +2.58% 0.00%  glfs_iotwr004[kernel.kallsyms]  [k]
> do_syscall_64
> +2.57% 0.00%  glfs_iotwr009[kernel.kallsyms]  [k]
> do_syscall_64
> +2.54% 0.00%  glfs_iotwr002[kernel.kallsyms]  [k]
> do_syscall_64
> +1.65% 0.00%  glfs_epoll000[unknown]  [k]
> 0x0001
> +1.65% 0.00%  glfs_epoll001[unknown]  [k]
> 0x0001
> +1.48% 0.01%  glfs_rpcrqhnd[kernel.kallsyms]  [k]
> entry_SYSCALL_64_after_hwframe
> +1.44% 0.08%  glfs_rpcrqhndlibpthread-2.32.so [.]
> pthread_cond_wait@@GLIBC_2.3.2
> +1.40% 0.01%  glfs_rpcrqhnd[kernel.kallsyms]  [k]
> do_syscall_64
> +1.36% 0.01%  glfs_rpcrqhnd[kernel.kallsyms]  [k]
> __x64_sys_futex
> +1.35% 0.03%  glfs_rpcrqhnd[kernel.kallsyms]  [k] do_futex
> +1.34% 0.01%  glfs_iotwr00alibpthread-2.32.so [.]
> __libc_pwrite64
> +1.32% 0.00%  glfs_iotwr00a[kernel.kallsyms]  [k]
> __x64_sys_pwrite64
> +1.32% 0.00%  glfs_iotwr001libpthread-2.32.so [.]
> __libc_pwrite64
> +1.31% 0.01%  glfs_iotwr002libpthread-2.32.so [.]
> __libc_pwrite64
> +1.31% 0.00%  glfs_iotwr00blibpthread-2.32.so [.]
> __libc_pwrite64
> +1.31% 0.01%  glfs_iotwr00a[kernel.kallsyms]  [k] vfs_write
> +1.30% 0.00%  glfs_iotwr001[kernel.kallsyms]  [k]
> __x64_sys_pwrite64
> +1.30% 0.00%  

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov

BTW, did someone try to profile the brick process? I do, and got this
for the default replica 3 volume ('perf record -F 2500 -g -p [PID]'):

+3.29% 0.02%  glfs_epoll001[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+3.17% 0.01%  glfs_epoll001[kernel.kallsyms]  [k] do_syscall_64
+3.17% 0.02%  glfs_epoll000[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+3.06% 0.02%  glfs_epoll000[kernel.kallsyms]  [k] do_syscall_64
+2.75% 0.01%  glfs_iotwr00f[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.74% 0.01%  glfs_iotwr00b[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.74% 0.01%  glfs_iotwr001[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.73% 0.00%  glfs_iotwr003[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.72% 0.00%  glfs_iotwr000[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.72% 0.01%  glfs_iotwr00c[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.70% 0.01%  glfs_iotwr003[kernel.kallsyms]  [k] do_syscall_64
+2.69% 0.00%  glfs_iotwr001[kernel.kallsyms]  [k] do_syscall_64
+2.69% 0.01%  glfs_iotwr008[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.68% 0.00%  glfs_iotwr00b[kernel.kallsyms]  [k] do_syscall_64
+2.68% 0.00%  glfs_iotwr00c[kernel.kallsyms]  [k] do_syscall_64
+2.68% 0.00%  glfs_iotwr00f[kernel.kallsyms]  [k] do_syscall_64
+2.68% 0.01%  glfs_iotwr000[kernel.kallsyms]  [k] do_syscall_64
+2.67% 0.00%  glfs_iotwr00a[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.65% 0.00%  glfs_iotwr008[kernel.kallsyms]  [k] do_syscall_64
+2.64% 0.00%  glfs_iotwr00e[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.64% 0.01%  glfs_iotwr00d[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.63% 0.01%  glfs_iotwr00a[kernel.kallsyms]  [k] do_syscall_64
+2.63% 0.01%  glfs_iotwr007[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.63% 0.00%  glfs_iotwr005[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.63% 0.01%  glfs_iotwr006[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.63% 0.00%  glfs_iotwr009[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.61% 0.01%  glfs_iotwr004[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.61% 0.01%  glfs_iotwr00e[kernel.kallsyms]  [k] do_syscall_64
+2.60% 0.00%  glfs_iotwr006[kernel.kallsyms]  [k] do_syscall_64
+2.59% 0.00%  glfs_iotwr005[kernel.kallsyms]  [k] do_syscall_64
+2.59% 0.00%  glfs_iotwr00d[kernel.kallsyms]  [k] do_syscall_64
+2.58% 0.00%  glfs_iotwr002[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+2.58% 0.01%  glfs_iotwr007[kernel.kallsyms]  [k] do_syscall_64
+2.58% 0.00%  glfs_iotwr004[kernel.kallsyms]  [k] do_syscall_64
+2.57% 0.00%  glfs_iotwr009[kernel.kallsyms]  [k] do_syscall_64
+2.54% 0.00%  glfs_iotwr002[kernel.kallsyms]  [k] do_syscall_64
+1.65% 0.00%  glfs_epoll000[unknown]  [k] 
0x0001
+1.65% 0.00%  glfs_epoll001[unknown]  [k] 
0x0001
+1.48% 0.01%  glfs_rpcrqhnd[kernel.kallsyms]  [k] 
entry_SYSCALL_64_after_hwframe
+1.44% 0.08%  glfs_rpcrqhndlibpthread-2.32.so [.] 
pthread_cond_wait@@GLIBC_2.3.2
+1.40% 0.01%  glfs_rpcrqhnd[kernel.kallsyms]  [k] do_syscall_64
+1.36% 0.01%  glfs_rpcrqhnd[kernel.kallsyms]  [k] 
__x64_sys_futex
+1.35% 0.03%  glfs_rpcrqhnd[kernel.kallsyms]  [k] do_futex
+1.34% 0.01%  glfs_iotwr00alibpthread-2.32.so [.] 
__libc_pwrite64
+1.32% 0.00%  glfs_iotwr00a[kernel.kallsyms]  [k] 
__x64_sys_pwrite64
+1.32% 0.00%  glfs_iotwr001libpthread-2.32.so [.] 
__libc_pwrite64
+1.31% 0.01%  glfs_iotwr002libpthread-2.32.so [.] 
__libc_pwrite64
+1.31% 0.00%  glfs_iotwr00blibpthread-2.32.so [.] 
__libc_pwrite64
+1.31% 0.01%  glfs_iotwr00a[kernel.kallsyms]  [k] vfs_write
+1.30% 0.00%  glfs_iotwr001[kernel.kallsyms]  [k] 
__x64_sys_pwrite64
+1.30% 0.00%  glfs_iotwr008libpthread-2.32.so [.] 
__libc_pwrite64
+1.30% 0.00%  glfs_iotwr00a[kernel.kallsyms]  [k] new_sync_write
+1.30% 0.00%  glfs_iotwr00clibpthread-2.32.so [.] 
__libc_pwrite64
+1.29% 0.00%  glfs_iotwr00a[kernel.kallsyms]  [k] 
xfs_file_write_iter
+1.29% 0.01%  glfs_iotwr00a[kernel.kallsyms]  [k] 
xfs_file_dio_aio_write

And on replica 3 with storage.linux-aio 

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov

On 11/26/20 11:42 AM, Strahil Nikolov wrote:


And you gluster bricks are localhost:/brick1 , localhost:/brick2 and
localhost:/brick3 ?
If not, add the hostname used for the bricks on the line starting with
127.0.0.1 and try again.


Same thing with:

127.0.0.1   trick trick.localdomain trick4 trick4.localdomain4
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

and:

Volume Name: test0
Type: Replicate
Volume ID: 2699e6fd-3898-4912-b4de-2d3850c53fb9
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: trick:/glusterfs/test0-000
Brick2: trick:/glusterfs/test0-001
Brick3: trick:/glusterfs/test0-002

When running the workload, per-interface RX/TX counters (as shown by ifconfig)
are rapidly grows on 'lo' but remains nearly the same on other interfaces.
So I'm pretty sure that loopback is in action and the problem is somewhere else.

Dmitry




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov

On 11/26/20 11:29 AM, Gionatan Danti wrote:


Can you details your exact client and server CPU model?


Desktop is 8x of:

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 94
model name  : Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
stepping: 3
microcode   : 0xe2
cpu MHz : 800.114
cache size  : 6144 KB
physical id : 0
siblings: 8
core id : 3
cpu cores   : 4
apicid  : 7
initial apicid  : 7
fpu : yes
fpu_exception   : yes
cpuid level : 22
wp  : yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts 
rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave 
avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm 
mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d

vmx flags   : vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb 
flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple pml
bugs: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds 
swapgs taa itlb_multihit srbds
bogomips: 5199.98
clflush size: 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual

and NVME disk is:

Model Number:   THNSN5128GPU7 TOSHIBA
Serial Number:  366S11IKTP6V
Firmware Version:   57XA4104
PCI Vendor/Subsystem ID:0x1179
IEEE OUI Identifier:0x00080d
Controller ID:  0
Number of Namespaces:   1
Namespace 1 Size/Capacity:  128,035,676,160 [128 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64:00080d 020004008b

Server is 32x of:

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 85
model name  : Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
stepping: 4
microcode   : 0x25a
cpu MHz : 800.057
cache size  : 11264 KB
physical id : 1
siblings: 16
core id : 7
cpu cores   : 8
apicid  : 31
initial apicid  : 31
fpu : yes
fpu_exception   : yes
cpuid level : 22
wp  : yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts 
rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes 
xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 
smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total 
cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke flush_l1d
vmx flags	: vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple shadow_vmcs pml 
ept_mode_based_exec tsc_scaling

bugs: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds 
swapgs taa itlb_multihit
bogomips: 4201.60
clflush size: 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual

and NVME disk is:

Model Number:   INTEL SSDPE7KX020T7
Serial Number:  PHLF8103001B2P0LGN
Firmware Version:   QDV10170
PCI Vendor/Subsystem ID:0x8086
IEEE OUI Identifier:0x5cd2e4
Total NVM Capacity: 2,000,398,934,016 [2.00 TB]
Unallocated NVM Capacity:   0
Controller ID:  0
Number of Namespaces:   1
Namespace 1 Size/Capacity:  2,000,398,934,016 [2.00 TB]
Namespace 1 Formatted LBA Size: 512

Dmitry




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Gionatan Danti

Il 2020-11-26 06:14 Dmitry Antipov ha scritto:

In my test setup, all bricks and client workload (fio) are running on
the same host. So
all network traffic should be routed through the loopback interface,
which is CPU-bounded.
Since the server is 32-core and has plenty of RAM, loopback should be
faster than even 10GbE.


Can you details your exact client and server CPU model?
Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov

On 11/26/20 9:05 AM, Strahil Nikolov wrote:

Can you test by adding entries in /etc/hosts for the loopback ip
(127.0.0.1)

something like this:
127.0.0.1   localhost localhost.localdomain
localhost4 localhost4.localdomain4 server


On both systems, my /etc/hosts is:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

Dmitry





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users