Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-27 Thread Gionatan Danti
Il 2020-11-27 09:40 Amar Tumballi ha scritto: Let's get to longer look into performance: Amar, Xavi, thanks for your input - very appreciated. However, I found that when facing sync writes (ie: fsync) gluster performances are very low - too much for a kernel/syscall overhead. For more info

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-27 Thread Dmitry Antipov
On 11/27/20 11:40 AM, Amar Tumballi wrote: Now, coming back to the subject, more the CPUs, same test is showing lesser performance gain because your locks would be taking more % bottleneck than in your Laptop.  Can you try running the same test with restricting the number of Cores the glusterfsd

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-27 Thread Amar Tumballi
Top posting as my observations are general and doesn't speak anything specific to the problem at hand, and what are our ideas to improve it. Thanks Dmitry for a good thread :-) I will try to break this into a long answer, but will give short answer for question. Does a single thread user app tak

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-27 Thread Xavi Hernandez
Hi Dmitry, On Thu, Nov 26, 2020 at 10:44 AM Dmitry Antipov wrote: > BTW, did someone try to profile the brick process? I do, and got this > for the default replica 3 volume ('perf record -F 2500 -g -p [PID]'): > > +3.29% 0.02% glfs_epoll001[kernel.kallsyms] [k] > entry_SYSCALL_

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-27 Thread Gionatan Danti
Il 2020-11-27 06:53 Dmitry Antipov ha scritto: Thanks, it seems you're right. Running local replica 3 volume on 3x1Gb ramdisks, I'm seeing: top - 08:44:35 up 1 day, 11:51, 1 user, load average: 2.34, 1.94, 1.00 Tasks: 237 total, 2 running, 235 sleeping, 0 stopped, 0 zombie %Cpu(s): 38.

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov
On 11/26/20 8:14 PM, Gionatan Danti wrote: So I think you simply are CPU limited. I remember doing some tests with loopback RAM disks and finding that Gluster used 100% CPU (ie: full load on an entire core) when doing 4K random writes. Side note: using synchronized (ie: fsync) 4k writes, I only

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Ewen Chan
half of Dmitry Antipov Sent: November 26, 2020 8:36 AM To: gluster-users@gluster.org List Subject: Re: [Gluster-users] Poor performance on a server-class system vs. desktop To whom it may be interesting, this paper says that ~80K IOPS (4K random writes) is real: https://archive.fosdem.o

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Gionatan Danti
Il 2020-11-26 09:47 Dmitry Antipov ha scritto: On 11/26/20 11:29 AM, Gionatan Danti wrote: Can you details your exact client and server CPU model? Desktop is 8x of: model name : Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz Server is 32x of: model name : Intel(R) Xeon(R) Silver 4110 CP

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Strahil Nikolov
Erm... that's not correct. Put them on the same line 27.0.0.1 localhost localhost.localdomain localhost4 > localhost4.localdomain4 trick. Best Regards, Strahil Nikolov В 12:00 +0300 на 26.11.2020 (чт), Dmitry Antipov написа: > On 11/26/20 11:42 AM, Strahil Nikolov wrote: > > > And you glus

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov
To whom it may be interesting, this paper says that ~80K IOPS (4K random writes) is real: https://archive.fosdem.org/2018/schedule/event/optimizing_sds/attachments/slides/2300/export/events/attachments/optimizing_sds/slides/2300/GlusterOnNVMe_FOSDEM2018.pdf On the same-class server hardware, fo

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Yaniv Kaul
On Thu, Nov 26, 2020 at 2:31 PM Dmitry Antipov wrote: > On 11/26/20 12:49 PM, Yaniv Kaul wrote: > > > I run a slightly different command, which hides the kernel stuff and > focuses on the user mode functions: > > sudo perf record --call-graph dwarf -j any --buildid-all --all-user -p > `pgrep -d\,

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Strahil Nikolov
And you gluster bricks are localhost:/brick1 , localhost:/brick2 and localhost:/brick3 ? If not, add the hostname used for the bricks on the line starting with 127.0.0.1 and try again. Best Regards, Strahil Nikolov В 11:18 +0300 на 26.11.2020 (чт), Dmitry Antipov написа: > On 11/26/20 9:05 AM, St

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov
On 11/26/20 12:49 PM, Yaniv Kaul wrote: I run a slightly different command, which hides the kernel stuff and focuses on the user mode functions: sudo perf record --call-graph dwarf -j any --buildid-all --all-user -p `pgrep -d\, gluster` -F 2000 -ag Thanks. BTW, how much is an overhead of pa

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Strahil Nikolov
Can you test by adding entries in /etc/hosts for the loopback ip (127.0.0.1) something like this: 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 server Best Regards, Strahil Nikolov В 08:14 +0300 на 26.11.2020 (чт), Dmitry Antipov написа: > On 11/26/20 6:33 AM, Ew

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Yaniv Kaul
On Thu, Nov 26, 2020 at 11:44 AM Dmitry Antipov wrote: > BTW, did someone try to profile the brick process? I do, and got this > for the default replica 3 volume ('perf record -F 2500 -g -p [PID]'): > I run a slightly different command, which hides the kernel stuff and focuses on the user mode f

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov
BTW, did someone try to profile the brick process? I do, and got this for the default replica 3 volume ('perf record -F 2500 -g -p [PID]'): +3.29% 0.02% glfs_epoll001[kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe +3.17% 0.01% glfs_epoll001[kernel.kallsyms]

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov
On 11/26/20 11:42 AM, Strahil Nikolov wrote: And you gluster bricks are localhost:/brick1 , localhost:/brick2 and localhost:/brick3 ? If not, add the hostname used for the bricks on the line starting with 127.0.0.1 and try again. Same thing with: 127.0.0.1 trick trick.localdomain trick4 tri

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov
On 11/26/20 11:29 AM, Gionatan Danti wrote: Can you details your exact client and server CPU model? Desktop is 8x of: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 94 model name : Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz stepping: 3 mic

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Gionatan Danti
Il 2020-11-26 06:14 Dmitry Antipov ha scritto: In my test setup, all bricks and client workload (fio) are running on the same host. So all network traffic should be routed through the loopback interface, which is CPU-bounded. Since the server is 32-core and has plenty of RAM, loopback should be f

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-26 Thread Dmitry Antipov
On 11/26/20 9:05 AM, Strahil Nikolov wrote: Can you test by adding entries in /etc/hosts for the loopback ip (127.0.0.1) something like this: 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 server On both systems, my /etc/hosts is: 127.0.0.1 localhost localhos

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-25 Thread Dmitry Antipov
On 11/26/20 6:33 AM, Ewen Chan wrote: Dmitry: Is there a way to check and see if the GlusterFS write requests is being routed through the network interface? I am asking this because of your bricks/host definition as you showed below. In my test setup, all bricks and client workload (fio) ar

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-25 Thread Ewen Chan
on behalf of Strahil Nikolov Sent: November 25, 2020 12:42 PM To: Dmitry Antipov Cc: gluster-users Subject: Re: [Gluster-users] Poor performance on a server-class system vs. desktop Having the same performance on 2 very fast disks indicate that you are hitting a limit. You can start with

Re: [Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-25 Thread Strahil Nikolov
Having the same performance on 2 very fast disks indicate that you are hitting a limit. You can start with this article: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/small_file_performance_enhancements Most probably increasing the performance.i

[Gluster-users] Poor performance on a server-class system vs. desktop

2020-11-25 Thread Dmitry Antipov
I'm trying to investigate the poor I/O performance results observed on a server-class system vs. the desktop-class one. The second one is 8-core notebook with NVME disk. According to fio --name test --filename=XXX --bs=4k --rw=randwrite --ioengine=libaio --direct=1 \ --iodepth=128 --numjobs