Re: [ceph-users] Poor ceph cluster performance

2018-11-28 Thread Paul Emmerich
Cody :
>
> > And this exact problem was one of the reasons why we migrated
> > everything to PXE boot where the OS runs from RAM.
>
> Hi Paul,
>
> I totally agree with and admire your diskless approach. If I may ask,
> what kind of OS image do you use? 1GB footprint sounds really small.

It's based on Debian, because Debian makes live boot really easy with
squashfs + overlayfs.
We also have a half-finished CentOS/RHEL-based version somewhere, but
that requires way more RAM because it doesn't use overlayfs (or didn't
when we last checked, I guess we need to check RHEL 8 again)

Current image size is 400 MB + 30 MB for kernel + initrd and it comes
with everything you need for Ceph. We don't even run aggressive
compression on the squashfs, it's just lzo.

You can test it for yourself in a VM: https://croit.io/croit-virtual-demo

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

>
> On Tue, Nov 27, 2018 at 1:53 PM Paul Emmerich  wrote:
> >
> > And this exact problem was one of the reasons why we migrated
> > everything to PXE boot where the OS runs from RAM.
> > That kind of failure is just the worst to debug...
> > Also, 1 GB of RAM is cheaper than a separate OS disk.
> >
> > --
> > Paul Emmerich
> >
> > Looking for help with your Ceph cluster? Contact us at https://croit.io
> >
> > croit GmbH
> > Freseniusstr. 31h
> > 81247 München
> > www.croit.io
> > Tel: +49 89 1896585 90
> >
> > Am Di., 27. Nov. 2018 um 19:22 Uhr schrieb Cody :
> > >
> > > Hi everyone,
> > >
> > > Many, many thanks to all of you!
> > >
> > > The root cause was due to a failed OS drive on one storage node. The
> > > server was responsive to ping, but unable to login. After a reboot via
> > > IPMI, docker daemon failed to start due to I/O errors and dmesg
> > > complained about the failing OS disk. I failed to catch the problem
> > > initially since  'ceph -s' kept showing HEALTH and the cluster was
> > > "functional" despite of slow performance.
> > >
> > > I really appreciate all the tips and advices received from you all and
> > > learned a lot. I will carry your advices (e.g. using bluestore,
> > > enterprise ssd/hdd, separating public and cluster traffics, etc) into
> > > my next round PoC.
> > >
> > > Thank you very much!
> > >
> > > Best regards,
> > > Cody
> > >
> > > On Tue, Nov 27, 2018 at 6:31 AM Vitaliy Filippov  
> > > wrote:
> > > >
> > > > > CPU: 2 x E5-2603 @1.8GHz
> > > > > RAM: 16GB
> > > > > Network: 1G port shared for Ceph public and cluster traffics
> > > > > Journaling device: 1 x 120GB SSD (SATA3, consumer grade)
> > > > > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade)
> > > >
> > > > 0.84 MB/s sequential write is impossibly bad, it's not normal with any
> > > > kind of devices and even with 1G network, you probably have some kind of
> > > > problem in your setup - maybe the network RTT is very high or maybe osd 
> > > > or
> > > > mon nodes are shared with other running tasks and overloaded or maybe 
> > > > your
> > > > disks are already dead... :))
> > > >
> > > > > As I moved on to test block devices, I got a following error message:
> > > > >
> > > > > # rbd map image01 --pool testbench --name client.admin
> > > >
> > > > You don't need to map it to run benchmarks, use `fio --ioengine=rbd`
> > > > (however you'll still need /etc/ceph/ceph.client.admin.keyring)
> > > >
> > > > --
> > > > With best regards,
> > > >Vitaliy Filippov
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor ceph cluster performance

2018-11-27 Thread Paul Emmerich
And this exact problem was one of the reasons why we migrated
everything to PXE boot where the OS runs from RAM.
That kind of failure is just the worst to debug...
Also, 1 GB of RAM is cheaper than a separate OS disk.

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

Am Di., 27. Nov. 2018 um 19:22 Uhr schrieb Cody :
>
> Hi everyone,
>
> Many, many thanks to all of you!
>
> The root cause was due to a failed OS drive on one storage node. The
> server was responsive to ping, but unable to login. After a reboot via
> IPMI, docker daemon failed to start due to I/O errors and dmesg
> complained about the failing OS disk. I failed to catch the problem
> initially since  'ceph -s' kept showing HEALTH and the cluster was
> "functional" despite of slow performance.
>
> I really appreciate all the tips and advices received from you all and
> learned a lot. I will carry your advices (e.g. using bluestore,
> enterprise ssd/hdd, separating public and cluster traffics, etc) into
> my next round PoC.
>
> Thank you very much!
>
> Best regards,
> Cody
>
> On Tue, Nov 27, 2018 at 6:31 AM Vitaliy Filippov  wrote:
> >
> > > CPU: 2 x E5-2603 @1.8GHz
> > > RAM: 16GB
> > > Network: 1G port shared for Ceph public and cluster traffics
> > > Journaling device: 1 x 120GB SSD (SATA3, consumer grade)
> > > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade)
> >
> > 0.84 MB/s sequential write is impossibly bad, it's not normal with any
> > kind of devices and even with 1G network, you probably have some kind of
> > problem in your setup - maybe the network RTT is very high or maybe osd or
> > mon nodes are shared with other running tasks and overloaded or maybe your
> > disks are already dead... :))
> >
> > > As I moved on to test block devices, I got a following error message:
> > >
> > > # rbd map image01 --pool testbench --name client.admin
> >
> > You don't need to map it to run benchmarks, use `fio --ioengine=rbd`
> > (however you'll still need /etc/ceph/ceph.client.admin.keyring)
> >
> > --
> > With best regards,
> >Vitaliy Filippov
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor ceph cluster performance

2018-11-27 Thread Cody
Hi everyone,

Many, many thanks to all of you!

The root cause was due to a failed OS drive on one storage node. The
server was responsive to ping, but unable to login. After a reboot via
IPMI, docker daemon failed to start due to I/O errors and dmesg
complained about the failing OS disk. I failed to catch the problem
initially since  'ceph -s' kept showing HEALTH and the cluster was
"functional" despite of slow performance.

I really appreciate all the tips and advices received from you all and
learned a lot. I will carry your advices (e.g. using bluestore,
enterprise ssd/hdd, separating public and cluster traffics, etc) into
my next round PoC.

Thank you very much!

Best regards,
Cody

On Tue, Nov 27, 2018 at 6:31 AM Vitaliy Filippov  wrote:
>
> > CPU: 2 x E5-2603 @1.8GHz
> > RAM: 16GB
> > Network: 1G port shared for Ceph public and cluster traffics
> > Journaling device: 1 x 120GB SSD (SATA3, consumer grade)
> > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade)
>
> 0.84 MB/s sequential write is impossibly bad, it's not normal with any
> kind of devices and even with 1G network, you probably have some kind of
> problem in your setup - maybe the network RTT is very high or maybe osd or
> mon nodes are shared with other running tasks and overloaded or maybe your
> disks are already dead... :))
>
> > As I moved on to test block devices, I got a following error message:
> >
> > # rbd map image01 --pool testbench --name client.admin
>
> You don't need to map it to run benchmarks, use `fio --ioengine=rbd`
> (however you'll still need /etc/ceph/ceph.client.admin.keyring)
>
> --
> With best regards,
>Vitaliy Filippov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor ceph cluster performance

2018-11-27 Thread Vitaliy Filippov

CPU: 2 x E5-2603 @1.8GHz
RAM: 16GB
Network: 1G port shared for Ceph public and cluster traffics
Journaling device: 1 x 120GB SSD (SATA3, consumer grade)
OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade)


0.84 MB/s sequential write is impossibly bad, it's not normal with any  
kind of devices and even with 1G network, you probably have some kind of  
problem in your setup - maybe the network RTT is very high or maybe osd or  
mon nodes are shared with other running tasks and overloaded or maybe your  
disks are already dead... :))



As I moved on to test block devices, I got a following error message:

# rbd map image01 --pool testbench --name client.admin


You don't need to map it to run benchmarks, use `fio --ioengine=rbd`  
(however you'll still need /etc/ceph/ceph.client.admin.keyring)


--
With best regards,
  Vitaliy Filippov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor ceph cluster performance

2018-11-27 Thread Darius Kasparavičius
Hi,


Most likely the issue is with your consumer grade journal ssd. Run
this to your ssd to check if it performs: fio --filename=
--direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1
--runtime=60 --time_based --group_reporting --name=journal-test
On Tue, Nov 27, 2018 at 2:06 AM Cody  wrote:
>
> Hello,
>
> I have a Ceph cluster deployed together with OpenStack using TripleO.
> While the Ceph cluster shows a healthy status, its performance is
> painfully slow. After eliminating a possibility of network issues, I
> have zeroed in on the Ceph cluster itself, but have no experience in
> further debugging and tunning.
>
> The Ceph OSD part of the cluster uses 3 identical servers with the
> following specifications:
>
> CPU: 2 x E5-2603 @1.8GHz
> RAM: 16GB
> Network: 1G port shared for Ceph public and cluster traffics
> Journaling device: 1 x 120GB SSD (SATA3, consumer grade)
> OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade)
>
> This is not beefy enough in any way, but I am running for PoC only,
> with minimum utilization.
>
> Ceph-mon and ceph-mgr daemons are hosted on the OpenStack Controller
> nodes. Ceph-ansible version is 3.1 and is using Filestore with
> non-colocated scenario (1 SSD for every 2 OSDs). Connection speed
> among Controllers, Computes, and OSD nodes can reach ~900Mbps tested
> using iperf.
>
> I followed the Red Hat Ceph 3 benchmarking procedure [1] and received
> following results:
>
> Write Test:
>
> Total time run: 80.313004
> Total writes made:  17
> Write size: 4194304
> Object size:4194304
> Bandwidth (MB/sec): 0.846687
> Stddev Bandwidth:   0.320051
> Max bandwidth (MB/sec): 2
> Min bandwidth (MB/sec): 0
> Average IOPS:   0
> Stddev IOPS:0
> Max IOPS:   0
> Min IOPS:   0
> Average Latency(s): 66.6582
> Stddev Latency(s):  15.5529
> Max latency(s): 80.3122
> Min latency(s): 29.7059
>
> Sequencial Read Test:
>
> Total time run:   25.951049
> Total reads made: 17
> Read size:4194304
> Object size:  4194304
> Bandwidth (MB/sec):   2.62032
> Average IOPS: 0
> Stddev IOPS:  0
> Max IOPS: 1
> Min IOPS: 0
> Average Latency(s):   24.4129
> Max latency(s):   25.9492
> Min latency(s):   0.117732
>
> Random Read Test:
>
> Total time run:   66.355433
> Total reads made: 46
> Read size:4194304
> Object size:  4194304
> Bandwidth (MB/sec):   2.77295
> Average IOPS: 0
> Stddev IOPS:  3
> Max IOPS: 27
> Min IOPS: 0
> Average Latency(s):   21.4531
> Max latency(s):   66.1885
> Min latency(s):   0.0395266
>
> Apparently, the results are pathetic...
>
> As I moved on to test block devices, I got a following error message:
>
> # rbd map image01 --pool testbench --name client.admin
> rbd: failed to add secret 'client.admin' to kernel
>
> Any suggestions on the above error and/or debugging would be greatly
> appreciated!
>
> Thank you very much to all.
>
> Cody
>
> [1] 
> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#benchmarking_performance
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor ceph cluster performance

2018-11-26 Thread Stefan Kooman
Quoting Cody (codeology@gmail.com):
> The Ceph OSD part of the cluster uses 3 identical servers with the
> following specifications:
> 
> CPU: 2 x E5-2603 @1.8GHz
> RAM: 16GB
> Network: 1G port shared for Ceph public and cluster traffics

This will hamper throughput a lot. 

> Journaling device: 1 x 120GB SSD (SATA3, consumer grade)
> OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade)

OK, let's stop here first: Consumer grade SSD. Percona did a nice
writeup about "fsync" speed on consumer grade SSDs [1]. As I don't know
what drives you use this might or might not be the issue.

> 
> This is not beefy enough in any way, but I am running for PoC only,
> with minimum utilization.
> 
> Ceph-mon and ceph-mgr daemons are hosted on the OpenStack Controller
> nodes. Ceph-ansible version is 3.1 and is using Filestore with
> non-colocated scenario (1 SSD for every 2 OSDs). Connection speed
> among Controllers, Computes, and OSD nodes can reach ~900Mbps tested
> using iperf.

Why filestore if I may ask? I guess bluestore with bluestore journal on
SSD and data on SATA should give you better performance. If the SSDs are
suitable for the job at all.

What version of Ceph are use using? Metrics can give you a lot of
insight. Did you take a look at those? In fFor example Ceph mgr dashboard?

> 
> I followed the Red Hat Ceph 3 benchmarking procedure [1] and received
> following results:
> 
> Write Test:
> 
> Total time run: 80.313004
> Total writes made:  17
> Write size: 4194304
> Object size:4194304
> Bandwidth (MB/sec): 0.846687
> Stddev Bandwidth:   0.320051
> Max bandwidth (MB/sec): 2
> Min bandwidth (MB/sec): 0
> Average IOPS:   0
> Stddev IOPS:0
> Max IOPS:   0
> Min IOPS:   0
> Average Latency(s): 66.6582
> Stddev Latency(s):  15.5529
> Max latency(s): 80.3122
> Min latency(s): 29.7059
> 
> Sequencial Read Test:
> 
> Total time run:   25.951049
> Total reads made: 17
> Read size:4194304
> Object size:  4194304
> Bandwidth (MB/sec):   2.62032
> Average IOPS: 0
> Stddev IOPS:  0
> Max IOPS: 1
> Min IOPS: 0
> Average Latency(s):   24.4129
> Max latency(s):   25.9492
> Min latency(s):   0.117732
> 
> Random Read Test:
> 
> Total time run:   66.355433
> Total reads made: 46
> Read size:4194304
> Object size:  4194304
> Bandwidth (MB/sec):   2.77295
> Average IOPS: 0
> Stddev IOPS:  3
> Max IOPS: 27
> Min IOPS: 0
> Average Latency(s):   21.4531
> Max latency(s):   66.1885
> Min latency(s):   0.0395266
> 
> Apparently, the results are pathetic...
> 
> As I moved on to test block devices, I got a following error message:
> 
> # rbd map image01 --pool testbench --name client.admin
> rbd: failed to add secret 'client.admin' to kernel

What replication factor are you using?

Make sure you have the client.admin keyring on the node you are issuing
this command. If you have the keyring present like Ceph expects it to
be, then you can omit the --name client.admin. On a monitor node you can
extract the admin keyring: ceph auth export client.admin. Put the output
of that in /etc/ceph/ceph.client.admin.keyring and this should work.

> Any suggestions on the above error and/or debugging would be greatly
> appreciated!

Gr. Stefan

[1]:
https://www.percona.com/blog/2018/07/18/why-consumer-ssd-reviews-are-useless-for-database-performance-use-case/
> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#benchmarking_performance
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Poor ceph cluster performance

2018-11-26 Thread Cody
Hello,

I have a Ceph cluster deployed together with OpenStack using TripleO.
While the Ceph cluster shows a healthy status, its performance is
painfully slow. After eliminating a possibility of network issues, I
have zeroed in on the Ceph cluster itself, but have no experience in
further debugging and tunning.

The Ceph OSD part of the cluster uses 3 identical servers with the
following specifications:

CPU: 2 x E5-2603 @1.8GHz
RAM: 16GB
Network: 1G port shared for Ceph public and cluster traffics
Journaling device: 1 x 120GB SSD (SATA3, consumer grade)
OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade)

This is not beefy enough in any way, but I am running for PoC only,
with minimum utilization.

Ceph-mon and ceph-mgr daemons are hosted on the OpenStack Controller
nodes. Ceph-ansible version is 3.1 and is using Filestore with
non-colocated scenario (1 SSD for every 2 OSDs). Connection speed
among Controllers, Computes, and OSD nodes can reach ~900Mbps tested
using iperf.

I followed the Red Hat Ceph 3 benchmarking procedure [1] and received
following results:

Write Test:

Total time run: 80.313004
Total writes made:  17
Write size: 4194304
Object size:4194304
Bandwidth (MB/sec): 0.846687
Stddev Bandwidth:   0.320051
Max bandwidth (MB/sec): 2
Min bandwidth (MB/sec): 0
Average IOPS:   0
Stddev IOPS:0
Max IOPS:   0
Min IOPS:   0
Average Latency(s): 66.6582
Stddev Latency(s):  15.5529
Max latency(s): 80.3122
Min latency(s): 29.7059

Sequencial Read Test:

Total time run:   25.951049
Total reads made: 17
Read size:4194304
Object size:  4194304
Bandwidth (MB/sec):   2.62032
Average IOPS: 0
Stddev IOPS:  0
Max IOPS: 1
Min IOPS: 0
Average Latency(s):   24.4129
Max latency(s):   25.9492
Min latency(s):   0.117732

Random Read Test:

Total time run:   66.355433
Total reads made: 46
Read size:4194304
Object size:  4194304
Bandwidth (MB/sec):   2.77295
Average IOPS: 0
Stddev IOPS:  3
Max IOPS: 27
Min IOPS: 0
Average Latency(s):   21.4531
Max latency(s):   66.1885
Min latency(s):   0.0395266

Apparently, the results are pathetic...

As I moved on to test block devices, I got a following error message:

# rbd map image01 --pool testbench --name client.admin
rbd: failed to add secret 'client.admin' to kernel

Any suggestions on the above error and/or debugging would be greatly
appreciated!

Thank you very much to all.

Cody

[1] 
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#benchmarking_performance
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com