Hi thanks for reply.


From the top of my head, it is recommended to use 3 mons in
production. Also, for the 22 osds your number of PGs look a bug low,
you should look at that.
I get it from http://ceph.com/docs/master/rados/operations/placement-groups/

(22osd's * 100)/3 replicas = 733,3333 ~1024 pgs
Please correct me if I'm wrong.

It will be 5 mons (on 6 hosts) but now we must migrate some data from used servers.



The performance of the cluster is poor - this is too vague. What is
your current performance, what benchmarks have you tried, what is your
data workload and most importantly, how is your cluster setup. what
disks, ssds, network, ram, etc.

Please provide more information so that people could help you.

Andrei

Hardware informations:
ceph15:
RAM: 4GB
Network: 4x 1GB NIC
OSD disk's:
2x SATA Seagate ST31000524NS
2x SATA WDC WD1003FBYX-18Y7B0

ceph25:
RAM: 16GB
Network: 4x 1GB NIC
OSD disk's:
2x SATA WDC WD7500BPKX-7
2x SATA WDC WD7500BPKX-2
2x SATA SSHD ST1000LM014-1EJ164

ceph30
RAM: 16GB
Network: 4x 1GB NIC
OSD disks:
6x SATA SSHD ST1000LM014-1EJ164

ceph35:
RAM: 16GB
Network: 4x 1GB NIC
OSD disks:
6x SATA SSHD ST1000LM014-1EJ164


All journals are on OSD's. 2 NIC are for backend network (10.20.4.0/22) and 2 NIC are for frontend (10.20.8.0/22).

This cluster we use as storage backend for <100VM's on KVM. I don't make benchmarks but all vm's are migrated from Xen+GlusterFS(NFS), before migration every VM are running fine, now each VM from time to time hangs for few seconds, apps installed on VM's loading much more time. GlusterFS are running on 2 servers with 1x 1GB NIC and 2x8 disks WDC WD7500BPKX-7.

I make one test with recovery, if disk marks out, then recovery io is 150-200MB/s but all vm's hangs until recovery ends.

Biggest load is on ceph35, IOps on each disk are near 150, cpu load ~4-5.
On other hosts cpu load <2, 120~130iops

Our ceph.conf

===========
[global]

fsid=a9d17295-62f2-46f6-8325-1cad7724e97f
mon initial members = ceph35, ceph30, ceph25, ceph15
mon host = 10.20.8.35, 10.20.8.30, 10.20.8.25, 10.20.8.15
public network = 10.20.8.0/22
cluster network = 10.20.4.0/22
osd journal size = 1024
filestore xattr use omap = true
osd pool default size = 3
osd pool default min size = 1
osd pool default pg num = 1024
osd pool default pgp num = 1024
osd crush chooseleaf type = 1
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
rbd default format = 2

##ceph35 osds
[osd.0]
cluster addr = 10.20.4.35
[osd.1]
cluster addr = 10.20.4.35
[osd.2]
cluster addr = 10.20.4.35
[osd.3]
cluster addr = 10.20.4.36
[osd.4]
cluster addr = 10.20.4.36
[osd.5]
cluster addr = 10.20.4.36

##ceph25 osds
[osd.6]
cluster addr = 10.20.4.25
public addr = 10.20.8.25
[osd.7]
cluster addr = 10.20.4.25
public addr = 10.20.8.25
[osd.8]
cluster addr = 10.20.4.25
public addr = 10.20.8.25
[osd.9]
cluster addr = 10.20.4.26
public addr = 10.20.8.26
[osd.10]
cluster addr = 10.20.4.26
public addr = 10.20.8.26
[osd.11]
cluster addr = 10.20.4.26
public addr = 10.20.8.26

##ceph15 osds
[osd.12]
cluster addr = 10.20.4.15
public addr = 10.20.8.15
[osd.13]
cluster addr = 10.20.4.15
public addr = 10.20.8.15
[osd.14]
cluster addr = 10.20.4.15
public addr = 10.20.8.15
[osd.15]
cluster addr = 10.20.4.16
public addr = 10.20.8.16

##ceph30 osds
[osd.16]
cluster addr = 10.20.4.30
public addr = 10.20.8.30
[osd.17]
cluster addr = 10.20.4.30
public addr = 10.20.8.30
[osd.18]
cluster addr = 10.20.4.30
public addr = 10.20.8.30
[osd.19]
cluster addr = 10.20.4.31
public addr = 10.20.8.31
[osd.20]
cluster addr = 10.20.4.31
public addr = 10.20.8.31
[osd.21]
cluster addr = 10.20.4.31
public addr = 10.20.8.31

[mon.ceph35]
host = ceph35
mon addr = 10.20.8.35:6789
[mon.ceph30]
host = ceph30
mon addr = 10.20.8.30:6789
[mon.ceph25]
host = ceph25
mon addr = 10.20.8.25:6789
[mon.ceph15]
host = ceph15
mon addr = 10.20.8.15:6789
================

Regards,
Mateusz


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to