[ceph-users] Poor performance with 2 million files flat

2014-02-15 Thread Samuel Terburg - Panther-IT BV

I have a performance problem i would like advise.

I have the following sub-optimal setup:
* 2 Servers (WTFM008 WTFM009)
  * HP Proliant DL180
* SmartArray G6 P410 raid-controller
* 4x 500GB RAID5   (seq writes = 230MB/s)
* CentOS 6.5 x86_64
* 2.000.000 files (ms-word), with no directory structure
* Ceph
  * ceph-deploy mon create WTFM008 WTFM009
  * ceph-deploy mds create WTFM008 WTFM009
  * ceph-deploy osd activate WTFM008:/var/lib/ceph/osd/ceph-0 
WTFM009:/var/lib/ceph/osd/ceph-1

(osd is using root fs)
  * ceph-fuse /mnt/ceph

I am currently trying to copy 2 million ms-word documents into ceph.
When i started it was doing about 10 files per second.
Now, 1 week later, it has done about 500.000 files and has slowed down 
to 1 file per 10 seconds.


How can i improve this terrible performance?
* The hardware is a fixed configuration, i cannot add (SSD) disks or 
change RAID.

* I could not find the cephfs kernel module so i had to use cephfs-fuse.
* I could have started with a degraded setup (1 OSD) for the initial load,
  would that have helped in the performance? (Ceph not having to do the 
distribution part)

* There is nu load on the systems at all (not cpu, not mem, not disk i/o)

Below is my crush map.


Regards,


Samuel Terburg
Panther-IT BV



# begin crush map

# devices
device 0 osd.0
device 1 osd.1

# types
type 0 osd
type 1 host
type 2 rack
type 3 row
type 4 room
type 5 datacenter
type 6 root

# buckets
host WTFM008 {
   id -2# do not change unnecessarily
   # weight 1.340
   alg straw
   hash 0   # rjenkins1
   item osd.0 weight 1.340
}
host WTFM009 {
   id -3# do not change unnecessarily
   # weight 1.340
   alg straw
   hash 0   # rjenkins1
   item osd.1 weight 1.340
}
root default {
   id -1# do not change unnecessarily
   # weight 2.680
   alg straw
   hash 0   # rjenkins1
   item WTFM008 weight 1.340
   item WTFM009 weight 1.340
}

# rules
rule data {
   ruleset 0
   type replicated
   min_size 1
   max_size 10
   step take default
   step chooseleaf firstn 0 type host
   step emit
}
rule metadata {
   ruleset 1
   type replicated
   min_size 1
   max_size 10
   step take default
   step chooseleaf firstn 0 type host
   step emit
}
rule rbd {
   ruleset 2
   type replicated
   min_size 1
   max_size 10
   step take default
   step chooseleaf firstn 0 type host
   step emit
}

# end crush map



# ceph -w
cluster 4f7bcb26-0cee-4472-abca-c200a999b686
 health HEALTH_OK
 monmap e1: 2 mons at 
{WTFM008=192.168.0.1:6789/0,WTFM009=192.168.0.2:6789/0}, election epoch 
4, quorum 0,1 WTFM008,WTFM009

 mdsmap e5: 1/1/1 up {0=WTFM008=up:active}, 1 up:standby
 osdmap e14: 2 osds: 2 up, 2 in
  pgmap v151668: 192 pgs, 3 pools, 31616 MB data, 956 kobjects
913 GB used, 1686 GB / 2738 GB avail
 192 active+clean
  client io 40892 kB/s rd, 7370 B/s wr, 1 op/s



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor performance with 2 million files flat

2014-02-15 Thread Sage Weil
On Sat, 15 Feb 2014, Samuel Terburg - Panther-IT BV wrote:
 I have a performance problem i would like advise.
 
 I have the following sub-optimal setup:
 * 2 Servers (WTFM008 WTFM009)
   * HP Proliant DL180
     * SmartArray G6 P410 raid-controller
     * 4x 500GB RAID5   (seq writes = 230MB/s)
 * CentOS 6.5 x86_64
 * 2.000.000 files (ms-word), with no directory structure
 * Ceph
   * ceph-deploy mon create WTFM008 WTFM009
   * ceph-deploy mds create WTFM008 WTFM009
   * ceph-deploy osd activate WTFM008:/var/lib/ceph/osd/ceph-0
 WTFM009:/var/lib/ceph/osd/ceph-1
     (osd is using root fs)
   * ceph-fuse /mnt/ceph
 
 I am currently trying to copy 2 million ms-word documents into ceph.
 When i started it was doing about 10 files per second.
 Now, 1 week later, it has done about 500.000 files and has slowed down to 1
 file per 10 seconds.
 
 How can i improve this terrible performance?

You probably need to add

mds frag = true

in the [mds] section

sage



 * The hardware is a fixed configuration, i cannot add (SSD) disks or change
 RAID.
 * I could not find the cephfs kernel module so i had to use cephfs-fuse.
 * I could have started with a degraded setup (1 OSD) for the initial load,
   would that have helped in the performance? (Ceph not having to do the
 distribution part)
 * There is nu load on the systems at all (not cpu, not mem, not disk i/o)
 
 Below is my crush map.
 
 
 Regards,
 
 
 Samuel Terburg
 Panther-IT BV
 
 
 
 # begin crush map
 
 # devices
 device 0 osd.0
 device 1 osd.1
 
 # types
 type 0 osd
 type 1 host
 type 2 rack
 type 3 row
 type 4 room
 type 5 datacenter
 type 6 root
 
 # buckets
 host WTFM008 {
    id -2    # do not change unnecessarily
    # weight 1.340
    alg straw
    hash 0   # rjenkins1
    item osd.0 weight 1.340
 }
 host WTFM009 {
    id -3    # do not change unnecessarily
    # weight 1.340
    alg straw
    hash 0   # rjenkins1
    item osd.1 weight 1.340
 }
 root default {
    id -1    # do not change unnecessarily
    # weight 2.680
    alg straw
    hash 0   # rjenkins1
    item WTFM008 weight 1.340
    item WTFM009 weight 1.340
 }
 
 # rules
 rule data {
    ruleset 0
    type replicated
    min_size 1
    max_size 10
    step take default
    step chooseleaf firstn 0 type host
    step emit
 }
 rule metadata {
    ruleset 1
    type replicated
    min_size 1
    max_size 10
    step take default
    step chooseleaf firstn 0 type host
    step emit
 }
 rule rbd {
    ruleset 2
    type replicated
    min_size 1
    max_size 10
    step take default
    step chooseleaf firstn 0 type host
    step emit
 }
 
 # end crush map
 
 
 
 # ceph -w
     cluster 4f7bcb26-0cee-4472-abca-c200a999b686
  health HEALTH_OK
  monmap e1: 2 mons at
 {WTFM008=192.168.0.1:6789/0,WTFM009=192.168.0.2:6789/0}, election epoch 4,
 quorum 0,1 WTFM008,WTFM009
  mdsmap e5: 1/1/1 up {0=WTFM008=up:active}, 1 up:standby
  osdmap e14: 2 osds: 2 up, 2 in
   pgmap v151668: 192 pgs, 3 pools, 31616 MB data, 956 kobjects
     913 GB used, 1686 GB / 2738 GB avail
  192 active+clean
   client io 40892 kB/s rd, 7370 B/s wr, 1 op/s
 
 
 
 
 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com