[ceph-users] Pools do not respond
Hi folk, I am following step by step the test intallation, and checking some configuration before try to deploy a production cluster. Now I have a Health cluster with 3 mons + 4 OSDs. I have created a pool with belonging all osd.x and two more one for two servers o the other for the other two. The general pool work fine (I can create images and mount it on remote machines). But the other two does not work (the commands rados put, or rbd ls pool hangs for ever). this is the tree: [ceph@cephadm ceph-cloud]$ sudo ceph osd tree # id weight type name up/down reweight -7 5.4 root 4x1GbFCnlSAS -3 2.7 host node04 1 2.7 osd.1 up 1 -4 2.7 host node03 2 2.7 osd.2 up 1 -6 8.1 root 4x4GbFCnlSAS -5 5.4 host node01 3 2.7 osd.3 up 1 4 2.7 osd.4 up 1 -2 2.7 host node04 0 2.7 osd.0 up 1 -1 13.5 root default -2 2.7 host node04 0 2.7 osd.0 up 1 -3 2.7 host node04 1 2.7 osd.1 up 1 -4 2.7 host node03 2 2.7 osd.2 up 1 -5 5.4 host node01 3 2.7 osd.3 up 1 4 2.7 osd.4 up 1 And this is the crushmap: ... root 4x4GbFCnlSAS { id -6 #do not change unnecessarily alg straw hash 0 # rjenkins1 item node01 weight 5.400 item node04 weight 2.700 } root 4x1GbFCnlSAS { id -7 #do not change unnecessarily alg straw hash 0 # rjenkins1 item node04 weight 2.700 item node03 weight 2.700 } # rules rule 4x4GbFCnlSAS { ruleset 1 type replicated min_size 1 max_size 10 step take 4x4GbFCnlSAS step choose firstn 0 type host step emit } rule 4x1GbFCnlSAS { ruleset 2 type replicated min_size 1 max_size 10 step take 4x1GbFCnlSAS step choose firstn 0 type host step emit } .. I of course set the crush_rules: sudo ceph osd pool set cloud-4x1GbFCnlSAS crush_ruleset 2 sudo ceph osd pool set cloud-4x4GbFCnlSAS crush_ruleset 1 but seems that are something wrong (4x4GbFCnlSAS.pool is 512MB file): sudo rados -p cloud-4x1GbFCnlSAS put 4x4GbFCnlSAS.object 4x4GbFCnlSAS.pool !!HANGS for eve! from the ceph-client happen the same rbd ls cloud-4x1GbFCnlSAS !!HANGS for eve! [root@cephadm ceph-cloud]# ceph osd map cloud-4x1GbFCnlSAS 4x1GbFCnlSAS.object osdmap e49 pool 'cloud-4x1GbFCnlSAS' (3) object '4x1GbFCnlSAS.object' - pg 3.114ae7a9 (3.29) - *up ([], p-1) acting ([], p-1)* Any idea what i am doing wrong?? Thanks in advance, I Bertrand Russell: *El problema con el mundo es que los estúpidos están seguros de todo y los inteligentes están llenos de dudas* ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Pools do not respond
The PG in question isn't being properly mapped to any OSDs. There's a good chance that those trees (with 3 OSDs in 2 hosts) aren't going to map well anyway, but the immediate problem should resolve itself if you change the choose to chooseleaf in your rules. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Jul 3, 2014 at 4:17 AM, Iban Cabrillo cabri...@ifca.unican.es wrote: Hi folk, I am following step by step the test intallation, and checking some configuration before try to deploy a production cluster. Now I have a Health cluster with 3 mons + 4 OSDs. I have created a pool with belonging all osd.x and two more one for two servers o the other for the other two. The general pool work fine (I can create images and mount it on remote machines). But the other two does not work (the commands rados put, or rbd ls pool hangs for ever). this is the tree: [ceph@cephadm ceph-cloud]$ sudo ceph osd tree # id weight type name up/down reweight -7 5.4 root 4x1GbFCnlSAS -3 2.7 host node04 1 2.7 osd.1 up 1 -4 2.7 host node03 2 2.7 osd.2 up 1 -6 8.1 root 4x4GbFCnlSAS -5 5.4 host node01 3 2.7 osd.3 up 1 4 2.7 osd.4 up 1 -2 2.7 host node04 0 2.7 osd.0 up 1 -1 13.5 root default -2 2.7 host node04 0 2.7 osd.0 up 1 -3 2.7 host node04 1 2.7 osd.1 up 1 -4 2.7 host node03 2 2.7 osd.2 up 1 -5 5.4 host node01 3 2.7 osd.3 up 1 4 2.7 osd.4 up 1 And this is the crushmap: ... root 4x4GbFCnlSAS { id -6 #do not change unnecessarily alg straw hash 0 # rjenkins1 item node01 weight 5.400 item node04 weight 2.700 } root 4x1GbFCnlSAS { id -7 #do not change unnecessarily alg straw hash 0 # rjenkins1 item node04 weight 2.700 item node03 weight 2.700 } # rules rule 4x4GbFCnlSAS { ruleset 1 type replicated min_size 1 max_size 10 step take 4x4GbFCnlSAS step choose firstn 0 type host step emit } rule 4x1GbFCnlSAS { ruleset 2 type replicated min_size 1 max_size 10 step take 4x1GbFCnlSAS step choose firstn 0 type host step emit } .. I of course set the crush_rules: sudo ceph osd pool set cloud-4x1GbFCnlSAS crush_ruleset 2 sudo ceph osd pool set cloud-4x4GbFCnlSAS crush_ruleset 1 but seems that are something wrong (4x4GbFCnlSAS.pool is 512MB file): sudo rados -p cloud-4x1GbFCnlSAS put 4x4GbFCnlSAS.object 4x4GbFCnlSAS.pool !!HANGS for eve! from the ceph-client happen the same rbd ls cloud-4x1GbFCnlSAS !!HANGS for eve! [root@cephadm ceph-cloud]# ceph osd map cloud-4x1GbFCnlSAS 4x1GbFCnlSAS.object osdmap e49 pool 'cloud-4x1GbFCnlSAS' (3) object '4x1GbFCnlSAS.object' - pg 3.114ae7a9 (3.29) - up ([], p-1) acting ([], p-1) Any idea what i am doing wrong?? Thanks in advance, I Bertrand Russell: El problema con el mundo es que los estúpidos están seguros de todo y los inteligentes están llenos de dudas ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Pools do not respond
Hi Gregory, Thanks a lot I begin to understand who ceph works. I add a couple of osd servers, and balance the disk between them. [ceph@cephadm ceph-cloud]$ sudo ceph osd tree # idweighttype nameup/downreweight -716.2root 4x1GbFCnlSAS -95.4host node02 72.7osd.7up1 82.7osd.8up1 -45.4host node03 22.7osd.2up1 92.7osd.9up1 -35.4host node04 12.7osd.1up1 102.7osd.10up1 -616.2root 4x4GbFCnlSAS -55.4host node01 32.7osd.3up1 42.7osd.4up1 -85.4host node02 52.7osd.5up1 62.7osd.6up1 -25.4host node04 02.7osd.0up1 112.7osd.11up1 -132.4root default -25.4host node04 02.7osd.0up1 112.7osd.11up1 -35.4host node04 12.7osd.1up1 102.7osd.10up1 -45.4host node03 22.7osd.2up1 92.7osd.9up1 -55.4host node01 32.7osd.3up1 42.7osd.4up1 -85.4host node02 52.7osd.5up1 62.7osd.6up1 -95.4host node02 72.7osd.7up1 82.7osd.8up1 The Idea Is to have at least 4 servers and 3 disk (2.7 TB SAN attached) for server per pool. Now i have to adjust the pg and pgp and make some performance test. PD which is the difference betwwwn chose ans choseleaf? Thanks a lot! 2014-07-03 19:06 GMT+02:00 Gregory Farnum g...@inktank.com: The PG in question isn't being properly mapped to any OSDs. There's a good chance that those trees (with 3 OSDs in 2 hosts) aren't going to map well anyway, but the immediate problem should resolve itself if you change the choose to chooseleaf in your rules. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Jul 3, 2014 at 4:17 AM, Iban Cabrillo cabri...@ifca.unican.es wrote: Hi folk, I am following step by step the test intallation, and checking some configuration before try to deploy a production cluster. Now I have a Health cluster with 3 mons + 4 OSDs. I have created a pool with belonging all osd.x and two more one for two servers o the other for the other two. The general pool work fine (I can create images and mount it on remote machines). But the other two does not work (the commands rados put, or rbd ls pool hangs for ever). this is the tree: [ceph@cephadm ceph-cloud]$ sudo ceph osd tree # id weight type name up/down reweight -7 5.4 root 4x1GbFCnlSAS -3 2.7 host node04 1 2.7 osd.1 up 1 -4 2.7 host node03 2 2.7 osd.2 up 1 -6 8.1 root 4x4GbFCnlSAS -5 5.4 host node01 3 2.7 osd.3 up 1 4 2.7 osd.4 up 1 -2 2.7 host node04 0 2.7 osd.0 up 1 -1 13.5 root default -2 2.7 host node04 0 2.7 osd.0 up 1 -3 2.7 host node04 1 2.7 osd.1 up 1 -4 2.7 host node03 2 2.7 osd.2 up 1 -5 5.4 host node01 3 2.7 osd.3 up 1 4 2.7 osd.4 up 1 And this is the crushmap: ... root 4x4GbFCnlSAS { id -6 #do not change unnecessarily alg straw hash 0 # rjenkins1 item node01 weight 5.400 item node04 weight 2.700 } root 4x1GbFCnlSAS { id -7 #do not change unnecessarily alg straw hash 0 # rjenkins1 item node04 weight 2.700 item node03 weight 2.700 } # rules rule 4x4GbFCnlSAS { ruleset 1 type replicated min_size 1 max_size 10 step take 4x4GbFCnlSAS step choose firstn 0 type host step emit } rule 4x1GbFCnlSAS { ruleset 2 type replicated min_size 1 max_size 10 step take 4x1GbFCnlSAS step choose firstn 0 type host step emit } .. I of course set the crush_rules: sudo ceph osd pool set cloud-4x1GbFCnlSAS crush_ruleset 2 sudo ceph osd pool set cloud-4x4GbFCnlSAS crush_ruleset 1 but seems that are something wrong (4x4GbFCnlSAS.pool is 512MB file): sudo rados -p cloud-4x1GbFCnlSAS put 4x4GbFCnlSAS.object 4x4GbFCnlSAS.pool !!HANGS for eve! from the ceph-client happen the same rbd ls cloud-4x1GbFCnlSAS !!HANGS for eve! [root@cephadm ceph-cloud]# ceph osd map cloud-4x1GbFCnlSAS 4x1GbFCnlSAS.object osdmap e49 pool 'cloud-4x1GbFCnlSAS' (3) object '4x1GbFCnlSAS.object' - pg 3.114ae7a9 (3.29) - up ([], p-1) acting ([], p-1) Any idea what i am doing wrong?? Thanks in advance, I Bertrand Russell: El problema con el mundo es que los estúpidos
Re: [ceph-users] Pools do not respond
On Thu, Jul 3, 2014 at 11:17 AM, Iban Cabrillo cabri...@ifca.unican.es wrote: Hi Gregory, Thanks a lot I begin to understand who ceph works. I add a couple of osd servers, and balance the disk between them. [ceph@cephadm ceph-cloud]$ sudo ceph osd tree # idweighttype nameup/downreweight -716.2root 4x1GbFCnlSAS -95.4host node02 72.7osd.7up1 82.7osd.8up1 -45.4host node03 22.7osd.2up1 92.7osd.9up1 -35.4host node04 12.7osd.1up1 102.7osd.10up1 -616.2root 4x4GbFCnlSAS -55.4host node01 32.7osd.3up1 42.7osd.4up1 -85.4host node02 52.7osd.5up1 62.7osd.6up1 -25.4host node04 02.7osd.0up1 112.7osd.11up1 -132.4root default -25.4host node04 02.7osd.0up1 112.7osd.11up1 -35.4host node04 12.7osd.1up1 102.7osd.10up1 -45.4host node03 22.7osd.2up1 92.7osd.9up1 -55.4host node01 32.7osd.3up1 42.7osd.4up1 -85.4host node02 52.7osd.5up1 62.7osd.6up1 -95.4host node02 72.7osd.7up1 82.7osd.8up1 The Idea Is to have at least 4 servers and 3 disk (2.7 TB SAN attached) for server per pool. Now i have to adjust the pg and pgp and make some performance test. PD which is the difference betwwwn chose ans choseleaf? choose instructs the system to choose N different buckets of the given type (where N is specified by the firstn 0 block to be the replication level, but could be 1: firstn 1, or replication - 1: firstn -1). Since you're saying choose firstn 0 type host, that's what you're getting out, and then you're emitting those 3 (by default) hosts. But they aren't valid devices (OSDs), so it's not a valid mapping; you're supposed to then say choose firstn 1 device or similar. chooseleaf instead tells the system to choose N different buckets, and then descend from each of those buckets to a leaf (device) in the CRUSH hierarchy. It's a little more robust against different mappings and failure conditions, so generally a better choice than choose if you don't need the finer granularity provided by choose. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com