Re: [ceph-users] New EC pool undersized

Kyle Hutson Wed, 04 Mar 2015 13:16:21 -0800

So it sounds like I should figure out at 'how many nodes' do I need to
increase pg_num to 4096, and again for 8192, and increase those
incrementally when as I add more hosts, correct?


On Wed, Mar 4, 2015 at 3:04 PM, Don Doerner <don.doer...@quantum.com> wrote:

>  Sorry, I missed your other questions, down at the bottom.  See here
> <http://ceph.com/docs/master/rados/operations/placement-groups/> (look
> for “number of replicas for replicated pools or the K+M sum for erasure
> coded pools”) for the formula; 38400/8 probably implies 8192.
>
>
>
> The thing is, you’ve got to think about how many ways you can form
> combinations of 8 unique OSDs (with replacement) that match your failure
> domain rules.  If you’ve only got 8 hosts, and your failure domain is
> hosts, it severely limits this number.  And I have read that too many
> isn’t good either – a serialization issue, I believe.
>
>
>
> -don-
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Don Doerner
> *Sent:* 04 March, 2015 12:49
> *To:* Kyle Hutson
> *Cc:* ceph-users@lists.ceph.com
>
> *Subject:* Re: [ceph-users] New EC pool undersized
>
>
>
> Hmmm, I just struggled through this myself.  How many racks do you have?  If
> not more than 8, you might want to make your failure domain smaller?  I.e.,
> maybe host?  That, at least, would allow you to debug the situation…
>
>
>
> -don-
>
>
>
> *From:* Kyle Hutson [mailto:kylehut...@ksu.edu <kylehut...@ksu.edu>]
> *Sent:* 04 March, 2015 12:43
> *To:* Don Doerner
> *Cc:* Ceph Users
> *Subject:* Re: [ceph-users] New EC pool undersized
>
>
>
> It wouldn't let me simply change the pg_num, giving
>
> Error EEXIST: specified pg_num 2048 <= current 8192
>
>
>
> But that's not a big deal, I just deleted the pool and recreated with
> 'ceph osd pool create ec44pool 2048 2048 erasure ec44profile'
>
> ...and the result is quite similar: 'ceph status' is now
>
> ceph status
>
>     cluster 196e5eb8-d6a7-4435-907e-ea028e946923
>
>      health HEALTH_WARN 4 pgs degraded; 4 pgs stuck unclean; 4 pgs
> undersized
>
>      monmap e1: 4 mons at {hobbit01=
> 10.5.38.1:6789/0,hobbit02=10.5.38.2:6789/0,hobbit13=10.5.38.13:6789/0,hobbit14=10.5.38.14:6789/0
> <https://urldefense.proofpoint.com/v1/url?u=http://10.5.38.1:6789/0%2Chobbit02%3D10.5.38.2:6789/0%2Chobbit13%3D10.5.38.13:6789/0%2Chobbit14%3D10.5.38.14:6789/0&k=8F5TVnBDKF32UabxXsxZiA%3D%3D%0A&r=klXZewu0kUquU7GVFsSHwpsWEaffmLRymeSfL%2FX1EJo%3D%0A&m=fHQcjtxx3uADdikQAQAh65Z0s%2FzNFIj544bRY5zThgI%3D%0A&s=01b7463be37041310163f5d75abc634fab3280633eaef2158ed6609c6f3978d8>},
> election epoch 6, quorum 0,1,2,3 hobbit01,hobbit02,hobbit13,hobbit14
>
>      osdmap e412: 144 osds: 144 up, 144 in
>
>       pgmap v6798: 6144 pgs, 2 pools, 0 bytes data, 0 objects
>
>             90590 MB used, 640 TB / 640 TB avail
>
>                    4 active+undersized+degraded
>
>                 6140 active+clean
>
>
>
> 'ceph pg dump_stuck results' in
>
> ok
>
> pg_stat   objects   mip  degr misp unf  bytes     log  disklog     state
> state_stamp    v    reported  up   up_primary     acting    acting_primary
> last_scrub     scrub_stamp     last_deep_scrub     deep_scrub_stamp
>
> 2.296     0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 14:33:26.672224     0'0  412:9
> [5,55,91,2147483647,83,135,53,26]  5     [5,55,91,2147483647,83,135,53,26]
> 5    0'0  2015-03-04 14:33:15.649911     0'0  2015-03-04 14:33:15.649911
>
> 2.69c     0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 14:33:24.984802     0'0  412:9
> [93,134,1,74,112,28,2147483647,60] 93     [93,134,1,74,112,28,2147483647
> ,60] 93   0'0  2015-03-04 14:33:15.695747     0'0  2015-03-04
> 14:33:15.695747
>
> 2.36d     0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 14:33:21.937620     0'0  412:9
> [12,108,136,104,52,18,63,2147483647]    12   [12,108,136,104,52,18,63,
> 2147483647]    12   0'0  2015-03-04 14:33:15.652480    0'0  2015-03-04
> 14:33:15.652480
>
> 2.5f7     0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 14:33:26.169242     0'0  412:9
> [94,128,73,22,4,60,2147483647,113] 94     [94,128,73,22,4,60,2147483647
> ,113] 94   0'0  2015-03-04 14:33:15.687695     0'0  2015-03-04
> 14:33:15.687695
>
>
>
> I do have questions for you, even at this point, though.
>
> 1) Where did you find the formula (14400/(k+m))?
>
> 2) I was really trying to size this for when it goes to production, at
> which point it may have as many as 384 OSDs. Doesn't that imply I should
> have even more pgs?
>
>
>
> On Wed, Mar 4, 2015 at 2:15 PM, Don Doerner <don.doer...@quantum.com>
> wrote:
>
> Oh duh…  OK, then given a 4+4 erasure coding scheme, 14400/8 is 1800, so
> try 2048.
>
>
>
> -don-
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Don Doerner
> *Sent:* 04 March, 2015 12:14
> *To:* Kyle Hutson; Ceph Users
> *Subject:* Re: [ceph-users] New EC pool undersized
>
>
>
> In this case, that number means that there is not an OSD that can be
> assigned.  What’s your k, m from you erasure coded pool?  You’ll need
> approximately (14400/(k+m)) PGs, rounded up to the next power of 2…
>
>
>
> -don-
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com
> <ceph-users-boun...@lists.ceph.com>] *On Behalf Of *Kyle Hutson
> *Sent:* 04 March, 2015 12:06
> *To:* Ceph Users
> *Subject:* [ceph-users] New EC pool undersized
>
>
>
> Last night I blew away my previous ceph configuration (this environment is
> pre-production) and have 0.87.1 installed. I've manually edited the
> crushmap so it down looks like https://dpaste.de/OLEa
> <https://urldefense.proofpoint.com/v1/url?u=https://dpaste.de/OLEa&k=8F5TVnBDKF32UabxXsxZiA%3D%3D%0A&r=klXZewu0kUquU7GVFsSHwpsWEaffmLRymeSfL%2FX1EJo%3D%0A&m=JSfAuDHRgKln0yM%2FQGMT3hZb3rVLUpdn2wGdV3C0Rbk%3D%0A&s=c1bd46dcd96e656554817882d7f6581903b1e3c6a50313f4bf7494acfd12b442>
>
>
>
> I currently have 144 OSDs on 8 nodes.
>
>
>
> After increasing pg_num and pgp_num to a more suitable 1024 (due to the
> high number of OSDs), everything looked happy.
>
> So, now I'm trying to play with an erasure-coded pool.
>
> I did:
>
> ceph osd erasure-code-profile set ec44profile k=4 m=4
> ruleset-failure-domain=rack
>
> ceph osd pool create ec44pool 8192 8192 erasure ec44profile
>
>
>
> After settling for a bit 'ceph status' gives
>
>     cluster 196e5eb8-d6a7-4435-907e-ea028e946923
>
>      health HEALTH_WARN 7 pgs degraded; 7 pgs stuck degraded; 7 pgs stuck
> unclean; 7 pgs stuck undersized; 7 pgs undersized
>
>      monmap e1: 4 mons at {hobbit01=
> 10.5.38.1:6789/0,hobbit02=10.5.38.2:6789/0,hobbit13=10.5.38.13:6789/0,hobbit14=10.5.38.14:6789/0
> <https://urldefense.proofpoint.com/v1/url?u=http://10.5.38.1:6789/0%2Chobbit02%3D10.5.38.2:6789/0%2Chobbit13%3D10.5.38.13:6789/0%2Chobbit14%3D10.5.38.14:6789/0&k=8F5TVnBDKF32UabxXsxZiA%3D%3D%0A&r=klXZewu0kUquU7GVFsSHwpsWEaffmLRymeSfL%2FX1EJo%3D%0A&m=JSfAuDHRgKln0yM%2FQGMT3hZb3rVLUpdn2wGdV3C0Rbk%3D%0A&s=6fe07b47a00235857630057e09cfb702dcddcea1d3f98d81a574020ee95dee44>},
> election epoch 6, quorum 0,1,2,3 hobbit01,hobbit02,hobbit13,hobbit14
>
>      osdmap e409: 144 osds: 144 up, 144 in
>
>       pgmap v6763: 12288 pgs, 2 pools, 0 bytes data, 0 objects
>
>             90598 MB used, 640 TB / 640 TB avail
>
>                    7 active+undersized+degraded
>
>                12281 active+clean
>
>
>
> So to troubleshoot the undersized pgs, I issued 'ceph pg dump_stuck'
>
> ok
>
> pg_stat   objects   mip  degr misp unf  bytes     log  disklog
> state     state_stamp    v    reported  up   up_primary     acting
> acting_primary last_scrub     scrub_stamp     last_deep_scrub
> deep_scrub_stamp
>
> 1.d77     0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 11:33:57.502849     0'0  408:12
> [15,95,58,73,52,31,116,2147483647] 15     [15,95,58,73,52,31,116,
> 2147483647] 15   0'0  2015-03-04 11:33:42.100752     0'0  2015-03-04
> 11:33:42.100752
>
> 1.10fa    0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 11:34:29.362554     0'0  408:12
> [23,12,99,114,132,53,56,2147483647]     23   [23,12,99,114,132,53,56,
> 2147483647]     23   0'0  2015-03-04 11:33:42.168571    0'0  2015-03-04
> 11:33:42.168571
>
> 1.1271    0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 11:33:48.795742     0'0  408:12
> [135,112,69,4,22,95,2147483647,83] 135     [135,112,69,4,22,95,2147483647,83]
> 135  0'0  2015-03-04 11:33:42.139555     0'0  2015-03-04 11:33:42.139555
>
> 1.2b5     0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 11:34:32.189738     0'0  408:12
> [11,115,139,19,76,52,94,2147483647]     11   [11,115,139,19,76,52,94,
> 2147483647]     11   0'0  2015-03-04 11:33:42.079673    0'0  2015-03-04
> 11:33:42.079673
>
> 1.7ae     0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 11:34:26.848344     0'0  408:12
> [27,5,132,119,94,56,52,2147483647] 27     [27,5,132,119,94,56,52,
> 2147483647] 27   0'0  2015-03-04 11:33:42.109832     0'0  2015-03-04
> 11:33:42.109832
>
> 1.1a97    0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 11:34:25.457454     0'0  408:12
> [20,53,14,54,102,118,2147483647,72]     20   [20,53,14,54,102,118,
> 2147483647,72]     20   0'0  2015-03-04 11:33:42.833850    0'0
> 2015-03-04 11:33:42.833850
>
> 1.10a6    0    0    0    0    0    0    0    0
> active+undersized+degraded    2015-03-04 11:34:30.059936     0'0  408:12
> [136,22,4,2147483647,72,52,101,55] 136     [136,22,4,2147483647,72,52,101,55]
> 136  0'0  2015-03-04 11:33:42.125871     0'0  2015-03-04 11:33:42.125871
>
>
>
> This appears to have a number on all these (2147483647) that is way out
> of line from what I would expect.
>
>
>
> Thoughts?
>
>
>   ------------------------------
>
> The information contained in this transmission may be confidential. Any
> disclosure, copying, or further distribution of confidential information is
> not permitted unless such privilege is explicitly granted in writing by
> Quantum. Quantum reserves the right to have electronic communications,
> including email and attachments, sent across its networks filtered through
> anti virus and spam software programs and retain such messages in order to
> comply with applicable data security and retention requirements. Quantum is
> not responsible for the proper and complete transmission of the substance
> of this communication or for any delay in its receipt.
>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] New EC pool undersized

Reply via email to