Oh duh…  OK, then given a 4+4 erasure coding scheme, 14400/8 is 1800, so try 
2048.

-don-

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Don 
Doerner
Sent: 04 March, 2015 12:14
To: Kyle Hutson; Ceph Users
Subject: Re: [ceph-users] New EC pool undersized

In this case, that number means that there is not an OSD that can be assigned.  
What’s your k, m from you erasure coded pool?  You’ll need approximately 
(14400/(k+m)) PGs, rounded up to the next power of 2…

-don-

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Kyle 
Hutson
Sent: 04 March, 2015 12:06
To: Ceph Users
Subject: [ceph-users] New EC pool undersized

Last night I blew away my previous ceph configuration (this environment is 
pre-production) and have 0.87.1 installed. I've manually edited the crushmap so 
it down looks like 
https://dpaste.de/OLEa<https://urldefense.proofpoint.com/v1/url?u=https://dpaste.de/OLEa&k=8F5TVnBDKF32UabxXsxZiA%3D%3D%0A&r=klXZewu0kUquU7GVFsSHwpsWEaffmLRymeSfL%2FX1EJo%3D%0A&m=JSfAuDHRgKln0yM%2FQGMT3hZb3rVLUpdn2wGdV3C0Rbk%3D%0A&s=c1bd46dcd96e656554817882d7f6581903b1e3c6a50313f4bf7494acfd12b442>

I currently have 144 OSDs on 8 nodes.

After increasing pg_num and pgp_num to a more suitable 1024 (due to the high 
number of OSDs), everything looked happy.
So, now I'm trying to play with an erasure-coded pool.
I did:
ceph osd erasure-code-profile set ec44profile k=4 m=4 
ruleset-failure-domain=rack
ceph osd pool create ec44pool 8192 8192 erasure ec44profile

After settling for a bit 'ceph status' gives
    cluster 196e5eb8-d6a7-4435-907e-ea028e946923
     health HEALTH_WARN 7 pgs degraded; 7 pgs stuck degraded; 7 pgs stuck 
unclean; 7 pgs stuck undersized; 7 pgs undersized
     monmap e1: 4 mons at 
{hobbit01=10.5.38.1:6789/0,hobbit02=10.5.38.2:6789/0,hobbit13=10.5.38.13:6789/0,hobbit14=10.5.38.14:6789/0<https://urldefense.proofpoint.com/v1/url?u=http://10.5.38.1:6789/0%2Chobbit02%3D10.5.38.2:6789/0%2Chobbit13%3D10.5.38.13:6789/0%2Chobbit14%3D10.5.38.14:6789/0&k=8F5TVnBDKF32UabxXsxZiA%3D%3D%0A&r=klXZewu0kUquU7GVFsSHwpsWEaffmLRymeSfL%2FX1EJo%3D%0A&m=JSfAuDHRgKln0yM%2FQGMT3hZb3rVLUpdn2wGdV3C0Rbk%3D%0A&s=6fe07b47a00235857630057e09cfb702dcddcea1d3f98d81a574020ee95dee44>},
 election epoch 6, quorum 0,1,2,3 hobbit01,hobbit02,hobbit13,hobbit14
     osdmap e409: 144 osds: 144 up, 144 in
      pgmap v6763: 12288 pgs, 2 pools, 0 bytes data, 0 objects
            90598 MB used, 640 TB / 640 TB avail
                   7 active+undersized+degraded
               12281 active+clean

So to troubleshoot the undersized pgs, I issued 'ceph pg dump_stuck'
ok
pg_stat   objects   mip  degr misp unf  bytes     log  disklog     state     
state_stamp    v    reported  up   up_primary     acting    acting_primary 
last_scrub     scrub_stamp     last_deep_scrub     deep_scrub_stamp
1.d77     0    0    0    0    0    0    0    0     active+undersized+degraded   
 2015-03-04 11:33:57.502849     0'0  408:12    
[15,95,58,73,52,31,116,2147483647] 15     [15,95,58,73,52,31,116,2147483647] 15 
  0'0  2015-03-04 11:33:42.100752     0'0  2015-03-04 11:33:42.100752
1.10fa    0    0    0    0    0    0    0    0     active+undersized+degraded   
 2015-03-04 11:34:29.362554     0'0  408:12    
[23,12,99,114,132,53,56,2147483647]     23   
[23,12,99,114,132,53,56,2147483647]     23   0'0  2015-03-04 11:33:42.168571    
0'0  2015-03-04 11:33:42.168571
1.1271    0    0    0    0    0    0    0    0     active+undersized+degraded   
 2015-03-04 11:33:48.795742     0'0  408:12    
[135,112,69,4,22,95,2147483647,83] 135     [135,112,69,4,22,95,2147483647,83] 
135  0'0  2015-03-04 11:33:42.139555     0'0  2015-03-04 11:33:42.139555
1.2b5     0    0    0    0    0    0    0    0     active+undersized+degraded   
 2015-03-04 11:34:32.189738     0'0  408:12    
[11,115,139,19,76,52,94,2147483647]     11   
[11,115,139,19,76,52,94,2147483647]     11   0'0  2015-03-04 11:33:42.079673    
0'0  2015-03-04 11:33:42.079673
1.7ae     0    0    0    0    0    0    0    0     active+undersized+degraded   
 2015-03-04 11:34:26.848344     0'0  408:12    
[27,5,132,119,94,56,52,2147483647] 27     [27,5,132,119,94,56,52,2147483647] 27 
  0'0  2015-03-04 11:33:42.109832     0'0  2015-03-04 11:33:42.109832
1.1a97    0    0    0    0    0    0    0    0     active+undersized+degraded   
 2015-03-04 11:34:25.457454     0'0  408:12    
[20,53,14,54,102,118,2147483647,72]     20   
[20,53,14,54,102,118,2147483647,72]     20   0'0  2015-03-04 11:33:42.833850    
0'0  2015-03-04 11:33:42.833850
1.10a6    0    0    0    0    0    0    0    0     active+undersized+degraded   
 2015-03-04 11:34:30.059936     0'0  408:12    
[136,22,4,2147483647,72,52,101,55] 136     [136,22,4,2147483647,72,52,101,55] 
136  0'0  2015-03-04 11:33:42.125871     0'0  2015-03-04 11:33:42.125871

This appears to have a number on all these (2147483647) that is way out of line 
from what I would expect.

Thoughts?

________________________________
The information contained in this transmission may be confidential. Any 
disclosure, copying, or further distribution of confidential information is not 
permitted unless such privilege is explicitly granted in writing by Quantum. 
Quantum reserves the right to have electronic communications, including email 
and attachments, sent across its networks filtered through anti virus and spam 
software programs and retain such messages in order to comply with applicable 
data security and retention requirements. Quantum is not responsible for the 
proper and complete transmission of the substance of this communication or for 
any delay in its receipt.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to