James, If your replication factor is 3, for every 1GB added, your GB avail with decrease by 3GB.
Cary -Dynamic On Mon, Dec 18, 2017 at 6:18 PM, James Okken <james.ok...@dialogic.com> wrote: > Thanks David. > Thanks again Cary. > > If I have > 682 GB used, 12998 GB / 13680 GB avail, > then I still need to divide 13680/3 (my replication setting) to get what my > total storage really is, right? > > Thanks! > > > James Okken > Lab Manager > Dialogic Research Inc. > 4 Gatehall Drive > Parsippany > NJ 07054 > USA > > Tel: 973 967 5179 > Email: james.ok...@dialogic.com > Web: www.dialogic.com – The Network Fuel Company > > This e-mail is intended only for the named recipient(s) and may contain > information that is privileged, confidential and/or exempt from disclosure > under applicable law. No waiver of privilege, confidence or otherwise is > intended by virtue of communication via the internet. Any unauthorized use, > dissemination or copying is strictly prohibited. If you have received this > e-mail in error, or are not named as a recipient, please immediately notify > the sender and destroy all copies of this e-mail. > > > -----Original Message----- > From: Cary [mailto:dynamic.c...@gmail.com] > Sent: Friday, December 15, 2017 5:56 PM > To: David Turner > Cc: James Okken; ceph-users@lists.ceph.com > Subject: Re: [ceph-users] add hard drives to 3 CEPH servers (3 server cluster) > > James, > > You can set these values in ceph.conf. > > [global] > ... > osd pool default size = 3 > osd pool default min size = 2 > ... > > New pools that are created will use those values. > > If you run a "ceph -s" and look at the "usage" line, it shows how much space > is: 1 used, 2 available, 3 total. ie. > > usage: 19465 GB used, 60113 GB / 79578 GB avail > > We choose to use Openstack with Ceph in this decade and do the other things, > not because they are easy, but because they are hard...;-p > > > Cary > -Dynamic > > On Fri, Dec 15, 2017 at 10:12 PM, David Turner <drakonst...@gmail.com> wrote: >> In conjunction with increasing the pool size to 3, also increase the >> pool min_size to 2. `ceph df` and `ceph osd df` will eventually show >> the full size in use in your cluster. In particular the output of >> `ceph df` with available size in a pool takes into account the pools >> replication size. >> Continue watching ceph -s or ceph -w to see when the backfilling for >> your change to replication size finishes. >> >> On Fri, Dec 15, 2017 at 5:06 PM James Okken <james.ok...@dialogic.com> >> wrote: >>> >>> This whole effort went extremely well, thanks to Cary, and Im not >>> used to that with CEPH so far. (And openstack ever....) Thank you >>> Cary. >>> >>> Ive upped the replication factor and now I see "replicated size 3" in >>> each of my pools. Is this the only place to check replication level? >>> Is there a Global setting or only a setting per Pool? >>> >>> ceph osd pool ls detail >>> pool 0 'rbd' replicated size 3...... >>> pool 1 'images' replicated size 3... >>> ... >>> >>> One last question! >>> At this replication level how can I tell how much total space I >>> actually have now? >>> Do I just 1/3 the Global size? >>> >>> ceph df >>> GLOBAL: >>> SIZE AVAIL RAW USED %RAW USED >>> 13680G 12998G 682G 4.99 >>> POOLS: >>> NAME ID USED %USED MAX AVAIL OBJECTS >>> rbd 0 0 0 6448G 0 >>> images 1 216G 3.24 6448G 27745 >>> backups 2 0 0 6448G 0 >>> volumes 3 117G 1.79 6448G 30441 >>> compute 4 0 0 6448G 0 >>> >>> ceph osd df >>> ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS >>> 0 0.81689 1.00000 836G 36549M 800G 4.27 0.86 67 >>> 4 3.70000 1.00000 3723G 170G 3553G 4.58 0.92 270 >>> 1 0.81689 1.00000 836G 49612M 788G 5.79 1.16 56 >>> 5 3.70000 1.00000 3723G 192G 3531G 5.17 1.04 282 >>> 2 0.81689 1.00000 836G 33639M 803G 3.93 0.79 58 >>> 3 3.70000 1.00000 3723G 202G 3521G 5.43 1.09 291 >>> TOTAL 13680G 682G 12998G 4.99 >>> MIN/MAX VAR: 0.79/1.16 STDDEV: 0.67 >>> >>> Thanks! >>> >>> -----Original Message----- >>> From: Cary [mailto:dynamic.c...@gmail.com] >>> Sent: Friday, December 15, 2017 4:05 PM >>> To: James Okken >>> Cc: ceph-users@lists.ceph.com >>> Subject: Re: [ceph-users] add hard drives to 3 CEPH servers (3 server >>> cluster) >>> >>> James, >>> >>> Those errors are normal. Ceph creates the missing files. You can >>> check "/var/lib/ceph/osd/ceph-6", before and after you run those >>> commands to see what files are added there. >>> >>> Make sure you get the replication factor set. >>> >>> >>> Cary >>> -Dynamic >>> >>> On Fri, Dec 15, 2017 at 6:11 PM, James Okken >>> <james.ok...@dialogic.com> >>> wrote: >>> > Thanks again Cary, >>> > >>> > Yes, once all the backfilling was done I was back to a Healthy cluster. >>> > I moved on to the same steps for the next server in the cluster, it >>> > is backfilling now. >>> > Once that is done I will do the last server in the cluster, and >>> > then I think I am done! >>> > >>> > Just checking on one thing. I get these messages when running this >>> > command. I assume this is OK, right? >>> > root@node-54:~# ceph-osd -i 4 --mkfs --mkkey --osd-uuid >>> > 25c21708-f756-4593-bc9e-c5506622cf07 >>> > 2017-12-15 17:28:22.849534 7fd2f9e928c0 -1 journal FileJournal::_open: >>> > disabling aio for non-block journal. Use journal_force_aio to >>> > force use of aio anyway >>> > 2017-12-15 17:28:22.855838 7fd2f9e928c0 -1 journal FileJournal::_open: >>> > disabling aio for non-block journal. Use journal_force_aio to >>> > force use of aio anyway >>> > 2017-12-15 17:28:22.856444 7fd2f9e928c0 -1 >>> > filestore(/var/lib/ceph/osd/ceph-4) could not find >>> > #-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or >>> > directory >>> > 2017-12-15 17:28:22.893443 7fd2f9e928c0 -1 created object store >>> > /var/lib/ceph/osd/ceph-4 for osd.4 fsid >>> > 2b9f7957-d0db-481e-923e-89972f6c594f >>> > 2017-12-15 17:28:22.893484 7fd2f9e928c0 -1 auth: error reading file: >>> > /var/lib/ceph/osd/ceph-4/keyring: can't open >>> > /var/lib/ceph/osd/ceph-4/keyring: (2) No such file or directory >>> > 2017-12-15 17:28:22.893662 7fd2f9e928c0 -1 created new key in >>> > keyring /var/lib/ceph/osd/ceph-4/keyring >>> > >>> > thanks >>> > >>> > -----Original Message----- >>> > From: Cary [mailto:dynamic.c...@gmail.com] >>> > Sent: Thursday, December 14, 2017 7:13 PM >>> > To: James Okken >>> > Cc: ceph-users@lists.ceph.com >>> > Subject: Re: [ceph-users] add hard drives to 3 CEPH servers (3 >>> > server >>> > cluster) >>> > >>> > James, >>> > >>> > Usually once the misplaced data has balanced out the cluster >>> > should reach a healthy state. If you run a "ceph health detail" >>> > Ceph will show you some more detail about what is happening. Is >>> > Ceph still recovering, or has it stalled? has the "objects misplaced >>> > (62.511%" >>> > changed to a lower %? >>> > >>> > Cary >>> > -Dynamic >>> > >>> > On Thu, Dec 14, 2017 at 10:52 PM, James Okken >>> > <james.ok...@dialogic.com> >>> > wrote: >>> >> Thanks Cary! >>> >> >>> >> Your directions worked on my first sever. (once I found the >>> >> missing carriage return in your list of commands, the email musta messed >>> >> it up. >>> >> >>> >> For anyone else: >>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 ceph auth add osd.4 >>> >> osd 'allow *' mon 'allow profile osd' -i >>> >> /etc/ceph/ceph.osd.4.keyring really is 2 commands: >>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 and ceph auth add >>> >> osd.4 osd 'allow *' mon 'allow profile osd' -i >>> >> /etc/ceph/ceph.osd.4.keyring >>> >> >>> >> Cary, what am I looking for in ceph -w and ceph -s to show the >>> >> status of the data moving? >>> >> Seems like the data is moving and that I have some issue... >>> >> >>> >> root@node-53:~# ceph -w >>> >> cluster 2b9f7957-d0db-481e-923e-89972f6c594f >>> >> health HEALTH_WARN >>> >> 176 pgs backfill_wait >>> >> 1 pgs backfilling >>> >> 27 pgs degraded >>> >> 1 pgs recovering >>> >> 26 pgs recovery_wait >>> >> 27 pgs stuck degraded >>> >> 204 pgs stuck unclean >>> >> recovery 10322/84644 objects degraded (12.195%) >>> >> recovery 52912/84644 objects misplaced (62.511%) >>> >> monmap e3: 3 mons at >>> >> {node-43=192.168.1.7:6789/0,node-44=192.168.1.5:6789/0,node-45=192.168.1.3:6789/0} >>> >> election epoch 138, quorum 0,1,2 node-45,node-44,node-43 >>> >> osdmap e206: 4 osds: 4 up, 4 in; 177 remapped pgs >>> >> flags sortbitwise,require_jewel_osds >>> >> pgmap v3936175: 512 pgs, 5 pools, 333 GB data, 58184 objects >>> >> 370 GB used, 5862 GB / 6233 GB avail >>> >> 10322/84644 objects degraded (12.195%) >>> >> 52912/84644 objects misplaced (62.511%) >>> >> 308 active+clean >>> >> 176 active+remapped+wait_backfill >>> >> 26 active+recovery_wait+degraded >>> >> 1 active+remapped+backfilling >>> >> 1 active+recovering+degraded recovery io 100605 >>> >> kB/s, 14 objects/s >>> >> client io 0 B/s rd, 92788 B/s wr, 50 op/s rd, 11 op/s wr >>> >> >>> >> 2017-12-14 22:45:57.459846 mon.0 [INF] pgmap v3936174: 512 pgs: 1 >>> >> activating, 1 active+recovering+degraded, 26 >>> >> active+recovery_wait+degraded, 1 active+remapped+backfilling, 307 >>> >> active+clean, 176 active+remapped+wait_backfill; 333 GB data, 369 >>> >> active+GB >>> >> used, 5863 GB / 6233 GB avail; 0 B/s rd, 101107 B/s wr, 19 op/s; >>> >> 10354/84644 objects degraded (12.232%); 52912/84644 objects >>> >> misplaced (62.511%); 12224 kB/s, 2 objects/s recovering >>> >> 2017-12-14 22:45:58.466736 mon.0 [INF] pgmap v3936175: 512 pgs: 1 >>> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1 >>> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill; >>> >> active+remapped+333 GB data, 370 GB used, 5862 GB / >>> >> 6233 GB avail; 0 B/s rd, 92788 B/s wr, 61 op/s; 10322/84644 >>> >> objects degraded (12.195%); 52912/84644 objects misplaced >>> >> (62.511%); 100605 kB/s, 14 objects/s recovering >>> >> 2017-12-14 22:46:00.474335 mon.0 [INF] pgmap v3936176: 512 pgs: 1 >>> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1 >>> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill; >>> >> active+remapped+333 GB data, 370 GB used, 5862 GB / >>> >> 6233 GB avail; 0 B/s rd, 434 kB/s wr, 45 op/s; 10322/84644 objects >>> >> degraded (12.195%); 52912/84644 objects misplaced (62.511%); 84234 >>> >> kB/s, 10 objects/s recovering >>> >> 2017-12-14 22:46:02.482228 mon.0 [INF] pgmap v3936177: 512 pgs: 1 >>> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1 >>> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill; >>> >> active+remapped+333 GB data, 370 GB used, 5862 GB / >>> >> 6233 GB avail; 0 B/s rd, 334 kB/s wr >>> >> >>> >> >>> >> -----Original Message----- >>> >> From: Cary [mailto:dynamic.c...@gmail.com] >>> >> Sent: Thursday, December 14, 2017 4:21 PM >>> >> To: James Okken >>> >> Cc: ceph-users@lists.ceph.com >>> >> Subject: Re: [ceph-users] add hard drives to 3 CEPH servers (3 >>> >> server >>> >> cluster) >>> >> >>> >> Jim, >>> >> >>> >> I am not an expert, but I believe I can assist. >>> >> >>> >> Normally you will only have 1 OSD per drive. I have heard >>> >> discussions about using multiple OSDs per disk, when using SSDs though. >>> >> >>> >> Once your drives have been installed you will have to format >>> >> them, unless you are using Bluestore. My steps for formatting are below. >>> >> Replace the sXX with your drive name. >>> >> >>> >> parted -a optimal /dev/sXX >>> >> print >>> >> mklabel gpt >>> >> unit mib >>> >> mkpart OSD4sdd1 1 -1 >>> >> quit >>> >> mkfs.xfs -f /dev/sXX1 >>> >> >>> >> # Run blkid, and copy the UUID for the newly formatted drive. >>> >> blkid >>> >> # Add the mount point/UUID to fstab. The mount point will be >>> >> created later. >>> >> vi /etc/fstab >>> >> # For example >>> >> UUID=6386bac4-7fef-3cd2-7d64-13db51d83b12 /var/lib/ceph/osd/ceph-4 >>> >> xfs >>> >> rw,noatime,inode64,logbufs=8 0 0 >>> >> >>> >> >>> >> # You can then add the OSD to the cluster. >>> >> >>> >> uuidgen >>> >> # Replace the UUID below with the UUID that was created with uuidgen. >>> >> ceph osd create 23e734d7-96d8-4327-a2b9-0fbdc72ed8f1 >>> >> >>> >> # Notice what number of osd it creates usually the lowest # OSD >>> >> available. >>> >> >>> >> # Add osd.4 to ceph.conf on all Ceph nodes. >>> >> vi /etc/ceph/ceph.conf >>> >> ... >>> >> [osd.4] >>> >> public addr = 172.1.3.1 >>> >> cluster addr = 10.1.3.1 >>> >> ... >>> >> >>> >> # Now add the mount point. >>> >> mkdir -p /var/lib/ceph/osd/ceph-4 >>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 >>> >> >>> >> # The command below mounts everything in fstab. >>> >> mount -a >>> >> # The number after -i below needs changed to the correct OSD ID, >>> >> and the osd-uuid needs to be changed the UUID created with uuidgen above. >>> >> Your keyring location may be different and need changed as well. >>> >> ceph-osd -i 4 --mkfs --mkkey --osd-uuid >>> >> 23e734d7-96d8-4327-a2b9-0fbdc72ed8f1 >>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 ceph auth add osd.4 >>> >> osd 'allow *' mon 'allow profile osd' -i >>> >> /etc/ceph/ceph.osd.4.keyring >>> >> >>> >> # Add the new OSD to its host in the crush map. >>> >> ceph osd crush add osd.4 .0 host=YOURhostNAME >>> >> >>> >> # Since the weight used in the previous step was .0, you will need >>> >> to increase it. I use 1 for a 1TB drive and 5 for a 5TB drive. The >>> >> command below will reweight osd.4 to 1. You may need to slowly ramp up >>> >> this number. >>> >> ie .10 then .20 etc. >>> >> ceph osd crush reweight osd.4 1 >>> >> >>> >> You should now be able to start the drive. You can watch the data >>> >> move to the drive with a ceph -w. Once data has migrated to the >>> >> drive, start the next. >>> >> >>> >> Cary >>> >> -Dynamic >>> >> >>> >> On Thu, Dec 14, 2017 at 5:34 PM, James Okken >>> >> <james.ok...@dialogic.com> >>> >> wrote: >>> >>> Hi all, >>> >>> >>> >>> Please let me know if I am missing steps or using the wrong steps >>> >>> >>> >>> I'm hoping to expand my small CEPH cluster by adding 4TB hard >>> >>> drives to each of the 3 servers in the cluster. >>> >>> >>> >>> I also need to change my replication factor from 1 to 3. >>> >>> This is part of an Openstack environment deployed by Fuel and I >>> >>> had foolishly set my replication factor to 1 in the Fuel settings >>> >>> before deploy. >>> >>> I know this would have been done better at the beginning. I do >>> >>> want to keep the current cluster and not start over. I know this >>> >>> is going thrash my cluster for a while replicating, but there isn't too >>> >>> much data on it yet. >>> >>> >>> >>> >>> >>> To start I need to safely turn off each CEPH server and add in >>> >>> the 4TB >>> >>> drive: >>> >>> To do that I am going to run: >>> >>> ceph osd set noout >>> >>> systemctl stop ceph-osd@1 (or 2 or 3 on the other servers) ceph >>> >>> osd tree (to verify it is down) poweroff, install the 4TB drive, >>> >>> bootup again ceph osd unset noout >>> >>> >>> >>> >>> >>> >>> >>> Next step wouyld be to get CEPH to use the 4TB drives. Each CEPH >>> >>> server already has a 836GB OSD. >>> >>> >>> >>> ceph> osd df >>> >>> ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS >>> >>> 0 0.81689 1.00000 836G 101G 734G 12.16 0.90 167 >>> >>> 1 0.81689 1.00000 836G 115G 721G 13.76 1.02 166 >>> >>> 2 0.81689 1.00000 836G 121G 715G 14.49 1.08 179 >>> >>> TOTAL 2509G 338G 2171G 13.47 MIN/MAX VAR: 0.90/1.08 >>> >>> STDDEV: 0.97 >>> >>> >>> >>> ceph> df >>> >>> GLOBAL: >>> >>> SIZE AVAIL RAW USED %RAW USED >>> >>> 2509G 2171G 338G 13.47 >>> >>> POOLS: >>> >>> NAME ID USED %USED MAX AVAIL OBJECTS >>> >>> rbd 0 0 0 2145G 0 >>> >>> images 1 216G 9.15 2145G 27745 >>> >>> backups 2 0 0 2145G 0 >>> >>> volumes 3 114G 5.07 2145G 29717 >>> >>> compute 4 0 0 2145G 0 >>> >>> >>> >>> >>> >>> Once I get the 4TB drive into each CEPH server should I look to >>> >>> increasing the current OSD (ie: to 4836GB)? >>> >>> Or create a second 4000GB OSD on each CEPH server? >>> >>> If I am going to create a second OSD on each CEPH server I hope >>> >>> to use this doc: >>> >>> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/ >>> >>> >>> >>> >>> >>> >>> >>> As far as changing the replication factor from 1 to 3: >>> >>> Here are my pools now: >>> >>> >>> >>> ceph osd pool ls detail >>> >>> pool 0 'rbd' replicated size 1 min_size 1 crush_ruleset 0 >>> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags >>> >>> hashpspool stripe_width 0 pool 1 'images' replicated size 1 >>> >>> min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num >>> >>> 64 last_change 116 flags hashpspool stripe_width 0 >>> >>> removed_snaps [1~3,b~6,12~8,20~2,24~6,2b~8,34~2,37~20] >>> >>> pool 2 'backups' replicated size 1 min_size 1 crush_ruleset 0 >>> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 7 flags >>> >>> hashpspool stripe_width 0 pool 3 'volumes' replicated size 1 >>> >>> min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 >>> >>> pgp_num 256 last_change 73 flags hashpspool stripe_width 0 >>> >>> removed_snaps [1~3] >>> >>> pool 4 'compute' replicated size 1 min_size 1 crush_ruleset 0 >>> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 34 flags >>> >>> hashpspool stripe_width 0 >>> >>> >>> >>> I plan on using these steps I saw online: >>> >>> ceph osd pool set rbd size 3 >>> >>> ceph -s (Verify that replication completes successfully) ceph >>> >>> osd pool set images size 3 ceph -s ceph osd pool set backups size >>> >>> 3 ceph -s ceph osd pool set volumes size 3 ceph -s >>> >>> >>> >>> >>> >>> please let me know any advice or better methods... >>> >>> >>> >>> thanks >>> >>> >>> >>> --Jim >>> >>> >>> >>> _______________________________________________ >>> >>> ceph-users mailing list >>> >>> ceph-users@lists.ceph.com >>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com