Re: [ceph-users] add hard drives to 3 CEPH servers (3 server cluster)

Cary Mon, 18 Dec 2017 12:28:50 -0800

James,

If your replication factor is 3, for every 1GB added, your GB avail
with decrease by 3GB.



Cary
-Dynamic

On Mon, Dec 18, 2017 at 6:18 PM, James Okken <james.ok...@dialogic.com> wrote:
> Thanks David.
> Thanks again Cary.
>
> If I have
> 682 GB used, 12998 GB / 13680 GB avail,
> then I still need to divide 13680/3 (my replication setting) to get what my 
> total storage really is, right?
>
> Thanks!
>
>
> James Okken
> Lab Manager
> Dialogic Research Inc.
> 4 Gatehall Drive
> Parsippany
> NJ 07054
> USA
>
> Tel:       973 967 5179
> Email:   james.ok...@dialogic.com
> Web:    www.dialogic.com – The Network Fuel Company
>
> This e-mail is intended only for the named recipient(s) and may contain 
> information that is privileged, confidential and/or exempt from disclosure 
> under applicable law. No waiver of privilege, confidence or otherwise is 
> intended by virtue of communication via the internet. Any unauthorized use, 
> dissemination or copying is strictly prohibited. If you have received this 
> e-mail in error, or are not named as a recipient, please immediately notify 
> the sender and destroy all copies of this e-mail.
>
>
> -----Original Message-----
> From: Cary [mailto:dynamic.c...@gmail.com]
> Sent: Friday, December 15, 2017 5:56 PM
> To: David Turner
> Cc: James Okken; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] add hard drives to 3 CEPH servers (3 server cluster)
>
> James,
>
> You can set these values in ceph.conf.
>
> [global]
> ...
> osd pool default size         = 3
> osd pool default min size  = 2
> ...
>
> New pools that are created will use those values.
>
> If you run a "ceph -s"  and look at the "usage" line, it shows how much space 
> is: 1 used, 2 available, 3 total. ie.
>
> usage:   19465 GB used, 60113 GB / 79578 GB avail
>
> We choose to use Openstack with Ceph in this decade and do the other things, 
> not because they are easy, but because they are hard...;-p
>
>
> Cary
> -Dynamic
>
> On Fri, Dec 15, 2017 at 10:12 PM, David Turner <drakonst...@gmail.com> wrote:
>> In conjunction with increasing the pool size to 3, also increase the
>> pool min_size to 2.  `ceph df` and `ceph osd df` will eventually show
>> the full size in use in your cluster.  In particular the output of
>> `ceph df` with available size in a pool takes into account the pools 
>> replication size.
>> Continue watching ceph -s or ceph -w to see when the backfilling for
>> your change to replication size finishes.
>>
>> On Fri, Dec 15, 2017 at 5:06 PM James Okken <james.ok...@dialogic.com>
>> wrote:
>>>
>>> This whole effort went extremely well, thanks to Cary, and Im not
>>> used to that with CEPH so far. (And openstack ever....) Thank you
>>> Cary.
>>>
>>> Ive upped the replication factor and now I see "replicated size 3" in
>>> each of my pools. Is this the only place to check replication level?
>>> Is there a Global setting or only a setting per Pool?
>>>
>>> ceph osd pool ls detail
>>> pool 0 'rbd' replicated size 3......
>>> pool 1 'images' replicated size 3...
>>> ...
>>>
>>> One last question!
>>> At this replication level how can I tell how much total space I
>>> actually have now?
>>> Do I just 1/3 the Global size?
>>>
>>> ceph df
>>> GLOBAL:
>>>     SIZE       AVAIL      RAW USED     %RAW USED
>>>     13680G     12998G         682G          4.99
>>> POOLS:
>>>     NAME        ID     USED     %USED     MAX AVAIL     OBJECTS
>>>     rbd         0         0         0         6448G           0
>>>     images      1      216G      3.24         6448G       27745
>>>     backups     2         0         0         6448G           0
>>>     volumes     3      117G      1.79         6448G       30441
>>>     compute     4         0         0         6448G           0
>>>
>>> ceph osd df
>>> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE VAR  PGS
>>>  0 0.81689  1.00000   836G 36549M   800G 4.27 0.86  67
>>>  4 3.70000  1.00000  3723G   170G  3553G 4.58 0.92 270
>>>  1 0.81689  1.00000   836G 49612M   788G 5.79 1.16  56
>>>  5 3.70000  1.00000  3723G   192G  3531G 5.17 1.04 282
>>>  2 0.81689  1.00000   836G 33639M   803G 3.93 0.79  58
>>>  3 3.70000  1.00000  3723G   202G  3521G 5.43 1.09 291
>>>               TOTAL 13680G   682G 12998G 4.99
>>> MIN/MAX VAR: 0.79/1.16  STDDEV: 0.67
>>>
>>> Thanks!
>>>
>>> -----Original Message-----
>>> From: Cary [mailto:dynamic.c...@gmail.com]
>>> Sent: Friday, December 15, 2017 4:05 PM
>>> To: James Okken
>>> Cc: ceph-users@lists.ceph.com
>>> Subject: Re: [ceph-users] add hard drives to 3 CEPH servers (3 server
>>> cluster)
>>>
>>> James,
>>>
>>>  Those errors are normal. Ceph creates the missing files. You can
>>> check "/var/lib/ceph/osd/ceph-6", before and after you run those
>>> commands to see what files are added there.
>>>
>>>  Make sure you get the replication factor set.
>>>
>>>
>>> Cary
>>> -Dynamic
>>>
>>> On Fri, Dec 15, 2017 at 6:11 PM, James Okken
>>> <james.ok...@dialogic.com>
>>> wrote:
>>> > Thanks again Cary,
>>> >
>>> > Yes, once all the backfilling was done I was back to a Healthy cluster.
>>> > I moved on to the same steps for the next server in the cluster, it
>>> > is backfilling now.
>>> > Once that is done I will do the last server in the cluster, and
>>> > then I think I am done!
>>> >
>>> > Just checking on one thing. I get these messages when running this
>>> > command. I assume this is OK, right?
>>> > root@node-54:~# ceph-osd -i 4 --mkfs --mkkey --osd-uuid
>>> > 25c21708-f756-4593-bc9e-c5506622cf07
>>> > 2017-12-15 17:28:22.849534 7fd2f9e928c0 -1 journal FileJournal::_open:
>>> > disabling aio for non-block journal.  Use journal_force_aio to
>>> > force use of aio anyway
>>> > 2017-12-15 17:28:22.855838 7fd2f9e928c0 -1 journal FileJournal::_open:
>>> > disabling aio for non-block journal.  Use journal_force_aio to
>>> > force use of aio anyway
>>> > 2017-12-15 17:28:22.856444 7fd2f9e928c0 -1
>>> > filestore(/var/lib/ceph/osd/ceph-4) could not find
>>> > #-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or
>>> > directory
>>> > 2017-12-15 17:28:22.893443 7fd2f9e928c0 -1 created object store
>>> > /var/lib/ceph/osd/ceph-4 for osd.4 fsid
>>> > 2b9f7957-d0db-481e-923e-89972f6c594f
>>> > 2017-12-15 17:28:22.893484 7fd2f9e928c0 -1 auth: error reading file:
>>> > /var/lib/ceph/osd/ceph-4/keyring: can't open
>>> > /var/lib/ceph/osd/ceph-4/keyring: (2) No such file or directory
>>> > 2017-12-15 17:28:22.893662 7fd2f9e928c0 -1 created new key in
>>> > keyring /var/lib/ceph/osd/ceph-4/keyring
>>> >
>>> > thanks
>>> >
>>> > -----Original Message-----
>>> > From: Cary [mailto:dynamic.c...@gmail.com]
>>> > Sent: Thursday, December 14, 2017 7:13 PM
>>> > To: James Okken
>>> > Cc: ceph-users@lists.ceph.com
>>> > Subject: Re: [ceph-users] add hard drives to 3 CEPH servers (3
>>> > server
>>> > cluster)
>>> >
>>> > James,
>>> >
>>> >  Usually once the misplaced data has balanced out the cluster
>>> > should reach a healthy state. If you run a "ceph health detail"
>>> > Ceph will show you some more detail about what is happening.  Is
>>> > Ceph still recovering, or has it stalled? has the "objects misplaced 
>>> > (62.511%"
>>> > changed to a lower %?
>>> >
>>> > Cary
>>> > -Dynamic
>>> >
>>> > On Thu, Dec 14, 2017 at 10:52 PM, James Okken
>>> > <james.ok...@dialogic.com>
>>> > wrote:
>>> >> Thanks Cary!
>>> >>
>>> >> Your directions worked on my first sever. (once I found the
>>> >> missing carriage return in your list of commands, the email musta messed 
>>> >> it up.
>>> >>
>>> >> For anyone else:
>>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 ceph auth add osd.4
>>> >> osd 'allow *' mon 'allow profile osd' -i
>>> >> /etc/ceph/ceph.osd.4.keyring really is 2 commands:
>>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4  and ceph auth add
>>> >> osd.4 osd 'allow *' mon 'allow profile osd' -i
>>> >> /etc/ceph/ceph.osd.4.keyring
>>> >>
>>> >> Cary, what am I looking for in ceph -w and ceph -s to show the
>>> >> status of the data moving?
>>> >> Seems like the data is moving and that I have some issue...
>>> >>
>>> >> root@node-53:~# ceph -w
>>> >>     cluster 2b9f7957-d0db-481e-923e-89972f6c594f
>>> >>      health HEALTH_WARN
>>> >>             176 pgs backfill_wait
>>> >>             1 pgs backfilling
>>> >>             27 pgs degraded
>>> >>             1 pgs recovering
>>> >>             26 pgs recovery_wait
>>> >>             27 pgs stuck degraded
>>> >>             204 pgs stuck unclean
>>> >>             recovery 10322/84644 objects degraded (12.195%)
>>> >>             recovery 52912/84644 objects misplaced (62.511%)
>>> >>      monmap e3: 3 mons at
>>> >> {node-43=192.168.1.7:6789/0,node-44=192.168.1.5:6789/0,node-45=192.168.1.3:6789/0}
>>> >>             election epoch 138, quorum 0,1,2 node-45,node-44,node-43
>>> >>      osdmap e206: 4 osds: 4 up, 4 in; 177 remapped pgs
>>> >>             flags sortbitwise,require_jewel_osds
>>> >>       pgmap v3936175: 512 pgs, 5 pools, 333 GB data, 58184 objects
>>> >>             370 GB used, 5862 GB / 6233 GB avail
>>> >>             10322/84644 objects degraded (12.195%)
>>> >>             52912/84644 objects misplaced (62.511%)
>>> >>                  308 active+clean
>>> >>                  176 active+remapped+wait_backfill
>>> >>                   26 active+recovery_wait+degraded
>>> >>                    1 active+remapped+backfilling
>>> >>                    1 active+recovering+degraded recovery io 100605
>>> >> kB/s, 14 objects/s
>>> >>   client io 0 B/s rd, 92788 B/s wr, 50 op/s rd, 11 op/s wr
>>> >>
>>> >> 2017-12-14 22:45:57.459846 mon.0 [INF] pgmap v3936174: 512 pgs: 1
>>> >> activating, 1 active+recovering+degraded, 26
>>> >> active+recovery_wait+degraded, 1 active+remapped+backfilling, 307
>>> >> active+clean, 176 active+remapped+wait_backfill; 333 GB data, 369
>>> >> active+GB
>>> >> used, 5863 GB / 6233 GB avail; 0 B/s rd, 101107 B/s wr, 19 op/s;
>>> >> 10354/84644 objects degraded (12.232%); 52912/84644 objects
>>> >> misplaced (62.511%); 12224 kB/s, 2 objects/s recovering
>>> >> 2017-12-14 22:45:58.466736 mon.0 [INF] pgmap v3936175: 512 pgs: 1
>>> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1
>>> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill;
>>> >> active+remapped+333 GB data, 370 GB used, 5862 GB /
>>> >> 6233 GB avail; 0 B/s rd, 92788 B/s wr, 61 op/s; 10322/84644
>>> >> objects degraded (12.195%); 52912/84644 objects misplaced
>>> >> (62.511%); 100605 kB/s, 14 objects/s recovering
>>> >> 2017-12-14 22:46:00.474335 mon.0 [INF] pgmap v3936176: 512 pgs: 1
>>> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1
>>> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill;
>>> >> active+remapped+333 GB data, 370 GB used, 5862 GB /
>>> >> 6233 GB avail; 0 B/s rd, 434 kB/s wr, 45 op/s; 10322/84644 objects
>>> >> degraded (12.195%); 52912/84644 objects misplaced (62.511%); 84234
>>> >> kB/s, 10 objects/s recovering
>>> >> 2017-12-14 22:46:02.482228 mon.0 [INF] pgmap v3936177: 512 pgs: 1
>>> >> active+recovering+degraded, 26 active+recovery_wait+degraded, 1
>>> >> active+remapped+backfilling, 308 active+clean, 176 wait_backfill;
>>> >> active+remapped+333 GB data, 370 GB used, 5862 GB /
>>> >> 6233 GB avail; 0 B/s rd, 334 kB/s wr
>>> >>
>>> >>
>>> >> -----Original Message-----
>>> >> From: Cary [mailto:dynamic.c...@gmail.com]
>>> >> Sent: Thursday, December 14, 2017 4:21 PM
>>> >> To: James Okken
>>> >> Cc: ceph-users@lists.ceph.com
>>> >> Subject: Re: [ceph-users] add hard drives to 3 CEPH servers (3
>>> >> server
>>> >> cluster)
>>> >>
>>> >> Jim,
>>> >>
>>> >> I am not an expert, but I believe I can assist.
>>> >>
>>> >>  Normally you will only have 1 OSD per drive. I have heard
>>> >> discussions about using multiple OSDs per disk, when using SSDs though.
>>> >>
>>> >>  Once your drives have been installed you will have to format
>>> >> them, unless you are using Bluestore. My steps for formatting are below.
>>> >> Replace the sXX with your drive name.
>>> >>
>>> >> parted -a optimal /dev/sXX
>>> >> print
>>> >> mklabel gpt
>>> >> unit mib
>>> >> mkpart OSD4sdd1 1 -1
>>> >> quit
>>> >> mkfs.xfs -f /dev/sXX1
>>> >>
>>> >> # Run blkid, and copy the UUID for the newly formatted drive.
>>> >> blkid
>>> >> # Add the mount point/UUID to fstab. The mount point will be
>>> >> created later.
>>> >> vi /etc/fstab
>>> >> # For example
>>> >> UUID=6386bac4-7fef-3cd2-7d64-13db51d83b12 /var/lib/ceph/osd/ceph-4
>>> >> xfs
>>> >> rw,noatime,inode64,logbufs=8 0 0
>>> >>
>>> >>
>>> >> # You can then add the OSD to the cluster.
>>> >>
>>> >> uuidgen
>>> >> # Replace the UUID below with the UUID that was created with uuidgen.
>>> >> ceph osd create 23e734d7-96d8-4327-a2b9-0fbdc72ed8f1
>>> >>
>>> >> # Notice what number of osd it creates usually the lowest # OSD
>>> >> available.
>>> >>
>>> >> # Add osd.4 to ceph.conf on all Ceph nodes.
>>> >> vi /etc/ceph/ceph.conf
>>> >> ...
>>> >> [osd.4]
>>> >> public addr = 172.1.3.1
>>> >> cluster addr = 10.1.3.1
>>> >> ...
>>> >>
>>> >> # Now add the mount point.
>>> >> mkdir -p /var/lib/ceph/osd/ceph-4
>>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4
>>> >>
>>> >> # The command below mounts everything in fstab.
>>> >> mount -a
>>> >> # The number after -i below needs changed to the correct OSD ID,
>>> >> and the osd-uuid needs to be changed the UUID created with uuidgen above.
>>> >> Your keyring location may be different and need changed as well.
>>> >> ceph-osd -i 4 --mkfs --mkkey --osd-uuid
>>> >> 23e734d7-96d8-4327-a2b9-0fbdc72ed8f1
>>> >> chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 ceph auth add osd.4
>>> >> osd 'allow *' mon 'allow profile osd' -i
>>> >> /etc/ceph/ceph.osd.4.keyring
>>> >>
>>> >> # Add the new OSD to its host in the crush map.
>>> >> ceph osd crush add osd.4 .0 host=YOURhostNAME
>>> >>
>>> >> # Since the weight used in the previous step was .0, you will need
>>> >> to increase it. I use 1 for a 1TB drive and 5 for a 5TB drive. The
>>> >> command below will reweight osd.4 to 1. You may need to slowly ramp up 
>>> >> this number.
>>> >> ie .10 then .20 etc.
>>> >> ceph osd crush reweight osd.4 1
>>> >>
>>> >> You should now be able to start the drive. You can watch the data
>>> >> move to the drive with a ceph -w. Once data has migrated to the
>>> >> drive, start the next.
>>> >>
>>> >> Cary
>>> >> -Dynamic
>>> >>
>>> >> On Thu, Dec 14, 2017 at 5:34 PM, James Okken
>>> >> <james.ok...@dialogic.com>
>>> >> wrote:
>>> >>> Hi all,
>>> >>>
>>> >>> Please let me know if I am missing steps or using the wrong steps
>>> >>>
>>> >>> I'm hoping to expand my small CEPH cluster by adding 4TB hard
>>> >>> drives to each of the 3 servers in the cluster.
>>> >>>
>>> >>> I also need to change my replication factor from 1 to 3.
>>> >>> This is part of an Openstack environment deployed by Fuel and I
>>> >>> had foolishly set my replication factor to 1 in the Fuel settings 
>>> >>> before deploy.
>>> >>> I know this would have been done better at the beginning. I do
>>> >>> want to keep the current cluster and not start over. I know this
>>> >>> is going thrash my cluster for a while replicating, but there isn't too 
>>> >>> much data on it yet.
>>> >>>
>>> >>>
>>> >>> To start I need to safely turn off each CEPH server and add in
>>> >>> the 4TB
>>> >>> drive:
>>> >>> To do that I am going to run:
>>> >>> ceph osd set noout
>>> >>> systemctl stop ceph-osd@1 (or 2 or 3 on the other servers) ceph
>>> >>> osd tree (to verify it is down) poweroff, install the 4TB drive,
>>> >>> bootup again ceph osd unset noout
>>> >>>
>>> >>>
>>> >>>
>>> >>> Next step wouyld be to get CEPH to use the 4TB drives. Each CEPH
>>> >>> server already has a 836GB OSD.
>>> >>>
>>> >>> ceph> osd df
>>> >>> ID WEIGHT  REWEIGHT SIZE  USE  AVAIL %USE  VAR  PGS
>>> >>>  0 0.81689  1.00000  836G 101G  734G 12.16 0.90 167
>>> >>>  1 0.81689  1.00000  836G 115G  721G 13.76 1.02 166
>>> >>>  2 0.81689  1.00000  836G 121G  715G 14.49 1.08 179
>>> >>>               TOTAL 2509G 338G 2171G 13.47 MIN/MAX VAR: 0.90/1.08
>>> >>> STDDEV: 0.97
>>> >>>
>>> >>> ceph> df
>>> >>> GLOBAL:
>>> >>>     SIZE      AVAIL     RAW USED     %RAW USED
>>> >>>     2509G     2171G         338G         13.47
>>> >>> POOLS:
>>> >>>     NAME        ID     USED     %USED     MAX AVAIL     OBJECTS
>>> >>>     rbd         0         0         0         2145G           0
>>> >>>     images      1      216G      9.15         2145G       27745
>>> >>>     backups     2         0         0         2145G           0
>>> >>>     volumes     3      114G      5.07         2145G       29717
>>> >>>     compute     4         0         0         2145G           0
>>> >>>
>>> >>>
>>> >>> Once I get the 4TB drive into each CEPH server should I look to
>>> >>> increasing the current OSD (ie: to 4836GB)?
>>> >>> Or create a second 4000GB OSD on each CEPH server?
>>> >>> If I am going to create a second OSD on each CEPH server I hope
>>> >>> to use this doc:
>>> >>> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/
>>> >>>
>>> >>>
>>> >>>
>>> >>> As far as changing the replication factor from 1 to 3:
>>> >>> Here are my pools now:
>>> >>>
>>> >>> ceph osd pool ls detail
>>> >>> pool 0 'rbd' replicated size 1 min_size 1 crush_ruleset 0
>>> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags
>>> >>> hashpspool stripe_width 0 pool 1 'images' replicated size 1
>>> >>> min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num
>>> >>> 64 last_change 116 flags hashpspool stripe_width 0
>>> >>>         removed_snaps [1~3,b~6,12~8,20~2,24~6,2b~8,34~2,37~20]
>>> >>> pool 2 'backups' replicated size 1 min_size 1 crush_ruleset 0
>>> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 7 flags
>>> >>> hashpspool stripe_width 0 pool 3 'volumes' replicated size 1
>>> >>> min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256
>>> >>> pgp_num 256 last_change 73 flags hashpspool stripe_width 0
>>> >>>         removed_snaps [1~3]
>>> >>> pool 4 'compute' replicated size 1 min_size 1 crush_ruleset 0
>>> >>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 34 flags
>>> >>> hashpspool stripe_width 0
>>> >>>
>>> >>> I plan on using these steps I saw online:
>>> >>> ceph osd pool set rbd size 3
>>> >>> ceph -s  (Verify that replication completes successfully) ceph
>>> >>> osd pool set images size 3 ceph -s ceph osd pool set backups size
>>> >>> 3 ceph -s ceph osd pool set volumes size 3 ceph -s
>>> >>>
>>> >>>
>>> >>> please let me know any advice or better methods...
>>> >>>
>>> >>> thanks
>>> >>>
>>> >>> --Jim
>>> >>>
>>> >>> _______________________________________________
>>> >>> ceph-users mailing list
>>> >>> ceph-users@lists.ceph.com
>>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] add hard drives to 3 CEPH servers (3 server cluster)

Reply via email to