Re: [ceph-users] Upgrading and lost OSDs

2019-11-25 Thread Brent Kennedy
Update on this:

 

I was able to link the block softlink back to the original for each offline 
drive.  I used the ceph-bluestore-tool with show-label  “ceph-bluestore-tool 
show-label --dev /dev/disk/by-partuuid/” on each drive ( apparently the newer 
commands link them as ceph-uuid, but these were created with luminous and 
ceph-deploy 1.5.9 ).  I added the keyring, fsid, ceph-fsid, type and whomai 
files to each directory and set the owner as ceph.  The bluestore tool output 
has all the required data for each of those files.  Then just started the 
service.  All 12 disks came back up this way.

 

What worries me here is that there are other files in the 
/var/lib/ceph/osd/ceph- folders on other servers of the same version 
creation date.  Most of them contain simple content.  Wonder why files are used 
instead of a single json config file?

 

Thoughts?

 

-Brent

 

 

 

From: ceph-users  On Behalf Of Brent Kennedy
Sent: Friday, November 22, 2019 6:47 PM
To: 'Alfredo Deza' ; 'Bob R' 
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Upgrading and lost OSDs

 

I just ran into this today with a server we rebooted.  The server has been 
upgraded to Nautilus 14.2.2 for a few months.  Was originally installed as 
Jewel, then upgraded to Luminous ( then Nautilus ).   I have a whole server 
where all 12 OSDs have empty folders.  I recreated the keyring file and the 
type file, but now I have the “bluestore(/var/lib/ceph/osd/ceph-80/block) 
_read_bdev_label failed to open /var/lib/ceph/osd/ceph-80/block: (2) No such 
file or directory”  error.

 

Nov 22 23:10:58 ukosdhost15 systemd: Starting Ceph object storage daemon 
osd.80...

Nov 22 23:10:58 ukosdhost15 systemd: Started Ceph object storage daemon osd.80.

Nov 22 23:10:58 ukosdhost15 ceph-osd: 2019-11-22 23:10:58.662 7f86ebe92d80 -1 
bluestore(/var/lib/ceph/osd/ceph-80/block) _read_bdev_label failed to open 
/var/lib/ceph/osd/ceph-80/block: (2) No such file or directory

Nov 22 23:10:58 ukosdhost15 ceph-osd: 2019-11-22 23:10:58.662 7f86ebe92d80 -1 
#033[0;31m ** ERROR: unable to open OSD superblock on 
/var/lib/ceph/osd/ceph-80: (2) No such file or directory#033[0m

Nov 22 23:10:58 ukosdhost15 systemd:  <mailto:ceph-osd@80.service> 
ceph-osd@80.service: main process exited, code=exited, status=1/FAILURE

 

Were you able to restore those OSDs?  I was adding 24 more OSDs when a network 
issue occurred and this server was rebooted as part of that ( and the OSDs died 
on it ).

 

-Brent

 

From: ceph-users < <mailto:ceph-users-boun...@lists.ceph.com> 
ceph-users-boun...@lists.ceph.com> On Behalf Of Alfredo Deza
Sent: Friday, July 26, 2019 12:48 PM
To: Bob R < <mailto:b...@drinksbeer.org> b...@drinksbeer.org>
Cc:  <mailto:ceph-users@lists.ceph.com> ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Upgrading and lost OSDs

 

 

 

On Thu, Jul 25, 2019 at 7:00 PM Bob R mailto:b...@drinksbeer.org> > wrote:

I would try 'mv /etc/ceph/osd{,.old}' then run 'ceph-volume  simple scan' 
again. We had some problems upgrading due to OSDs (perhaps initially installed 
as firefly?) missing the 'type' attribute and iirc the 'ceph-volume simple 
scan' command refused to overwrite existing json files after I made some 
changes to ceph-volume. 

 

Ooof. I could swear that this issue was fixed already and it took me a while to 
find out that it wasn't at all. We saw this a few months ago in our Long 
Running Cluster used for dogfooding. 

 

I've created a ticket to track this work at http://tracker.ceph.com/issues/40987

 

But what you've done is exactly why we chose to persist the JSON files in 
/etc/ceph/osd/*.json, so that an admin could tell if anything is missing (or 
incorrect like in this case) and make the changes needed.

 

 

 

Bob

 

On Wed, Jul 24, 2019 at 1:24 PM Alfredo Deza mailto:ad...@redhat.com> > wrote:

 

 

On Wed, Jul 24, 2019 at 4:15 PM Peter Eisch mailto:peter.ei...@virginpulse.com> > wrote:

Hi,

 

I appreciate the insistency that the directions be followed.  I wholly agree.  
The only liberty I took was to do a ‘yum update’ instead of just ‘yum update 
ceph-osd’ and then reboot.  (Also my MDS runs on the MON hosts, so it got 
update a step early.)

 

As for the logs:

 

[2019-07-24 15:07:22,713][ceph_volume.main][INFO  ] Running command: 
ceph-volume  simple scan

[2019-07-24 15:07:22,714][ceph_volume.process][INFO  ] Running command: 
/bin/systemctl show --no-pager --property=Id --state=running ceph-osd@*

[2019-07-24 15:07:27,574][ceph_volume.main][INFO  ] Running command: 
ceph-volume  simple activate --all

[2019-07-24 15:07:27,575][ceph_volume.devices.simple.activate][INFO  ] 
activating OSD specified in 
/etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json

[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ] Required 
devices (block and data) not present for bluestore

[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ] 

Re: [ceph-users] Upgrading and lost OSDs

2019-11-22 Thread Brent Kennedy
I just ran into this today with a server we rebooted.  The server has been 
upgraded to Nautilus 14.2.2 for a few months.  Was originally installed as 
Jewel, then upgraded to Luminous ( then Nautilus ).   I have a whole server 
where all 12 OSDs have empty folders.  I recreated the keyring file and the 
type file, but now I have the “bluestore(/var/lib/ceph/osd/ceph-80/block) 
_read_bdev_label failed to open /var/lib/ceph/osd/ceph-80/block: (2) No such 
file or directory”  error.

 

Nov 22 23:10:58 ukosdhost15 systemd: Starting Ceph object storage daemon 
osd.80...

Nov 22 23:10:58 ukosdhost15 systemd: Started Ceph object storage daemon osd.80.

Nov 22 23:10:58 ukosdhost15 ceph-osd: 2019-11-22 23:10:58.662 7f86ebe92d80 -1 
bluestore(/var/lib/ceph/osd/ceph-80/block) _read_bdev_label failed to open 
/var/lib/ceph/osd/ceph-80/block: (2) No such file or directory

Nov 22 23:10:58 ukosdhost15 ceph-osd: 2019-11-22 23:10:58.662 7f86ebe92d80 -1 
#033[0;31m ** ERROR: unable to open OSD superblock on 
/var/lib/ceph/osd/ceph-80: (2) No such file or directory#033[0m

Nov 22 23:10:58 ukosdhost15 systemd: ceph-osd@80.service: main process exited, 
code=exited, status=1/FAILURE

 

Were you able to restore those OSDs?  I was adding 24 more OSDs when a network 
issue occurred and this server was rebooted as part of that ( and the OSDs died 
on it ).

 

-Brent

 

From: ceph-users  On Behalf Of Alfredo Deza
Sent: Friday, July 26, 2019 12:48 PM
To: Bob R 
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Upgrading and lost OSDs

 

 

 

On Thu, Jul 25, 2019 at 7:00 PM Bob R mailto:b...@drinksbeer.org> > wrote:

I would try 'mv /etc/ceph/osd{,.old}' then run 'ceph-volume  simple scan' 
again. We had some problems upgrading due to OSDs (perhaps initially installed 
as firefly?) missing the 'type' attribute and iirc the 'ceph-volume simple 
scan' command refused to overwrite existing json files after I made some 
changes to ceph-volume. 

 

Ooof. I could swear that this issue was fixed already and it took me a while to 
find out that it wasn't at all. We saw this a few months ago in our Long 
Running Cluster used for dogfooding. 

 

I've created a ticket to track this work at http://tracker.ceph.com/issues/40987

 

But what you've done is exactly why we chose to persist the JSON files in 
/etc/ceph/osd/*.json, so that an admin could tell if anything is missing (or 
incorrect like in this case) and make the changes needed.

 

 

 

Bob

 

On Wed, Jul 24, 2019 at 1:24 PM Alfredo Deza mailto:ad...@redhat.com> > wrote:

 

 

On Wed, Jul 24, 2019 at 4:15 PM Peter Eisch mailto:peter.ei...@virginpulse.com> > wrote:

Hi,

 

I appreciate the insistency that the directions be followed.  I wholly agree.  
The only liberty I took was to do a ‘yum update’ instead of just ‘yum update 
ceph-osd’ and then reboot.  (Also my MDS runs on the MON hosts, so it got 
update a step early.)

 

As for the logs:

 

[2019-07-24 15:07:22,713][ceph_volume.main][INFO  ] Running command: 
ceph-volume  simple scan

[2019-07-24 15:07:22,714][ceph_volume.process][INFO  ] Running command: 
/bin/systemctl show --no-pager --property=Id --state=running ceph-osd@*

[2019-07-24 15:07:27,574][ceph_volume.main][INFO  ] Running command: 
ceph-volume  simple activate --all

[2019-07-24 15:07:27,575][ceph_volume.devices.simple.activate][INFO  ] 
activating OSD specified in 
/etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json

[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ] Required 
devices (block and data) not present for bluestore

[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ] 
bluestore devices found: [u'data']

[2019-07-24 15:07:27,576][ceph_volume][ERROR ] exception caught by decorator

Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59, 
in newfunc

return f(*a, **kw)

  File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148, in main

terminal.dispatch(self.mapper, subcommand_args)

  File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, in 
dispatch

instance.main()

  File "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/main.py", 
line 33, in main

terminal.dispatch(self.mapper, self.argv)

  File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, in 
dispatch

instance.main()

  File 
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line 
272, in main

self.activate(args)

  File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, 
in is_root

return func(*a, **kw)

  File 
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line 
131, in activate

self.validate_devices(osd_metadata)

  File 
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/ac

Re: [ceph-users] Upgrading and lost OSDs

2019-07-26 Thread Alfredo Deza
me simple scan --stdout` so that it doesn't persist
>> data onto /etc/ceph/osd/ and inspect that the JSON produced is capturing
>> all the necessary details for OSDs.
>>
>> Alternatively, I would look into the JSON files already produced in
>> /etc/ceph/osd/ and check if the details are correct. The `scan` sub-command
>> does a tremendous effort to cover all cases where ceph-disk
>> created an OSD (filestore, bluestore, dmcrypt, etc...) but it is possible
>> that it may be hitting a problem. This is why the tool made these JSON
>> files available, so that they could be inspected and corrected if anything.
>>
>> The details of the scan sub-command can be found at
>> http://docs.ceph.com/docs/master/ceph-volume/simple/scan/ and the JSON
>> structure is described in detail below at
>> http://docs.ceph.com/docs/master/ceph-volume/simple/scan/#json-contents
>>
>> In this particular case the tool is refusing to activate what seems to be
>> a bluestore OSD. Is it really a bluestore OSD? if so, then it can't find
>> where is the data partition. What does that partition look like (for any of
>> the failing OSDs) ? Does it use dmcrypt, how was it created? (hopefully
>> with ceph-disk!)
>>
>> If you know the data partition for a given OSD, try and pass it onto
>> 'scan'. For example if it is /dev/sda1 you could do `ceph-volume simple
>> scan /dev/sda1` and check its output.
>>
>>
>>
>>>
>>> peter
>>>
>>>
>>>
>>>
>>> Peter Eisch
>>> Senior Site Reliability Engineer
>>> T *1.612.659.3228* <1.612.659.3228>
>>> [image: Facebook] <https://www.facebook.com/VirginPulse>
>>> [image: LinkedIn] <https://www.linkedin.com/company/virgin-pulse>
>>> [image: Twitter] <https://twitter.com/virginpulse>
>>> *virginpulse.com* <https://www.virginpulse.com/>
>>> | *virginpulse.com/global-challenge*
>>> <https://www.virginpulse.com/en-gb/global-challenge/>
>>>
>>> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
>>> Switzerland | United Kingdom | USA
>>> Confidentiality Notice: The information contained in this e-mail,
>>> including any attachment(s), is intended solely for use by the designated
>>> recipient(s). Unauthorized use, dissemination, distribution, or
>>> reproduction of this message by anyone other than the intended
>>> recipient(s), or a person designated as responsible for delivering such
>>> messages to the intended recipient, is strictly prohibited and may be
>>> unlawful. This e-mail may contain proprietary, confidential or privileged
>>> information. Any views or opinions expressed are solely those of the author
>>> and do not necessarily represent those of Virgin Pulse, Inc. If you have
>>> received this message in error, or are not the named recipient(s), please
>>> immediately notify the sender and delete this e-mail message.
>>> v2.59
>>>
>>> *From: *Alfredo Deza 
>>> *Date: *Wednesday, July 24, 2019 at 3:02 PM
>>> *To: *Peter Eisch 
>>> *Cc: *Paul Emmerich , "ceph-users@lists.ceph.com"
>>> 
>>> *Subject: *Re: [ceph-users] Upgrading and lost OSDs
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jul 24, 2019 at 3:49 PM Peter Eisch 
>>> wrote:
>>>
>>>
>>>
>>> I’m at step 6.  I updated/rebooted the host to complete “installing the
>>> new packages and restarting the ceph-osd daemon” on the first OSD host.
>>> All the systemctl definitions to start the OSDs were deleted, all the
>>> properties in /var/lib/ceph/osd/ceph-* directories were deleted.  All the
>>> files in /var/lib/ceph/osd-lockbox, for comparison, were untouched and
>>> still present.
>>>
>>>
>>>
>>> Peeking into step 7 I can run ceph-volume:
>>>
>>>
>>>
>>> # ceph-volume simple scan /dev/sda1
>>>
>>> Running command: /usr/sbin/cryptsetup status /dev/sda1
>>>
>>> Running command: /usr/sbin/cryptsetup status
>>> 93fb5f2f-0273-4c87-a718-886d7e6db983
>>>
>>> Running command: /bin/mount -v /dev/sda5 /tmp/tmpF5F8t2
>>>
>>> stdout: mount: /dev/sda5 mounted on /tmp/tmpF5F8t2.
>>>
>>> Running command: /usr/sbin/cryptsetup status /dev/sda5
>>>
>>> Running command: /bin/ceph --cluster ceph --name
>>> client.osd-lockbox.93fb5f2f-0273-4c87-a718-886d7e6db983 --keyring
>>> /tmp/tmpF5F8t2/keyring config-key 

Re: [ceph-users] Upgrading and lost OSDs

2019-07-25 Thread Bob R
t really a bluestore OSD? if so, then it can't find
> where is the data partition. What does that partition look like (for any of
> the failing OSDs) ? Does it use dmcrypt, how was it created? (hopefully
> with ceph-disk!)
>
> If you know the data partition for a given OSD, try and pass it onto
> 'scan'. For example if it is /dev/sda1 you could do `ceph-volume simple
> scan /dev/sda1` and check its output.
>
>
>
>>
>> peter
>>
>>
>>
>>
>> Peter Eisch
>> Senior Site Reliability Engineer
>> T *1.612.659.3228* <1.612.659.3228>
>> [image: Facebook] <https://www.facebook.com/VirginPulse>
>> [image: LinkedIn] <https://www.linkedin.com/company/virgin-pulse>
>> [image: Twitter] <https://twitter.com/virginpulse>
>> *virginpulse.com* <https://www.virginpulse.com/>
>> | *virginpulse.com/global-challenge*
>> <https://www.virginpulse.com/en-gb/global-challenge/>
>>
>> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
>> Switzerland | United Kingdom | USA
>> Confidentiality Notice: The information contained in this e-mail,
>> including any attachment(s), is intended solely for use by the designated
>> recipient(s). Unauthorized use, dissemination, distribution, or
>> reproduction of this message by anyone other than the intended
>> recipient(s), or a person designated as responsible for delivering such
>> messages to the intended recipient, is strictly prohibited and may be
>> unlawful. This e-mail may contain proprietary, confidential or privileged
>> information. Any views or opinions expressed are solely those of the author
>> and do not necessarily represent those of Virgin Pulse, Inc. If you have
>> received this message in error, or are not the named recipient(s), please
>> immediately notify the sender and delete this e-mail message.
>> v2.59
>>
>> *From: *Alfredo Deza 
>> *Date: *Wednesday, July 24, 2019 at 3:02 PM
>> *To: *Peter Eisch 
>> *Cc: *Paul Emmerich , "ceph-users@lists.ceph.com"
>> 
>> *Subject: *Re: [ceph-users] Upgrading and lost OSDs
>>
>>
>>
>>
>>
>> On Wed, Jul 24, 2019 at 3:49 PM Peter Eisch 
>> wrote:
>>
>>
>>
>> I’m at step 6.  I updated/rebooted the host to complete “installing the
>> new packages and restarting the ceph-osd daemon” on the first OSD host.
>> All the systemctl definitions to start the OSDs were deleted, all the
>> properties in /var/lib/ceph/osd/ceph-* directories were deleted.  All the
>> files in /var/lib/ceph/osd-lockbox, for comparison, were untouched and
>> still present.
>>
>>
>>
>> Peeking into step 7 I can run ceph-volume:
>>
>>
>>
>> # ceph-volume simple scan /dev/sda1
>>
>> Running command: /usr/sbin/cryptsetup status /dev/sda1
>>
>> Running command: /usr/sbin/cryptsetup status
>> 93fb5f2f-0273-4c87-a718-886d7e6db983
>>
>> Running command: /bin/mount -v /dev/sda5 /tmp/tmpF5F8t2
>>
>> stdout: mount: /dev/sda5 mounted on /tmp/tmpF5F8t2.
>>
>> Running command: /usr/sbin/cryptsetup status /dev/sda5
>>
>> Running command: /bin/ceph --cluster ceph --name
>> client.osd-lockbox.93fb5f2f-0273-4c87-a718-886d7e6db983 --keyring
>> /tmp/tmpF5F8t2/keyring config-key get
>> dm-crypt/osd/93fb5f2f-0273-4c87-a718-886d7e6db983/luks
>>
>> Running command: /bin/umount -v /tmp/tmpF5F8t2
>>
>> stderr: umount: /tmp/tmpF5F8t2 (/dev/sda5) unmounted
>>
>> Running command: /usr/sbin/cryptsetup --key-file - --allow-discards
>> luksOpen /dev/sda1 93fb5f2f-0273-4c87-a718-886d7e6db983
>>
>> Running command: /bin/mount -v
>> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 /tmp/tmpYK0WEV
>>
>> stdout: mount: /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 mounted
>> on /tmp/tmpYK0WEV.
>>
>> --> broken symlink found /tmp/tmpYK0WEV/block ->
>> /dev/mapper/a05b447c-c901-4690-a249-cc1a2d62a110
>>
>> Running command: /usr/sbin/cryptsetup status /tmp/tmpYK0WEV/block_dmcrypt
>>
>> Running command: /usr/sbin/cryptsetup status
>> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>>
>> Running command: /bin/umount -v /tmp/tmpYK0WEV
>>
>> stderr: umount: /tmp/tmpYK0WEV
>> (/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983) unmounted
>>
>> Running command: /usr/sbin/cryptsetup remove
>> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>>
>> --> OSD 0 got scanned and metadata persisted to file:
>> /etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
>>
>> --> To take ov

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Alfredo Deza
t;
>
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
> Switzerland | United Kingdom | USA
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the designated
> recipient(s). Unauthorized use, dissemination, distribution, or
> reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or privileged
> information. Any views or opinions expressed are solely those of the author
> and do not necessarily represent those of Virgin Pulse, Inc. If you have
> received this message in error, or are not the named recipient(s), please
> immediately notify the sender and delete this e-mail message.
> v2.59
>
> *From: *Alfredo Deza 
> *Date: *Wednesday, July 24, 2019 at 3:02 PM
> *To: *Peter Eisch 
> *Cc: *Paul Emmerich , "ceph-users@lists.ceph.com"
> 
> *Subject: *Re: [ceph-users] Upgrading and lost OSDs
>
>
>
>
>
> On Wed, Jul 24, 2019 at 3:49 PM Peter Eisch 
> wrote:
>
>
>
> I’m at step 6.  I updated/rebooted the host to complete “installing the
> new packages and restarting the ceph-osd daemon” on the first OSD host.
> All the systemctl definitions to start the OSDs were deleted, all the
> properties in /var/lib/ceph/osd/ceph-* directories were deleted.  All the
> files in /var/lib/ceph/osd-lockbox, for comparison, were untouched and
> still present.
>
>
>
> Peeking into step 7 I can run ceph-volume:
>
>
>
> # ceph-volume simple scan /dev/sda1
>
> Running command: /usr/sbin/cryptsetup status /dev/sda1
>
> Running command: /usr/sbin/cryptsetup status
> 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/mount -v /dev/sda5 /tmp/tmpF5F8t2
>
> stdout: mount: /dev/sda5 mounted on /tmp/tmpF5F8t2.
>
> Running command: /usr/sbin/cryptsetup status /dev/sda5
>
> Running command: /bin/ceph --cluster ceph --name
> client.osd-lockbox.93fb5f2f-0273-4c87-a718-886d7e6db983 --keyring
> /tmp/tmpF5F8t2/keyring config-key get
> dm-crypt/osd/93fb5f2f-0273-4c87-a718-886d7e6db983/luks
>
> Running command: /bin/umount -v /tmp/tmpF5F8t2
>
> stderr: umount: /tmp/tmpF5F8t2 (/dev/sda5) unmounted
>
> Running command: /usr/sbin/cryptsetup --key-file - --allow-discards
> luksOpen /dev/sda1 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/mount -v
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 /tmp/tmpYK0WEV
>
> stdout: mount: /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 mounted on
> /tmp/tmpYK0WEV.
>
> --> broken symlink found /tmp/tmpYK0WEV/block ->
> /dev/mapper/a05b447c-c901-4690-a249-cc1a2d62a110
>
> Running command: /usr/sbin/cryptsetup status /tmp/tmpYK0WEV/block_dmcrypt
>
> Running command: /usr/sbin/cryptsetup status
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/umount -v /tmp/tmpYK0WEV
>
> stderr: umount: /tmp/tmpYK0WEV
> (/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983) unmounted
>
> Running command: /usr/sbin/cryptsetup remove
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>
> --> OSD 0 got scanned and metadata persisted to file:
> /etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
>
> --> To take over management of this scanned OSD, and disable ceph-disk and
> udev, run:
>
> --> ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> #
>
> #
>
> # ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> --> Required devices (block and data) not present for bluestore
>
> --> bluestore devices found: [u'data']
>
> -->  RuntimeError: Unable to activate bluestore OSD due to missing devices
>
> #
>
>
>
> The tool detected bluestore, or rather, it failed to find a journal
> associated with /dev/sda1. Scanning a single partition can cause that.
> There is a flag to spit out the findings to STDOUT instead of persisting
> them in /etc/ceph/osd/
>
>
>
> Since this is a "whole system" upgrade, then the upgrade documentation
> instructions need to be followed:
>
>
>
> ceph-volume simple scan
> ceph-volume simple activate --all
>
>
>
> If the `scan` command doesn't display any information (not even with the
> --stdout flag) then the logs at /var/log/ceph/ceph-volume.log need to be
> inspected. It would be useful to check any findings in there
>
>
>
>
> Okay, this created /etc/ceph/osd/*.json.  This is cool.  Is there a
> command or option which will read these files and mount the devices

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Peter Eisch
Hi,

I appreciate the insistency that the directions be followed.  I wholly agree.  
The only liberty I took was to do a ‘yum update’ instead of just ‘yum update 
ceph-osd’ and then reboot.  (Also my MDS runs on the MON hosts, so it got 
update a step early.)

As for the logs:

[2019-07-24 15:07:22,713][ceph_volume.main][INFO  ] Running command: 
ceph-volume  simple scan
[2019-07-24 15:07:22,714][ceph_volume.process][INFO  ] Running command: 
/bin/systemctl show --no-pager --property=Id --state=running ceph-osd@*
[2019-07-24 15:07:27,574][ceph_volume.main][INFO  ] Running command: 
ceph-volume  simple activate --all
[2019-07-24 15:07:27,575][ceph_volume.devices.simple.activate][INFO  ] 
activating OSD specified in 
/etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ] Required 
devices (block and data) not present for bluestore
[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ] 
bluestore devices found: [u'data']
[2019-07-24 15:07:27,576][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59, 
in newfunc
return f(*a, **kw)
  File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148, in main
terminal.dispatch(self.mapper, subcommand_args)
  File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, in 
dispatch
instance.main()
  File "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/main.py", 
line 33, in main
terminal.dispatch(self.mapper, self.argv)
  File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, in 
dispatch
instance.main()
  File 
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line 
272, in main
self.activate(args)
  File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, 
in is_root
return func(*a, **kw)
  File 
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line 
131, in activate
self.validate_devices(osd_metadata)
  File 
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line 
62, in validate_devices
raise RuntimeError('Unable to activate bluestore OSD due to missing 
devices')
RuntimeError: Unable to activate bluestore OSD due to missing devices

(this is repeated for each of the 16 drives)

Any other thoughts?  (I’ll delete/create the OSDs with ceph-deply otherwise.)

peter



Peter Eisch
Senior Site Reliability Engineer
T1.612.659.3228
virginpulse.com
|virginpulse.com/global-challenge
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.
v2.59
From: Alfredo Deza 
Date: Wednesday, July 24, 2019 at 3:02 PM
To: Peter Eisch 
Cc: Paul Emmerich , "ceph-users@lists.ceph.com" 

Subject: Re: [ceph-users] Upgrading and lost OSDs


On Wed, Jul 24, 2019 at 3:49 PM Peter Eisch 
mailto:peter.ei...@virginpulse.com>> wrote:

I’m at step 6.  I updated/rebooted the host to complete “installing the new 
packages and restarting the ceph-osd daemon” on the first OSD host.  All the 
systemctl definitions to start the OSDs were deleted, all the properties in 
/var/lib/ceph/osd/ceph-* directories were deleted.  All the files in 
/var/lib/ceph/osd-lockbox, for comparison, were untouched and still present.

Peeking into step 7 I can run ceph-volume:

# ceph-volume simple scan /dev/sda1
Running command: /usr/sbin/cryptsetup status /dev/sda1
Running command: /usr/sbin/cryptsetup status 
93fb5f2f-0273-4c87-a718-886d7e6db983
Running command: /bin/mount -v /dev/sda5 /tmp/tmpF5F8t2
stdout: mount: /dev/sda5 mounted on /tmp/tmpF5F8t2.
Running command: /usr/sbin/cryptsetup status /dev/sda5
Running command: /bin/ceph --cluster ceph --name 
client.osd-lockbox.93fb5f2f-0273-4c87-a718-886d7e6db983 --keyring 
/tmp/tmpF5F8t2/keyring config-key get 
dm-crypt/osd/93fb5f2f-0273-4c87-a718-886d7e6db983/luks
Running command: /bin/umount -v /tmp/tmpF5F8t2
stderr: umount: /tmp/tmpF5F8t2 (/dev/sda5) unmounted
Running command: /

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Alfredo Deza
On Wed, Jul 24, 2019 at 3:49 PM Peter Eisch 
wrote:

>
>
> I’m at step 6.  I updated/rebooted the host to complete “installing the
> new packages and restarting the ceph-osd daemon” on the first OSD host.
> All the systemctl definitions to start the OSDs were deleted, all the
> properties in /var/lib/ceph/osd/ceph-* directories were deleted.  All the
> files in /var/lib/ceph/osd-lockbox, for comparison, were untouched and
> still present.
>
>
>
> Peeking into step 7 I can run ceph-volume:
>
>
>
> # ceph-volume simple scan /dev/sda1
>
> Running command: /usr/sbin/cryptsetup status /dev/sda1
>
> Running command: /usr/sbin/cryptsetup status
> 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/mount -v /dev/sda5 /tmp/tmpF5F8t2
>
> stdout: mount: /dev/sda5 mounted on /tmp/tmpF5F8t2.
>
> Running command: /usr/sbin/cryptsetup status /dev/sda5
>
> Running command: /bin/ceph --cluster ceph --name
> client.osd-lockbox.93fb5f2f-0273-4c87-a718-886d7e6db983 --keyring
> /tmp/tmpF5F8t2/keyring config-key get
> dm-crypt/osd/93fb5f2f-0273-4c87-a718-886d7e6db983/luks
>
> Running command: /bin/umount -v /tmp/tmpF5F8t2
>
> stderr: umount: /tmp/tmpF5F8t2 (/dev/sda5) unmounted
>
> Running command: /usr/sbin/cryptsetup --key-file - --allow-discards
> luksOpen /dev/sda1 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/mount -v
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 /tmp/tmpYK0WEV
>
> stdout: mount: /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 mounted on
> /tmp/tmpYK0WEV.
>
> --> broken symlink found /tmp/tmpYK0WEV/block ->
> /dev/mapper/a05b447c-c901-4690-a249-cc1a2d62a110
>
> Running command: /usr/sbin/cryptsetup status /tmp/tmpYK0WEV/block_dmcrypt
>
> Running command: /usr/sbin/cryptsetup status
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/umount -v /tmp/tmpYK0WEV
>
> stderr: umount: /tmp/tmpYK0WEV
> (/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983) unmounted
>
> Running command: /usr/sbin/cryptsetup remove
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>
> --> OSD 0 got scanned and metadata persisted to file:
> /etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
>
> --> To take over management of this scanned OSD, and disable ceph-disk and
> udev, run:
>
> --> ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> #
>
> #
>
> # ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> --> Required devices (block and data) not present for bluestore
>
> --> bluestore devices found: [u'data']
>
> -->  RuntimeError: Unable to activate bluestore OSD due to missing devices
>
> #
>

The tool detected bluestore, or rather, it failed to find a journal
associated with /dev/sda1. Scanning a single partition can cause that.
There is a flag to spit out the findings to STDOUT instead of persisting
them in /etc/ceph/osd/

Since this is a "whole system" upgrade, then the upgrade documentation
instructions need to be followed:

ceph-volume simple scan
ceph-volume simple activate --all


If the `scan` command doesn't display any information (not even with the
--stdout flag) then the logs at /var/log/ceph/ceph-volume.log need to be
inspected. It would be useful to check any findings in there


>
> Okay, this created /etc/ceph/osd/*.json.  This is cool.  Is there a
> command or option which will read these files and mount the devices?
>
>
>
> peter
>
>
>
>
>
>
> Peter Eisch
> Senior Site Reliability Engineer
> T *1.612.659.3228* <1.612.659.3228>
> [image: Facebook] <https://www.facebook.com/VirginPulse>
> [image: LinkedIn] <https://www.linkedin.com/company/virgin-pulse>
> [image: Twitter] <https://twitter.com/virginpulse>
> *virginpulse.com* <https://www.virginpulse.com/>
> | *virginpulse.com/global-challenge*
> <https://www.virginpulse.com/en-gb/global-challenge/>
>
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
> Switzerland | United Kingdom | USA
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the designated
> recipient(s). Unauthorized use, dissemination, distribution, or
> reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or privileged
> information. Any views or opinions expressed are solely those of the author
> and do not necessarily represent those of Virgin Pulse, Inc. If you have

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Peter Eisch

I’m at step 6.  I updated/rebooted the host to complete “installing the new 
packages and restarting the ceph-osd daemon” on the first OSD host.  All the 
systemctl definitions to start the OSDs were deleted, all the properties in 
/var/lib/ceph/osd/ceph-* directories were deleted.  All the files in 
/var/lib/ceph/osd-lockbox, for comparison, were untouched and still present.

Peeking into step 7 I can run ceph-volume:

# ceph-volume simple scan /dev/sda1
Running command: /usr/sbin/cryptsetup status /dev/sda1
Running command: /usr/sbin/cryptsetup status 
93fb5f2f-0273-4c87-a718-886d7e6db983
Running command: /bin/mount -v /dev/sda5 /tmp/tmpF5F8t2
stdout: mount: /dev/sda5 mounted on /tmp/tmpF5F8t2.
Running command: /usr/sbin/cryptsetup status /dev/sda5
Running command: /bin/ceph --cluster ceph --name 
client.osd-lockbox.93fb5f2f-0273-4c87-a718-886d7e6db983 --keyring 
/tmp/tmpF5F8t2/keyring config-key get 
dm-crypt/osd/93fb5f2f-0273-4c87-a718-886d7e6db983/luks
Running command: /bin/umount -v /tmp/tmpF5F8t2
stderr: umount: /tmp/tmpF5F8t2 (/dev/sda5) unmounted
Running command: /usr/sbin/cryptsetup --key-file - --allow-discards luksOpen 
/dev/sda1 93fb5f2f-0273-4c87-a718-886d7e6db983
Running command: /bin/mount -v /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 
/tmp/tmpYK0WEV
stdout: mount: /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 mounted on 
/tmp/tmpYK0WEV.
--> broken symlink found /tmp/tmpYK0WEV/block -> 
/dev/mapper/a05b447c-c901-4690-a249-cc1a2d62a110
Running command: /usr/sbin/cryptsetup status /tmp/tmpYK0WEV/block_dmcrypt
Running command: /usr/sbin/cryptsetup status 
/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
Running command: /bin/umount -v /tmp/tmpYK0WEV
stderr: umount: /tmp/tmpYK0WEV 
(/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983) unmounted
Running command: /usr/sbin/cryptsetup remove 
/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
--> OSD 0 got scanned and metadata persisted to file: 
/etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
--> To take over management of this scanned OSD, and disable ceph-disk and 
udev, run:
--> ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
#
#
# ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
--> Required devices (block and data) not present for bluestore
--> bluestore devices found: [u'data']
-->  RuntimeError: Unable to activate bluestore OSD due to missing devices
#

Okay, this created /etc/ceph/osd/*.json.  This is cool.  Is there a command or 
option which will read these files and mount the devices?

peter




Peter Eisch
Senior Site Reliability Engineer
T1.612.659.3228
virginpulse.com
|virginpulse.com/global-challenge
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.
v2.59
From: Alfredo Deza 
Date: Wednesday, July 24, 2019 at 2:20 PM
To: Peter Eisch 
Cc: Paul Emmerich , "ceph-users@lists.ceph.com" 

Subject: Re: [ceph-users] Upgrading and lost OSDs

On Wed, Jul 24, 2019 at 2:56 PM Peter Eisch 
mailto:peter.ei...@virginpulse.com>> wrote:
Hi Paul,

To do better to answer you question, I'm following: 
http://docs.ceph.com/docs/nautilus/releases/nautilus/<https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdocs.ceph.com%2Fdocs%2Fnautilus%2Freleases%2Fnautilus%2F=02%7C01%7Cpeter.eisch%40virginpulse.com%7Ccb996f99f71d41410beb08d7106bece7%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995928035423307=15PrHdzXLqtKg0o2ZM0Pfv%2Fp56KSCGOsXXzhymKkbCA%3D=0>

At step 6, upgrade OSDs, I jumped on an OSD host and did a full 'yum update' 
for patching the host and rebooted to pick up the current centos kernel.

If you are at Step 6 then it is *crucial* to understand that the tooling used 
to create the OSDs is no longer available and Step 7 *is absolutely required*.

ceph-volume has to scan the system and give you the output of all OSDs found so 
that it can persist them in /etc/ceph/osd/*.json files and then can later be
"activated".


I didn't do anything to specific commands for just updating the ceph RPMs in 
this process.

It is not clear if you are at Step 6 and wondering why OSDs are not up, or you 
are past that and 

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Alfredo Deza
On Wed, Jul 24, 2019 at 2:56 PM Peter Eisch 
wrote:

> Hi Paul,
>
> To do better to answer you question, I'm following:
> http://docs.ceph.com/docs/nautilus/releases/nautilus/
>
> At step 6, upgrade OSDs, I jumped on an OSD host and did a full 'yum
> update' for patching the host and rebooted to pick up the current centos
> kernel.
>

If you are at Step 6 then it is *crucial* to understand that the tooling
used to create the OSDs is no longer available and Step 7 *is absolutely
required*.

ceph-volume has to scan the system and give you the output of all OSDs
found so that it can persist them in /etc/ceph/osd/*.json files and then
can later be
"activated".


> I didn't do anything to specific commands for just updating the ceph RPMs
> in this process.
>
>
It is not clear if you are at Step 6 and wondering why OSDs are not up, or
you are past that and ceph-volume wasn't able to detect anything.


> peter
>
> Peter Eisch
> Senior Site Reliability Engineer
> T *1.612.659.3228* <1.612.659.3228>
> [image: Facebook] <https://www.facebook.com/VirginPulse>
> [image: LinkedIn] <https://www.linkedin.com/company/virgin-pulse>
> [image: Twitter] <https://twitter.com/virginpulse>
> *virginpulse.com* <https://www.virginpulse.com/>
> | *virginpulse.com/global-challenge*
> <https://www.virginpulse.com/en-gb/global-challenge/>
>
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
> Switzerland | United Kingdom | USA
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the designated
> recipient(s). Unauthorized use, dissemination, distribution, or
> reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or privileged
> information. Any views or opinions expressed are solely those of the author
> and do not necessarily represent those of Virgin Pulse, Inc. If you have
> received this message in error, or are not the named recipient(s), please
> immediately notify the sender and delete this e-mail message.
> v2.59
>
> From: Paul Emmerich 
> Date: Wednesday, July 24, 2019 at 1:39 PM
> To: Peter Eisch 
> Cc: Xavier Trilla , "ceph-users@lists.ceph.com"
> 
> Subject: Re: [ceph-users] Upgrading and lost OSDs
>
> On Wed, Jul 24, 2019 at 8:36 PM Peter Eisch  peter.ei...@virginpulse.com> wrote:
> # lsblk
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda 8:0 0 1.7T 0 disk
> ├─sda1 8:1 0 100M 0 part
> ├─sda2 8:2 0 1.7T 0 part
> └─sda5 8:5 0 10M 0 part
> sdb 8:16 0 1.7T 0 disk
> ├─sdb1 8:17 0 100M 0 part
> ├─sdb2 8:18 0 1.7T 0 part
> └─sdb5 8:21 0 10M 0 part
> sdc 8:32 0 1.7T 0 disk
> ├─sdc1 8:33 0 100M 0 part
>
> That's ceph-disk which was removed, run "ceph-volume simple scan"
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at
> https://nam02.safelinks.protection.outlook.com/?url=https://croit.io=02|01|peter.ei...@virginpulse.com|93235ab7971a4beceab708d710664a14|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995903843215231=YEQI+UvikVPVeOFNSB2ikqVRiul8ElD3JEZDVOQI+NY==0
> <https://nam02.safelinks.protection.outlook.com/?url=https://croit.io=02%7c01%7cpeter.ei...@virginpulse.com%7C93235ab7971a4beceab708d710664a14%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995903843215231=YEQI+UvikVPVeOFNSB2ikqVRiul8ElD3JEZDVOQI+NY==0>
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
>
> https://nam02.safelinks.protection.outlook.com/?url=http://www.croit.io=02|01|peter.ei...@virginpulse.com|93235ab7971a4beceab708d710664a14|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995903843225224=83sD9wJHxE5W0renuDE7RGR/cPznR6jl9rEfl1AO+oA==0
> <https://nam02.safelinks.protection.outlook.com/?url=http://www.croit.io=02%7c01%7cpeter.ei...@virginpulse.com%7C93235ab7971a4beceab708d710664a14%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995903843225224=83sD9wJHxE5W0renuDE7RGR/cPznR6jl9rEfl1AO+oA==0>
> Tel: +49 89 1896585 90
>
>
> ...
> I'm thinking the OSD would start (I can recreate the .service definitions
> in systemctl) if the above were mounted in a way like they are on another
> of my hosts:
> # lsblk
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda 8:0 0 1.7T 0 disk
> ├─sda1 8:1 0 100M 0 part
> │ └─97712be4-1234-4acc-8102-2265769053a5 253:17 0 98M 0 crypt
> /var/lib/ceph/osd/ceph-16
> ├─sda2 8:2 0 1.7T 0 part
> │ └─049b7160-1234-4edd-a5dc-fe00faca8d89 253:16 0 1.7T 0 crypt
> └─sda5 8:5 0 10M 0 part
> /var/lib/ceph/osd-lockbox/97712be4-9674-4acc-1234-2265769053a5
> sdb 8:1

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Peter Eisch
Hi Paul,

To do better to answer you question, I'm following: 
http://docs.ceph.com/docs/nautilus/releases/nautilus/

At step 6, upgrade OSDs, I jumped on an OSD host and did a full 'yum update' 
for patching the host and rebooted to pick up the current centos kernel.

I didn't do anything to specific commands for just updating the ceph RPMs in 
this process.

peter



Peter Eisch
Senior Site Reliability Engineer
T1.612.659.3228
virginpulse.com
|virginpulse.com/global-challenge
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.
v2.59
From: Paul Emmerich 
Date: Wednesday, July 24, 2019 at 1:39 PM
To: Peter Eisch 
Cc: Xavier Trilla , "ceph-users@lists.ceph.com" 

Subject: Re: [ceph-users] Upgrading and lost OSDs

On Wed, Jul 24, 2019 at 8:36 PM Peter Eisch 
<mailto:peter.ei...@virginpulse.com> wrote:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 100M 0 part
├─sda2 8:2 0 1.7T 0 part
└─sda5 8:5 0 10M 0 part
sdb 8:16 0 1.7T 0 disk
├─sdb1 8:17 0 100M 0 part
├─sdb2 8:18 0 1.7T 0 part
└─sdb5 8:21 0 10M 0 part
sdc 8:32 0 1.7T 0 disk
├─sdc1 8:33 0 100M 0 part

That's ceph-disk which was removed, run "ceph-volume simple scan"


--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at 
https://nam02.safelinks.protection.outlook.com/?url=https://croit.io=02|01|peter.ei...@virginpulse.com|93235ab7971a4beceab708d710664a14|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995903843215231=YEQI+UvikVPVeOFNSB2ikqVRiul8ElD3JEZDVOQI+NY==0

croit GmbH
Freseniusstr. 31h
81247 München
https://nam02.safelinks.protection.outlook.com/?url=http://www.croit.io=02|01|peter.ei...@virginpulse.com|93235ab7971a4beceab708d710664a14|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995903843225224=83sD9wJHxE5W0renuDE7RGR/cPznR6jl9rEfl1AO+oA==0
Tel: +49 89 1896585 90

 
...
I'm thinking the OSD would start (I can recreate the .service definitions in 
systemctl) if the above were mounted in a way like they are on another of my 
hosts:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 100M 0 part
│ └─97712be4-1234-4acc-8102-2265769053a5 253:17 0 98M 0 crypt 
/var/lib/ceph/osd/ceph-16
├─sda2 8:2 0 1.7T 0 part
│ └─049b7160-1234-4edd-a5dc-fe00faca8d89 253:16 0 1.7T 0 crypt
└─sda5 8:5 0 10M 0 part 
/var/lib/ceph/osd-lockbox/97712be4-9674-4acc-1234-2265769053a5
sdb 8:16 0 1.7T 0 disk
├─sdb1 8:17 0 100M 0 part
│ └─f03f0298-1234-42e9-8b28-f3016e44d1e2 253:26 0 98M 0 crypt 
/var/lib/ceph/osd/ceph-17
├─sdb2 8:18 0 1.7T 0 part
│ └─51177019-1234-4963-82d1-5006233f5ab2 253:30 0 1.7T 0 crypt
└─sdb5 8:21 0 10M 0 part 
/var/lib/ceph/osd-lockbox/f03f0298-1234-42e9-8b28-f3016e44d1e2
sdc 8:32 0 1.7T 0 disk
├─sdc1 8:33 0 100M 0 part
│ └─0184df0c-1234-404d-92de-cb71b1047abf 253:8 0 98M 0 crypt 
/var/lib/ceph/osd/ceph-18
├─sdc2 8:34 0 1.7T 0 part
│ └─fdad7618-1234-4021-a63e-40d973712e7b 253:13 0 1.7T 0 crypt
...

Thank you for your time on this,

peter

From: Xavier Trilla <mailto:xavier.tri...@clouding.io>
Date: Wednesday, July 24, 2019 at 1:25 PM
To: Peter Eisch <mailto:peter.ei...@virginpulse.com>
Cc: "mailto:ceph-users@lists.ceph.com; <mailto:ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Upgrading and lost OSDs

Hi Peter,

Im not sure but maybe after some changes the OSDs are not being recongnized by 
ceph scripts.

Ceph used to use udev to detect the OSDs and then moved to lvm, which kind of 
OSDs are you running? Blustore or filestore? Which version did you use to 
create them?

Cheers!

El 24 jul 2019, a les 20:04, Peter Eisch 
<mailto:mailto:peter.ei...@virginpulse.com> va escriure:
Hi,

I’m working through updating from 12.2.12/luminious to 14.2.2/nautilus on 
centos 7.6. The managers are updated alright:

# ceph -s
  cluster:
    id:     2fdb5976-1234-4b29-ad9c-1ca74a9466ec
    health: HEALTH_WARN
            Degraded data redundancy: 24177/9555955 objects degraded (0.253%), 
7 pgs degraded, 1285 pgs undersized
            3 monitors have not enabled msgr2
 ...

I updated ceph on a OSD host with 'yum update' and then rebooted to grab the 
current kernel. Along the way, the contents of all the directories in 
/var

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Peter Eisch
[2019-07-24 13:40:49,602][ceph_volume.process][INFO  ] Running command: 
/bin/systemctl show --no-pager --property=Id --state=running ceph-osd@*

This is the only log event.  At the prompt:

# ceph-volume simple scan
#

peter


Peter Eisch
Senior Site Reliability Engineer
T1.612.659.3228
virginpulse.com
|virginpulse.com/global-challenge
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.
v2.59
From: Paul Emmerich 
Date: Wednesday, July 24, 2019 at 1:32 PM
To: Xavier Trilla 
Cc: Peter Eisch , "ceph-users@lists.ceph.com" 

Subject: Re: [ceph-users] Upgrading and lost OSDs

Did you use ceph-disk before?

Support for ceph-disk was removed, see Nautilus upgrade instructions. You'll 
need to run "ceph-volume simple scan" to convert them to ceph-volume

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at 
https://nam02.safelinks.protection.outlook.com/?url=https://croit.io=02|01|peter.ei...@virginpulse.com|d36b1ddd859a4312cc5908d710654b4f|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995899574324430=qGIIxPCaeiKjrJ2F7enE5NrjY3vfv7fGNaO/gr1RYto==0

croit GmbH
Freseniusstr. 31h
81247 München
https://nam02.safelinks.protection.outlook.com/?url=http://www.croit.io=02|01|peter.ei...@virginpulse.com|d36b1ddd859a4312cc5908d710654b4f|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995899574324430=bw4LllzkQlTarUimhd/JattNfA1ULqdSgtUYmVnhdhU==0
Tel: +49 89 1896585 90


On Wed, Jul 24, 2019 at 8:25 PM Xavier Trilla 
<mailto:xavier.tri...@clouding.io> wrote:
Hi Peter,

Im not sure but maybe after some changes the OSDs are not being recongnized by 
ceph scripts.

Ceph used to use udev to detect the OSDs and then moved to lvm, which kind of 
OSDs are you running? Blustore or filestore? Which version did you use to 
create them?

Cheers!

El 24 jul 2019, a les 20:04, Peter Eisch <mailto:peter.ei...@virginpulse.com> 
va escriure:
Hi,

I’m working through updating from 12.2.12/luminious to 14.2.2/nautilus on 
centos 7.6. The managers are updated alright:

# ceph -s
  cluster:
    id:     2fdb5976-1234-4b29-ad9c-1ca74a9466ec
    health: HEALTH_WARN
            Degraded data redundancy: 24177/9555955 objects degraded (0.253%), 
7 pgs degraded, 1285 pgs undersized
            3 monitors have not enabled msgr2
 ...

I updated ceph on a OSD host with 'yum update' and then rebooted to grab the 
current kernel. Along the way, the contents of all the directories in 
/var/lib/ceph/osd/ceph-*/ were deleted. Thus I have 16 OSDs down from this. I 
can manage the undersized but I'd like to get these drives working again 
without deleting each OSD and recreating them.

So far I've pulled the respective cephx key into the 'keyring' file and 
populated 'bluestore' into the 'type' files but I'm unsure how to get the 
lockboxes mounted to where I can get the OSDs running. The osd-lockbox 
directory is otherwise untouched from when the OSDs were deployed.

Is there a way to run ceph-deploy or some other tool to rebuild the mounts for 
the drives?

peter
___
ceph-users mailing list
mailto:ceph-users@lists.ceph.com
https://nam02.safelinks.protection.outlook.com/?url=http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com=02|01|peter.ei...@virginpulse.com|d36b1ddd859a4312cc5908d710654b4f|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995899574354413=V1IPNZgsCojA+RPbPRQop6R0zGGWTtovUtrg7toHMrs==0

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Paul Emmerich
On Wed, Jul 24, 2019 at 8:36 PM Peter Eisch 
wrote:

> # lsblk
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda 8:0 0 1.7T 0 disk
> ├─sda1 8:1 0 100M 0 part
> ├─sda2 8:2 0 1.7T 0 part
> └─sda5 8:5 0 10M 0 part
> sdb 8:16 0 1.7T 0 disk
> ├─sdb1 8:17 0 100M 0 part
> ├─sdb2 8:18 0 1.7T 0 part
> └─sdb5 8:21 0 10M 0 part
> sdc 8:32 0 1.7T 0 disk
> ├─sdc1 8:33 0 100M 0 part
>

That's ceph-disk which was removed, run "ceph-volume simple scan"

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90



> ...
> I'm thinking the OSD would start (I can recreate the .service definitions
> in systemctl) if the above were mounted in a way like they are on another
> of my hosts:
> # lsblk
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda 8:0 0 1.7T 0 disk
> ├─sda1 8:1 0 100M 0 part
> │ └─97712be4-1234-4acc-8102-2265769053a5 253:17 0 98M 0 crypt
> /var/lib/ceph/osd/ceph-16
> ├─sda2 8:2 0 1.7T 0 part
> │ └─049b7160-1234-4edd-a5dc-fe00faca8d89 253:16 0 1.7T 0 crypt
> └─sda5 8:5 0 10M 0 part
> /var/lib/ceph/osd-lockbox/97712be4-9674-4acc-1234-2265769053a5
> sdb 8:16 0 1.7T 0 disk
> ├─sdb1 8:17 0 100M 0 part
> │ └─f03f0298-1234-42e9-8b28-f3016e44d1e2 253:26 0 98M 0 crypt
> /var/lib/ceph/osd/ceph-17
> ├─sdb2 8:18 0 1.7T 0 part
> │ └─51177019-1234-4963-82d1-5006233f5ab2 253:30 0 1.7T 0 crypt
> └─sdb5 8:21 0 10M 0 part
> /var/lib/ceph/osd-lockbox/f03f0298-1234-42e9-8b28-f3016e44d1e2
> sdc 8:32 0 1.7T 0 disk
> ├─sdc1 8:33 0 100M 0 part
> │ └─0184df0c-1234-404d-92de-cb71b1047abf 253:8 0 98M 0 crypt
> /var/lib/ceph/osd/ceph-18
> ├─sdc2 8:34 0 1.7T 0 part
> │ └─fdad7618-1234-4021-a63e-40d973712e7b 253:13 0 1.7T 0 crypt
> ...
>
> Thank you for your time on this,
>
> peter
>
> Peter Eisch
> Senior Site Reliability Engineer
> T *1.612.659.3228* <1.612.659.3228>
> [image: Facebook] <https://www.facebook.com/VirginPulse>
> [image: LinkedIn] <https://www.linkedin.com/company/virgin-pulse>
> [image: Twitter] <https://twitter.com/virginpulse>
> *virginpulse.com* <https://www.virginpulse.com/>
> | *virginpulse.com/global-challenge*
> <https://www.virginpulse.com/en-gb/global-challenge/>
>
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
> Switzerland | United Kingdom | USA
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the designated
> recipient(s). Unauthorized use, dissemination, distribution, or
> reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or privileged
> information. Any views or opinions expressed are solely those of the author
> and do not necessarily represent those of Virgin Pulse, Inc. If you have
> received this message in error, or are not the named recipient(s), please
> immediately notify the sender and delete this e-mail message.
> v2.59
>
> From: Xavier Trilla 
> Date: Wednesday, July 24, 2019 at 1:25 PM
> To: Peter Eisch 
> Cc: "ceph-users@lists.ceph.com" 
> Subject: Re: [ceph-users] Upgrading and lost OSDs
>
> Hi Peter,
>
> Im not sure but maybe after some changes the OSDs are not being
> recongnized by ceph scripts.
>
> Ceph used to use udev to detect the OSDs and then moved to lvm, which kind
> of OSDs are you running? Blustore or filestore? Which version did you use
> to create them?
>
> Cheers!
>
> El 24 jul 2019, a les 20:04, Peter Eisch  peter.ei...@virginpulse.com> va escriure:
> Hi,
>
> I’m working through updating from 12.2.12/luminious to 14.2.2/nautilus on
> centos 7.6. The managers are updated alright:
>
> # ceph -s
>   cluster:
> id: 2fdb5976-1234-4b29-ad9c-1ca74a9466ec
> health: HEALTH_WARN
> Degraded data redundancy: 24177/9555955 objects degraded
> (0.253%), 7 pgs degraded, 1285 pgs undersized
> 3 monitors have not enabled msgr2
>  ...
>
> I updated ceph on a OSD host with 'yum update' and then rebooted to grab
> the current kernel. Along the way, the contents of all the directories in
> /var/lib/ceph/osd/ceph-*/ were deleted. Thus I have 16 OSDs down from this.
> I can manage the undersized but I'd like to get these drives working again
> without deleting each OSD and recreating them.
>
> So far I've pulled the respective cephx key into the 'keyring' file and
> populated 'bluestore' into the 'type' files but I'm unsure how to get the
> lockboxe

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Peter Eisch
Bluestore created with 12.2.10/luminous.

The OSD startup generates logs like:

2019-07-24 12:39:46.483 7f4b27649d80  0 set uid:gid to 167:167 (ceph:ceph)
2019-07-24 12:39:46.483 7f4b27649d80  0 ceph version 14.2.2 
(4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process ceph-osd, 
pid 48553
2019-07-24 12:39:46.483 7f4b27649d80  0 pidfile_write: ignore empty --pid-file
2019-07-24 12:39:46.483 7f4b27649d80  0 set uid:gid to 167:167 (ceph:ceph)
2019-07-24 12:39:46.483 7f4b27649d80  0 ceph version 14.2.2 
(4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process ceph-osd, 
pid 48553
2019-07-24 12:39:46.483 7f4b27649d80  0 pidfile_write: ignore empty --pid-file
2019-07-24 12:39:46.505 7f4b27649d80 -1 
bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open 
/var/lib/ceph/osd/ceph-0/block: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1  ** ERROR: unable to open OSD 
superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 
bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open 
/var/lib/ceph/osd/ceph-0/block: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 
bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open 
/var/lib/ceph/osd/ceph-0/block: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1  ** ERROR: unable to open OSD 
superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1  ** ERROR: unable to open OSD 
superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
-
# lsblk
NAMEMAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda   8:00   1.7T  0 disk
├─sda18:10   100M  0 part
├─sda28:20   1.7T  0 part
└─sda58:5010M  0 part
sdb   8:16   0   1.7T  0 disk
├─sdb18:17   0   100M  0 part
├─sdb28:18   0   1.7T  0 part
└─sdb58:21   010M  0 part
sdc   8:32   0   1.7T  0 disk
├─sdc18:33   0   100M  0 part
...
I'm thinking the OSD would start (I can recreate the .service definitions in 
systemctl) if the above were mounted in a way like they are on another of my 
hosts:
# lsblk
NAME MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda8:00   1.7T  0 disk
├─sda1 8:10   100M  0 part
│ └─97712be4-1234-4acc-8102-2265769053a5 253:17   098M  0 crypt 
/var/lib/ceph/osd/ceph-16
├─sda2 8:20   1.7T  0 part
│ └─049b7160-1234-4edd-a5dc-fe00faca8d89 253:16   0   1.7T  0 crypt
└─sda5 8:5010M  0 part  
/var/lib/ceph/osd-lockbox/97712be4-9674-4acc-1234-2265769053a5
sdb8:16   0   1.7T  0 disk
├─sdb1 8:17   0   100M  0 part
│ └─f03f0298-1234-42e9-8b28-f3016e44d1e2 253:26   098M  0 crypt 
/var/lib/ceph/osd/ceph-17
├─sdb2 8:18   0   1.7T  0 part
│ └─51177019-1234-4963-82d1-5006233f5ab2 253:30   0   1.7T  0 crypt
└─sdb5 8:21   010M  0 part  
/var/lib/ceph/osd-lockbox/f03f0298-1234-42e9-8b28-f3016e44d1e2
sdc8:32   0   1.7T  0 disk
├─sdc1 8:33   0   100M  0 part
│ └─0184df0c-1234-404d-92de-cb71b1047abf 253:8098M  0 crypt 
/var/lib/ceph/osd/ceph-18
├─sdc2 8:34   0   1.7T  0 part
│ └─fdad7618-1234-4021-a63e-40d973712e7b 253:13   0   1.7T  0 crypt
...

Thank you for your time on this,

peter



Peter Eisch
Senior Site Reliability Engineer
T1.612.659.3228
virginpulse.com
|virginpulse.com/global-challenge
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.
v2.59
From: Xavier Trilla 
Date: Wednesday, July 24, 2019 at 1:25 PM
To: Peter Eisch 
Cc: "ceph-users@lists.ceph.com" 
Subject: Re: [ceph-users] Upgrading and lost OSDs

Hi Peter,

Im not sure but m

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Paul Emmerich
Did you use ceph-disk before?

Support for ceph-disk was removed, see Nautilus upgrade instructions.
You'll need to run "ceph-volume simple scan" to convert them to ceph-volume

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Wed, Jul 24, 2019 at 8:25 PM Xavier Trilla 
wrote:

> Hi Peter,
>
> Im not sure but maybe after some changes the OSDs are not being
> recongnized by ceph scripts.
>
> Ceph used to use udev to detect the OSDs and then moved to lvm, which kind
> of OSDs are you running? Blustore or filestore? Which version did you use
> to create them?
>
>
> Cheers!
>
> El 24 jul 2019, a les 20:04, Peter Eisch  va
> escriure:
>
> Hi,
>
> I’m working through updating from 12.2.12/luminious to 14.2.2/nautilus on
> centos 7.6. The managers are updated alright:
>
> # ceph -s
>   cluster:
> id: 2fdb5976-1234-4b29-ad9c-1ca74a9466ec
> health: HEALTH_WARN
> Degraded data redundancy: 24177/9555955 objects degraded
> (0.253%), 7 pgs degraded, 1285 pgs undersized
> 3 monitors have not enabled msgr2
>  ...
>
> I updated ceph on a OSD host with 'yum update' and then rebooted to grab
> the current kernel. Along the way, the contents of all the directories in
> /var/lib/ceph/osd/ceph-*/ were deleted. Thus I have 16 OSDs down from this.
> I can manage the undersized but I'd like to get these drives working again
> without deleting each OSD and recreating them.
>
> So far I've pulled the respective cephx key into the 'keyring' file and
> populated 'bluestore' into the 'type' files but I'm unsure how to get the
> lockboxes mounted to where I can get the OSDs running. The osd-lockbox
> directory is otherwise untouched from when the OSDs were deployed.
>
> Is there a way to run ceph-deploy or some other tool to rebuild the mounts
> for the drives?
>
> peter
>
> Peter Eisch​
> Senior Site Reliability Engineer
> T *1.612.659.3228* <1.612.659.3228>
>  
>  
>  
> *virginpulse.com* 
> | *virginpulse.com/global-challenge*
> 
>
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
> Switzerland | United Kingdom | USA
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the designated
> recipient(s). Unauthorized use, dissemination, distribution, or
> reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or privileged
> information. Any views or opinions expressed are solely those of the author
> and do not necessarily represent those of Virgin Pulse, Inc. If you have
> received this message in error, or are not the named recipient(s), please
> immediately notify the sender and delete this e-mail message.
> v2.59
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Xavier Trilla
Hi Peter,

Im not sure but maybe after some changes the OSDs are not being recongnized by 
ceph scripts.

Ceph used to use udev to detect the OSDs and then moved to lvm, which kind of 
OSDs are you running? Blustore or filestore? Which version did you use to 
create them?


Cheers!

El 24 jul 2019, a les 20:04, Peter Eisch 
mailto:peter.ei...@virginpulse.com>> va escriure:

Hi,

I'm working through updating from 12.2.12/luminious to 14.2.2/nautilus on 
centos 7.6. The managers are updated alright:

# ceph -s
  cluster:
id: 2fdb5976-1234-4b29-ad9c-1ca74a9466ec
health: HEALTH_WARN
Degraded data redundancy: 24177/9555955 objects degraded (0.253%), 
7 pgs degraded, 1285 pgs undersized
3 monitors have not enabled msgr2
 ...

I updated ceph on a OSD host with 'yum update' and then rebooted to grab the 
current kernel. Along the way, the contents of all the directories in 
/var/lib/ceph/osd/ceph-*/ were deleted. Thus I have 16 OSDs down from this. I 
can manage the undersized but I'd like to get these drives working again 
without deleting each OSD and recreating them.

So far I've pulled the respective cephx key into the 'keyring' file and 
populated 'bluestore' into the 'type' files but I'm unsure how to get the 
lockboxes mounted to where I can get the OSDs running. The osd-lockbox 
directory is otherwise untouched from when the OSDs were deployed.

Is there a way to run ceph-deploy or some other tool to rebuild the mounts for 
the drives?

peter

Peter Eisch?
Senior Site Reliability Engineer


T
1.612.659.3228












virginpulse.com

|

virginpulse.com/global-challenge



Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA


Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.


v2.59


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com