[ceph-users] Replacing an OSD Drive

2015-02-06 Thread Gaylord Holder
When the time comes to replace an OSD I've used the following procedure 1) Stop/down/out the osd and replace the drive 2) Create the ceph osd directory: ceph-osd -i N --mkfs 3) Copy the osd key out of the authorized keys list 4) ceph osd crush rm osd.N 5) ceph osd crush add osd.$i $osd_size root=

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Hi, > Did you also recreate the journal?! It was a journal file and got re-created automatically. Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Smart Weblications GmbH
Am 01.07.2014 17:48, schrieb Sylvain Munaut: > Hi, > > > As an exercise, I killed an OSD today, just killed the process and > removed its data directory. > > To recreate it, I recreated an empty data dir, then > > ceph-osd -c /etc/ceph/ceph.conf -i 3 --monmap /tmp/monmap --mkfs > > (I tried wi

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Hi Loic, > By restoring the fsid file from the back, presumably. I did not think of that > when you showed the ceph-osd mkfs line, but it makes sense. This is not the > ceph fsid. Yeah, I though about that and I saw fsid and ceph_fsid, but I wasn't just that just replacing the file would be en

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Loic Dachary
Hi Sylvain, On 02/07/2014 11:13, Sylvain Munaut wrote: > Ah, I finally fond something that looks like an error message : > > 2014-07-02 11:07:57.817269 7f0692e3a700 7 mon.a@0(leader).osd e1147 > preprocess_boot from osd.3 10.192.2.70:6807/9702 clashes with existing > osd: different fsid (ours: e

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Just for future reference, you actually do need to remove the OSD even if you're going to re-add it like 10 sec later ... $ ceph osd rm 3 removed osd.3 $ ceph osd create 3 Then it works fine. No need to remove from crusmap or remove the auth key (you can re-use both), but you need to remove/add

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Ah, I finally fond something that looks like an error message : 2014-07-02 11:07:57.817269 7f0692e3a700 7 mon.a@0(leader).osd e1147 preprocess_boot from osd.3 10.192.2.70:6807/9702 clashes with existing osd: different fsid (ours: e44c914a-23e9-4756-9713-166de401dec6 ; theirs: c1cfff2f-4f2e-4c1d-a

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Hi, > Does OSD 3 show when you ceph pg dump ? If so I would look in the logs of an > OSD which is participating in the same PG. It appears at the end but not in any PG, it's now been marked out and all was redistributed. osdstat kbused kbavail kb hb in hb out 0 15602352158

Re: [ceph-users] Replacing an OSD

2014-07-01 Thread Loic Dachary
On 01/07/2014 18:21, Sylvain Munaut wrote: > Hi, > >>> And then I start the process, and it starts fine. >>> http://pastebin.com/TPzNth6P >>> I even see one active tcp connection to a mon from that process. >>> >>> But the osd never becomes "up" or do anything ... >> >> I suppose there are erro

Re: [ceph-users] Replacing an OSD

2014-07-01 Thread Sylvain Munaut
Hi, >> And then I start the process, and it starts fine. >> http://pastebin.com/TPzNth6P >> I even see one active tcp connection to a mon from that process. >> >> But the osd never becomes "up" or do anything ... > > I suppose there are error messages in logs somewhere regarding the fact that >

Re: [ceph-users] Replacing an OSD

2014-07-01 Thread Loic Dachary
Hi, On 01/07/2014 17:48, Sylvain Munaut wrote: > Hi, > > > As an exercise, I killed an OSD today, just killed the process and > removed its data directory. > > To recreate it, I recreated an empty data dir, then > > ceph-osd -c /etc/ceph/ceph.conf -i 3 --monmap /tmp/monmap --mkfs > > (I tried

[ceph-users] Replacing an OSD

2014-07-01 Thread Sylvain Munaut
Hi, As an exercise, I killed an OSD today, just killed the process and removed its data directory. To recreate it, I recreated an empty data dir, then ceph-osd -c /etc/ceph/ceph.conf -i 3 --monmap /tmp/monmap --mkfs (I tried with and without giving the monmap). I then restored the keyring fil

Re: [ceph-users] replacing an OSD or crush map sensitivity

2013-06-03 Thread Nigel Williams
On Tue, Jun 4, 2013 at 1:59 PM, Sage Weil wrote: > On Tue, 4 Jun 2013, Nigel Williams wrote: >> Something else I noticed: ... > > Does the monitor data directory share a disk with an OSD? If so, that > makes sense: compaction freed enough space to drop below the threshold... Of course! that is

Re: [ceph-users] replacing an OSD or crush map sensitivity

2013-06-03 Thread Sage Weil
On Tue, 4 Jun 2013, Nigel Williams wrote: > On 4/06/2013 9:16 AM, Chen, Xiaoxi wrote: > > my 0.02? you really dont need to wait for health_ok between your > > recovery steps,just go ahead. Everytime a new map be generated and > > broadcasted,the old map and in-progress recovery will be canceled >

Re: [ceph-users] replacing an OSD or crush map sensitivity

2013-06-03 Thread Nigel Williams
On 4/06/2013 9:16 AM, Chen, Xiaoxi wrote: > my 0.02, you really dont need to wait for health_ok between your > recovery steps,just go ahead. Everytime a new map be generated and > broadcasted,the old map and in-progress recovery will be canceled thanks Xiaoxi, that is helpful to know. It seems to

Re: [ceph-users] replacing an OSD or crush map sensitivity

2013-06-03 Thread Chen, Xiaoxi
my 0.02, you really dont need to wait for health_ok between your recovery steps,just go ahead. Everytime a new map be generated and broadcasted,the old map and in-progress recovery will be canceled 发自我的 iPhone 在 2013-6-2,11:30,"Nigel Williams" 写道: > Could I have a critique of this approach pl

[ceph-users] replacing an OSD or crush map sensitivity

2013-06-01 Thread Nigel Williams
Could I have a critique of this approach please as to how I could have done it better or whether what I experienced simply reflects work still to be done. This is with Ceph 0.61.2 on a quite slow test cluster (logs shared with OSDs, no separate journals, using CephFS). I knocked the power co