Re: [ceph-users] Power outages!!! help!

2017-09-28 Thread hjcho616
table-tool all reset sessionhttp://docs.ceph.com/docs/jewel/cephfs/disaster-recovery/ Restarted MDS.  HEALTH_WARN no legacy OSD present but 'sortbitwise' flag is not set Mounted!  Thank you everyone for the help!  Learned alot! Regards,Hong On Friday, September 22, 2017 1:01 AM, hj

Re: [ceph-users] Power outages!!! help!

2017-09-21 Thread hjcho616
On Thursday, September 21, 2017 1:46 AM, Ronny Aasen wrote: On 21. sep. 2017 00:35, hjcho616 wrote: > # rados list-inconsistent-pg data > ["0.0","0.5","0.a","0.e","0.1c","0.29","0.2c"] > # rados list-inconsis

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
I don't know how long is "wait a bit" is, I just turned it back on after a minute or so, just returns back to same inconsistent message.. =P  Are we looking for entire stopped OSD to map to different OSD and get 3 replica when running stopped OSD again? Regards,Hong On Wedn

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
ll manually as on http://ceph.com/geen-categorie/ceph-manually-repair-object/ good luck Ronny Aasen On 20.09.2017 22:17, hjcho616 wrote: Thanks Ronny. I decided to try to tar everything under current directory.  Is this correct command for it?  Is there any directory we do not

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
degraded; no legacy OSD present but 'sortbitwise' flag is not set Regards,Hong On Wednesday, September 20, 2017 11:53 AM, Ronny Aasen wrote: On 20.09.2017 16:49, hjcho616 wrote: Anyone?  Can this page be saved?  If not what are my options? Regards, Hong On S

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
Anyone?  Can this page be saved?  If not what are my options? Regards,Hong On Saturday, September 16, 2017 1:55 AM, hjcho616 wrote: Looking better... working on scrubbing..HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 pgs

Re: [ceph-users] Power outages!!! help!

2017-09-15 Thread hjcho616
Looking better... working on scrubbing..HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 pgs repair; 1 pgs stuck inactive; 1 pgs stuck unclean; 109 scrub errors; too few PGs per OSD (29 < min 30); mds rank 0 has failed; mds cluster is degrad

Re: [ceph-users] Power outages!!! help!

2017-09-15 Thread hjcho616
er again using the method linked a few times in this thread. How did that go, were you successfull in recovering those pg's ? kind regards. Ronny Aasen On 15. sep. 2017 07:52, hjcho616 wrote: > I just did this and backfilling started.  Let's see where this takes me. > ceph osd

Re: [ceph-users] Power outages!!! help!

2017-09-14 Thread hjcho616
I just did this and backfilling started.  Let's see where this takes me. ceph osd lost 0 --yes-i-really-mean-it Regards,Hong On Friday, September 15, 2017 12:44 AM, hjcho616 wrote: Ronny, Working with all of the pgs shown in the "ceph health detail", I ran below

Re: [ceph-users] Power outages!!! help!

2017-09-14 Thread hjcho616
Ronny, Working with all of the pgs shown in the "ceph health detail", I ran below for each PG to export.ceph-objectstore-tool --op export --pgid 0.1c   --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --skip-journal-replay --file 0.1c.export I have all PGs ex

Re: [ceph-users] Power outages!!! help!

2017-09-13 Thread hjcho616
Rooney, Just tried hooking up osd.0 back.  osd.0 seems to be better as I was able to run ceph-objectstore-tool export so decided to try hooking it up.  Looks like journal is not happy.  Is there any way to get this running?  Or do I need to start getting data using ceph-objectstore-tool? 2017-09

Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread hjcho616
urnal-path /var/lib/ceph/osd/ceph-0/journal --file 0.2c.export.0Failure to read OSD superblock: (2) No such file or directory Regards,Hong On Tuesday, September 12, 2017 10:04 AM, hjcho616 wrote: Thank you for those references!  I'll have to go study some more.  Good portion

Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread hjcho616
sible. if you manage to get one running, let it recover and stabilize. - recover and inject objects from osd's that do not run. stasrt by doing one and one pg. and once you get the hang of the method you can do multiple pg's at the same time. good luck Ronny Aasen On 11. sep. 2

Re: [ceph-users] Power outages!!! help!

2017-09-10 Thread hjcho616
It took a while.  It appears to have cleaned up quite a bit... but still has issues.  I've been seeing below message for more than a day and cpu utilization and io utilization is low... looks like something is stuck...  I rebooted OSDs several times when it looked like it was stuck earlier and i

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616
-tool --op export --pgid 2.2f --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --file 2.2f.exportFailure to read OSD superblock: (2) No such file or directory Regards,Hong On Monday, September 4, 2017 2:29 AM, hjcho616 wrote: Ronny, While letting cl

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616
be that means I need to export both and import both?  If I have to get both, is there a need to merge the two before importing?  Or would the tool know how to handle this? Regards,Hong On Monday, September 4, 2017 1:20 AM, hjcho616 wrote: Thank you Ronny.  I've added two OSDs to OSD

Re: [ceph-users] Power outages!!! help!

2017-09-03 Thread hjcho616
sd to the cluster. kind regards Ronny Aasen On 03.09.2017 06:20, hjcho616 wrote: I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that superblock file is the same.  I copied it over and started OSD.  It still fails with the same error message.  Looks like when I upda

Re: [ceph-users] Power outages!!! help!

2017-09-02 Thread hjcho616
Regards,Hong On Friday, September 1, 2017 11:10 PM, hjcho616 wrote: Just realized there is a file called superblock in the ceph directory.  ceph-1 and ceph-2's superblock file is identical, ceph-6 and ceph-7 are identical, but not between the two groups.  When I originally created the

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
/2473662 objects misplaced (10.297%); recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is degraded; no legacy OSD present but 'sortbitwise' flag is not set Regards,Hong On Friday, September 1, 2017 10:37 PM, hjcho616 wrote: Tried connecting recovered osd.  Look

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
journal_uuid  magic          superblock  whoamiactive           fsid     keyring       ready          sysvinitceph_fsid        journal  lost+found     store_version  type Regards,Hong On Friday, September 1, 2017 2:59 PM, hjcho616 wrote: Found the partition, wasn't able to mount the part

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
to use this drive if the data is missing? =)  Or am I being paranoid?   Just plug it? =) Regards,Hong On Friday, September 1, 2017 9:01 AM, hjcho616 wrote: Looks like it has been rescued... Only 1 error as we saw before in the smart log!# ddrescue -f /dev/sda /dev/sdc ./rescue.logGNU

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
b with batteries is:- more “proper temperature” you run them at the more life you get outof them- more battery is overpowered for your application the longer it willsurvive. Get your self a LSI 94** controller and use it as HBA and you will befine. but get MORE DRIVES ! …  On 28 Aug 2017,

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread hjcho616
uck reasing osd_backfill_full_ratio to 92% it may fix things./MagedOn 2017-08-29 21:13, hjcho616 wrote: Nice!  Thank you for the explanation!  I feel like I can revive that OSD. =)   That does sound great.  I don't quite have another cluster so waiting for a drive to arrive! =)   After setti

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread hjcho616
controller and use it as HBA and you will be > fine. but get MORE DRIVES ! …  > > On 28 Aug 2017, at 23:10, hjcho616 wrote: > > > > Thank you Tomasz and Ronny.  I'll have to order some hdd soon and > > try these out.  Car battery idea is nice!  I may try that.. =

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
ck of space3. Follow advice of Ronny Aasen on hot to recover data from hard drives 4 get cooling to drives or you will loose more !  On 28 Aug 2017, at 22:39, hjcho616 wrote: Tomasz, Those machines are behind a surge protector.  Doesn't appear to be a good one!   I do have a UPS... but it

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
Tomasz, Those machines are behind a surge protector.  Doesn't appear to be a good one!   I do have a UPS... but it is my fault... no battery.  Power was pretty reliable for a while... and UPS was just beeping every chance it had, disrupting some sleep.. =P  So running on surge protector only.  I

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
, August 28, 2017 3:24 PM, Tomasz Kusmierz wrote: I think you’ve got your anwser: 197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       1 On 28 Aug 2017, at 21:22, hjcho616 wrote: Steve, I thought that was odd too..  Below is from the log, This captures

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
. | | Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation 380 Data Drive Suite 300 | Draper | Utah | 84020 Office: 801.871.2799 | | | If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with a

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
8, 2017 12:53 PM, Ronny Aasen wrote: comments inline On 28.08.2017 18:31, hjcho616 wrote: I'll see what I can do on that... Looks like I may have to add another OSD host as I utilized all of the SATA ports on those boards. =P Ronny, I am running with size=2 min_siz

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
hat is that is seems that you’ve got issues everywhere and since you are running a production environment (at least it seem like that to me) data and down time is main priority. > On 28 Aug 2017, at 11:58, Ronny Aasen wrote: > > On 28. aug. 2017 08:01, hjcho616 wrote: >> Hell

[ceph-users] Power outages!!! help!

2017-08-27 Thread hjcho616
Hello! I've been using ceph for long time mostly for network CephFS storage, even before Argonaut release!  It's been working very well for me.  Yes, I had some power outtages before and asked few questions on this list before and got resolved happily!  Thank you all! Not sure why but we've been

Re: [ceph-users] infernalis and jewel upgrades...

2016-04-15 Thread hjcho616
.16024__0_4E98A1D9__noneroot@OSD2:/var/lib/ceph/osd# diff ./ceph-3/current/meta/DIR_9/DIR_D/osdmap.16024__0_4E98A1D9__none ./ceph-5/current/meta/DIR_9/DIR_D/osdmap.16024__0_4E98A1D9__none Regards,Hong On Saturday, April 16, 2016 12:35 AM, hjcho616 wrote: osd.3 did have a full version when I

[ceph-users] infernalis and jewel upgrades...

2016-04-15 Thread hjcho616
I've been successfully running cephfs on my Debian Jessies for a while and one day after power outage, MDS wasn't happy.  MDS crashing after it was done loading, increasing the memory utilization quite a bit.  I was running infernalis 9.2.0 and did successful upgrade from Hammer before... so I t

Re: [ceph-users] Power Outage

2014-08-12 Thread hjcho616
, August 12, 2014 3:02 PM, Craig Lewis wrote: I can't really help with MDS.  Hopefully somebody else will chime in here. (Resending, because my last reply was too large.) On Tue, Aug 12, 2014 at 12:44 PM, hjcho616 wrote: Craig, > > >Thanks.  It turns out one of my memory stick

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-27 Thread hjcho616
Looks like client is waking up ok now.  Thanks. Will those fixes be included in next release? Firefly? Regards, Hong From: hjcho616 To: Gregory Farnum Cc: "ceph-users@lists.ceph.com" Sent: Tuesday, March 25, 2014 11:56 AM Subject: Re: [ceph-

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread hjcho616
continued way before the wake event.   I'll monitor the sleep and wake few more times and see if it is good. Thanks. Regards, Hong From: Gregory Farnum To: hjcho616 Cc: Mohd Bazli Ab Karim ; "Yan, Zheng" ; Sage Weil ; "ceph-users@lis

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread hjcho616
Regards, Hong From: Gregory Farnum To: hjcho616 Cc: Mohd Bazli Ab Karim ; "Yan, Zheng" ; Sage Weil ; "ceph-users@lists.ceph.com" Sent: Tuesday, March 25, 2014 11:05 AM Subject: Re: [ceph-users] MDS crash when client goes to sleep On Mon, Mar 24, 2014 at 6:26

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
ar in 0.72 emperor.  I am using debian packages. Client went to sleep for a while (like 8+ hours).  There was no I/O prior to the sleep other than the fact that cephfs was still mounted. Regards, Hong From: Luke Jing Yuan To: hjcho616 Cc: Mohd Bazli Ab Kar

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
]: segfault at 200 ip 7f36c3d480b8 sp 7f36c07d3520 error 4 in libgcc_s.so.1[7f36c3d39000+15000] Regards, Hong From: Luke Jing Yuan To: hjcho616 Cc: Mohd Bazli Ab Karim ; "ceph-users@lists.ceph.com" Sent: Thursday, March 20, 2014 10:53

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
: hjcho616 ; "ceph-users@lists.ceph.com" Sent: Thursday, March 20, 2014 9:40 PM Subject: RE: [ceph-users] MDS crash when client goes to sleep Hi Hong, May I know what has happened to your MDS once it crashed? Was it able to recover from replay? We also facing this issue and I am int

[ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
When CephFS is mounted on a client and when client decides to go to sleep, MDS segfaults.  Has anyone seen this?  Below is a part of MDS log.  This happened in emperor and recent 0.77 release.  I am running Debian Wheezy with testing kernels 3.13.  What can I do to not crash the whole system if