Re: [ceph-users] scrub errors continue with 0.80.4

2014-07-18 Thread Gregory Farnum
It's just because the PG hadn't been scrubbed since the error occurred; then you upgraded, it scrubbed, and the error was found. You can deep-scrub all your PGs to check them if you like, but as I've said elsewhere this issue -- while scary! -- shouldn't actually damage any of your user data, so ju

Re: [ceph-users] scrub errors continue with 0.80.4

2014-07-18 Thread Randy Smith
Greg, This error occurred AFTER the upgrade. I upgraded to 0.80.4 last night and this error cropped up this afternoon. I ran `ceph pg repair 3.7f` (after I copied the pgs) which returned the cluster to health. However, I'm concerned that this showed up again so soon after I upgraded to 0.80.4. Is

Re: [ceph-users] Mon won't start, possibly due to corrupt disk?

2014-07-18 Thread Gregory Farnum
Keep in mind that this has thrown out all the auth info in your cluster, so if you ever do enable cephx you'll need to re-assign all the keys. And you might be in line for some other strangeness as well that I haven't foreseen down the line. In the meanwhile, I've forwarded things on to our full-t

Re: [ceph-users] health_err on osd full

2014-07-18 Thread James Eckersall
Thanks Greg. I appreciate the advice, and very quick replies too :) On 18 July 2014 23:35, Gregory Farnum wrote: > On Fri, Jul 18, 2014 at 3:29 PM, James Eckersall > wrote: > > Thanks Greg. > > > > Can I suggest that the documentation makes this much clearer? It might > just be me, but I cou

Re: [ceph-users] Mon won't start, possibly due to corrupt disk?

2014-07-18 Thread Lincoln Bryant
Thanks Greg. Just for posterity, "ceph-kvstore-tool /var/lib/ceph/mon/store.db set auth last_committed ver 0" did the trick and we're back to HEALTH_OK. Cheers, Lincoln Bryant On Jul 18, 2014, at 4:15 PM, Gregory Farnum wrote: > Hmm, this log is just leaving me with more questions. Could you ta

Re: [ceph-users] health_err on osd full

2014-07-18 Thread Gregory Farnum
On Fri, Jul 18, 2014 at 3:29 PM, James Eckersall wrote: > Thanks Greg. > > Can I suggest that the documentation makes this much clearer? It might just > be me, but I couldn't glean this from the docs, so I expect I'm not the only > one. > > Also, can I clarify how many pg's you would suggest is

Re: [ceph-users] health_err on osd full

2014-07-18 Thread James Eckersall
Thanks Greg. Can I suggest that the documentation makes this much clearer? It might just be me, but I couldn't glean this from the docs, so I expect I'm not the only one. Also, can I clarify how many pg's you would suggest is a decent number for my setup? 80 OSD's across 4 nodes. 5 pools. I'

Re: [ceph-users] health_err on osd full

2014-07-18 Thread Gregory Farnum
Yes, that's expected behavior. Since the cluster can't move data around on its own, and lots of things will behave *very badly* if some of their writes go through but others don't, the cluster goes read-only once any OSD is full. That's why nearfull is a warn condition; you really want to even out

[ceph-users] health_err on osd full

2014-07-18 Thread James Eckersall
Hi, I have a ceph cluster running on 0.80.1 with 80 OSD's. I've had fairly uneven distribution of the data and have been keeping it ticking along with "ceph osd reweight XX 0.x" commands on a few OSD's while I try and increase the pg count of the pools to hopefully better balance the data. Tonig

Re: [ceph-users] scrub errors continue with 0.80.4

2014-07-18 Thread Gregory Farnum
The config option change in the upgrade will prevent *new* scrub errors from occurring, but it won't resolve existing ones. You'll need to run a scrub repair to fix those up. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Jul 18, 2014 at 2:59 PM, Randy Smith wrote: >

[ceph-users] scrub errors continue with 0.80.4

2014-07-18 Thread Randy Smith
Greetings, I upgraded to 0.80.4 last night to resolve the inconsistent pg scrub errors I was seeing. Unfortunately, they are continuing. $ ceph health detail HEALTH_ERR 1 pgs inconsistent; 1 scrub errors pg 3.7f is active+clean+inconsistent, acting [0,4] And here's the relevant log entries. 201

Re: [ceph-users] Mon won't start, possibly due to corrupt disk?

2014-07-18 Thread Gregory Farnum
Hmm, this log is just leaving me with more questions. Could you tar up the "/var/lib/ceph/mon/store.db" (substitute actual mon store path as necessary) and upload it for me? (you can use ceph-post-file to put it on our servers if you prefer.) Just from the log I don't have a great idea of what's go

Re: [ceph-users] Not able to achieve active+clean state

2014-07-18 Thread Vincenzo Pii
You can change the pools configuration using for example one of this two options: 1. define a different crush_ruleset for the pools to allow replicas to be placed on the same host (so 3 replicas can be accommodated on two hosts). 2. set the number of replicas to two Option 2 is very easy: $ c

Re: [ceph-users] Possible to schedule deep scrub to nights?

2014-07-18 Thread Gregory Farnum
There's nothing built in to the system but I think some people have had success with scripts that set nobackfill during the day, and then trigger them regularly at night. Try searching the list archives. :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Jul 18, 2014 at

Re: [ceph-users] [openstack-dev] [Nova] [RBD] Copy-on-write cloning for RBD-backed disks

2014-07-18 Thread Russell Bryant
On 07/17/2014 03:07 PM, Dmitry Borodaenko wrote: > The meeting is in 2 hours, so you still have a chance to particilate > or at least lurk :) Note that this spec has 4 members of nova-core sponsoring it for an exception on the etherpad tracking potential exceptions. We'll review further in the me

Re: [ceph-users] v0.80.4 Firefly released

2014-07-18 Thread Dmitry Smirnov
Hi Sage, A little favour -- It will be helpful if you could CC release announcements to ceph-maintain...@lists.ceph.com please. It is much easier to notice there comparing to higher volume {devel,users} lists. Thank you very much. -- All the best, Dmitry Smirnov GPG key : 4096R/53968D1

Re: [ceph-users] Not able to achieve active+clean state

2014-07-18 Thread Joe Hewitt
Agree with Iban,, adding one more OSD may help. I met the same problem when I created cluster following quick start guide. 2 OSD nodes gave me health_warn and then adding one more OSD made everything okay. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Iban Cabrill

Re: [ceph-users] Not able to achieve active+clean state

2014-07-18 Thread Iban Cabrillo
Hi Pratik, I am not an expert, but I think you need one more OSD server, the default pools (rbd, metadata, data) have 3 replicas by default. Regards, I El 18/07/2014 14:19, "Pratik Rupala" escribió: > Hi, > > I am deploying firefly version on CentOs 6.4. I am following quick > installation ins

Re: [ceph-users] Regarding ceph osd setmaxosd

2014-07-18 Thread Sage Weil
On Fri, 18 Jul 2014, Anand Bhat wrote: > I have question on intention of Ceph setmaxosd command. From source code, it > appears as if this is present as a way to limit the number of OSDs in the > Ceph cluster.  Yeah. It's basically sizing the array of OSDs in the OSDMap. It's a bit obsolete sin

[ceph-users] Not able to achieve active+clean state

2014-07-18 Thread Pratik Rupala
Hi, I am deploying firefly version on CentOs 6.4. I am following quick installation instructions available at ceph.com. Kernel version in CentOs 6.4 is 2.6.32-358. I am using virtual machines for all the nodes. As per the setup, there are one admin-node, one monitor node and two OSD nodes. I

[ceph-users] Possible to schedule deep scrub to nights?

2014-07-18 Thread David
Is there any known workarounds to schedule deep scrubs to run nightly? Latency does go up a little bit when it runs so I’d rather that it didn’t affect our daily activities. Kind Regards, David ___ ceph-users mailing list ceph-users@lists.ceph.com http