On Fri, Jan 9, 2015 at 7:17 AM, Robert LeBlanc wrote:
> Protect against bit rot. Checked on read and on deep scrub.
There are still issues (at least in firefly) with FDCache and scrub
completion having corrupted on-disk data, so throughout checksumming
will not cover every possible corruption cas
On Thu, 8 Jan 2015 21:17:12 -0700 Robert LeBlanc wrote:
> On Thu, Jan 8, 2015 at 8:31 PM, Christian Balzer wrote:
> > On Thu, 8 Jan 2015 11:41:37 -0700 Robert LeBlanc wrote:
> > Which of course currently means a strongly consistent lockup in these
> > scenarios. ^o^
>
> That is one way of puttin
On Thu, Jan 8, 2015 at 8:31 PM, Christian Balzer wrote:
> On Thu, 8 Jan 2015 11:41:37 -0700 Robert LeBlanc wrote:
> Which of course currently means a strongly consistent lockup in these
> scenarios. ^o^
That is one way of putting it
> Slightly off-topic and snarky, that strong consistency is of
On Thu, 8 Jan 2015 11:41:37 -0700 Robert LeBlanc wrote:
> On Wed, Jan 7, 2015 at 10:55 PM, Christian Balzer wrote:
> > Which of course begs the question of why not having min_size at 1
> > permanently, so that in the (hopefully rare) case of loosing 2 OSDs at
> > the same time your cluster still
On Wed, Jan 7, 2015 at 9:55 PM, Christian Balzer wrote:
> On Wed, 7 Jan 2015 17:07:46 -0800 Craig Lewis wrote:
>
>> On Mon, Dec 29, 2014 at 4:49 PM, Alexandre Oliva wrote:
>>
>> > However, I suspect that temporarily setting min size to a lower number
>> > could be enough for the PGs to recover.
On Wed, Jan 7, 2015 at 10:55 PM, Christian Balzer wrote:
> Which of course begs the question of why not having min_size at 1
> permanently, so that in the (hopefully rare) case of loosing 2 OSDs at the
> same time your cluster still keeps working (as it should with a size of 3).
The idea is that
On Wed, 7 Jan 2015 17:07:46 -0800 Craig Lewis wrote:
> On Mon, Dec 29, 2014 at 4:49 PM, Alexandre Oliva wrote:
>
> > However, I suspect that temporarily setting min size to a lower number
> > could be enough for the PGs to recover. If "ceph osd pool set
> > min_size 1" doesn't get the PGs goin
On Mon, Dec 29, 2014 at 4:49 PM, Alexandre Oliva wrote:
> However, I suspect that temporarily setting min size to a lower number
> could be enough for the PGs to recover. If "ceph osd pool set
> min_size 1" doesn't get the PGs going, I suppose restarting at least one
> of the OSDs involved in t
Hi Eneko,
nope, new pool has all pgs active+clean, not errors during image
creation. The format command just hangs, without error.
Am 30.12.2014 12:33, schrieb Eneko Lacunza:
> Hi Christian,
>
> New pool's pgs also show as incomplete?
>
> Did you notice something remarkable in ceph logs in th
Hi Christian,
New pool's pgs also show as incomplete?
Did you notice something remarkable in ceph logs in the new pools image
format?
On 30/12/14 12:31, Christian Eichelmann wrote:
Hi Eneko,
I was trying a rbd cp before, but that was haning as well. But I
couldn't find out if the source ima
Hi Eneko,
I was trying a rbd cp before, but that was haning as well. But I
couldn't find out if the source image was causing the hang or the
destination image. That's why I decided to try a posix copy.
Our cluster is sill nearly empty (12TB / 867TB). But as far as I
understood (If not, somebody p
Hi Christian,
Have you tried to migrate the disk from the old storage (pool) to the
new one?
I think it should show the same problem, but I think it'd be a much
easier path to recover than the posix copy.
How full is your storage?
Maybe you can customize the crushmap, so that some OSDs are
Hi Nico and all others who answered,
After some more trying to somehow get the pgs in a working state (I've
tried force_create_pg, which was putting then in creating state. But
that was obviously not true, since after rebooting one of the containing
osd's it went back to incomplete), I decided to
On Dec 29, 2014, Christian Eichelmann wrote:
> After we got everything up and running again, we still have 3 PGs in the
> state incomplete. I was checking one of them directly on the systems
> (replication factor is 3).
I have run into this myself at least twice before. I had not lost or
replac
On Mon, Dec 29, 2014 at 12:56 PM, Christian Eichelmann
wrote:
> Hi all,
>
> we have a ceph cluster, with currently 360 OSDs in 11 Systems. Last week
> we were replacing one OSD System with a new one. During that, we had a
> lot of problems with OSDs crashing on all of our systems. But that is
> no
Hi Christian,
I had a similar problem about a month ago.
After trying lots of helpful suggestions, I found none of it worked and
I could only delete the affected pools and start over.
I opened a feature request in the tracker:
http://tracker.ceph.com/issues/10098
If you find a way, let
Hey Christian,
Christian Eichelmann [Mon, Dec 29, 2014 at 10:56:59AM +0100]:
> [incomplete PG / RBD hanging, osd lost also not helping]
that is very interesting to hear, because we had a similar situation
with ceph 0.80.7 and had to re-create a pool, after I deleted 3 pg
directories to allow OSDs
Hi all,
we have a ceph cluster, with currently 360 OSDs in 11 Systems. Last week
we were replacing one OSD System with a new one. During that, we had a
lot of problems with OSDs crashing on all of our systems. But that is
not our current problem.
After we got everything up and running again, we s
18 matches
Mail list logo