Hi HP.

Mine was not really a fix, it was just a hack to get the OSD up long enough
to make sure I had a full backup, then I rebuilt the cluster from scratch
and restored the data.  Though the hack did stop the OSD from crashing, it
is probably a symptom of some internal problem, and may not be "safe" to
run like that in the long term.

The change was something like this:

Ref:  https://github.com/ceph/ceph/blob/master/src/osd/ReplicatedPG.cc

I changed this:

ObjectContextRef obc = get_object_context(oid, false); assert(obc);
--ctx->delta_stats.num_objects; --ctx->delta_stats.
num_objects_hit_set_archive; ctx->delta_stats.num_bytes -= obc->obs.oi.size;
ctx->delta_stats.num_bytes_hit_set_archive -= obc->obs.oi.size;

to this:

ObjectContextRef obc = 0; // get_object_context(oid, false); assert(obc);
--ctx->delta_stats.num_objects; --ctx->delta_stats.
num_objects_hit_set_archive;
if( obc)
{
 ctx->delta_stats.num_bytes -= obc->obs.oi.size;
 ctx->delta_stats.num_bytes_hit_set_archive -= obc->obs.oi.size;
}


Good luck!
Blade.


On Sat, Aug 13, 2016 at 5:52 AM, Hein-Pieter van Braam <h...@tmm.cx> wrote:

> Hi Blade,
>
> I appear to be stuck in the same situation you were in. Do you still
> happen to have a patch to implement this workaround you described?
>
> Thanks,
>
> - HP
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to