On Fri, 1 Jun 2012, Alex Elder wrote:
> On 06/01/2012 11:20 AM, Sage Weil wrote:
> > The problem is that socket events queue work, which can take a while, and
> > race with, say, osd_client getting an osdmap and dropping it's
> > struct ceph_osd.  The ->get and ->put ops just twiddle the containing
> > struct's refcount, in that case, so the con_work will find the (now
> > closed) ceph_connection and do nothing...
> 
> I think you're saying that the connection (or its socket) needs to
> be protected from its containing structure going away.  So the
> connection needs to hold a reference to its container.  If that's
> the case then the disposal of the ceph_osd needs to clean up
> the connection fully before it goes away.

Yeah.  I think it happens already before we drop the ref:

static void __remove_osd(struct ceph_osd_client *osdc, struct ceph_osd *osd)
{
        dout("__remove_osd %p\n", osd);
        BUG_ON(!list_empty(&osd->o_requests));
        rb_erase(&osd->o_node, &osdc->osds);
        list_del_init(&osd->o_osd_lru);
        ceph_con_close(&osd->o_con);
        put_osd(osd);
}

So it's just the con reference in the workqueue that matters.

sage



> 
> Anyway, I think I see why there might be a need for the ref counts
> and they obviously won't go away if they're needed...
> 
>                                       -Alex
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to