On Wed, 29 Sep 2010, Gregory Farnum wrote:

> On Wed, Sep 29, 2010 at 9:29 AM, Henry C Chang
> <[email protected]> wrote:
> > On Thu, Sep 23, 2010 at 1:14 AM, Gregory Farnum <[email protected]> 
> > wrote:
> >> On Wed, Sep 22, 2010 at 8:29 AM, Henry C Chang
> >> <[email protected]> wrote:
> >>> 1. 
> >>> http://github.com/tcloud/ceph/commit/3c91d0c0e20425f94eaa374f39b2a1f398270255
> >>>
> >>> This fixed a monitor bug we hit several times. When some osds report
> >>> that other osds are down,
> >>> there is a chance that the leading monitor would enter an infinite loop 
> >>> (you can
> >>> see the CPU usage of cmon is 100% from "top").
> >> You diagnosed the problem correctly, but it looks like your patch can
> >> make the monitor discard the wrong report! Fixed this by sticking an
> >> "&& i != fail_notes.end()" into the while loop, in
> >> commit:2c5a3d99aa3be5ce114072e84f73a0a6426e63fd.
> >
> > With this commit, however, if i == fail_notes.end() and leaves the while 
> > loop,
> > it will cause segfault/abort in the following if-else when
> > fail_notes.erase(i) is called.
> > (I wonder if my patch is right.)
> Whoops, I missed that in the branching.
> The basic problem here is that if you reach end() then attempting to
> deref either of the values returns 0 -- which in this case can be a
> valid value. So, also, check for end() in the if clause.
> commit:ba39fd782375dc1bc909519fa3f5307a9696032b

I think the order in the if () should be reversed... it's not legal to 
dereference the iterator if it's == end().

Thanks!
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to