On Fri, Oct 7, 2011 at 9:12 AM, dave jones  wrote:
> 2011/10/4 Mikolaj Golub :
>>
>> On Sat, 1 Oct 2011 14:15:45 +0800 dave jones wrote:
>>
>>  dj> On Fri, Sep 30, 2011 at 9:41 PM, Robert Watson wrote:
>>  >>
>>  >> On Wed, 28 Sep 2011, Mikolaj Golub wrote:
>>  >>
>>  >>> On Mon, 26 Sep 2011 16:12:55 +0200 K. Macy wrote:
>>  >>>
>>  >>> KM> Sorry, didn't look at the images (limited bw), I've seen something 
>> KM>
>>  >>> like this before in timewait. This "can't happen" with UDP so will be 
>> KM>
>>  >>> interested in learning more about the bug.
>>  >>>
>>  >>> The panic can be easily triggered by this:
>>  >>
>>  >> Hi:
>>  >>
>>  >> Just catching up on this thread.  I think the analysis here is generally
>>  >> right: in 9.0, you're much more likely to see an inpcb with its in_socket
>>  >> pointer cleared in the hash list than in prior releases, and
>>  >> in_pcbbind_setup() trips over this.
>>  >>
>>  >> However, at least on first glance (and from the perspective of invariants
>>  >> here), I think the bug is actualy that in_pcbbind_setup() is asking
>>  >> in_pcblookup_local() for an inpcb and then access the returned inpcb's
>>  >> in_socket pointer without acquiring a lock on the inpcb.  Structurally, 
>> it
>>  >> can't acquire this lock for lock order reasons -- it already holds the 
>> lock
>>  >> on its own inpcb.  Therefore, we should only access fields that are safe 
>> to
>>  >> follow in an inpcb when you hold a reference via the hash lock and not a
>>  >> lock on the inpcb itself, which appears generally OK (+/-) for all the
>>  >> fields in that clause but the t->inp_socket->so_options dereference.
>>  >>
>>  >> A preferred fix would cache the SO_REUSEPORT flag in an inpcb-layer 
>> field,
>>  >> such as inp_flags2, giving us access to its value without having to walk
>>  >> into the attached (or not) socket.
>>  >>
>>  >> This raises another structural question, which is whether we need a new
>>  >> inp_foo flags field that is protected explicitly by the hash lock, and 
>> not
>>  >> by the inpcb lock, which could hold fields relevant to address binding.  
>> I
>>  >> don't think we need to solve that problem in this context, as a slightly
>>  >> race on SO_REUSEPORT is likely acceptable.
>>  >>
>>  >> The suggested fix does perform the desired function of explicitly 
>> detaching
>>  >> the inpcb from the hash list before the socket is disconnected from the
>>  >> inpcb. However, it's incomplete in that the invariant that's being 
>> broken is
>>  >> also relied on for other protocols (such as raw sockets).  The correct
>>  >> invariant is that inp_socket is safe to follow unconditionally if an 
>> inpcb
>>  >> is locked and INP_DROPPED isn't set -- the bug is in "locked" not in
>>  >> "INP_DROPPED", which is why I think this is the wrong fix, even though it
>>  >> prevents a panic :-).
>>
>>  dj> Hello Robert,
>>
>>  dj> Thank you for taking your valuable time to find out the problem.
>>  dj> Since I don't have idea about network internals, would you have a patch
>>  dj> about this? I'd be glad to test it, thanks again.
>>
>> Here is the patch that implements what Robert suggests.
>>
>> Dave, could you test it?
>
> Sure. Thanks for cooking the patch.
> Machines have been running two days now without panic.

Is there any plan to commit your fix? Thank you.
I'd upgrade to 9.0-release from beta-2 once it's released.

Best regards,
Dave.
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to