Hi,

I've sent a RFC patch trying to resolve this use-after-free issue.
just to collect the comments.

Here is some thought on this issues:

first, we cannot wait until ofproto ref reduces to 1 to do the real
destruct, as this
leads to the "egg and chicken" situation:

if we do not destroy ofproto we cannot decrease ref, if not decrease ref,
ofproto not
being destroyed.

Second, we cannot put close_dpif_backer into the deferred part of ofproto
desruction,
as it will call udpif_destroy which stop revalidator threads that might
hold some ref to ofproto
in its xlate cache.

So I am guessing maybe we can introduce a rcu barrier just like kernels and
wait for
all the meter were destroyed and perform the close-dpif-backer.

NOTE: we cannot do this for rule destruction, because it's difficult to
find out the precision
timing that all the rules have been destroyed, as rule destruction can
endure for 2 grace periods.


Peng He <xnhp0...@gmail.com> 于2022年3月10日周四 08:57写道:

> or what we really need here is not ofproto but backer refcount inc.
> I see backer itself has refcount.
>
> Ilya Maximets <i.maxim...@ovn.org> 于2022年3月10日周四 01:44写道:
>
>> On 3/9/22 18:23, David Marchand wrote:
>> > On Tue, Mar 8, 2022 at 8:44 PM Ilya Maximets <i.maxim...@ovn.org>
>> wrote:
>> >>
>> >> On 3/8/22 17:39, David Marchand wrote:
>> >>> On Mon, Mar 7, 2022 at 11:28 PM Ilya Maximets <i.maxim...@ovn.org>
>> wrote:
>> >>>>
>> >>>> 'ofproto' should still be in place while freeing meter id, but this
>> is
>> >>>> not ensured in any way leading to the use-after-free:
>> >>>>
>> >>>>   ==13702==ERROR: AddressSanitizer: heap-use-after-free
>> >>>>   READ of size 8 at 0x614000000c50 thread T3 (urcu2)
>> >>>>     0 0x536656 in free_meter_id ofproto/ofproto-dpif.c:6780:37
>> >>>>     1 0x7356d0 in ovsrcu_call_postponed lib/ovs-rcu.c:346:13
>> >>>>     2 0x735b21 in ovsrcu_postpone_thread lib/ovs-rcu.c:362:14
>> >>>>     3 0x73a30c in ovsthread_wrapper lib/ovs-thread.c:422:12
>> >>>>     4 0x7f7b1a7876da in start_thread
>> >>>>     5 0x7f7b19d0671e in clone
>> >>>>
>> >>>>   0x614000000c50 is located 16 bytes inside of 400-byte region
>> >>>>   freed by thread T0 here:
>> >>>>     0 0x49640d in free (vswitchd/ovs-vswitchd+0x49640d)
>> >>>>     1 0x517e38 in destruct ofproto/ofproto-dpif.c:1851:5
>> >>>>     2 0x4f0bb0 in ofproto_destroy ofproto/ofproto.c:1773:5
>> >>>>     3 0x4c71e4 in bridge_destroy vswitchd/bridge.c:3608:9
>> >>>>     4 0x4c6f4a in bridge_exit vswitchd/bridge.c:553:9
>> >>>>     5 0x4e109a in main vswitchd/ovs-vswitchd.c:146:5
>> >>>>     6 0x7f7b19c06bf6 in __libc_start_main
>> >>>>
>> >>>> Since introduction of the ofproto refcount this issue can be fixed
>> >>>> by taking the 'ofproto' reference before postponing the operation.
>> >>>> This will guarantee that 'ofproto' and 'baker' are not destroyed
>> until
>> >>>> all ids are freed.
>> >>>
>> >>> We prevent ofproto from getting shot thanks to refcount, but I don't
>> >>> understand what prevents backer content from being uninitialised (like
>> >>> destroying the meter id pool) in close_dpif_backer().
>> >>
>> >> Successfully opened backer is closed only as part of the destruction
>> >> of the last ofproto instance that uses it.  So, if we're holding the
>> >
>> > Yes, I think that's what happens here.
>> > We are closing the last ofproto instance: backer gets shot in
>> > close_dpif_backer()/destroy().
>> > And there is a rcu postponed free_meter_id() with reference to
>> ofproto->backer.
>>
>> Ugh.  You're right.  ofproto_class->destruct() is called unconditionally
>> regardless of ofproto still being referenced.  That doesn't sound right.
>> _______________________________________________
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>
>
> --
> hepeng
>


-- 
hepeng
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to