On 2/23/22 09:24, Dumitru Ceara wrote:
> On 2/23/22 07:27, Han Zhou wrote:
>> On Fri, Feb 11, 2022 at 12:52 AM Dumitru Ceara <dce...@redhat.com> wrote:
>>>
>>> On 2/11/22 00:54, Numan Siddique wrote:
>>>> On Thu, Jan 27, 2022 at 10:00 AM Dumitru Ceara <dce...@redhat.com>
>> wrote:
>>>>>
>>>>> We still need to try to ovsdb_idl_loop_commit_and_wait() on instances
>>>>> that are in standby mode.  That's because they need to try to take the
>>>>> lock.  But if they're in standby they won't actually build a
>> transaction
>>>>> json and will return early because they don't own the SB lock.
>>>>>
>>>>> That's reported as an error by ovsdb_idl_loop_commit_and_wait() but we
>>>>> shouldn't act on it.
>>>>>
>>>>> Also, to ensure that no DB changes are missed, ovsdb_idl_track_clear()
>>>>> should be called only on active instances.  The standby or paused ones
>>>>> will get the incremental updates when they become active.  Otherwise we
>>>>> would be forced to perform a full recompute when the instance becomes
>>>>> active.
>>>>
>>>> Hi Dumitru,
>>>>
>>>
>>> Hi Numan,
>>>
>>> Thanks for the review!
>>>
>>>> I've a question on the track clear being moved out of the standby
>> instances.
>>>> To ensure correctness,  I suppose it's better to trigger a full
>> recompute when a
>>>> standby instance becomes active. What do you think?
>>>>
>>>
>>> I might be wrong but I don't think that's necessary.  It may also be
>>> quite costly as full recomputes can take quite long.
>>>
>>>> Also lets say CMS does the below operations
>>>>      - Add a logical switch S1
>>>>      - Add a  logical port p1 in S1
>>>>      - Add a logical port p2 in S1
>>>>      - Delete logical port p2
>>>>      - Delete a logical switch.
>>>>
>>>> With this patch since we are not clearing the tracking information,
>>>> how does ovn-northd
>>>> process the tracked changes when it becomes active ?
>>>
>>> When ovn-northd becomes active, from a Northbound database perspective,
>>> there were no changes; that is, S1 didn't exist when it was last active
>>> and it doesn't exist now either.
>>>
>>> So, there should be no NB change to process.  Accumulating tracked
>>> changes without calling clear() on the standby has exactly this effect.
>>
>> Hi Dumitru,
>>
> 
> Hi Han,
> 
>> I wonder how accumulating tracked changes without calling clear() would
>> work.
>>
>> Firstly, I was under the impression that ovsdb_idl_track_clear() must be
>> called before the next ovsdb_idl_run(), and the current change tracking
>> implementation cannot safely carry tracking information across the
>> iterations. This was why in ovn-controller whenever (!engine_has_run() &&
>> engine_need_run()) we force recompute in the next iteration. If changes
>> could be carried over we would have incrementally processed the accumulated
>> changes without forcing recompute. I can't recall the details, and I
>> checked the IDL again briefly but I didn't find the exact reason why it is
>> not safe. But I believe it was never used this way before. I should have
>> added a TODO for this (but somehow forgot to, sorry).
>>
> 
> I've been looking at that code too and I don't see any reason why
> accumulating changes wouldn't work.  The IDL code is written such that
> it processes multiple jsonrpc updates in a single run anyway:
> 
> ovsdb_idl_run() -> ovsdb_cs_run() -> which can receive up to 50
> (hardcoded) jsonrpc messages.
> 
> It's possible that I missed something though, the IDL functionality is
> complex, so maybe it's worth documenting that we recommend calling
> ovsdb_idl_track_clear() every time ovsdb_idl_run() is called, and why.
> What do you think?
> 
>> Secondly, even if it is safe to accumulate changes, it is going to be a
>> memory problem. Active-standby failover happens very rarely in a healthy
>> production environment. So even if the DB size is small, the change
>> tracking can grow without any limit. I tried with this patch by doing
>> simply a loop of adding and deleting a single logical switch 10k times on
>> top of an empty NB DB, and the standby northd's memory grew to 1.1GB.
>>
> 
> This is surprising, sounds like an issue in the IDL, I would've expected
> the delete to just remove the record from the track list.  I'll look
> into it.
> 

I had a quick look, it's because deleting a row clears its datum, sets
tracked_old_datum and adds it to the deleted track list.  The subsequent
"add" will not find this row, and will insert a new one.

We probably need to fix this anyway because even in the current
implementation the IDL can process up to 50 consecutive
"add/delete/add/delete/.." updates.  This can potentially cause issues
due to "spurious" delete events when the client finally processes the
tracked changes.

>> Thirdly, processing all the tracked changes when standby becomes active is
>> likely taking higher cost than recompute, if the tracked change size is
>> bigger than the DB size.
>>
> 
> If changes are accumulated correctly, the total number of the tracked
> record changes should never be larger than the size of the DB the last
> time changes were processed + the size of the DB now, resulting in a
> cost to process of O(size-of-DB), the same as for full recompute.  Or am
> I missing something?
> 
>> So for now I would suggest keeping the logic of clearing tracking on every
>> iteration and force recompute when failover.
>>
> 
> At least until we figure out why the memory increase in the IDL, I agree
> to keep forcing a recompute on failover.  That's also because we
> currently don't really incrementally process much in ovn-northd.
> 
> I'll send a v2.
> 
>> Thanks,
>> Han
>>
> 
> Thanks,
> Dumitru
> 
>>>
>>> From a Southbound database perspective there are two cases:
>>>
>>> a. The former active northd processed some (but not all) of the NB
>>> changes and executed their corresponding SB transactions.  In this case,
>>> the standby northd also receives update messages for the SB records that
>>> were changed.  The standby northd tracks these changes.
>>>
>>> When the standby northd becomes active it will:
>>> - determine that NB state didn't change
>>> - SB state changed and needs to be reconciled (today we do this with the
>>> help of a NULL change handler for SB_* tables which will trigger a full
>>> recompute).
>>>
>>> b. The former active northd processed all of the NB changes and executed
>>> their corresponding SB transactions.  In this case, the final state of
>>> the NB and SB databases should be equivalent to their initial states.
>>> NB/SB changes will be accumulated by the change tracking mechanism on
>>> the standby resulting in empty tracked changes lists.  This is fine
>>> because the new active northd doesn't need to do anything, the DB
>>> contents are already consistent.
>>>
>>> c. The former active northd processed none of the NB changes yet.  This
>>> is very similar to case "b" above, the new active northd doesn't need to
>>> change anything in the NB/SB and it won't do that either.
>>>
>>>>
>>>> Thanks
>>>> Numan
>>>>
>>>
>>> Thanks,
>>> Dumitru
>>>
>>> _______________________________________________
>>> dev mailing list
>>> d...@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to