On 8/1/25 10:55 AM, Adrián Moreno wrote:
> On Thu, Jul 31, 2025 at 10:48:08PM +0200, Ilya Maximets wrote:
>> On 7/31/25 8:41 PM, Aaron Conole wrote:
>>> Ilya Maximets <i.maxim...@ovn.org> writes:
>>>
>>>> On 7/30/25 4:47 PM, Aaron Conole via dev wrote:
>>>>> When adjusting bond parameters, any adjustment is considered sufficient
>>>>> for triggering a rebalance.  This is a very simplistic config update
>>>>> scheme that triggers complete rebalancing even if the time adjustment
>>>>> would move the next expiration out beyond the last calculated expiration.
>>>>>
>>>>> For the interval parameter only, we can simply recalculate the expiry
>>>>> deadline and let the next bond_run() event do the rebalance if needed.
>>>>> Even if the recalculation would cause the deadline to have occurred in
>>>>> the past, it should execute on the next bond_run() anyway.  This is
>>>>> still okay, as the rebalance interval timeout may not result in a
>>>>> full rebalance anyway.
>>>>>
>>>>> Reported-at: 
>>>>> https://www.mail-archive.com/ovs-discuss@openvswitch.org/msg10409.html
>>>>> Signed-off-by: Aaron Conole <acon...@redhat.com>
>>>>> ---
>>>>>  ofproto/bond.c | 8 +++++++-
>>>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/ofproto/bond.c b/ofproto/bond.c
>>>>> index 3859ddca08..86e21607e5 100644
>>>>> --- a/ofproto/bond.c
>>>>> +++ b/ofproto/bond.c
>>>>> @@ -459,8 +459,14 @@ bond_reconfigure(struct bond *bond, const struct 
>>>>> bond_settings *s)
>>>>>      }
>>>>>
>>>>>      if (bond->rebalance_interval != s->rebalance_interval) {
>>>>> +        /* Recompute the next rebalance interval by moving the 
>>>>> next_rebalance
>>>>> +         * to be offset by the new interval.  Then let the rebalance code
>>>>> +         * trigger a rebalance based on the new details.
>>>>
>>>> This sentence doesn't seem necessary.
>>>
>>> Removed.
>>>
>>>>>  In this case, if
>>>>> +         * all that was updated is the rebalance interval, we can skip
>>>>> +         * triggering the rest of the port reconfigure mechanism. */
>>>>
>>>> s/reconfigure mechanism/reconfiguration/
>>>
>>> Ack.
>>>
>>>>> +        int old_start_time = bond->next_rebalance - 
>>>>> bond->rebalance_interval;
>>>>>          bond->rebalance_interval = s->rebalance_interval;
>>>>> -        revalidate = true;
>>>>> +        bond->next_rebalance = old_start_time + bond->rebalance_interval;
>>>>>      }
>>>>
>>>> We should likely still revalidate when the value goes to or from zero,
>>>> because that impacts if the bond considered balanced or not.  Even if
>>>> that doesn't cause any problems today, it will inevitably be forgotten
>>>> in the future.
>>>
>>> That makes sense to me - I'll add a case for it.
>>>
>>>> Also, the subtraction may technically overflow if the next_rebalance is
>>>> not initialized.  E.g. if it is not initialized yet when the bond is
>>>> going from active-backup to balance-tcp.  Which may be fine, given those
>>>> are signed integers, but still doesn't feel right.
>>>
>>> I'm trying to envision a scenario that would make this happen.  For
>>> example, it gets reset when going from ab -> balance-{tcp,slb}.  And
>>> that will be reconfigured later in this very code path as well (since we
>>> will call the bond_entry_reset if the config got changed for any config
>>> option other than the interval).  I can put a guard on !next_rebalance
>>> and tell it to trigger revalidation then (since the very first execution
>>> or something should be initialized to '0' and obviously that could
>>> probably find a trigger).  But still struggling to see a place where
>>> changing bond mode won't also force the whole thing to be reinitialized
>>> in bond_entry_reset() since revalidate will be 'true' via other means.
>>
>> It doesn't cause any issues, I agree, but it just didn't feel right. :)
>> Feel free to keep this code as-is.
>>
>>>
>>>> And this patch needs a Fixes tag.
>>>
>>> I didn't put one because I'm not sure if it's really a bugfix or an
>>> improvement.  If you consider it one, I'll add the fixes tag.
>>
>> Looking closer to the code, there are two separate issues here:
>>
>> 1. Traffic is re-shuffled among bond members on the rebalance interval
>>    change.
>>
>> 2. Traffic is being dropped for some time after the bond is reconfigured
>>    until the next bond_run().  This happens if other config options
>>    are changed as well.
>>
> 
> Exactly, that's what I was trying to say in the other thread, issue #2
> is more problematic.
> 
>> And, I think, the latter is what the original reporter was concerned about.
>> This is a behavior introduced in:
>>
>>   586adfd047cb ("bond: Avoid deadlock while updating post recirculation 
>> rules.")
>>
>> Before that change we would not update post-recirc rules while the
>> configuration is cleared and not re-constructed yet.  But after the
>> 586adfd047cb fix, we're now calling the update after we zeroed out the
>> configuration, effectively just removing all the post recirculation rules
>> or the lb-output mappings.
>>
>> Saying that, maybe the real fix is to use bond_update_post_recirc_rules__
>> with force instead of direct call to update_recirc_rules().  That will
>> ensure that the configuration is not blank and we actually have real
>> post-recirc rules / lb-output mapping at all times.  I do not remember
>> why I went with the direct call in that fix though.
>>
>> What do you think?
> 
> I agree but I think this is not the only problem here.
> 
> AFAICS, another issue is that, when traffic is moved from one member to
> another during rebalancing, we set bond->bond_revalidate to true.
> Soon after that we call bond_update_post_recirc_rules__ so we update the
> rules.

It feels like setting bond_revalidate to true can just be removed from
the bond_shift_load().  Not sure what's the purpose of it.

> 
> However, if something in the configuration changes (even with this
> patch), and bond_reconfigure is called, we will unnecessarily call
> bond_entry_reset() which blanks the hashes we have just carefully rebalanced.
> 
> In addition to wiping the result of the rebalance, as Ilya points out,
> we leave the zeroed hashes until the next bond_run(), which would be
> fixed if we called bond_update_post_recirc_rules__ instead of
> update_recirc_rules directly.
> 
> Thanks.
> Adrián
> 
>>
>> Not re-setting the config on timeout change is still needed, as that
>> avoids re-shuffling the traffic among bond members (issue 1).
>>
>> The behavior in point 1 was always this way, AFAIU, so that can be treated
>> as just an enhancement, I suppose.
>>
>> Best regards, Ilya Maximets.
>>
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to