Hi Dumitru, I just wanted to ask wether you need any help (maybe, testing) in this? I’m ready to check this on my dataset if you were successful to implement a fix.
> On 12 Jul 2023, at 12:15, Dumitru Ceara <dce...@redhat.com> wrote: > > On 7/12/23 00:01, Ilya Maximets wrote: >> On 7/11/23 19:01, Dumitru Ceara wrote: >>> On 7/11/23 18:33, Vladislav Odintsov wrote: >>>> Hi Dumitru, >>>> >>>> The system on which I reproduced this issue is running 22.09.x version. >>>> I’ve tried to upgrade ovn-controller to main branch + your patch. Please, >>>> note that it has test error: [1]. >>>> After two minutes after upgrade it still consumed 3.3G. >>>> >>> >>> Ack, I need to re-think the patch then. Maybe a hard deadline to run >>> malloc_trim() at least once every X seconds. I'll see what I can come >>> up with. >>> >>>> I tried to backport your patch to 22.09, it required to backport also this >>>> commit: [2] and it failed some tests: [3]. >>>> >>>> But I’ve got general question: prior to commit that I mentioned in initial >>>> mail, ovn-controller even didn’t try load such amount of data. And now it >>>> does and IIUC, your patch just releases memory that was freed after >>>> ovn-controller fully loaded. >>>> I’m wonder wether it should load that excess data at all? Seems like it >>>> did. >>>> >>> >>> Looking again at 1b0dbde94070 ("ovn-controller: Only set monitor >>> conditions on available tables.") it's kind of expected indeed: >>> >>> Initially all tables are "unavailable" because we didn't get the schema >>> so we don't set any condition for any table. >>> >>> After ovn-controller connects to the SB for the first time it will >>> determine that the SB tables are in the schema so it will explicitly add >>> them to the monitor condition and restrict the SB data it is interested in. >>> >>> Maybe we need to change the IDL/CS modules to wait with the >>> monitor_cond/monitor_cond_since until instructed by the client >>> (ovn-controller). Ilya do you have any thoughts on this matter? >> >> So, AFAICT, the issue is that we're running with >> 'monitor_everything_by_default' >> option, the default condition is 'true' and the monitor request for the main >> database is sent out immediately after receiving the schema, so the >> application >> has no time to react. >> >> I think, there are few possible solutions for this: >> >> 1. Introduce a new state in the CS state machine, e.g. >> CS_S_SERVER_SCHEMA_RCEIVED, and move out from this state in the run() >> callback. This way the application will have a chance to set up conditions >> before they are sent. Slightly not intuitive. >> >> 2. A variation on what you suggested, i.e. enter the >> CS_S_SERVER_SCHEMA_RCEIVED >> state and wait for some sort of the signal from the application to proceed. >> Sounds a bit counter-intuitive for an IDL user. >> >> 3. Introduce an application callback that can be called from the >> ovsdb_idl_compose_monitor_request() the same way as this function is >> getting >> called form the ovsdb_cs_send_monitor_request(). An application will be >> able to influence conditions before they are sent. >> Might be tricky due to new->req->ack state transition. >> >> 4. Make the default condition configurable, e.g. by an additional argument >> 'default_condition' = true/false for an ovsdb_idl_create(). This way the >> application will not get any data until conditions are actually set. >> >> 5. Or it maybe just a separate config function that will set default >> conditions >> to 'false' and will need to be called before the first run(). >> >> 6. Change behavior of 'monitor_everything_by_default' argument. Make it >> actually add all the tables to the monitor, but with the 'false' condition. >> Result should technically be the same. Might be tricky to get right though >> with all the backward compatibility. >> >> Option 5 might be the better option of these. >> >> What do you think? >> > > I think option 5 sounds the simplest to implement indeed. It also > doesn't induce any compatibility issues as you mentioned. > > The only "issue" is we'd probably want this backported to stable OVN > releases so it means we need to bump the submodule version to an > unreleased version of OVS. But that's an OVN problem and we discussed > similar instances of it before. > > I'll prepare a patch soon. > > Best regards, > Dumitru Regards, Vladislav Odintsov _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev