Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Martin Sivak
> then why don’t you handle the connection state as well? isn’t that a simple > fix? VDSM socket availability during startup is probably the most important requirement for MOM and the whole service is based around that assumption. We could handle that differently, but letting the service crash sa

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Michal Skrivanek
> On 18 Nov 2016, at 12:35, Martin Sivak wrote: > >> I don't think it is related to version X or Y. It is a race, so might be >> related to other factors. > > It never (seriously: NEVER) happened with xml-rpc before 4.0.5. that is surprising but we also didn’t have lago before;-) > >> likely

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Martin Sivak
> But even if vdsm did this, mom still needs a retry mechanism. No it doesn't. Systemd handles that just fine (I might increase the retry count in the service file though). I rather like the supervised way of handling errors in "let it crash" projects. It simplifies the code tremendously. MOM doe

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Michal Skrivanek
> On 18 Nov 2016, at 12:12, Oved Ourfali wrote: > > I don't think it is related to version X or Y. It is a race, so might be > related to other factors. > likely because json-rpc is initialized after xml-rpc….or indeed whatever else;-) either way it needs to be solved. Either by improving th

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Nir Soffer
On Fri, Nov 18, 2016 at 12:21 PM, Martin Sivak wrote: > What about making vdsm ready to answer connections when it returns to > systemd instead? I hate workarounds and this always worked fine. This is clearly a mom bug. Mom must have retry mechanism when and do not expect that vdsm is ready to ac

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Oved Ourfali
I don't think it is related to version X or Y. It is a race, so might be related to other factors. On Nov 18, 2016 12:59 PM, "Martin Sivak" wrote: > > Are we / can we use systemd socket activation there? > > That actually requires systemd specific code iirc (to take over the > standing by socket

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Anton Marchukov
Hello Martin. It does. But from 4.0 as I see we support only systemd enabled distros so it might make sense. Anton. On Fri, Nov 18, 2016 at 11:59 AM, Martin Sivak wrote: > > Are we / can we use systemd socket activation there? > > That actually requires systemd specific code iirc (to take over

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Martin Sivak
> Are we / can we use systemd socket activation there? That actually requires systemd specific code iirc (to take over the standing by socket). I am actually wondering why the xml-rpc in 4.0.4 was fine and json-rpc in 4.0.6 is too slow. Martin On Fri, Nov 18, 2016 at 11:53 AM, Anton Marchukov w

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Martin Sivak
> I am not so sure whether it will be so simple to do it. Recovery can take > some time > and during this time vdsm is not functional. Interesting issue found [1]. > > > [1] https://bugzilla.redhat.com/1396183 MOM has no issue with VDSM that reports the recovering or reinitializing code (99 iirc).

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Anton Marchukov
Hello All. Are we / can we use systemd socket activation there? Anton. On Fri, Nov 18, 2016 at 11:21 AM, Martin Sivak wrote: > What about making vdsm ready to answer connections when it returns to > systemd instead? I hate workarounds and this always worked fine. > > Martin > > On Fri, Nov 18,

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Piotr Kliczewski
In my opinion this issue is protocol agnostic. We may want to have a retry logic to make sure client can "talk" to vdsm. It seems that we could have this logic in the client code (disabled by default) and client code could enable it if needed. On Fri, Nov 18, 2016 at 11:25 AM, Oved Ourfali wro

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Oved Ourfali
Discuss it with the infra guys and I'm sure you'll get the reasons, and will figure out a solution together. On Nov 18, 2016 12:21 PM, "Martin Sivak" wrote: > What about making vdsm ready to answer connections when it returns to > systemd instead? I hate workarounds and this always worked fine.

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Martin Sivak
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine. Martin On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali wrote: > Seems like a race regardless of the protocol. > Should you add a retry? > > > On Nov 18, 2016 11:5

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Oved Ourfali
Seems like a race regardless of the protocol. Should you add a retry? On Nov 18, 2016 11:52 AM, "Martin Sivak" wrote: > Yes, because VDSM is supposed to be up (there is systemd dependency). > This always worked fine with xml-rpc. > > Martin > > On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer wrote

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Martin Sivak
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc. Martin On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer wrote: > On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak wrote: >> This happens because MOM can't connect to VDSM and so it quits. > > S

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Nir Soffer
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak wrote: > This happens because MOM can't connect to VDSM and so it quits. So mom try once to connect and if the connection fails it quits? > We > discussed it on the mailinglist > > https://lists.fedoraproject.org/archives/list/vdsm-de...@lists.fedor

Re: [ovirt-devel] Where's MOM (on latest master)

2016-11-18 Thread Martin Sivak
This happens because MOM can't connect to VDSM and so it quits. We discussed it on the mailinglist https://lists.fedoraproject.org/archives/list/vdsm-de...@lists.fedorahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/ http://lists.ovirt.org/pipermail/devel/2016-November/014101.html This issue n

[ovirt-devel] Where's MOM (on latest master)

2016-11-17 Thread Yaniv Kaul
I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13