Dominik, Here is the exception which caused "Not able to update response"
2017-02-09 06:30:41,828-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-13) [1abe9b01] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM lago-basic-suite-master-host1 command HostSetupNetworksVDS failed: 880 > 879 2017-02-09 06:30:41,829-05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (org.ovirt.thread.pool-7-thread-13) [1abe9b01] Command 'HostSetupNetworksVDSCommand(HostName = lago-basic-suite-master-host1, HostSetupNetworksVdsCommandParameters:{runAsync='true', hostId='42fa76b6-7bf9-463e-80fa-fcca9b0d3e81', vds='Host[lago-basic-suite-master-host1,42fa76b6-7bf9-463e-80fa-fcca9b0d3e81]', rollbackOnFailure='true', connectivityTimeout='120', networks='[HostNetwork:{defaultRoute='false', bonding='false', networkName='Labeled_Network', nicName='eth0', vlan='600', mtu='0', vmNetwork='false', stp='false', properties='null', ipv4BootProtocol='NONE', ipv4Address='null', ipv4Netmask='null', ipv4Gateway='null', ipv6BootProtocol='NONE', ipv6Address='null', ipv6Prefix='null', ipv6Gateway='null'}]', removedNetworks='[]', bonds='[]', removedBonds='[]', clusterSwitchType='LEGACY'})' execution failed: VDSGenericException: VDSNetworkException: 880 > 879 2017-02-09 06:30:41,829-05 DEBUG [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (org.ovirt.thread.pool-7-thread-13) [1abe9b01] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: 880 > 879 at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:188) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.FutureVDSCommand.get(FutureVDSCommand.java:70) [vdsbroker.jar:] at org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand.executeCommand(HostSetupNetworksCommand.java:290) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1251) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1391) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2055) [bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:] at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1451) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:397) [bll.jar:] This could be connected with the specific version of the jackson that we use. On Tue, Feb 21, 2017 at 4:42 PM, Dominik Holler <dhol...@redhat.com> wrote: > A deep analysis of the logfiles gives details about the > unexpected behavior, but I regret to not provide the fault causing the > unexpected behavior. > > To get this fault, the help of someone familiar with > org.ovirt.vdsm.jsonrpc.client.JsonRpcClient is needed. > > In the failing test "assign_labeled_network" a (labeld) network is > assigned to the cluster. For this reason the network has to be added to > the hosts. After that, the test "assign_labeled_network" checks, if the > engine acknowledges that hosts are in the labeld network. This > execution of the test failed, because this acknowledge of the engine > is missing after 180 seconds [3]. > > There are two hosts lago-basic-suite-master-host0 and > lago-basic-suite-master-host1 in the scenario. > lago-basic-suite-master-host1 fails and > lago-basic-suite-master-host0 succeeds, so only > lago-basic-suite-master-host1 is analyzed below. > > Please find here the most relevant steps causing this error: > 1. The engine sends Host.setupNetworks to the hosts in > line 40279 - 40295 in [1] with > "id":"02298344-165f-47e4-9ea4-7c17a55d37f8". > 2. The host executes the Host.setupNetworks RPC call successfully in > line 1286 in [2]. > 3. The engine receives the acknowledgment of the successful execution > in line 40716 and 40717 in [1]. > 4. The error occurs in line 40718: > '[org.ovirt.vdsm.jsonrpc.client.JsonRpcClient] (ResponseWorker) [] > Not able to update response for > "02298344-165f-47e4-9ea4-7c17a55d37f8"'. This means the engine can > not process the acknowledgment of the successful execution. > 5. The command HostSetupNetworksVDS is aborted. > So Host.getCapabilities is skipped and the engine database is not > updated with the new network configuration of the host. > 6. Since the test script relays on the information from database > about host network configuration, it does not see that > Host.setupNetworks is successfully executed and stops with the > error "False != True after 180 seconds" [3] > > So the fault happens before or in step 4 and is around the jsonrpc > communication. > > It is an open action item to precise the location of the fault. > > > > [1] > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ > master/5217/artifact/exported-artifacts/basic-suit-master- > el7/test_logs/basic-suite-master/post-005_network_by_ > label.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log > > [2] > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ > master/5217/artifact/exported-artifacts/basic-suit-master- > el7/test_logs/basic-suite-master/post-005_network_by_ > label.py/lago-basic-suite-master-host1/_var_log/vdsm/vdsm.log > > [3] > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ > master/5217/testReport/junit/(root)/005_network_by_label/ > assign_labeled_network/ > > > > On Thu, 9 Feb 2017 14:52:52 +0200 > Shlomo Ben David <sbend...@redhat.com> wrote: > > > Hi, > > > > > > *Test failed:* [test-repo_ovirt_experimental_master] > > > > *Link to suspected patches:* n/a > > > > *Link to Job:* > > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5217 > > > > *Link to all logs:* > > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ > master/5217/artifact/exported-artifacts/basic-suit-master- > el7/test_logs/basic-suite-master/post-005_network_by_label.py/ > > > > *Error snippet from the log: * > > > > <error> > > > > ifup/VLAN100_Network::ERROR::2017-02-09 > > 06:21:15,236::concurrent::189::root::(run) FINISH thread > > <Thread(ifup/VLAN100_Network, started daemon 140189099861760)> failed > > Traceback (most recent call last): > > File "/usr/lib/python2.7/site-packages/vdsm/concurrent.py", line > > 185, in run ret = func(*args, **kwargs) > > File > > "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", > > line 949, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) > > File > > "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", > > line 935, in _exec_ifup_by_name raise > > ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') > > ConfigNetworkError: (29, 'Determining IPv6 information for > > VLAN100_Network... failed.') > > > > </error> > > > > Best Regards, > > > > Shlomi Ben-David | Software Engineer | Red Hat ISRAEL > > RHCSA | RHCVA | RHCE > > IRC: shlomibendavid (on #rhev-integ, #rhev-dev, #rhev-ci) > > > > OPEN SOURCE - 1 4 011 && 011 4 1 > >
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra