Re: [ovirt-devel] [OST Failure Report] [oVirt master] [09.02.2017] [test-repo_ovirt_experimental_master]
Dominik, Here is the exception which caused "Not able to update response" 2017-02-09 06:30:41,828-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-13) [1abe9b01] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM lago-basic-suite-master-host1 command HostSetupNetworksVDS failed: 880 > 879 2017-02-09 06:30:41,829-05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (org.ovirt.thread.pool-7-thread-13) [1abe9b01] Command 'HostSetupNetworksVDSCommand(HostName = lago-basic-suite-master-host1, HostSetupNetworksVdsCommandParameters:{runAsync='true', hostId='42fa76b6-7bf9-463e-80fa-fcca9b0d3e81', vds='Host[lago-basic-suite-master-host1,42fa76b6-7bf9-463e-80fa-fcca9b0d3e81]', rollbackOnFailure='true', connectivityTimeout='120', networks='[HostNetwork:{defaultRoute='false', bonding='false', networkName='Labeled_Network', nicName='eth0', vlan='600', mtu='0', vmNetwork='false', stp='false', properties='null', ipv4BootProtocol='NONE', ipv4Address='null', ipv4Netmask='null', ipv4Gateway='null', ipv6BootProtocol='NONE', ipv6Address='null', ipv6Prefix='null', ipv6Gateway='null'}]', removedNetworks='[]', bonds='[]', removedBonds='[]', clusterSwitchType='LEGACY'})' execution failed: VDSGenericException: VDSNetworkException: 880 > 879 2017-02-09 06:30:41,829-05 DEBUG [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (org.ovirt.thread.pool-7-thread-13) [1abe9b01] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: 880 > 879 at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:188) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.FutureVDSCommand.get(FutureVDSCommand.java:70) [vdsbroker.jar:] at org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand.executeCommand(HostSetupNetworksCommand.java:290) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1251) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1391) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2055) [bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:] at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1451) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:397) [bll.jar:] This could be connected with the specific version of the jackson that we use. On Tue, Feb 21, 2017 at 4:42 PM, Dominik Hollerwrote: > A deep analysis of the logfiles gives details about the > unexpected behavior, but I regret to not provide the fault causing the > unexpected behavior. > > To get this fault, the help of someone familiar with > org.ovirt.vdsm.jsonrpc.client.JsonRpcClient is needed. > > In the failing test "assign_labeled_network" a (labeld) network is > assigned to the cluster. For this reason the network has to be added to > the hosts. After that, the test "assign_labeled_network" checks, if the > engine acknowledges that hosts are in the labeld network. This > execution of the test failed, because this acknowledge of the engine > is missing after 180 seconds [3]. > > There are two hosts lago-basic-suite-master-host0 and > lago-basic-suite-master-host1 in the scenario. > lago-basic-suite-master-host1 fails and > lago-basic-suite-master-host0 succeeds, so only > lago-basic-suite-master-host1 is analyzed below. > > Please find here the most relevant steps causing this error: > 1. The engine sends Host.setupNetworks to the hosts in >line 40279 - 40295 in [1] with >"id":"02298344-165f-47e4-9ea4-7c17a55d37f8". > 2. The host executes the Host.setupNetworks RPC call successfully in >line 1286 in [2]. > 3. The engine receives the acknowledgment of the successful execution >in line 40716 and 40717 in [1]. > 4. The error occurs in line 40718: >'[org.ovirt.vdsm.jsonrpc.client.JsonRpcClient] (ResponseWorker) [] >Not able to update response for >"02298344-165f-47e4-9ea4-7c17a55d37f8"'. This means the engine can >not process the acknowledgment of the successful execution. > 5. The command HostSetupNetworksVDS is aborted. >So Host.getCapabilities is skipped and the engine database is not >updated with the new network configuration of the host. > 6. Since the test script relays on the information from database >about host network configuration, it does not see that >Host.setupNetworks is successfully executed and stops with the >error
Re: [ovirt-devel] [OST Failure Report] [oVirt master] [09.02.2017] [test-repo_ovirt_experimental_master]
A deep analysis of the logfiles gives details about the unexpected behavior, but I regret to not provide the fault causing the unexpected behavior. To get this fault, the help of someone familiar with org.ovirt.vdsm.jsonrpc.client.JsonRpcClient is needed. In the failing test "assign_labeled_network" a (labeld) network is assigned to the cluster. For this reason the network has to be added to the hosts. After that, the test "assign_labeled_network" checks, if the engine acknowledges that hosts are in the labeld network. This execution of the test failed, because this acknowledge of the engine is missing after 180 seconds [3]. There are two hosts lago-basic-suite-master-host0 and lago-basic-suite-master-host1 in the scenario. lago-basic-suite-master-host1 fails and lago-basic-suite-master-host0 succeeds, so only lago-basic-suite-master-host1 is analyzed below. Please find here the most relevant steps causing this error: 1. The engine sends Host.setupNetworks to the hosts in line 40279 - 40295 in [1] with "id":"02298344-165f-47e4-9ea4-7c17a55d37f8". 2. The host executes the Host.setupNetworks RPC call successfully in line 1286 in [2]. 3. The engine receives the acknowledgment of the successful execution in line 40716 and 40717 in [1]. 4. The error occurs in line 40718: '[org.ovirt.vdsm.jsonrpc.client.JsonRpcClient] (ResponseWorker) [] Not able to update response for "02298344-165f-47e4-9ea4-7c17a55d37f8"'. This means the engine can not process the acknowledgment of the successful execution. 5. The command HostSetupNetworksVDS is aborted. So Host.getCapabilities is skipped and the engine database is not updated with the new network configuration of the host. 6. Since the test script relays on the information from database about host network configuration, it does not see that Host.setupNetworks is successfully executed and stops with the error "False != True after 180 seconds" [3] So the fault happens before or in step 4 and is around the jsonrpc communication. It is an open action item to precise the location of the fault. [1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5217/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-005_network_by_label.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log [2] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5217/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-005_network_by_label.py/lago-basic-suite-master-host1/_var_log/vdsm/vdsm.log [3] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5217/testReport/junit/(root)/005_network_by_label/assign_labeled_network/ On Thu, 9 Feb 2017 14:52:52 +0200 Shlomo Ben Davidwrote: > Hi, > > > *Test failed:* [test-repo_ovirt_experimental_master] > > *Link to suspected patches:* n/a > > *Link to Job:* > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5217 > > *Link to all logs:* > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5217/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-005_network_by_label.py/ > > *Error snippet from the log: * > > > > ifup/VLAN100_Network::ERROR::2017-02-09 > 06:21:15,236::concurrent::189::root::(run) FINISH thread > failed > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/concurrent.py", line > 185, in run ret = func(*args, **kwargs) > File > "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", > line 949, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) > File > "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", > line 935, in _exec_ifup_by_name raise > ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') > ConfigNetworkError: (29, 'Determining IPv6 information for > VLAN100_Network... failed.') > > > > Best Regards, > > Shlomi Ben-David | Software Engineer | Red Hat ISRAEL > RHCSA | RHCVA | RHCE > IRC: shlomibendavid (on #rhev-integ, #rhev-dev, #rhev-ci) > > OPEN SOURCE - 1 4 011 && 011 4 1 ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[OST Failure Report] [oVirt master] [09.02.2017] [test-repo_ovirt_experimental_master]
Hi, *Test failed:* [test-repo_ovirt_experimental_master] *Link to suspected patches:* n/a *Link to Job:* http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5217 *Link to all logs:* http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5217/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-005_network_by_label.py/ *Error snippet from the log: * ifup/VLAN100_Network::ERROR::2017-02-09 06:21:15,236::concurrent::189::root::(run) FINISH threadfailed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/concurrent.py", line 185, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 949, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 935, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, 'Determining IPv6 information for VLAN100_Network... failed.') Best Regards, Shlomi Ben-David | Software Engineer | Red Hat ISRAEL RHCSA | RHCVA | RHCE IRC: shlomibendavid (on #rhev-integ, #rhev-dev, #rhev-ci) OPEN SOURCE - 1 4 011 && 011 4 1 ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra