[Yahoo-eng-team] [Bug 1839920] [NEW] Macvtap CI fails on Train

2019-08-12 Thread Lenny
Public bug reported:


MacVtap CI[1] started to fail after merging commit[2]


We think it related to this 
https://github.com/libvirt/libvirt/commit/b91a33638476cf57d910b6056a8fc11921edd029#diff-28bc83a0c3470bba712dfa6824a79c9d.
 So they change from setting the admin mac to the effective mac. The problem is 
that the sriov-nic agent relay on the admin mac to send rpc to the neutron 
server. If the mac and the pci slot don't much it ignores it and the VM stuck 
in spawn until timeout


[1] https://wiki.openstack.org/wiki/ThirdPartySystems/Mellanox_CI
[2] https://review.opendev.org/#/c/31/

** Affects: nova
 Importance: Undecided
 Assignee: Lenny (lennyb)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1839920

Title:
  Macvtap CI fails on Train

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  
  MacVtap CI[1] started to fail after merging commit[2]

  
  We think it related to this 
https://github.com/libvirt/libvirt/commit/b91a33638476cf57d910b6056a8fc11921edd029#diff-28bc83a0c3470bba712dfa6824a79c9d.
 So they change from setting the admin mac to the effective mac. The problem is 
that the sriov-nic agent relay on the admin mac to send rpc to the neutron 
server. If the mac and the pci slot don't much it ignores it and the VM stuck 
in spawn until timeout

  
  [1] https://wiki.openstack.org/wiki/ThirdPartySystems/Mellanox_CI
  [2] https://review.opendev.org/#/c/31/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1839920/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1708920] [NEW] Cold migration fails

2017-08-06 Thread Lenny
Public bug reported:


Nova cold migration intermediate fails due to broken connection to SQL cell 
database

setup:  devstack master(pike)
allinone physical server
compute physical server
SR-IOV over Mellanox ConnectX-4 NICs

Scenario:
Running tempest cold migration few times, it fails on the 3rd time. 
#testr run 
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_cold_migration

Issue:
One of the computes loses sql connection to novacell[1]
coldmigration fails since it's migration is not allowed to the same 
node[2] 

Logs:
AllinOne 
http://52.169.200.208/tmp/cold_migration_bug_20170806/controller/
Compute  http://52.169.200.208/tmp/cold_migration_bug_20170806/compute

[1] novacell Error: 
http://52.169.200.208/tmp/cold_migration_bug_20170806/controller/logs/n-cond-cell1.log
http://paste.openstack.org/show/617598/

[2] Compute error 
http://52.169.200.208/tmp/cold_migration_bug_20170806/compute/logs/n-cpu.log
http://paste.openstack.org/show/617599/

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1708920

Title:
  Cold migration fails

Status in OpenStack Compute (nova):
  New

Bug description:
  
  Nova cold migration intermediate fails due to broken connection to SQL cell 
database

  setup:  devstack master(pike)
allinone physical server
compute physical server
SR-IOV over Mellanox ConnectX-4 NICs

  Scenario:
  Running tempest cold migration few times, it fails on the 3rd time.   
  #testr run 
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_cold_migration

  Issue:
One of the computes loses sql connection to novacell[1]
  coldmigration fails since it's migration is not allowed to the same 
node[2]   

  Logs:
AllinOne 
http://52.169.200.208/tmp/cold_migration_bug_20170806/controller/
Compute  http://52.169.200.208/tmp/cold_migration_bug_20170806/compute

  [1] novacell Error: 
  
http://52.169.200.208/tmp/cold_migration_bug_20170806/controller/logs/n-cond-cell1.log
  http://paste.openstack.org/show/617598/

  [2] Compute error 
  http://52.169.200.208/tmp/cold_migration_bug_20170806/compute/logs/n-cpu.log
  http://paste.openstack.org/show/617599/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1708920/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1618382] [NEW] test_update_instance_port_admin_state Fails sumetime with DB Update error

2016-08-30 Thread Lenny
Public bug reported:

Sometimes in Mellanox CI we see that
test_update_instance_port_admin_state[1] fails with error [2]
"StaleDataError: UPDATE statement on table 'standardattributes' expected
to update 1 row(s); 0 were matched." [3]

[1] 
http://13.69.151.247/Test_Neutron_SRIOV_cloudx25/233_cloudx-25//testr_results.html.gz
[2] http://paste.openstack.org/show/564796/
[3] 
http://13.69.151.247/Test_Neutron_SRIOV_cloudx25/233_cloudx-25/logs/q-svc.log.gz

** Affects: neutron
 Importance: Undecided
 Status: New

** Description changed:

- 
- Sometimes 
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_instance_port_admin_state
 [1] 
- fails with error [2] "StaleDataError: UPDATE statement on table 
'standardattributes' expected to update 1 row(s); 0 were matched." [3]
- 
+ Sometimes in Mellanox CI we see that
+ test_update_instance_port_admin_state[1] fails with error [2]
+ "StaleDataError: UPDATE statement on table 'standardattributes' expected
+ to update 1 row(s); 0 were matched." [3]
  
  [1] 
http://13.69.151.247/Test_Neutron_SRIOV_cloudx25/233_cloudx-25//testr_results.html.gz
  [2] http://paste.openstack.org/show/564796/
  [3] 
http://13.69.151.247/Test_Neutron_SRIOV_cloudx25/233_cloudx-25/logs/q-svc.log.gz

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1618382

Title:
  test_update_instance_port_admin_state Fails sumetime with DB Update
  error

Status in neutron:
  New

Bug description:
  Sometimes in Mellanox CI we see that
  test_update_instance_port_admin_state[1] fails with error [2]
  "StaleDataError: UPDATE statement on table 'standardattributes'
  expected to update 1 row(s); 0 were matched." [3]

  [1] 
http://13.69.151.247/Test_Neutron_SRIOV_cloudx25/233_cloudx-25//testr_results.html.gz
  [2] http://paste.openstack.org/show/564796/
  [3] 
http://13.69.151.247/Test_Neutron_SRIOV_cloudx25/233_cloudx-25/logs/q-svc.log.gz

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1618382/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1560860] [NEW] mellanox infiniband SR-IOV(ib_hostdev vif) detach port fails

2016-03-23 Thread Lenny
Public bug reported:

detaching SRIOV port direct causes exception.

# neutron port-create --binding:vnic_type=direct private
# nova boot --flavor m1.small --image cirros-mellanox-x86_64-disk-ib --nic 
port-id=a247d89e-dae5-4d65-b414-e7bf3a26bfd1 vm1
# nova suspend vm1

logs:
https://review.openstack.org/#/c/286668
http://144.76.193.39/ci-artifacts/286668/3/Neutron-Networking-MLNX-ML2/

Traceback message

2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] Traceback (most recent call last):
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]   File 
"/opt/stack/nova/nova/compute/manager.py", line 6515, in 
_error_out_instance_on_exception
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] yield
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]   File 
"/opt/stack/nova/nova/compute/manager.py", line 4172, in suspend_instance
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] self.driver.suspend(context, instance)
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 2638, in suspend
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] self._detach_sriov_ports(context, 
instance, guest)
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 3425, in _detach_sriov_ports
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] if vif['vnic_type'] in 
network_model.VNIC_TYPES_SRIOV
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] AttributeError: 
'LibvirtConfigGuestHostdevPCI' object has no attribute 'source_dev'
2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]

** Affects: nova
 Importance: Undecided
 Assignee: Moshe Levi (moshele)
 Status: New


** Tags: pci

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1560860

Title:
  mellanox infiniband SR-IOV(ib_hostdev vif)  detach  port fails

Status in OpenStack Compute (nova):
  New

Bug description:
  detaching SRIOV port direct causes exception.

  # neutron port-create --binding:vnic_type=direct private
  # nova boot --flavor m1.small --image cirros-mellanox-x86_64-disk-ib --nic 
port-id=a247d89e-dae5-4d65-b414-e7bf3a26bfd1 vm1
  # nova suspend vm1

  logs:
  https://review.openstack.org/#/c/286668
  http://144.76.193.39/ci-artifacts/286668/3/Neutron-Networking-MLNX-ML2/

  Traceback message

  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] Traceback (most recent call last):
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]   File 
"/opt/stack/nova/nova/compute/manager.py", line 6515, in 
_error_out_instance_on_exception
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] yield
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]   File 
"/opt/stack/nova/nova/compute/manager.py", line 4172, in suspend_instance
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] self.driver.suspend(context, instance)
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 2638, in suspend
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] self._detach_sriov_ports(context, 
instance, guest)
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 3425, in _detach_sriov_ports
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] if vif['vnic_type'] in 
network_model.VNIC_TYPES_SRIOV
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730] AttributeError: 
'LibvirtConfigGuestHostdevPCI' object has no attribute 'source_dev'
  2016-03-03 04:45:42.775 1801 ERROR nova.compute.manager [instance: 
cdf2e34d-bc2e-4edb-aff7-516b97487730]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1560860/+subscriptions

-- 
Mailing list: http

[Yahoo-eng-team] [Bug 1535367] [NEW] Failure on SR-IOV . Missing 'parent_addr

2016-01-18 Thread Lenny
Public bug reported:

Mellanox CI fails on SR-IOV hardware
1.  Running nova from master commit ffa07781ab47baf096854cd6c22a3e433eab3f0d
2. Full  logs http://144.76.193.39/ci-artifacts/269109/1/Nova-ML2-Sriov/
3. Reproduce:
 ./stack.sh
 neutron port-create --binding:vnic_type=direct private
 nova boot --flavor m1.small --image mellanox_eth --nic port-id= 
vm1
4.  port binding fails
 nova fails to find appropriate host

** Affects: nova
 Importance: Critical
 Status: Confirmed


** Tags: sr-iov

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1535367

Title:
  Failure on SR-IOV .  Missing  'parent_addr

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Mellanox CI fails on SR-IOV hardware
  1.  Running nova from master commit ffa07781ab47baf096854cd6c22a3e433eab3f0d
  2. Full  logs http://144.76.193.39/ci-artifacts/269109/1/Nova-ML2-Sriov/
  3. Reproduce:
   ./stack.sh
   neutron port-create --binding:vnic_type=direct private
   nova boot --flavor m1.small --image mellanox_eth --nic port-id= 
vm1
  4.  port binding fails
   nova fails to find appropriate host

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1535367/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp