[Yahoo-eng-team] [Bug 1545584] [NEW] OVN devstack: Network creation fails when a VM with provider and private network interface is activatied

2016-02-14 Thread Mala Anand
Public bug reported:

We have a 5 node OVN devstack installation. We have created networks,
subnets, routers and activated VMs on private network.  Then added
provider network and activated VMs with both private and provider
network interface.  In this devstack implementation we also started two
ovsdb servers one with 6640 port and another with 6641.  OVSDB 6641
connects to OVN contorller plug-in.

When a VM with both private and provider interface is activated,   I see
Internal server error,  neutron server log shows connection lost in the
middle of a mysql operation.

Rally benchmark is enhanced to activate a VM with both network
interfaces.

Rally errors: 
016-02-12 13:46:36.403 28528 DEBUG neutronclient.client [-] RESP: 500 {'Date': 
'Fri, 12 Feb 2016 19:46:36 GMT', 'Connection': 'keep-alive', 'Content-Type':
 'application/json; charset=UTF-8', 'Content-Length': '150', 
'X-Openstack-Request-Id': 'req-a5d49508-8501-4802-a46b-674d36a46d23'} 
{"NeutronError": {"message": 
 "Request Failed: internal server error while processing your request.", 
"type": "HTTPInternalServerError", "detail": ""}} http_log_resp 
/usr/lib/python2.7/site-packages/neutronclient/common/utils.py:146
2016-02-12 13:46:36.403 28528 DEBUG neutronclient.v2_0.client [-] Error 
message: {"NeutronError": {"message": "Request Failed: internal server error 
while processing your request.", "type": "HTTPInternalServerError", "detail": 
""}} _handle_fault_response 
/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py:176
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner [-] Request Failed: 
internal server error while processing your request.
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner Traceback (most recent 
call last):
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/rally/task/runner.py", line 64, in 
_run_scenario_once
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner 
method_name)(**kwargs) or scenario_output
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/home/stack/sahil/OVN/rally_runs/cnps_ovn.py", line 100, in 
boot_server_overlay_network
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner 
self.wait_for_dhcp_port_up()
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/home/stack/sahil/OVN/rally_runs/cnps_ovn.py", line 200, in 
wait_for_dhcp_port_up
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner dhcp_port_id = 
self._get_dhcp_port(network_id, poll_count=poll_count)["id"]
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/rally/cnp/cnp_base_scenario.py", line 510, in 
_get_dhcp_port
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner 
device_owner=device_owner)
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 102, in 
with_params
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner ret = 
self.function(instance, *args, **kwargs)
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 547, in 
list_ports
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner **_params)
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 307, in 
list
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner for r in 
self._pagination(collection, path, **params):
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 320, in 
_pagination
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner res = self.get(path, 
params=params)
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 293, in 
get
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner headers=headers, 
params=params)
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 270, in 
retry_request
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner headers=headers, 
params=params)
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner res = self.get(path, 
params=params)
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 293, in 
get
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner headers=headers, 
params=params)
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 270, in 
retry_request
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner headers=headers, 
params=params)
2016-02-12 13:46:36.405 28528 ERROR rally.task.runner   File 
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 211, in 
do_request
2016-02-12 13:46:36.405 28528 ERROR 

[Yahoo-eng-team] [Bug 1544999] [NEW] Encountered database ACL.", "error":"referential integrity violation" running OVN stack

2016-02-12 Thread Mala Anand
Public bug reported:

Scenario tested: A tenant has 2 networks, each with a subnet connected through 
a router. Each tenant gets two VMs each in a network. When these two VMs boot 
up, they send iperf traffic across the router.
Scale tenants to 500 and measure scaling performance

Results: Scaled up to 330 vms successfully and failed to boot VMs beyond
330VMs.

The neturon servers including plugin were pegged at 100% and see this database 
integrity violation:
2016-01-28 04:55:04.723 2757 WARNING requests.packages.urllib3.connectionpool 
[req-7964fc18-a95c-4464-a568-038a453d006e 7a0b7c6414734a93b5dffbc666534690 
e53d6cd40c5140c58ec014eb56070917 - - -] Connection pool is full, discarding 
connection: identity.open.softlayer.com
2016-01-28 04:55:04.772 2757 WARNING requests.packages.urllib3.connectionpool 
[-] Connection pool is full, discarding connection: identity.open.softlayer.com
2016-01-28 04:55:05.776 2757 WARNING requests.packages.urllib3.connectionpool 
[req-6b80241f-0928-4d93-a80e-988a7b3e9690 7a0b7c6414734a93b5dffbc666534690 
e53d6cd40c5140c58ec014eb56070917 - - -] Connection pool is full, discarding 
connection: identity.open.softlayer.com
2016-01-28 04:55:55.380 2757 ERROR neutron.agent.ovsdb.impl_idl [-] OVSDB Error:
{"details":"Table Logical_Switch column acls row 
36d940b2-26cc-426a-bda6-dd2491f18397 references nonexistent row 
0644cffd-71ca-467f-8e1d-6652968870ef in table ACL.","error":"referential 
integrity violation"}

2016-01-28 04:55:55.443 2757 ERROR neutron.agent.ovsdb.impl_idl 
[req-0bd44851-ecd9-4957-978f-350b52ada25b cc27c50b17fc4954905db5f3f3eed730 
e53d6cd40c5140c58ec014eb56070917 - - -] Traceback (most recent call last):
File 
"/opt/neutron/lib/python2.7/site-packages/neutron/agent/ovsdb/native/connection.py",
 line 99, in run
txn.results.put(txn.do_commit())
File 
"/opt/neutron/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_idl.py", 
line 106, in do_commit
raise RuntimeError(msg)
RuntimeError: OVSDB Error:
{"details":"Table Logical_Switch column acls row 
36d940b2-26cc-426a-bda6-dd2491f18397 references nonexistent row 
0644cffd-71ca-467f-8e1d-6652968870ef in table ACL.","error":"referential 
integrity violation"} 

This problem occurs consistently scaling to 500 VMs.

Also have seen this problem occurring in a 5 node devstack installation.
In this case we increased the Neutron api threads to 18 to recreate this
problem.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1544999

Title:
  Encountered database  ACL.","error":"referential integrity violation"
  running OVN stack

Status in neutron:
  New

Bug description:
  Scenario tested: A tenant has 2 networks, each with a subnet connected 
through a router. Each tenant gets two VMs each in a network. When these two 
VMs boot up, they send iperf traffic across the router.
  Scale tenants to 500 and measure scaling performance

  Results: Scaled up to 330 vms successfully and failed to boot VMs
  beyond 330VMs.

  The neturon servers including plugin were pegged at 100% and see this 
database integrity violation:
  2016-01-28 04:55:04.723 2757 WARNING requests.packages.urllib3.connectionpool 
[req-7964fc18-a95c-4464-a568-038a453d006e 7a0b7c6414734a93b5dffbc666534690 
e53d6cd40c5140c58ec014eb56070917 - - -] Connection pool is full, discarding 
connection: identity.open.softlayer.com
  2016-01-28 04:55:04.772 2757 WARNING requests.packages.urllib3.connectionpool 
[-] Connection pool is full, discarding connection: identity.open.softlayer.com
  2016-01-28 04:55:05.776 2757 WARNING requests.packages.urllib3.connectionpool 
[req-6b80241f-0928-4d93-a80e-988a7b3e9690 7a0b7c6414734a93b5dffbc666534690 
e53d6cd40c5140c58ec014eb56070917 - - -] Connection pool is full, discarding 
connection: identity.open.softlayer.com
  2016-01-28 04:55:55.380 2757 ERROR neutron.agent.ovsdb.impl_idl [-] OVSDB 
Error:
  {"details":"Table Logical_Switch column acls row 
36d940b2-26cc-426a-bda6-dd2491f18397 references nonexistent row 
0644cffd-71ca-467f-8e1d-6652968870ef in table ACL.","error":"referential 
integrity violation"}

  2016-01-28 04:55:55.443 2757 ERROR neutron.agent.ovsdb.impl_idl 
[req-0bd44851-ecd9-4957-978f-350b52ada25b cc27c50b17fc4954905db5f3f3eed730 
e53d6cd40c5140c58ec014eb56070917 - - -] Traceback (most recent call last):
  File 
"/opt/neutron/lib/python2.7/site-packages/neutron/agent/ovsdb/native/connection.py",
 line 99, in run
  txn.results.put(txn.do_commit())
  File 
"/opt/neutron/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_idl.py", 
line 106, in do_commit
  raise RuntimeError(msg)
  RuntimeError: OVSDB Error:
  {"details":"Table Logical_Switch column acls row 
36d940b2-26cc-426a-bda6-dd2491f18397 references nonexistent row 
0644cffd-71ca-467f-8e1d-6652968870ef in table ACL.","error":"referential 
integrity violation"} 

  This problem occurs