[Openstack] Cannot delete stack after update failed

2017-12-18 Thread Samuel Monderer
I tried to update an existing stack by just changing a parameter setting
the number of server groups but it failed due to exceeded server group
quota.
I then tried to delete the stack but it failed, heat-engine.log show the
following error

2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource Traceback (most
recent call last):
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource File
"/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 770, in
_action_recorder
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource yield
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource File
"/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 1707, in
delete
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource *action_args)
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource File
"/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 352, in
wrapper
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource step =
next(subtask)
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource File
"/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 823, in
action_handler_task
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource done =
check(handler_data)
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource File
"/usr/lib/python2.7/site-packages/heat/engine/resources/stack_resource.py",
line 549, in check_delete_complete
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource return
self._check_status_complete(self.DELETE)
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource File
"/usr/lib/python2.7/site-packages/heat/engine/resources/stack_resource.py",
line 415, in _check_status_complete
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource action=action)
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource ResourceFailure:
resources[0]: resources.pairs.Resource DELETE failed: resources.core: Stack
DELETE cancelled
2017-12-18 23:34:37.755 10017 ERROR heat.engine.resource

Which is strange because all other stack command succeed

Any idea what could cause this problem
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] need help understanding neutron/OVS

2017-12-18 Thread Remo Mattei
I think you are looking at the wrong file.. if you are using ovs you need to 
look at that… example: 

[root@os-poc-controller-0 ml2]# grep -i mtu openvswitch_agent.ini
# MTU size of veth interfaces (integer value)
#veth_mtu = 9000



> On Dec 18, 2017, at 20:06, Manuel Sopena Ballesteros 
>  wrote:
> 
> Yes,  <>
>  
> This is my neutron ml2 config on the compute node
>  
> [ml2]
> type_drivers = flat,vlan,vxlan
> tenant_network_types = vxlan
> path_mtu = 9000
> mechanism_drivers = openvswitch,l2population
> extension_drivers = port_security
>  
> [ml2_type_vlan]
> network_vlan_ranges =
>  
> [ml2_type_flat]
> flat_networks = physnet1
>  
> [ml2_type_vxlan]
> vni_ranges = 1:1000
> vxlan_group = 239.1.1.1
>  
> [securitygroup]
> firewall_driver = 
> neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver
>  
> [agent]
> tunnel_types = vxlan
> l2_population = true
> arp_responder = true
>  
> [ovs]
> ovsdb_connection = tcp:127.0.0.1:6640
> local_ip = 10.0.32.23
>  
> thank you
>  
> Manuel
>  
> From: Remo Mattei [mailto:r...@italy1.com ] 
> Sent: Tuesday, December 19, 2017 12:56 PM
> To: Manuel Sopena Ballesteros
> Cc: openstack@lists.openstack.org 
> Subject: Re: [Openstack] need help understanding neutron/OVS
>  
> Did you change the mtu on neutron config?
> 
> Sent from my iPad
> 
> On Dec 18, 2017, at 5:38 PM, Manuel Sopena Ballesteros 
> mailto:manuel...@garvan.org.au>> wrote:
> 
> Dear Openstack community,
>  
> Openstack environment:
>  
> 2 compute nodes each has:
> · 1x AMD Opteron(tm) Processor 6282 SE (56 cpus)
> · 512GB RAM
> · 1x dual port 25Gbps Mellanox connect-4 XOR bond
>  
> Physical network:
> · Compute nodes connected through a mellanox 2700 switch (100Gbps)
> · MTU 9000
>  
> Openstack network:
> · OVS + VxLan
> · Vxlan offloading setup on physical nics
> · MTU on vms 8950
>  
> Compute nodes nics:
>  
> [root@hercules-23 ~]# ip a
> …
> 6: p2p1:  mtu 9000 qdisc mq master 
> bond0 state UP qlen 1000
> link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff
> 7: p2p2:  mtu 9000 qdisc mq master 
> bond0 state UP qlen 1000
> link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff
> 8: bond0:  mtu 9000 qdisc noqueue 
> state UP qlen 1000
> link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff
> inet 10.0.32.23/16 brd 10.0.255.255 scope global bond0
>valid_lft forever preferred_lft forever
> inet6 fe80::3826:9daa:ef1b:43b0/64 scope link
>valid_lft forever preferred_lft forever
> …
>  
>  
> VM nics:
>  
> [centos@centos2 ~]$ ip a
> …
> 2: eth0:  mtu 8950 qdisc pfifo_fast state UP 
> qlen 1000
> link/ether fa:16:3e:a3:d4:85 brd ff:ff:ff:ff:ff:ff
> inet 192.168.1.116/24 brd 192.168.1.255 scope global dynamic eth0
>valid_lft 57885sec preferred_lft 57885sec
> inet6 fe80::f816:3eff:fea3:d485/64 scope link
>valid_lft forever preferred_lft forever
>  
>  
> host to host bandwidth using muttcp (1 thread):
>  
> [root@hercules-22 ~]# nuttcp -l65536 -fparse 10.0.32.23
> megabytes=19844.1759 real_seconds=10.00 rate_Mbps=16643.4189 tx_cpu=61 
> rx_cpu=99 retrans=0 rtt_ms=0.28
>  
> VMTP Output:
>  
> ==
> Total Scenarios:   29
> Passed Scenarios:  17 [100.00%]
> Failed Scenarios:  0 [0.00%]
> Skipped Scenarios: 12
> +--+---+---+--+
> | Scenario | Scenario Name | 
> Functional Status | Data  
>|
> +--+---+---+--+
> | 1.1  | Same Network, Fixed IP, Intra-node, TCP   | PASSED   
>  | {'tp_kbps': '7613130', 'rtt_ms': '0.46'}   
>   |
> | 1.2  | Same Network, Fixed IP, Intra-node, UDP   | PASSED   
>  | {128: {'tp_kbps': 59954, 'loss_rate': 0.02}, 1024: {'tp_kbps': 
> 444264,   |
> |  |   |  
>  | 'loss_rate': 0.05}, 8192: {'tp_kbps': 2877674, 'loss_rate': 0.0}}  
>   |
> | 1.3  | Same Network, Fixed IP, Intra-node, ICMP  | PASSED   
>  | {'rtt avg/min/max/stddev msec': {'391-byte': 
> '0.376/0.285/0.439/0.047',  |
> |  |   |  
>  | '64-byte': '0.617/0.361/1.513/0.388', '1500-byte': 
> '0.369/0.322/0.452/0.048'}}   |
> | 1.4  | Same Network, Fixed IP, Intra-node, Multicast | SKIPPED  
>  | {} 
>   |
> | 2.1   

Re: [Openstack] need help understanding neutron/OVS

2017-12-18 Thread Manuel Sopena Ballesteros
Ok,

This is it:

(neutron-openvswitch-agent)[root@hercules-23 /]# grep -i mtu 
/etc/neutron/plugins/ml2/openvswitch_agent.ini
# MTU size of veth interfaces (integer value)
#veth_mtu = 9000

Thank you

Manuel

From: Remo Mattei [mailto:r...@italy1.com]
Sent: Tuesday, December 19, 2017 3:09 PM
To: Manuel Sopena Ballesteros
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] need help understanding neutron/OVS

I think you are looking at the wrong file.. if you are using ovs you need to 
look at that… example:

[root@os-poc-controller-0 ml2]# grep -i mtu openvswitch_agent.ini
# MTU size of veth interfaces (integer value)
#veth_mtu = 9000



On Dec 18, 2017, at 20:06, Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> wrote:

Yes,

This is my neutron ml2 config on the compute node

[ml2]
type_drivers = flat,vlan,vxlan
tenant_network_types = vxlan
path_mtu = 9000
mechanism_drivers = openvswitch,l2population
extension_drivers = port_security

[ml2_type_vlan]
network_vlan_ranges =

[ml2_type_flat]
flat_networks = physnet1

[ml2_type_vxlan]
vni_ranges = 1:1000
vxlan_group = 239.1.1.1

[securitygroup]
firewall_driver = 
neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

[agent]
tunnel_types = vxlan
l2_population = true
arp_responder = true

[ovs]
ovsdb_connection = tcp:127.0.0.1:6640
local_ip = 10.0.32.23

thank you

Manuel

From: Remo Mattei [mailto:r...@italy1.com]
Sent: Tuesday, December 19, 2017 12:56 PM
To: Manuel Sopena Ballesteros
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] need help understanding neutron/OVS

Did you change the mtu on neutron config?
Sent from my iPad

On Dec 18, 2017, at 5:38 PM, Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> wrote:
Dear Openstack community,

Openstack environment:

2 compute nodes each has:
• 1x AMD Opteron(tm) Processor 6282 SE (56 cpus)
• 512GB RAM
• 1x dual port 25Gbps Mellanox connect-4 XOR bond

Physical network:
• Compute nodes connected through a mellanox 2700 switch (100Gbps)
• MTU 9000

Openstack network:
• OVS + VxLan
• Vxlan offloading setup on physical nics
• MTU on vms 8950

Compute nodes nics:

[root@hercules-23 ~]# ip a
…
6: p2p1:  mtu 9000 qdisc mq master bond0 
state UP qlen 1000
link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff
7: p2p2:  mtu 9000 qdisc mq master bond0 
state UP qlen 1000
link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff
8: bond0:  mtu 9000 qdisc noqueue state 
UP qlen 1000
link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff
inet 10.0.32.23/16 brd 10.0.255.255 scope global bond0
   valid_lft forever preferred_lft forever
inet6 fe80::3826:9daa:ef1b:43b0/64 scope link
   valid_lft forever preferred_lft forever
…


VM nics:

[centos@centos2 ~]$ ip a
…
2: eth0:  mtu 8950 qdisc pfifo_fast state UP 
qlen 1000
link/ether fa:16:3e:a3:d4:85 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.116/24 brd 192.168.1.255 scope global dynamic eth0
   valid_lft 57885sec preferred_lft 57885sec
inet6 fe80::f816:3eff:fea3:d485/64 scope link
   valid_lft forever preferred_lft forever


host to host bandwidth using muttcp (1 thread):

[root@hercules-22 ~]# nuttcp -l65536 -fparse 10.0.32.23
megabytes=19844.1759 real_seconds=10.00 rate_Mbps=16643.4189 tx_cpu=61 
rx_cpu=99 retrans=0 rtt_ms=0.28

VMTP Output:

==
Total Scenarios:   29
Passed Scenarios:  17 [100.00%]
Failed Scenarios:  0 [0.00%]
Skipped Scenarios: 12
+--+---+---+--+
| Scenario | Scenario Name | Functional 
Status | Data   
  |
+--+---+---+--+
| 1.1  | Same Network, Fixed IP, Intra-node, TCP   | PASSED 
   | {'tp_kbps': '7613130', 'rtt_ms': '0.46'}   
  |
| 1.2  | Same Network, Fixed IP, Intra-node, UDP   | PASSED 
   | {128: {'tp_kbps': 59954, 'loss_rate': 0.02}, 1024: {'tp_kbps': 444264, 
  |
|  |   |
   | 'loss_rate': 0.05}, 8192: {'tp_kbps': 2877674, 'loss_rate': 0.0}}  
  |
| 1.3  | Same Network, Fixed IP, Intra-node, ICMP  | PASSED 
   | {'rtt avg/min/max/stddev msec': {'391-byte': 
'0.376/0.285/0.439/0.047',  |
|  |   |
   | '64-byte': '0.617/0.361/1.513/0.388', '1500-byte': 
'0.369/0.322/0.452/0.048'}}   |
| 1.4  | Same Network, Fixed IP, Intra-node, Multicast | SKIPP

Re: [Openstack] need help understanding neutron/OVS

2017-12-18 Thread Manuel Sopena Ballesteros
Yes,

This is my neutron ml2 config on the compute node

[ml2]
type_drivers = flat,vlan,vxlan
tenant_network_types = vxlan
path_mtu = 9000
mechanism_drivers = openvswitch,l2population
extension_drivers = port_security

[ml2_type_vlan]
network_vlan_ranges =

[ml2_type_flat]
flat_networks = physnet1

[ml2_type_vxlan]
vni_ranges = 1:1000
vxlan_group = 239.1.1.1

[securitygroup]
firewall_driver = 
neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

[agent]
tunnel_types = vxlan
l2_population = true
arp_responder = true

[ovs]
ovsdb_connection = tcp:127.0.0.1:6640
local_ip = 10.0.32.23

thank you

Manuel

From: Remo Mattei [mailto:r...@italy1.com]
Sent: Tuesday, December 19, 2017 12:56 PM
To: Manuel Sopena Ballesteros
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] need help understanding neutron/OVS

Did you change the mtu on neutron config?
Sent from my iPad

On Dec 18, 2017, at 5:38 PM, Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> wrote:

Dear Openstack community,



Openstack environment:



2 compute nodes each has:

· 1x AMD Opteron(tm) Processor 6282 SE (56 cpus)

· 512GB RAM

· 1x dual port 25Gbps Mellanox connect-4 XOR bond



Physical network:

· Compute nodes connected through a mellanox 2700 switch (100Gbps)

· MTU 9000



Openstack network:

· OVS + VxLan

· Vxlan offloading setup on physical nics

· MTU on vms 8950



Compute nodes nics:



[root@hercules-23 ~]# ip a

…

6: p2p1:  mtu 9000 qdisc mq master bond0 
state UP qlen 1000

link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff

7: p2p2:  mtu 9000 qdisc mq master bond0 
state UP qlen 1000

link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff

8: bond0:  mtu 9000 qdisc noqueue state 
UP qlen 1000

link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff

inet 10.0.32.23/16 brd 10.0.255.255 scope global bond0

   valid_lft forever preferred_lft forever

inet6 fe80::3826:9daa:ef1b:43b0/64 scope link

   valid_lft forever preferred_lft forever

…





VM nics:



[centos@centos2 ~]$ ip a

…

2: eth0:  mtu 8950 qdisc pfifo_fast state UP 
qlen 1000

link/ether fa:16:3e:a3:d4:85 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.116/24 brd 192.168.1.255 scope global dynamic eth0

   valid_lft 57885sec preferred_lft 57885sec

inet6 fe80::f816:3eff:fea3:d485/64 scope link

   valid_lft forever preferred_lft forever





host to host bandwidth using muttcp (1 thread):



[root@hercules-22 ~]# nuttcp -l65536 -fparse 10.0.32.23

megabytes=19844.1759 real_seconds=10.00 rate_Mbps=16643.4189 tx_cpu=61 
rx_cpu=99 retrans=0 rtt_ms=0.28



VMTP Output:



==

Total Scenarios:   29

Passed Scenarios:  17 [100.00%]

Failed Scenarios:  0 [0.00%]

Skipped Scenarios: 12

+--+---+---+--+

| Scenario | Scenario Name | Functional 
Status | Data   
  |

+--+---+---+--+

| 1.1  | Same Network, Fixed IP, Intra-node, TCP   | PASSED 
   | {'tp_kbps': '7613130', 'rtt_ms': '0.46'}   
  |

| 1.2  | Same Network, Fixed IP, Intra-node, UDP   | PASSED 
   | {128: {'tp_kbps': 59954, 'loss_rate': 0.02}, 1024: {'tp_kbps': 444264, 
  |

|  |   |
   | 'loss_rate': 0.05}, 8192: {'tp_kbps': 2877674, 'loss_rate': 0.0}}  
  |

| 1.3  | Same Network, Fixed IP, Intra-node, ICMP  | PASSED 
   | {'rtt avg/min/max/stddev msec': {'391-byte': 
'0.376/0.285/0.439/0.047',  |

|  |   |
   | '64-byte': '0.617/0.361/1.513/0.388', '1500-byte': 
'0.369/0.322/0.452/0.048'}}   |

| 1.4  | Same Network, Fixed IP, Intra-node, Multicast | SKIPPED
   | {} 
  |

| 2.1  | Same Network, Fixed IP, Inter-node, TCP   | PASSED 
   | {'tp_kbps': '6864266', 'rtt_ms': '0.65'}   
  |

| 2.2  | Same Network, Fixed IP, Inter-node, UDP   | PASSED 
   | {128: {'tp_kbps': 80955, 'loss_rate': 0.46}, 1024: {'tp_kbps': 166186, 
  |

|  |   |
   | 'loss_rate': 36.2}, 8192: {'tp_kbps': 2189389, 'loss_rate': 0.05}} 
  |

| 2.3  | Same Network, Fixed IP, Inter-node, ICMP  | PASSED 
   | {'r

Re: [Openstack] need help understanding neutron/OVS

2017-12-18 Thread Remo Mattei
Did you change the mtu on neutron config?

Sent from my iPad

> On Dec 18, 2017, at 5:38 PM, Manuel Sopena Ballesteros 
>  wrote:
> 
> Dear Openstack community,
>  
> Openstack environment:
>  
> 2 compute nodes each has:
> · 1x AMD Opteron(tm) Processor 6282 SE (56 cpus)
> · 512GB RAM
> · 1x dual port 25Gbps Mellanox connect-4 XOR bond
>  
> Physical network:
> · Compute nodes connected through a mellanox 2700 switch (100Gbps)
> · MTU 9000
>  
> Openstack network:
> · OVS + VxLan
> · Vxlan offloading setup on physical nics
> · MTU on vms 8950
>  
> Compute nodes nics:
>  
> [root@hercules-23 ~]# ip a
> …
> 6: p2p1:  mtu 9000 qdisc mq master 
> bond0 state UP qlen 1000
> link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff
> 7: p2p2:  mtu 9000 qdisc mq master 
> bond0 state UP qlen 1000
> link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff
> 8: bond0:  mtu 9000 qdisc noqueue 
> state UP qlen 1000
> link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff
> inet 10.0.32.23/16 brd 10.0.255.255 scope global bond0
>valid_lft forever preferred_lft forever
> inet6 fe80::3826:9daa:ef1b:43b0/64 scope link
>valid_lft forever preferred_lft forever
> …
>  
>  
> VM nics:
>  
> [centos@centos2 ~]$ ip a
> …
> 2: eth0:  mtu 8950 qdisc pfifo_fast state UP 
> qlen 1000
> link/ether fa:16:3e:a3:d4:85 brd ff:ff:ff:ff:ff:ff
> inet 192.168.1.116/24 brd 192.168.1.255 scope global dynamic eth0
>valid_lft 57885sec preferred_lft 57885sec
> inet6 fe80::f816:3eff:fea3:d485/64 scope link
>valid_lft forever preferred_lft forever
>  
>  
> host to host bandwidth using muttcp (1 thread):
>  
> [root@hercules-22 ~]# nuttcp -l65536 -fparse 10.0.32.23
> megabytes=19844.1759 real_seconds=10.00 rate_Mbps=16643.4189 tx_cpu=61 
> rx_cpu=99 retrans=0 rtt_ms=0.28
>  
> VMTP Output:
>  
> ==
> Total Scenarios:   29
> Passed Scenarios:  17 [100.00%]
> Failed Scenarios:  0 [0.00%]
> Skipped Scenarios: 12
> +--+---+---+--+
> | Scenario | Scenario Name | 
> Functional Status | Data  
>|
> +--+---+---+--+
> | 1.1  | Same Network, Fixed IP, Intra-node, TCP   | PASSED   
>  | {'tp_kbps': '7613130', 'rtt_ms': '0.46'}   
>   |
> | 1.2  | Same Network, Fixed IP, Intra-node, UDP   | PASSED   
>  | {128: {'tp_kbps': 59954, 'loss_rate': 0.02}, 1024: {'tp_kbps': 
> 444264,   |
> |  |   |  
>  | 'loss_rate': 0.05}, 8192: {'tp_kbps': 2877674, 'loss_rate': 0.0}}  
>   |
> | 1.3  | Same Network, Fixed IP, Intra-node, ICMP  | PASSED   
>  | {'rtt avg/min/max/stddev msec': {'391-byte': 
> '0.376/0.285/0.439/0.047',  |
> |  |   |  
>  | '64-byte': '0.617/0.361/1.513/0.388', '1500-byte': 
> '0.369/0.322/0.452/0.048'}}   |
> | 1.4  | Same Network, Fixed IP, Intra-node, Multicast | SKIPPED  
>  | {} 
>   |
> | 2.1  | Same Network, Fixed IP, Inter-node, TCP   | PASSED   
>  | {'tp_kbps': '6864266', 'rtt_ms': '0.65'}   
>   |
> | 2.2  | Same Network, Fixed IP, Inter-node, UDP   | PASSED   
>  | {128: {'tp_kbps': 80955, 'loss_rate': 0.46}, 1024: {'tp_kbps': 
> 166186,   |
> |  |   |  
>  | 'loss_rate': 36.2}, 8192: {'tp_kbps': 2189389, 'loss_rate': 0.05}} 
>   |
> | 2.3  | Same Network, Fixed IP, Inter-node, ICMP  | PASSED   
>  | {'rtt avg/min/max/stddev msec': {'391-byte': 
> '0.484/0.315/0.561/0.068',  |
> |  |   |  
>  | '64-byte': '0.719/0.421/2.714/0.666', '1500-byte': 
> '0.465/0.358/0.553/0.055'}}   |
> | 2.4  | Same Network, Fixed IP, Inter-node, Multicast | SKIPPED  
>  | {} 
>   |
> | 3.1  | Different Network, Fixed IP, Intra-node, TCP  | PASSED   
>  | {'tp_kbps': '5207719', 'rtt_ms': '0.896667'}   
>   |
> | 3.2  | Different Network, Fixed IP, Intra-node, UDP  | PASSED   
>  | {128: 

[Openstack] need help understanding neutron/OVS

2017-12-18 Thread Manuel Sopena Ballesteros
Dear Openstack community,



Openstack environment:



2 compute nodes each has:

* 1x AMD Opteron(tm) Processor 6282 SE (56 cpus)

* 512GB RAM

* 1x dual port 25Gbps Mellanox connect-4 XOR bond



Physical network:

* Compute nodes connected through a mellanox 2700 switch (100Gbps)

* MTU 9000



Openstack network:

* OVS + VxLan

* Vxlan offloading setup on physical nics

* MTU on vms 8950



Compute nodes nics:



[root@hercules-23 ~]# ip a

...

6: p2p1:  mtu 9000 qdisc mq master bond0 
state UP qlen 1000

link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff

7: p2p2:  mtu 9000 qdisc mq master bond0 
state UP qlen 1000

link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff

8: bond0:  mtu 9000 qdisc noqueue state 
UP qlen 1000

link/ether 7c:fe:90:12:23:c4 brd ff:ff:ff:ff:ff:ff

inet 10.0.32.23/16 brd 10.0.255.255 scope global bond0

   valid_lft forever preferred_lft forever

inet6 fe80::3826:9daa:ef1b:43b0/64 scope link

   valid_lft forever preferred_lft forever

...





VM nics:



[centos@centos2 ~]$ ip a

...

2: eth0:  mtu 8950 qdisc pfifo_fast state UP 
qlen 1000

link/ether fa:16:3e:a3:d4:85 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.116/24 brd 192.168.1.255 scope global dynamic eth0

   valid_lft 57885sec preferred_lft 57885sec

inet6 fe80::f816:3eff:fea3:d485/64 scope link

   valid_lft forever preferred_lft forever





host to host bandwidth using muttcp (1 thread):



[root@hercules-22 ~]# nuttcp -l65536 -fparse 10.0.32.23

megabytes=19844.1759 real_seconds=10.00 rate_Mbps=16643.4189 tx_cpu=61 
rx_cpu=99 retrans=0 rtt_ms=0.28



VMTP Output:



==

Total Scenarios:   29

Passed Scenarios:  17 [100.00%]

Failed Scenarios:  0 [0.00%]

Skipped Scenarios: 12

+--+---+---+--+

| Scenario | Scenario Name | Functional 
Status | Data   
  |

+--+---+---+--+

| 1.1  | Same Network, Fixed IP, Intra-node, TCP   | PASSED 
   | {'tp_kbps': '7613130', 'rtt_ms': '0.46'}   
  |

| 1.2  | Same Network, Fixed IP, Intra-node, UDP   | PASSED 
   | {128: {'tp_kbps': 59954, 'loss_rate': 0.02}, 1024: {'tp_kbps': 444264, 
  |

|  |   |
   | 'loss_rate': 0.05}, 8192: {'tp_kbps': 2877674, 'loss_rate': 0.0}}  
  |

| 1.3  | Same Network, Fixed IP, Intra-node, ICMP  | PASSED 
   | {'rtt avg/min/max/stddev msec': {'391-byte': 
'0.376/0.285/0.439/0.047',  |

|  |   |
   | '64-byte': '0.617/0.361/1.513/0.388', '1500-byte': 
'0.369/0.322/0.452/0.048'}}   |

| 1.4  | Same Network, Fixed IP, Intra-node, Multicast | SKIPPED
   | {} 
  |

| 2.1  | Same Network, Fixed IP, Inter-node, TCP   | PASSED 
   | {'tp_kbps': '6864266', 'rtt_ms': '0.65'}   
  |

| 2.2  | Same Network, Fixed IP, Inter-node, UDP   | PASSED 
   | {128: {'tp_kbps': 80955, 'loss_rate': 0.46}, 1024: {'tp_kbps': 166186, 
  |

|  |   |
   | 'loss_rate': 36.2}, 8192: {'tp_kbps': 2189389, 'loss_rate': 0.05}} 
  |

| 2.3  | Same Network, Fixed IP, Inter-node, ICMP  | PASSED 
   | {'rtt avg/min/max/stddev msec': {'391-byte': 
'0.484/0.315/0.561/0.068',  |

|  |   |
   | '64-byte': '0.719/0.421/2.714/0.666', '1500-byte': 
'0.465/0.358/0.553/0.055'}}   |

| 2.4  | Same Network, Fixed IP, Inter-node, Multicast | SKIPPED
   | {} 
  |

| 3.1  | Different Network, Fixed IP, Intra-node, TCP  | PASSED 
   | {'tp_kbps': '5207719', 'rtt_ms': '0.896667'}   
  |

| 3.2  | Different Network, Fixed IP, Intra-node, UDP  | PASSED 
   | {128: {'tp_kbps': 76569, 'loss_rate': 0.9}, 1024: {'tp_kbps': 571847,  
  |

|  |   |
   | 'loss_rate': 2.75}, 8192: {'tp_kbps': 1855166, 'loss_rate': 0.0}}  
  |

| 3.3  | Different Network, 

Re: [Openstack] compute nodes down

2017-12-18 Thread Paras pradhan
It might not be the compute nodes. I would check the rabbitmq, neutron and
nova logs on the controllers.


Paras

On Dec 18, 2017 5:30 PM, "Volodymyr Litovka"  wrote:

> Hi Jim,
>
> switch debug to true in nova.conf and check *also* other logs -
> nova-scheduler, nova-placement, nova-conductor.
>
> On 12/19/17 12:54 AM, Jim Okken wrote:
>
> hi list,
>
> hoping someone could shed some light on this issue I just started seeing
> today
>
> all my compute nodes started showing as "Down" in the Horizon ->
> Hypervisors -> Compute Nodes tab
>
>
> root@node-1:~# nova service-list
> +-+--+---+--+---
> --+---++-+
> | Id  | Binary   | Host  | Zone | Status  | State
> | Updated_at | Disabled Reason |
> +-+--+---+--+---
> --+---++-+
> | 325 | nova-compute | node-9.mydom.com  | nova | enabled | down
> | 2017-12-18T21:59:38.00 | -   |
> | 448 | nova-compute | node-14.mydom.com | nova | enabled | up
> | 2017-12-18T22:41:42.00 | -   |
> | 451 | nova-compute | node-17.mydom.com | nova | enabled | up
> | 2017-12-18T22:42:04.00 | -   |
> | 454 | nova-compute | node-11.mydom.com | nova | enabled | up
> | 2017-12-18T22:42:02.00 | -   |
> | 457 | nova-compute | node-12.mydom.com | nova | enabled | up
> | 2017-12-18T22:42:12.00 | -   |
> | 472 | nova-compute | node-16.mydom.com | nova | enabled | down
> | 2017-12-18T00:16:01.00 | -   |
> | 475 | nova-compute | node-10.mydom.com | nova | enabled | down
> | 2017-12-18T00:26:09.00 | -   |
> | 478 | nova-compute | node-13.mydom.com | nova | enabled | down
> | 2017-12-17T23:54:06.00 | -   |
> | 481 | nova-compute | node-15.mydom.com | nova | enabled | up
> | 2017-12-18T22:41:34.00 | -   |
> | 484 | nova-compute | node-8.mydom.com  | nova | enabled | down
> | 2017-12-17T23:55:50.00 | -   |
>
>
> if I stop and the start nova-compute on the down nodes the stop will take
> several minutes and then the start will be quick and fine. but after about
> 2 hours the nova-compute service will show down again.
>
> i am not seeing any ERRORS in nova logs.
>
> I get this for the status of a node that is showing as "UP"
>
>
>
> root@node-14:~# systemctl status nova-compute.service
> â nova-compute.service - OpenStack Compute
>Loaded: loaded (/lib/systemd/system/nova-compute.service; enabled;
> vendor preset: enabled)
>Active: active (running) since Mon 2017-12-18 21:57:10 UTC; 35min ago
>  Docs: man:nova-compute(1)
>   Process: 32193 ExecStartPre=/bin/chown nova:adm /var/log/nova
> (code=exited, status=0/SUCCESS)
>   Process: 32190 ExecStartPre=/bin/chown nova:nova /var/lock/nova
> /var/lib/nova (code=exited, status=0/SUCCESS)
>   Process: 32187 ExecStartPre=/bin/mkdir -p /var/lock/nova /var/log/nova
> /var/lib/nova (code=exited, status=0/SUCCESS)
>  Main PID: 32196 (nova-compute)
>CGroup: /system.slice/nova-compute.service
>ââ32196 /usr/bin/python /usr/bin/nova-compute
> --config-file=/etc/nova/nova-compute.conf --config-file=/etc/nova/nova.conf
> --log-file=/var/log/nova/nova-compute.log
>
> Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
> 22:31:47.570 32196 DEBUG oslo_messaging._drivers.amqpdriver
> [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] CALL msg_id:
> 2877b9707da144f3a91e7b80e2705fb3 exchange 'nova' topic 'conductor' _send
> /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:448
> Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
> 22:31:47.604 32196 DEBUG oslo_messaging._drivers.amqpdriver [-] received
> reply msg_id: 2877b9707da144f3a91e7b80e2705fb3 __call__
> /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:296
> Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
> 22:31:47.605 32196 INFO nova.compute.resource_tracker
> [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Total usable vcpus:
> 40, total allocated vcpus: 0
> Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
> 22:31:47.606 32196 INFO nova.compute.resource_tracker
> [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Final resource view:
> name=node-14.mydom.com phys_ram=128812MB used_ram=512MB phys_disk=6691GB
> used_disk=0GB total_vcpus=40 used_vcpus=0 pci_stats=[]
> Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
> 22:31:47.610 32196 DEBUG oslo_messaging._drivers.amqpdriver
> [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] CALL msg_id:
> ad32abe833f4440d86c15b911aa35c43 exchange 'nova' topic 'conductor' _send
> /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:448
> Dec 

Re: [Openstack] compute nodes down

2017-12-18 Thread Volodymyr Litovka

Hi Jim,

switch debug to true in nova.conf and check *also* other logs - 
nova-scheduler, nova-placement, nova-conductor.


On 12/19/17 12:54 AM, Jim Okken wrote:

hi list,

hoping someone could shed some light on this issue I just started 
seeing today


all my compute nodes started showing as "Down" in the Horizon -> 
Hypervisors -> Compute Nodes tab



root@node-1:~# nova service-list
+-+--+---+--+-+---++-+
| Id  | Binary           | Host         | Zone     | Status  | State | 
Updated_at      | Disabled Reason |

+-+--+---+--+-+---++-+
| 325 | nova-compute     | node-9.mydom.com  
| nova     | enabled | down  | 2017-12-18T21:59:38.00 | -          
     |
| 448 | nova-compute     | node-14.mydom.com 
 | nova     | enabled | up    | 
2017-12-18T22:41:42.00 | -               |
| 451 | nova-compute     | node-17.mydom.com 
 | nova     | enabled | up    | 
2017-12-18T22:42:04.00 | -               |
| 454 | nova-compute     | node-11.mydom.com 
 | nova     | enabled | up    | 
2017-12-18T22:42:02.00 | -               |
| 457 | nova-compute     | node-12.mydom.com 
 | nova     | enabled | up    | 
2017-12-18T22:42:12.00 | -               |
| 472 | nova-compute     | node-16.mydom.com 
 | nova     | enabled | down  | 
2017-12-18T00:16:01.00 | -               |
| 475 | nova-compute     | node-10.mydom.com 
 | nova     | enabled | down  | 
2017-12-18T00:26:09.00 | -               |
| 478 | nova-compute     | node-13.mydom.com 
 | nova     | enabled | down  | 
2017-12-17T23:54:06.00 | -               |
| 481 | nova-compute     | node-15.mydom.com 
 | nova     | enabled | up    | 
2017-12-18T22:41:34.00 | -               |
| 484 | nova-compute     | node-8.mydom.com  
| nova     | enabled | down  | 2017-12-17T23:55:50.00 | -          
     |



if I stop and the start nova-compute on the down nodes the stop will 
take several minutes and then the start will be quick and fine. but 
after about 2 hours the nova-compute service will show down again.


i am not seeing any ERRORS in nova logs.

I get this for the status of a node that is showing as "UP"



root@node-14:~# systemctl status nova-compute.service
â nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; enabled; 
vendor preset: enabled)

   Active: active (running) since Mon 2017-12-18 21:57:10 UTC; 35min ago
     Docs: man:nova-compute(1)
  Process: 32193 ExecStartPre=/bin/chown nova:adm /var/log/nova 
(code=exited, status=0/SUCCESS)
  Process: 32190 ExecStartPre=/bin/chown nova:nova /var/lock/nova 
/var/lib/nova (code=exited, status=0/SUCCESS)
  Process: 32187 ExecStartPre=/bin/mkdir -p /var/lock/nova 
/var/log/nova /var/lib/nova (code=exited, status=0/SUCCESS)

 Main PID: 32196 (nova-compute)
   CGroup: /system.slice/nova-compute.service
           ââ32196 /usr/bin/python /usr/bin/nova-compute 
--config-file=/etc/nova/nova-compute.conf 
--config-file=/etc/nova/nova.conf 
--log-file=/var/log/nova/nova-compute.log


Dec 18 22:31:47 node-14.mydom.com  
nova-compute[32196]: 2017-12-18 22:31:47.570 32196 DEBUG 
oslo_messaging._drivers.amqpdriver 
[req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] CALL msg_id: 
2877b9707da144f3a91e7b80e2705fb3 exchange 'nova' topic 'conductor' 
_send 
/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:448
Dec 18 22:31:47 node-14.mydom.com  
nova-compute[32196]: 2017-12-18 22:31:47.604 32196 DEBUG 
oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 
2877b9707da144f3a91e7b80e2705fb3 __call__ 
/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:296
Dec 18 22:31:47 node-14.mydom.com  
nova-compute[32196]: 2017-12-18 22:31:47.605 32196 INFO 
nova.compute.resource_tracker 
[req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Total usable 
vcpus: 40, total allocated vcpus: 0
Dec 18 22:31:47 node-14.mydom.com  
nova-compute[32196]: 2017-12-18 22:31:47.606 32196 INFO 
nova.compute.resource_tracker 
[req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Final resource 
view: name=node-14.mydom.com  
phys_ram=128812MB used_ram=512MB phys_disk=6691GB used_disk=0GB 
total_vcpus=40 used_vcpus=0 pci_stats=[]
Dec 18 22:31:47 node-14.mydom.com  
nova-compute[32196]: 2017-12-18 22:31:47.610 32196 DEBUG 
oslo_messaging._drivers.amqpdriver 
[req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] CALL msg_id: 
ad

[Openstack] compute nodes down

2017-12-18 Thread Jim Okken
hi list,

hoping someone could shed some light on this issue I just started seeing
today

all my compute nodes started showing as "Down" in the Horizon ->
Hypervisors -> Compute Nodes tab


root@node-1:~# nova service-list
+-+--+---+--+-+---++-+
| Id  | Binary   | Host  | Zone | Status  | State |
Updated_at | Disabled Reason |
+-+--+---+--+-+---++-+
| 325 | nova-compute | node-9.mydom.com  | nova | enabled | down  |
2017-12-18T21:59:38.00 | -   |
| 448 | nova-compute | node-14.mydom.com | nova | enabled | up|
2017-12-18T22:41:42.00 | -   |
| 451 | nova-compute | node-17.mydom.com | nova | enabled | up|
2017-12-18T22:42:04.00 | -   |
| 454 | nova-compute | node-11.mydom.com | nova | enabled | up|
2017-12-18T22:42:02.00 | -   |
| 457 | nova-compute | node-12.mydom.com | nova | enabled | up|
2017-12-18T22:42:12.00 | -   |
| 472 | nova-compute | node-16.mydom.com | nova | enabled | down  |
2017-12-18T00:16:01.00 | -   |
| 475 | nova-compute | node-10.mydom.com | nova | enabled | down  |
2017-12-18T00:26:09.00 | -   |
| 478 | nova-compute | node-13.mydom.com | nova | enabled | down  |
2017-12-17T23:54:06.00 | -   |
| 481 | nova-compute | node-15.mydom.com | nova | enabled | up|
2017-12-18T22:41:34.00 | -   |
| 484 | nova-compute | node-8.mydom.com  | nova | enabled | down  |
2017-12-17T23:55:50.00 | -   |


if I stop and the start nova-compute on the down nodes the stop will take
several minutes and then the start will be quick and fine. but after about
2 hours the nova-compute service will show down again.

i am not seeing any ERRORS in nova logs.

I get this for the status of a node that is showing as "UP"



root@node-14:~# systemctl status nova-compute.service
â nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; enabled;
vendor preset: enabled)
   Active: active (running) since Mon 2017-12-18 21:57:10 UTC; 35min ago
 Docs: man:nova-compute(1)
  Process: 32193 ExecStartPre=/bin/chown nova:adm /var/log/nova
(code=exited, status=0/SUCCESS)
  Process: 32190 ExecStartPre=/bin/chown nova:nova /var/lock/nova
/var/lib/nova (code=exited, status=0/SUCCESS)
  Process: 32187 ExecStartPre=/bin/mkdir -p /var/lock/nova /var/log/nova
/var/lib/nova (code=exited, status=0/SUCCESS)
 Main PID: 32196 (nova-compute)
   CGroup: /system.slice/nova-compute.service
   ââ32196 /usr/bin/python /usr/bin/nova-compute
--config-file=/etc/nova/nova-compute.conf --config-file=/etc/nova/nova.conf
--log-file=/var/log/nova/nova-compute.log

Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
22:31:47.570 32196 DEBUG oslo_messaging._drivers.amqpdriver
[req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] CALL msg_id:
2877b9707da144f3a91e7b80e2705fb3 exchange 'nova' topic 'conductor' _send
/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:448
Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
22:31:47.604 32196 DEBUG oslo_messaging._drivers.amqpdriver [-] received
reply msg_id: 2877b9707da144f3a91e7b80e2705fb3 __call__
/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:296
Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
22:31:47.605 32196 INFO nova.compute.resource_tracker
[req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Total usable vcpus:
40, total allocated vcpus: 0
Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
22:31:47.606 32196 INFO nova.compute.resource_tracker
[req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Final resource view:
name=node-14.mydom.com phys_ram=128812MB used_ram=512MB phys_disk=6691GB
used_disk=0GB total_vcpus=40 used_vcpus=0 pci_stats=[]
Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
22:31:47.610 32196 DEBUG oslo_messaging._drivers.amqpdriver
[req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] CALL msg_id:
ad32abe833f4440d86c15b911aa35c43 exchange 'nova' topic 'conductor' _send
/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:448
Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
22:31:47.632 32196 DEBUG oslo_messaging._drivers.amqpdriver [-] received
reply msg_id: ad32abe833f4440d86c15b911aa35c43 __call__
/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:296
Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18
22:31:47.633 32196 WARNING nova.scheduler.client.report
[req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Unable to refresh my
resource provider record
Dec 

[Openstack] [FEMDC] Edge sessions during the next PTG

2017-12-18 Thread lebre . adrien
Dear all, 

As briefly discussed during the last telco around edge challenges [1], we have 
the opportunity to have sessions during the next PTG in Dublin. 
If you are interested by such sessions and you plan to attend to the PTG, 
please put your name  and questions/topic you would like to discuss at the 
bottom of [2]. 

Please remind that the PTG is a place to discuss about technical aspects (and 
have the opportunities to exchanges with core-devs of the different projects). 
The objective is to dive into details. For this aim, we would like to identify 
concrete actions we can do before the PTG in order to have fruitful exchanges. 

Best regards, 
ad_ri3n_
PS: I need to inform the foundation about number of participants and how many 
slots we would like to have. Please complete the pad [2] ASAP, thanks ;-)

[1] https://etherpad.openstack.org/p/2017_edge_computing_working_sessions
[2] https://etherpad.openstack.org/p/edge-openstack-related

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack