Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
It is perhaps worth mentioning that there is an effort to implement a generic synchronization mechanism (between Neutron and backend controllers/devices) in the ML2 plugin [1]. A possible framework for its eventual implementation is in an early discussion/proof-of-concept WIP state [2]. -Mohammad [1] https://blueprints.launchpad.net/neutron/+spec/ml2-driver-sync [2] https://review.openstack.org/#/c/154333/ From: Cory Benfield cory.benfi...@metaswitch.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: 03/16/2015 04:48 AM Subject:Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB On Sun, Mar 15, 2015 at 22:37:59, Sławek Kapłoński wrote: Maybe good idea for the beginning could be to implement some periodic task called from agent to check db config and compare it with real state on host? What do You think? Or maybe I'm competly wrong with such idea and it should be done in different way? This is almost exactly what we do in our Calico ML2 driver. Each of our agents will periodically request its complete state from a neutron-server node and will ensure that its local state matches that expected state. This interval is configurable, to allow administrators to make a trade-off between DB/network load and convergence time. With reliable transport this is in principle almost never needed (messages only really get lost on agent crash, and the agent will resynchronize when it starts back up anyway), but it provides assurances that the fabric is capable of bringing itself into consistency without administrator intervention. Having similar function in other neutron agents would be valuable for the exact same reasons, but do bear in mind the potentially increased load this kind of resynchronization can place on databases and servers. Cory __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
Hello, I read blueprint which You send but I don't know how it should solve for example problems like can be in l2pop mechanism. It send message to fanout cast and forget about it. There is no any exception in port_update_postcommit operation but message can be not consumed by some agents (or maybe I'm wrong and it couldn't happen?) and then this agent is not synced with neutron db. -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Dnia poniedziałek, 16 marca 2015 11:05:45 Mohammad Banikazemi pisze: It is perhaps worth mentioning that there is an effort to implement a generic synchronization mechanism (between Neutron and backend controllers/devices) in the ML2 plugin [1]. A possible framework for its eventual implementation is in an early discussion/proof-of-concept WIP state [2]. -Mohammad [1] https://blueprints.launchpad.net/neutron/+spec/ml2-driver-sync [2] https://review.openstack.org/#/c/154333/ From: Cory Benfield cory.benfi...@metaswitch.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: 03/16/2015 04:48 AM Subject: Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB On Sun, Mar 15, 2015 at 22:37:59, Sławek Kapłoński wrote: Maybe good idea for the beginning could be to implement some periodic task called from agent to check db config and compare it with real state on host? What do You think? Or maybe I'm competly wrong with such idea and it should be done in different way? This is almost exactly what we do in our Calico ML2 driver. Each of our agents will periodically request its complete state from a neutron-server node and will ensure that its local state matches that expected state. This interval is configurable, to allow administrators to make a trade-off between DB/network load and convergence time. With reliable transport this is in principle almost never needed (messages only really get lost on agent crash, and the agent will resynchronize when it starts back up anyway), but it provides assurances that the fabric is capable of bringing itself into consistency without administrator intervention. Having similar function in other neutron agents would be valuable for the exact same reasons, but do bear in mind the potentially increased load this kind of resynchronization can place on databases and servers. Cory __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
Hello, Thanks. I didn't find it before. When we will upgrade our infra we will see if this problem will still present. I hope that this was due to that bug maybe and will be fixed then :) -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Dnia poniedziałek, 16 marca 2015 00:14:57 Mathieu Rohon pisze: Hi slawek, may be you're hitting this l2pop bug : https://bugs.launchpad.net/neutron/+bug/1372438 On Sun, Mar 15, 2015 at 11:37 PM, Sławek Kapłoński sla...@kaplonski.pl wrote: Hello, Dnia niedziela, 15 marca 2015 17:45:05 Salvatore Orlando pisze: On 14 March 2015 at 11:19, Sławek Kapłoński sla...@kaplonski.pl wrote: Hello, I'm using ovs agents with L2 population mechanism in ML2 plugin. I noticed that sometimes agents don't receive proper RPC to add new vxlan tunnel openflow rules and then vxlan network between some compute nodes not working. I'm now using still havana release but want to upgrade to Juno. I was checking Juno code in l2 population mech driver and ovs plugin and I didn't find anything like periodic check if openflow rules are proper set or maybe resynced. Maybe it would be also good idea to add something like that to ovs agent? It would surely be a good idea to add some form of reliability into communications between server and agents. So far there are still several instances where the server sends a fire and forget notification to the agent, and does not take any step to ensure the state change associated with that notification has been actually applied to the agent. This applies also to some messages from the agent side, such as status change notifications. Maybe good idea for the beginning could be to implement some periodic task called from agent to check db config and compare it with real state on host? What do You think? Or maybe I'm competly wrong with such idea and it should be done in different way? This is something that can be beneficial any neutron implementation which depends on one or more agents, not just for those using the ovs/linux bridge agents with the l2-population driver. Probably yes, but I had this problem only with this l2-population driver so far and that's why I wrote about it :) -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Salvatore -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze: However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
On Sun, Mar 15, 2015 at 22:37:59, Sławek Kapłoński wrote: Maybe good idea for the beginning could be to implement some periodic task called from agent to check db config and compare it with real state on host? What do You think? Or maybe I'm competly wrong with such idea and it should be done in different way? This is almost exactly what we do in our Calico ML2 driver. Each of our agents will periodically request its complete state from a neutron-server node and will ensure that its local state matches that expected state. This interval is configurable, to allow administrators to make a trade-off between DB/network load and convergence time. With reliable transport this is in principle almost never needed (messages only really get lost on agent crash, and the agent will resynchronize when it starts back up anyway), but it provides assurances that the fabric is capable of bringing itself into consistency without administrator intervention. Having similar function in other neutron agents would be valuable for the exact same reasons, but do bear in mind the potentially increased load this kind of resynchronization can place on databases and servers. Cory __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
On 14 March 2015 at 11:19, Sławek Kapłoński sla...@kaplonski.pl wrote: Hello, I'm using ovs agents with L2 population mechanism in ML2 plugin. I noticed that sometimes agents don't receive proper RPC to add new vxlan tunnel openflow rules and then vxlan network between some compute nodes not working. I'm now using still havana release but want to upgrade to Juno. I was checking Juno code in l2 population mech driver and ovs plugin and I didn't find anything like periodic check if openflow rules are proper set or maybe resynced. Maybe it would be also good idea to add something like that to ovs agent? It would surely be a good idea to add some form of reliability into communications between server and agents. So far there are still several instances where the server sends a fire and forget notification to the agent, and does not take any step to ensure the state change associated with that notification has been actually applied to the agent. This applies also to some messages from the agent side, such as status change notifications. This is something that can be beneficial any neutron implementation which depends on one or more agents, not just for those using the ovs/linux bridge agents with the l2-population driver. Salvatore -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze: However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
The L2 agent, for instance, has a logic to perform full synchronisations with the server. These happens in two cases: 1) upon agent restart, as some messages from the server side might have gone lost 2) whenever a failure is detected on the agent side (this is probably a bit too conservative). Salvatore On 14 March 2015 at 10:51, Leo Y minh...@gmail.com wrote: Hello Rossella, I meant to something different, to less conventional changes. Right now, the network topology state is stored in neutron DB and each compute node knows about it by using neutron API per-request. Node knows means that neutron agents have this data stored in in-memory structures. In a case this synchronization is broken due a bug in software or (un)intentional change in neutron DB, I'd like to understand if the re-synchronization is possible. Right now, I know that L3 agent (I'm not sure if its working for all L3 agents) has periodic task that refreshes NIC information from neutron server. However, L2 agents don't have this mechanic. I don't know about agents that implement SDN. So, I'm looking to learn how the current neutron implementation deals with that problem. On Fri, Mar 13, 2015 at 10:52 AM, Rossella Sblendido rsblend...@suse.com wrote: On 03/07/2015 01:10 PM, Leo Y wrote: What happens when neutron DB is updated to change network settings (e.g. via Dashboard or manually) when there are communication sessions opened in compute nodes. Does it influence those sessions? When the update is propagated to compute nodes? Hi Leo, when you say change network settings I think you mean a change in the security group, is my assumption correct? In that case the Neutron server will notify all the L2 agent (they reside on each compute node) about the change. There are different kind of messages that the Neutron server sends depending on the type of the update, security_groups_rule_updated, security_groups_member_updated, security_groups_provider_updated. Each L2 agent will process the message and apply the required modification on the host. In the default implementation we use iptables to implement security group, so the update consists in some modifications of the iptables rules. Regarding the existing connections in the compute nodes they might not be affected by the change, which is a problem already discussed in this mail thread [1] and there's a patch in review to fix that [2]. Hope that answers your question. cheers, Rossella [1] http://lists.openstack.org/pipermail/openstack-dev/2014-October/049055.html [2] https://review.openstack.org/#/c/147713/ On 03/13/2015 04:10 AM, Kevin Benton wrote: Yeah, I was making a bad assumption for the l2 and l3. Sorry about that. It sounds like we don't have any protection against servers failing to send notifications. On Mar 12, 2015 7:41 PM, Assaf Muller amul...@redhat.com mailto:amul...@redhat.com wrote: - Original Message - However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. The L3 agent periodic sync is only if the full_sync flag was turned on, which is a result of an error. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe:
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
Hello, Dnia niedziela, 15 marca 2015 17:45:05 Salvatore Orlando pisze: On 14 March 2015 at 11:19, Sławek Kapłoński sla...@kaplonski.pl wrote: Hello, I'm using ovs agents with L2 population mechanism in ML2 plugin. I noticed that sometimes agents don't receive proper RPC to add new vxlan tunnel openflow rules and then vxlan network between some compute nodes not working. I'm now using still havana release but want to upgrade to Juno. I was checking Juno code in l2 population mech driver and ovs plugin and I didn't find anything like periodic check if openflow rules are proper set or maybe resynced. Maybe it would be also good idea to add something like that to ovs agent? It would surely be a good idea to add some form of reliability into communications between server and agents. So far there are still several instances where the server sends a fire and forget notification to the agent, and does not take any step to ensure the state change associated with that notification has been actually applied to the agent. This applies also to some messages from the agent side, such as status change notifications. Maybe good idea for the beginning could be to implement some periodic task called from agent to check db config and compare it with real state on host? What do You think? Or maybe I'm competly wrong with such idea and it should be done in different way? This is something that can be beneficial any neutron implementation which depends on one or more agents, not just for those using the ovs/linux bridge agents with the l2-population driver. Probably yes, but I had this problem only with this l2-population driver so far and that's why I wrote about it :) -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Salvatore -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze: However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
Hi slawek, may be you're hitting this l2pop bug : https://bugs.launchpad.net/neutron/+bug/1372438 On Sun, Mar 15, 2015 at 11:37 PM, Sławek Kapłoński sla...@kaplonski.pl wrote: Hello, Dnia niedziela, 15 marca 2015 17:45:05 Salvatore Orlando pisze: On 14 March 2015 at 11:19, Sławek Kapłoński sla...@kaplonski.pl wrote: Hello, I'm using ovs agents with L2 population mechanism in ML2 plugin. I noticed that sometimes agents don't receive proper RPC to add new vxlan tunnel openflow rules and then vxlan network between some compute nodes not working. I'm now using still havana release but want to upgrade to Juno. I was checking Juno code in l2 population mech driver and ovs plugin and I didn't find anything like periodic check if openflow rules are proper set or maybe resynced. Maybe it would be also good idea to add something like that to ovs agent? It would surely be a good idea to add some form of reliability into communications between server and agents. So far there are still several instances where the server sends a fire and forget notification to the agent, and does not take any step to ensure the state change associated with that notification has been actually applied to the agent. This applies also to some messages from the agent side, such as status change notifications. Maybe good idea for the beginning could be to implement some periodic task called from agent to check db config and compare it with real state on host? What do You think? Or maybe I'm competly wrong with such idea and it should be done in different way? This is something that can be beneficial any neutron implementation which depends on one or more agents, not just for those using the ovs/linux bridge agents with the l2-population driver. Probably yes, but I had this problem only with this l2-population driver so far and that's why I wrote about it :) -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Salvatore -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze: However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
Hello Rossella, I meant to something different, to less conventional changes. Right now, the network topology state is stored in neutron DB and each compute node knows about it by using neutron API per-request. Node knows means that neutron agents have this data stored in in-memory structures. In a case this synchronization is broken due a bug in software or (un)intentional change in neutron DB, I'd like to understand if the re-synchronization is possible. Right now, I know that L3 agent (I'm not sure if its working for all L3 agents) has periodic task that refreshes NIC information from neutron server. However, L2 agents don't have this mechanic. I don't know about agents that implement SDN. So, I'm looking to learn how the current neutron implementation deals with that problem. On Fri, Mar 13, 2015 at 10:52 AM, Rossella Sblendido rsblend...@suse.com wrote: On 03/07/2015 01:10 PM, Leo Y wrote: What happens when neutron DB is updated to change network settings (e.g. via Dashboard or manually) when there are communication sessions opened in compute nodes. Does it influence those sessions? When the update is propagated to compute nodes? Hi Leo, when you say change network settings I think you mean a change in the security group, is my assumption correct? In that case the Neutron server will notify all the L2 agent (they reside on each compute node) about the change. There are different kind of messages that the Neutron server sends depending on the type of the update, security_groups_rule_updated, security_groups_member_updated, security_groups_provider_updated. Each L2 agent will process the message and apply the required modification on the host. In the default implementation we use iptables to implement security group, so the update consists in some modifications of the iptables rules. Regarding the existing connections in the compute nodes they might not be affected by the change, which is a problem already discussed in this mail thread [1] and there's a patch in review to fix that [2]. Hope that answers your question. cheers, Rossella [1] http://lists.openstack.org/pipermail/openstack-dev/2014-October/049055.html [2] https://review.openstack.org/#/c/147713/ On 03/13/2015 04:10 AM, Kevin Benton wrote: Yeah, I was making a bad assumption for the l2 and l3. Sorry about that. It sounds like we don't have any protection against servers failing to send notifications. On Mar 12, 2015 7:41 PM, Assaf Muller amul...@redhat.com mailto:amul...@redhat.com wrote: - Original Message - However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. The L3 agent periodic sync is only if the full_sync flag was turned on, which is a result of an error. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Regards, Leo - I enjoy the massacre of ads. This sentence will slaughter ads without a messy bloodbath __ OpenStack Development Mailing List (not for
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
Hello, I'm using ovs agents with L2 population mechanism in ML2 plugin. I noticed that sometimes agents don't receive proper RPC to add new vxlan tunnel openflow rules and then vxlan network between some compute nodes not working. I'm now using still havana release but want to upgrade to Juno. I was checking Juno code in l2 population mech driver and ovs plugin and I didn't find anything like periodic check if openflow rules are proper set or maybe resynced. Maybe it would be also good idea to add something like that to ovs agent? -- Pozdrawiam / Best regards Sławek Kapłoński sla...@kaplonski.pl Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze: However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
On 03/07/2015 01:10 PM, Leo Y wrote: What happens when neutron DB is updated to change network settings (e.g. via Dashboard or manually) when there are communication sessions opened in compute nodes. Does it influence those sessions? When the update is propagated to compute nodes? Hi Leo, when you say change network settings I think you mean a change in the security group, is my assumption correct? In that case the Neutron server will notify all the L2 agent (they reside on each compute node) about the change. There are different kind of messages that the Neutron server sends depending on the type of the update, security_groups_rule_updated, security_groups_member_updated, security_groups_provider_updated. Each L2 agent will process the message and apply the required modification on the host. In the default implementation we use iptables to implement security group, so the update consists in some modifications of the iptables rules. Regarding the existing connections in the compute nodes they might not be affected by the change, which is a problem already discussed in this mail thread [1] and there's a patch in review to fix that [2]. Hope that answers your question. cheers, Rossella [1] http://lists.openstack.org/pipermail/openstack-dev/2014-October/049055.html [2] https://review.openstack.org/#/c/147713/ On 03/13/2015 04:10 AM, Kevin Benton wrote: Yeah, I was making a bad assumption for the l2 and l3. Sorry about that. It sounds like we don't have any protection against servers failing to send notifications. On Mar 12, 2015 7:41 PM, Assaf Muller amul...@redhat.com mailto:amul...@redhat.com wrote: - Original Message - However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. The L3 agent periodic sync is only if the full_sync flag was turned on, which is a result of an error. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
What does it mean under if that notification is lost, the agent will eventually resynchronize? Is it proven/guaranteed? By what means? Can you, please the process with more details? Or point me to resources that describe it. Thank you On Mon, Mar 9, 2015 at 2:11 AM, Kevin Benton blak...@gmail.com wrote: Port changes will result in an update message being sent on the AMQP message bus. When the agent receives it, it will affect the communications then. If that notification is lost, the agent will eventually resynchronize. So during normal operations, the change should take effect within a few seconds. On Sat, Mar 7, 2015 at 4:10 AM, Leo Y minh...@gmail.com wrote: Hello, What happens when neutron DB is updated to change network settings (e.g. via Dashboard or manually) when there are communication sessions opened in compute nodes. Does it influence those sessions? When the update is propagated to compute nodes? Thank you __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Regards, Leo - I enjoy the massacre of ads. This sentence will slaughter ads without a messy bloodbath __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
- Original Message - However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. The L3 agent periodic sync is only if the full_sync flag was turned on, which is a result of an error. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
If there are any errors on the agent connecting to the message bus or retrieving messages, an exception will be thrown in the main rpc_loop, which will be caught and a sync flag will be set to true, which will trigger the sync on the next loop. However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. On Thu, Mar 12, 2015 at 6:19 AM, Leo Y minh...@gmail.com wrote: What does it mean under if that notification is lost, the agent will eventually resynchronize? Is it proven/guaranteed? By what means? Can you, please the process with more details? Or point me to resources that describe it. Thank you On Mon, Mar 9, 2015 at 2:11 AM, Kevin Benton blak...@gmail.com wrote: Port changes will result in an update message being sent on the AMQP message bus. When the agent receives it, it will affect the communications then. If that notification is lost, the agent will eventually resynchronize. So during normal operations, the change should take effect within a few seconds. On Sat, Mar 7, 2015 at 4:10 AM, Leo Y minh...@gmail.com wrote: Hello, What happens when neutron DB is updated to change network settings (e.g. via Dashboard or manually) when there are communication sessions opened in compute nodes. Does it influence those sessions? When the update is propagated to compute nodes? Thank you __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Regards, Leo - I enjoy the massacre of ads. This sentence will slaughter ads without a messy bloodbath __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
Yeah, I was making a bad assumption for the l2 and l3. Sorry about that. It sounds like we don't have any protection against servers failing to send notifications. On Mar 12, 2015 7:41 PM, Assaf Muller amul...@redhat.com wrote: - Original Message - However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. The L3 agent periodic sync is only if the full_sync flag was turned on, which is a result of an error. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
However, I briefly looked through the L2 agent code and didn't see a periodic task to resync the port information to protect from a neutron server that failed to send a notification because it crashed or lost its amqp connection. The L3 agent has a period sync routers task that helps in this regard. Maybe another neutron developer more familiar with the L2 agent can chime in here if I'm missing anything. i don't think you are missing anything. periodic sync would be a good improvement. YAMAMAOTO Takashi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
Port changes will result in an update message being sent on the AMQP message bus. When the agent receives it, it will affect the communications then. If that notification is lost, the agent will eventually resynchronize. So during normal operations, the change should take effect within a few seconds. On Sat, Mar 7, 2015 at 4:10 AM, Leo Y minh...@gmail.com wrote: Hello, What happens when neutron DB is updated to change network settings (e.g. via Dashboard or manually) when there are communication sessions opened in compute nodes. Does it influence those sessions? When the update is propagated to compute nodes? Thank you __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB
Hello, What happens when neutron DB is updated to change network settings (e.g. via Dashboard or manually) when there are communication sessions opened in compute nodes. Does it influence those sessions? When the update is propagated to compute nodes? Thank you __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev