Re: [ovs-dev] MUSL netpacket/packet.h

2019-03-07 Thread Fernando Casas Schössow
Thanks Stuart for confirming this.

On mié, mar 6, 2019 at 10:30 AM, develo...@it-offshore.co.uk wrote:
Hello, The musl-if_packet.patch will always be needed by Alpine due to musl 
libc / glibc incompatibilities. See: 
https://wiki.musl-libc.org/faq.html#Q:-Why-am-I-getting- Kind Regards, Stuart 
Cardall.


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [crash] trigger a panic when flush-conntrack

2019-03-07 Thread Darrell Ball
On Wed, Mar 6, 2019 at 11:33 PM solomon  wrote:

> Darrell Ball wrote:
> > +LIST_FOR_EACH_SAFE (conn, next, exp_node,
> &ctb->exp_lists[j]) {
> > +if (!zone || *zone == conn->key.zone) {
> > +ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
>
> why do we need this assert?
> Clean the CT_CONN_TYPE_DEFAULT type in conntrack_flush(), and clean
> CT_CONN_TYPE_UN_NAT in nat_clean() like following:
> +if ((!zone || *zone == conn->key.zone) &&
> +(conn->conn_type == CT_CONN_TYPE_DEFAULT)) {
> +//ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
>

The assert should always be true; pls leave it in.


>
>
> with the above code, catch an another panic which in ct_clean thread.
>

You are saying with the patch applied, you crash from sweep_bucket().
What about conntrack_flush() ?
This occurs when you issue the command conntrack_flush() ?


> I have see the same panic without changeing the code.


You are saying that you meant to say earlier that the same crash occurs
both from
conntrack_flush() and sweep_bucket() ?
This occurs when you issue the command conntrack_flush() ?



> Both ct_clean and flush-conntrack, can catch the bad bucket value.
>
> #0  0x562ae7402553 in hmap_remove (node=0x7f871bdc4258,
> hmap=0x7f9498701c68) at ./include/openvswitch/hmap.h:287
> 287 while (*bucket != node) {
> [Current thread is 1 (Thread 0x7f948700 (LWP 2085796))]
> (gdb) bt
> #0  0x562ae7402553 in hmap_remove (node=0x7f871bdc4258,
> hmap=0x7f9498701c68) at ./include/openvswitch/hmap.h:287
> #1  conn_clean (ct=ct@entry=0x7f9498700d98, conn=0x7f871bdc41b0,
> ctb=ctb@entry=0x7f9498701c38) at lib/conntrack.c:815
> #2  0x562ae7402a28 in sweep_bucket (limit=3906, now=1223168469,
> ctb=, ct=0x7f9498700d98) at lib/conntrack.c:1396
> #3  conntrack_clean (now=1223168469, ct=0x7f9498700d98) at
> lib/conntrack.c:1433
> #4  clean_thread_main (f_=0x7f9498700d98) at lib/conntrack.c:1488
> #5  0x562ae737a48f in ovsthread_wrapper (aux_=) at
> lib/ovs-thread.c:354
> #6  0x7f9497715494 in start_thread (arg=0x7f948700) at
> pthread_create.c:333
> #7  0x7f9496d09acf in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
> (gdb) p bucket
> $1 = (struct hmap_node **) 0x8  <== why is bucket not a point value?
> (gdb) p *(struct hmap *) 0x7f9498701c68
> $2 = {buckets = 0x7f8609f8fc00, one = 0x0, mask = 32767, n = 31040}
> (gdb) p *(struct hmap_node *) 0x7f871bdc4258
> $3 = {hash = 203059667, next = 0x7f8707cfe6c8}
> (gdb) p 203059667&32767
> $4 = 29139
> (gdb) p &hmap->buckets[29139]
> $5 = (struct hmap_node **) 0x7f8609fc8a98
>
>
> > +conn_clean(ct, conn, ctb);
> > +}
> >  }
> >  }
> > -ct_lock_unlock(&ct->buckets[i].lock);
> > +ct_lock_unlock(&ctb->lock);
> >  }
> >
> >  return 0;
> >
> >
> > which yields conntrack_flush(...) as
> >
> > int
> > conntrack_flush(struct conntrack *ct, const uint16_t *zone)
> > {
> > for (unsigned i = 0; i < CONNTRACK_BUCKETS; i++) {
> > struct conntrack_bucket *ctb = &ct->buckets[i];
> > ct_lock_lock(&ctb->lock);
> > for (unsigned j = 0; j < N_CT_TM; j++) {
> > struct conn *conn, *next;
> > LIST_FOR_EACH_SAFE (conn, next, exp_node,
> &ctb->exp_lists[j]) {
> > if (!zone || *zone == conn->key.zone) {
> > ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
> > conn_clean(ct, conn, ctb);
> > }
> > }
> > }
> > ct_lock_unlock(&ctb->lock);
> > }
> >
> > return 0;
> > }
> >
> > Thanks Darrell
> >
> >
> >
> > On Wed, Mar 6, 2019 at 8:06 PM solomon  wrote:
> >
> >>
> >>When i test conntrack, i catch a panic of ovs.
> >>
> >> Core was generated by `ovs-vswitchd unix:/var/run/openvswitch/db.sock
> >> -vconsole:emer -vsyslog:err -vfi'.
> >> Program terminated with signal SIGSEGV, Segmentation fault.
> >> #0  0x5605c5cd7553 in hmap_remove (node=0x7f734cde0218,
> >> hmap=0x7f8178c7fd70) at ./include/openvswitch/hmap.h:287
> >> 287 while (*bucket != node) {
> >> [Current thread is 1 (Thread 0x7f8178dccb00 (LWP 2024338))]
> >> (gdb) bt
> >> #0  0x5605c5cd7553 in hmap_remove (node=0x7f734cde0218,
> >> hmap=0x7f8178c7fd70) at ./include/openvswitch/hmap.h:287
> >> #1  conn_clean (ct=ct@entry=0x7f8178c75d98, conn=0x7f734cde0170,
> >> ctb=ctb@entry=0x7f8178c7fd40) at lib/conntrack.c:815
> >> #2  0x5605c5cdd66a in conntrack_flush (ct=0x7f8178c75d98, zone=0x0)
> at
> >> lib/conntrack.c:2549
> >> #3  0x5605c5bc2b39 in ct_dpif_flush (dpif=0x5605c68a6430,
> >> zone=zone@entry=0x0, tuple=tuple@entry=0x0) at lib/ct-dpif.c:140
> >> #4  0x5605c5ce17a0 in dpctl_flush_conntrack (argc=argc@entry=1,
> >> argv=argv@entry=0x5605c697ec30, dpctl_p=dpctl_p@entry=0x7fffee718110)
> at
> >> lib/dpctl.c:1388
> >> #5  0x5605c5cdeb78 in d

Re: [ovs-dev] Phy-VM connectivity issue

2019-03-07 Thread ppnaik

Hi Ilya,

Thanks for your suggestion.
We tried pinning the PMD threads to core. But still the issue exists 
that only single queue is working instead of multiqueue.
We tried using 'dpdkvhostuserclient' instead of 'dpdkvhostuser' but we 
are getting the following error and the VM is not able to find the 
interface.


ovs-vswitchd log

2019-03-07T07:42:40.517Z|00212|dpdk|INFO|VHOST_CONFIG: vhost-user 
client: socket created, fd: 1091
2019-03-07T07:42:40.517Z|00213|netdev_dpdk|INFO|vHost User device 
'dpdkvhostuser0' created in 'client' mode, using client socket 
'/tmp/dpdkvhostclient0'
2019-03-07T07:42:40.517Z|00214|dpdk|WARN|VHOST_CONFIG: failed to 
connect to /tmp/dpdkvhostclient0: No such file or directory
2019-03-07T07:42:40.517Z|00215|dpdk|INFO|VHOST_CONFIG: 
/tmp/dpdkvhostclient0: reconnecting...
2019-03-07T07:42:40.517Z|00216|dpif_netdev|WARN|There's no available 
(non-isolated) pmd thread on numa node 1. Queue 0 on port 'dpdk-p0' will 
be assigned to the pmd on core 1 (numa node 0). Expect reduced 
performance.


The output of the following commands now is:


ovs-appctl dpif-netdev/pmd-rxq-show

pmd thread numa_id 0 core_id 1:
  isolated : false
  port: dpdk-p0   queue-id:  0  pmd usage:  0 %
pmd thread numa_id 0 core_id 2:
  isolated : false
  port: dpdkvhostuser0queue-id:  0  pmd usage:  0 %


ovs-vsctl show:

356b0bfb-979c-408a-be1e-4567d52ef57f
Bridge "br0"
Port "dpdk-p0"
Interface "dpdk-p0"
type: dpdk
options: {dpdk-devargs=":81:00.1,n_rxq=2"}
Port "br0"
Interface "br0"
type: internal
Port "dpdkvhostuser0"
Interface "dpdkvhostuser0"
type: dpdkvhostuserclient
options: {vhost-server-path="/tmp/dpdkvhostuser0"}

ovs-vswitchd --version
 ovs-vswitchd (Open vSwitch) 2.11.90
 DPDK 18.11.0

qemu-system-x86_64 --version
  QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.34), 
Copyright (c) 2003-2008 Fabrice Bellard


Can you please help us with what we are missing?

Thanks,
Priyanka


On 2019-03-06 18:13, Ilya Maximets wrote:

On 06.03.2019 15:18, ppnaik wrote:

Thanks for the response Ilya.
We could get this setup working now.

However, we could not get it working when we want to give two queues 
to the VM interface.


We added the queue option when creating the interface on OVS.


There is no 'n_rxq'/'n_txq' options for vhost interfaces. Number of 
queues
obtained from the virtio device when QEMU connects. You could have 
this

in ovsdb, but it doesn't affect anything.
OTOH, physical ports has 'n_rxq' configurable.

We also enabled multiqueue in VM XML and gave the interface and set 
the vectors too.


ethtool inside the VM shows:

ethtool -l ens3
Channel parameters for ens3:
Pre-set maximums:
RX:    0
TX:    0
Other:    0
Combined:    2
Current hardware settings:
RX:    0
TX:    0
Other:    0
Combined:    2

However, a DPDK application inside the VM is not able to get packets 
from both queues. It still works with one queue.


I assume that you have only one PMD thread that polls both physical
and virtual ports. Each PMD thread uses only one Tx queue per port.
So, if you want to utilize more queues in virtual interface, you need 
to

create more PMD threads using pmd-cpu-mask.
For example, with pmd-cpu-mask=0x30 there will be 2 PMD threads. One 
on

CPU #5 and another on CPU #6. Thread on core #5 will send packets to
let's say Tx queue #0, Thread on core #6 will send packets to Tx 
queue #1.

These packets will appear on Rx queue #0 and Rx queue #1 accordingly
inside VM.
But you need to be sure that both threads has packets to send, i.e. 
they
both polls some rx queues of other ports. For exmaple, you may 
configure
2 queues on physical port and assign them to different PMD threads 
(this
should be done automatically). You may check the distribution of rx 
queues

between threads by 'ovs-appctl dpif-netdev/pmd-rxq-show'.
In this case, packets that appears on different rx queues of hw port 
will

go to different queues of vhost-user port.



Please help us resolve this issue.

Thanks,
Priyanka


On 2019-03-06 16:48, Ilya Maximets wrote:

Hi.

At first you need to look at ovs-vswitchd.log and the log of qemu.
There might be some errors.

Some thoughts inline.

Best regards, Ilya Maximets.


Hi All,

Our setup is as follows:

We have two servers which are connected peer to peer over 40G
interfaces.

On one server we have setup OVS and added the physical 40G 
interface as

a DPDK interface to the ovs bridge.

We created another dpdkvhostuser interface for the VM. We added 
this

interface to the VM (by editing the XML). We are able to see this
interface inside the VM and have configure IP to the interface.

We want to communicate between the other server and VM inside this
server through the OVS interface created for the VM.

The steps we followed (on the server with OVS) are:

modprobe uio


IMHO, it's 

Re: [ovs-dev] Phy-VM connectivity issue

2019-03-07 Thread Ilya Maximets
On 07.03.2019 13:29, ppnaik wrote:
> Hi Ilya,
> 
> Thanks for your suggestion.
> We tried pinning the PMD threads to core. But still the issue exists that 
> only single queue is working instead of multiqueue.
> We tried using 'dpdkvhostuserclient' instead of 'dpdkvhostuser' but we are 
> getting the following error and the VM is not able to find the interface.

Hi.
Thanks for logs. See comments inline.

> 
> ovs-vswitchd log
> 
> 2019-03-07T07:42:40.517Z|00212|dpdk|INFO|VHOST_CONFIG: vhost-user client: 
> socket created, fd: 1091
> 2019-03-07T07:42:40.517Z|00213|netdev_dpdk|INFO|vHost User device 
> 'dpdkvhostuser0' created in 'client' mode, using client socket 
> '/tmp/dpdkvhostclient0'
> 2019-03-07T07:42:40.517Z|00214|dpdk|WARN|VHOST_CONFIG: failed to connect to 
> /tmp/dpdkvhostclient0: No such file or directory
> 2019-03-07T07:42:40.517Z|00215|dpdk|INFO|VHOST_CONFIG: /tmp/dpdkvhostclient0: 
> reconnecting...
> 2019-03-07T07:42:40.517Z|00216|dpif_netdev|WARN|There's no available 
> (non-isolated) pmd thread on numa node 1. Queue 0 on port 'dpdk-p0' will be 
> assigned to the pmd on core 1 (numa node 0). Expect reduced performance.
> 
> The output of the following commands now is:
> 
> 
> ovs-appctl dpif-netdev/pmd-rxq-show
> 
> pmd thread numa_id 0 core_id 1:
>   isolated : false
>   port: dpdk-p0   queue-id:  0  pmd usage:  0 %
> pmd thread numa_id 0 core_id 2:
>   isolated : false
>   port: dpdkvhostuser0    queue-id:  0  pmd usage:  0 %
> 
> 
> ovs-vsctl show:
> 
> 356b0bfb-979c-408a-be1e-4567d52ef57f
>     Bridge "br0"
>     Port "dpdk-p0"
>     Interface "dpdk-p0"
>     type: dpdk
>     options: {dpdk-devargs=":81:00.1,n_rxq=2"}

You configured 'n_rxq' as part of 'dpdk-devargs'. This should be separate
config to take effect:

ovs-vsctl set Interface dpdk-p0 type=dpdk options:n_rxq=2 
options:dpdk-devargs=":81:00.1"
  

>     Port "br0"
>     Interface "br0"
>     type: internal
>     Port "dpdkvhostuser0"
>     Interface "dpdkvhostuser0"
>     type: dpdkvhostuserclient
>     options: {vhost-server-path="/tmp/dpdkvhostuser0"}
> 
> ovs-vswitchd --version
>  ovs-vswitchd (Open vSwitch) 2.11.90
>  DPDK 18.11.0
> 
> qemu-system-x86_64 --version
>   QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.34), Copyright 
> (c) 2003-2008 Fabrice Bellard

Oh. Unfortunately vhost-user-client supported in qemu only since 2.7 version.
You need to switch back to usual vhost-user ports or update the qemu.

> 
> Can you please help us with what we are missing?
> 
> Thanks,
> Priyanka
> 
> 
> On 2019-03-06 18:13, Ilya Maximets wrote:
>> On 06.03.2019 15:18, ppnaik wrote:
>>> Thanks for the response Ilya.
>>> We could get this setup working now.
>>>
>>> However, we could not get it working when we want to give two queues to the 
>>> VM interface.
>>>
>>> We added the queue option when creating the interface on OVS.
>>
>> There is no 'n_rxq'/'n_txq' options for vhost interfaces. Number of queues
>> obtained from the virtio device when QEMU connects. You could have this
>> in ovsdb, but it doesn't affect anything.
>> OTOH, physical ports has 'n_rxq' configurable.
>>
>>> We also enabled multiqueue in VM XML and gave the interface and set the 
>>> vectors too.
>>>
>>> ethtool inside the VM shows:
>>>
>>> ethtool -l ens3
>>> Channel parameters for ens3:
>>> Pre-set maximums:
>>> RX:    0
>>> TX:    0
>>> Other:    0
>>> Combined:    2
>>> Current hardware settings:
>>> RX:    0
>>> TX:    0
>>> Other:    0
>>> Combined:    2
>>>
>>> However, a DPDK application inside the VM is not able to get packets from 
>>> both queues. It still works with one queue.
>>
>> I assume that you have only one PMD thread that polls both physical
>> and virtual ports. Each PMD thread uses only one Tx queue per port.
>> So, if you want to utilize more queues in virtual interface, you need to
>> create more PMD threads using pmd-cpu-mask.
>> For example, with pmd-cpu-mask=0x30 there will be 2 PMD threads. One on
>> CPU #5 and another on CPU #6. Thread on core #5 will send packets to
>> let's say Tx queue #0, Thread on core #6 will send packets to Tx queue #1.
>> These packets will appear on Rx queue #0 and Rx queue #1 accordingly
>> inside VM.
>> But you need to be sure that both threads has packets to send, i.e. they
>> both polls some rx queues of other ports. For exmaple, you may configure
>> 2 queues on physical port and assign them to different PMD threads (this
>> should be done automatically). You may check the distribution of rx queues
>> between threads by 'ovs-appctl dpif-netdev/pmd-rxq-show'.
>> In this case, packets that appears on different rx queues of hw port will
>> go to different queues of vhost-user port.
>>
>>>
>>> Please help us resolve this issue.
>>>
>>> Thanks,
>>> Priyanka
>>>
>>>
>>> On 2019-03-06 16:48, Ilya Maximets wrote:
 Hi.

 At first you need to l

[ovs-dev] [PATCH v2] ovs-tcpdump: add dump_cmd checker before _doexec()

2019-03-07 Thread txfh2007 via dev
Sorry , I forgot to add signed-off messages.

v1->v2: add signed-off messages.

diff --git a/utilities/ovs-tcpdump.in b/utilities/ovs-tcpdump.in
index 269c252f8..f28ecf5b2 100755
--- a/utilities/ovs-tcpdump.in
+++ b/utilities/ovs-tcpdump.in
@@ -416,6 +416,10 @@ def main():
 print("Error: must at least specify an interface with '-i' option")
 sys.exit(1)

+if os.system("command -v %s" % (dump_cmd)):
+print("Error: %s is not installed !!"%dump_cmd)
+sys.exit(1)
+
 if '-l' not in tcpdargs:
 tcpdargs.insert(0, '-l')

Signed-off-by: Liu Chang 




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V5 0/2] Do not rewrite fields with the same values as matched

2019-03-07 Thread Eli Britstein
I would really want it to get merged.

how do you suggest to progress?


On 3/1/2019 7:40 AM, Eli Britstein wrote:
> ping
>
> On 2/25/2019 9:42 PM, Eli Britstein wrote:
>>
>> On 2/25/2019 7:31 PM, Ben Pfaff wrote:
>>> On Sun, Feb 24, 2019 at 09:45:21AM +, Eli Britstein wrote:
 On 2/22/2019 6:42 PM, Ben Pfaff wrote:
> On Sun, Feb 17, 2019 at 09:18:46AM +, Eli Britstein wrote:
>> This patch set avoids unnecessary rewrite actions to fields with the
>> same values as matched on.
>>
>> Patch 1 is a pre-step of generating ovs key fields macros
>> Patch 2 avoids the unnecessary rewrites and adapts the tests 
>> accordingly
> Thanks for the revision.
>
> Do you foresee other uses of OVS_KEY_FIELD in the future? As is,
> there's a lot of duplication here from the numerous declarations like
>
>   struct ovs_key_field_properties 
> ovs_key_nd_extensions_properties[] = {
> #define OVS_KEY_FIELD(type, name) {offsetof(struct 
> ovs_key_nd_extensions, name), sizeof(type)},
>   OVS_KEY_ND_EXTENSIONS_FIELDS
>   {0, 0}
> #undef OVS_KEY_FIELD
>   };
>
> If this is the only currently foreseen use, it would be better to 
> have
> the code generator just generate the declarations directly instead of
> forcing these later duplications.
 Please note the definitions are not exactly duplicated - each define
 refers to a different struct. Currently I don't know what other uses
 will be needed/done in the future. However, I think we should remain
 with more generic and flexible code, that will allow easier use of 
 those
 macros in the future rather than specific macros.
>>> What uses do you foresee in the future?
>>
>> I don't foresee the future. I just think we should keep the code as 
>> generic as possible, so future implementations are easier to reuse 
>> the same infrastructure.
>>
>> Just for example, I can think of a log/debug implementation to print 
>> a struct:
>>
>> suppose there is a function: void print_hex(char *buf, int len)
>>
>> and you want to print a struct ovs_key_ipv4 ip:
>>
>> #define OVS_KEY_FIELD(type, name) 
>> print_hex(((char*)ip)+offsetof(struct ovs_key_ipv4, name), sizeof(type))
>>
>> OVS_KEY_IPV4_FIELDS - will print all the fields, and you can also add 
>> a title etc.
>>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-dev, v2] ovs-tcpdump: add dump_cmd checker before _doexec()

2019-03-07 Thread 0-day Robot
Bleep bloop.  Greetings txfh2007 via dev, I am a robot and I have tried out 
your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Author should not be mailing list.
Lines checked: 33, Warnings: 0, Errors: 1


Please check this out.  If you feel there has been an error, please email 
acon...@bytheb.org

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovs-tcpdump: add dump_cmd checker before _doexec()

2019-03-07 Thread Aaron Conole
Hi Liu,

Thanks for working on this issue.

txfh2007 via dev  writes:

> Sorry , I forgot to add signed-off messages.
>
> v1->v2: add signed-off messages.
>
> diff --git a/utilities/ovs-tcpdump.in b/utilities/ovs-tcpdump.in
> index 269c252f8..f28ecf5b2 100755
> --- a/utilities/ovs-tcpdump.in
> +++ b/utilities/ovs-tcpdump.in
> @@ -416,6 +416,10 @@ def main():
>  print("Error: must at least specify an interface with '-i' option")
>  sys.exit(1)
>
> +if os.system("command -v %s" % (dump_cmd)):
> +print("Error: %s is not installed !!"%dump_cmd)
> +sys.exit(1)
> +
>  if '-l' not in tcpdargs:
>  tcpdargs.insert(0, '-l')
>
> Signed-off-by: Liu Chang 

Typically commits are formatted with the 'Signed-off-by: ...' lines
coming before the diff.  While I think it's possible for a
maintainer/committer to fix this before applying when dealing with the
mail file, it's nicer when it is formatted correctly from the beginning.
In the future, using 'git format-patch' (and using 'git commit --amend'
to fix the commit message first) could make this easier for you.

OTOH, your proposal will generate a new flake8 error:

   utilities/ovs-tcpdump.in:420:46: E228 missing whitespace around modulo 
operator

Additionally, I think rather than spawning another task to search the
path, python provides us a mechanism to search the path.  I suggest
something like the following (please feel free to test+submit).

---
diff --git a/utilities/ovs-tcpdump.in b/utilities/ovs-tcpdump.in
index 269c252f8..11624c54b 100755
--- a/utilities/ovs-tcpdump.in
+++ b/utilities/ovs-tcpdump.in
@@ -372,6 +372,11 @@ def argv_tuples(lst):
 pass
 
 
+def py_which(executable):
+return any(os.access(os.path.join(path, executable), os.X_OK)
+   for path in os.environ["PATH"].split(os.pathsep))
+
+
 def main():
 db_sock = 'unix:@RUNDIR@/db.sock'
 interface = None
@@ -416,6 +421,10 @@ def main():
 print("Error: must at least specify an interface with '-i' option")
 sys.exit(1)
 
+if not py_which(dump_cmd):
+print("Error: unable to execute '%s' (check PATH)" % dump_cmd)
+sys.exit(1)
+
 if '-l' not in tcpdargs:
 tcpdargs.insert(0, '-l')
 
---
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] meta-flow.xml: Fix typos of flow-based tunnel command examples.

2019-03-07 Thread Ben Pfaff
On Wed, Mar 06, 2019 at 10:04:50PM -0800, Han Zhou wrote:
> From: Han Zhou 
> 
> Signed-off-by: Han Zhou 

Thanks!  I applied this to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] :....

2019-03-07 Thread Liu Shiyu
Can I have a word with you?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovsdb-idl: Fix memory leak of idl->remote.

2019-03-07 Thread David Marchand
On Wed, Mar 6, 2019 at 7:23 PM Han Zhou  wrote:

> From: Han Zhou 
>
> Reported by Address Sanitizer.
>
> Fixes: 5e07b8f93f03 ("ovsdb-idl: New function
> ovsdb_idl_create_unconnected().")
> Signed-off-by: Han Zhou 
> ---
>
> Notes:
> v1->v2: address comments from David Marchand.
>
>  lib/ovsdb-idl.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> index 8cfb201..ed2b30a 100644
> --- a/lib/ovsdb-idl.c
> +++ b/lib/ovsdb-idl.c
> @@ -571,6 +571,7 @@ ovsdb_idl_destroy(struct ovsdb_idl *idl)
>  ovsdb_idl_db_destroy(&idl->server);
>  ovsdb_idl_db_destroy(&idl->data);
>  json_destroy(idl->request_id);
> +free(idl->remote);
>  free(idl);
>  }
>  }
>

Reviewed-by: David Marchand 

-- 
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] OVN: Add support for DHCP option 210 - path prefix

2019-03-07 Thread lmartins
From: Lucas Alvares Gomes 

OpenStack Ironic relies on few DHCP options [0] that are not yet supported
in OVN, one of them is the 210 (PATH PREFIX, RFC5071 [1]).

[0]
https://github.com/openstack/ironic/blob/3f6d4c6a789b12512d6cc67cdbc93ba5fbf29848/ironic/common/pxe_utils.py#L44-L54
[1] https://tools.ietf.org/html/rfc5071#section-5

Signed-off-by: Lucas Alvares Gomes 
---
 ovn/lib/ovn-l7.h| 1 +
 ovn/northd/ovn-northd.c | 1 +
 ovn/ovn-nb.xml  | 8 
 tests/ovn.at| 6 +++---
 tests/test-ovn.c| 1 +
 5 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/ovn/lib/ovn-l7.h b/ovn/lib/ovn-l7.h
index 76b26e70f..a2224d019 100644
--- a/ovn/lib/ovn-l7.h
+++ b/ovn/lib/ovn-l7.h
@@ -72,6 +72,7 @@ struct gen_opts_map {
 
 #define DHCP_OPT_BOOTFILE DHCP_OPTION("bootfile_name", 67, "str")
 #define DHCP_OPT_WPAD DHCP_OPTION("wpad", 252, "str")
+#define DHCP_OPT_PATH_PREFIX DHCP_OPTION("path_prefix", 210, "str")
 
 static inline uint32_t
 gen_opt_hash(char *opt_name)
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index d554d6794..1a59c9fa9 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -7388,6 +7388,7 @@ static struct gen_opts_map supported_dhcp_opts[] = {
 DHCP_OPT_T2,
 DHCP_OPT_WPAD,
 DHCP_OPT_BOOTFILE,
+DHCP_OPT_PATH_PREFIX,
 };
 
 static struct gen_opts_map supported_dhcpv6_opts[] = {
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 18396507d..b084c821e 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -2170,6 +2170,14 @@
 to identify a bootfile.
   
 
+
+
+  
+The DHCPv4 option code for this option is 210. In PXELINUX'
+case this option is used to set a common path prefix,
+instead of deriving it from the bootfile name.
+  
+
   
 
 
diff --git a/tests/ovn.at b/tests/ovn.at
index 2af225a67..a5a6c6652 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -1104,9 +1104,9 @@ put_arp(inport, arp.spa, arp.sha);
 # put_dhcp_opts
 reg1[0] = put_dhcp_opts(offerip = 1.2.3.4, router = 10.0.0.1);
 encodes as 
controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.40.01.02.03.04.03.04.0a.00.00.01,pause)
-reg2[5] = 
put_dhcp_opts(offerip=10.0.0.4,router=10.0.0.1,netmask=255.255.254.0,mtu=1400,domain="ovn.org",wpad="https://example.org",bootfile_name="https://127.0.0.1/boot.ipxe";);
-formats as reg2[5] = put_dhcp_opts(offerip = 10.0.0.4, router = 10.0.0.1, 
netmask = 255.255.254.0, mtu = 1400, domain = "ovn.org", wpad = 
"https://example.org";, bootfile_name = "https://127.0.0.1/boot.ipxe";);
-encodes as 
controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.25.0a.00.00.04.03.04.0a.00.00.01.01.04.ff.ff.fe.00.1a.02.05.78.0f.07.6f.76.6e.2e.6f.72.67.fc.13.68.74.74.70.73.3a.2f.2f.65.78.61.6d.70.6c.65.2e.6f.72.67.43.1b.68.74.74.70.73.3a.2f.2f.31.32.37.2e.30.2e.30.2e.31.2f.62.6f.6f.74.2e.69.70.78.65,pause)
+reg2[5] = 
put_dhcp_opts(offerip=10.0.0.4,router=10.0.0.1,netmask=255.255.254.0,mtu=1400,domain="ovn.org",wpad="https://example.org",bootfile_name="https://127.0.0.1/boot.ipxe",path_prefix="/tftpboot";);
+formats as reg2[5] = put_dhcp_opts(offerip = 10.0.0.4, router = 10.0.0.1, 
netmask = 255.255.254.0, mtu = 1400, domain = "ovn.org", wpad = 
"https://example.org";, bootfile_name = "https://127.0.0.1/boot.ipxe";, 
path_prefix = "/tftpboot");
+encodes as 
controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.25.0a.00.00.04.03.04.0a.00.00.01.01.04.ff.ff.fe.00.1a.02.05.78.0f.07.6f.76.6e.2e.6f.72.67.fc.13.68.74.74.70.73.3a.2f.2f.65.78.61.6d.70.6c.65.2e.6f.72.67.43.1b.68.74.74.70.73.3a.2f.2f.31.32.37.2e.30.2e.30.2e.31.2f.62.6f.6f.74.2e.69.70.78.65.d2.09.2f.74.66.74.70.62.6f.6f.74,pause)
 reg0[15] = 
put_dhcp_opts(offerip=10.0.0.4,router=10.0.0.1,netmask=255.255.255.0,mtu=1400,ip_forward_enable=1,default_ttl=121,dns_server={8.8.8.8,7.7.7.7},classless_static_route={30.0.0.0/24,10.0.0.4,40.0.0.0/16,10.0.0.6,0.0.0.0/0,10.0.0.1},ethernet_encap=1,router_discovery=0);
 formats as reg0[15] = put_dhcp_opts(offerip = 10.0.0.4, router = 10.0.0.1, 
netmask = 255.255.255.0, mtu = 1400, ip_forward_enable = 1, default_ttl = 121, 
dns_server = {8.8.8.8, 7.7.7.7}, classless_static_route = {30.0.0.0/24, 
10.0.0.4, 40.0.0.0/16, 10.0.0.6, 0.0.0.0/0, 10.0.0.1}, ethernet_encap = 1, 
router_discovery = 0);
 encodes as 
controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.6f.0a.00.00.04.03.04.0a.00.00.01.01.04.ff.ff.ff.00.1a.02.05.78.13.01.01.17.01.79.06.08.08.08.08.08.07.07.07.07.79.14.18.1e.00.00.0a.00.00.04.10.28.00.0a.00.00.06.00.0a.00.00.01.24.01.01.1f.01.00,pause)
diff --git a/tests/test-ovn.c b/tests/test-ovn.c
index bafe7bbc8..8e8c38a66 100644
--- a/tests/test-ovn.c
+++ b/tests/test-ovn.c
@@ -184,6 +184,7 @@ create_gen_opts(struct hmap *dhcp_opts, struct hmap 
*dhcpv6_opts,
 dhcp_opt_add(dhcp_opts, "lease_time",  51, "uint32");
 dhcp_opt_add(dhcp_opts, "wpad", 252, "str");
   

[ovs-dev] [PATCH 2/2] OVN: Add support for DHCP option 150 - TFTP server address

2019-03-07 Thread lmartins
From: Lucas Alvares Gomes 

OpenStack Ironic relies on a few DHCP options [0] that were not
supported in OVN yet. This patch is adding the last one which is the
option 150 (TFTP server address, RFC5859 [1]).

Note that this option is Cisco proprietary, the IEEE standard that
matches with this requirement is Option 66. The difference is that 150
allows to multiple IPs to be specified and 66 only allows one.

[0]
https://github.com/openstack/ironic/blob/3f6d4c6a789b12512d6cc67cdbc93ba5fbf29848/ironic/common/pxe_utils.py#L44-L54
[1] https://tools.ietf.org/html/rfc5859

Signed-off-by: Lucas Alvares Gomes 
---
 ovn/lib/ovn-l7.h| 2 ++
 ovn/northd/ovn-northd.c | 1 +
 ovn/ovn-nb.xml  | 9 +
 tests/ovn.at| 6 +++---
 tests/test-ovn.c| 1 +
 5 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/ovn/lib/ovn-l7.h b/ovn/lib/ovn-l7.h
index a2224d019..fe3104e1f 100644
--- a/ovn/lib/ovn-l7.h
+++ b/ovn/lib/ovn-l7.h
@@ -73,6 +73,8 @@ struct gen_opts_map {
 #define DHCP_OPT_BOOTFILE DHCP_OPTION("bootfile_name", 67, "str")
 #define DHCP_OPT_WPAD DHCP_OPTION("wpad", 252, "str")
 #define DHCP_OPT_PATH_PREFIX DHCP_OPTION("path_prefix", 210, "str")
+#define DHCP_OPT_TFTP_SERVER_ADDRESS \
+DHCP_OPTION("tftp_server_address", 150, "ipv4")
 
 static inline uint32_t
 gen_opt_hash(char *opt_name)
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 1a59c9fa9..2843969f4 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -7389,6 +7389,7 @@ static struct gen_opts_map supported_dhcp_opts[] = {
 DHCP_OPT_WPAD,
 DHCP_OPT_BOOTFILE,
 DHCP_OPT_PATH_PREFIX,
+DHCP_OPT_TFTP_SERVER_ADDRESS,
 };
 
 static struct gen_opts_map supported_dhcpv6_opts[] = {
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index b084c821e..61a57110a 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -2178,6 +2178,15 @@
 instead of deriving it from the bootfile name.
   
 
+
+
+  
+The DHCPv4 option code for this option is 150. The option
+contains one or more IPv4 addresses that the client MAY
+use. This option is Cisco proprietary, the IEEE standard
+that matches with this requirement is option 66 (tftp_server).
+  
+
   
 
 
diff --git a/tests/ovn.at b/tests/ovn.at
index a5a6c6652..f2f2bc405 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -1107,9 +1107,9 @@ reg1[0] = put_dhcp_opts(offerip = 1.2.3.4, router = 
10.0.0.1);
 reg2[5] = 
put_dhcp_opts(offerip=10.0.0.4,router=10.0.0.1,netmask=255.255.254.0,mtu=1400,domain="ovn.org",wpad="https://example.org",bootfile_name="https://127.0.0.1/boot.ipxe",path_prefix="/tftpboot";);
 formats as reg2[5] = put_dhcp_opts(offerip = 10.0.0.4, router = 10.0.0.1, 
netmask = 255.255.254.0, mtu = 1400, domain = "ovn.org", wpad = 
"https://example.org";, bootfile_name = "https://127.0.0.1/boot.ipxe";, 
path_prefix = "/tftpboot");
 encodes as 
controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.25.0a.00.00.04.03.04.0a.00.00.01.01.04.ff.ff.fe.00.1a.02.05.78.0f.07.6f.76.6e.2e.6f.72.67.fc.13.68.74.74.70.73.3a.2f.2f.65.78.61.6d.70.6c.65.2e.6f.72.67.43.1b.68.74.74.70.73.3a.2f.2f.31.32.37.2e.30.2e.30.2e.31.2f.62.6f.6f.74.2e.69.70.78.65.d2.09.2f.74.66.74.70.62.6f.6f.74,pause)
-reg0[15] = 
put_dhcp_opts(offerip=10.0.0.4,router=10.0.0.1,netmask=255.255.255.0,mtu=1400,ip_forward_enable=1,default_ttl=121,dns_server={8.8.8.8,7.7.7.7},classless_static_route={30.0.0.0/24,10.0.0.4,40.0.0.0/16,10.0.0.6,0.0.0.0/0,10.0.0.1},ethernet_encap=1,router_discovery=0);
-formats as reg0[15] = put_dhcp_opts(offerip = 10.0.0.4, router = 10.0.0.1, 
netmask = 255.255.255.0, mtu = 1400, ip_forward_enable = 1, default_ttl = 121, 
dns_server = {8.8.8.8, 7.7.7.7}, classless_static_route = {30.0.0.0/24, 
10.0.0.4, 40.0.0.0/16, 10.0.0.6, 0.0.0.0/0, 10.0.0.1}, ethernet_encap = 1, 
router_discovery = 0);
-encodes as 
controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.6f.0a.00.00.04.03.04.0a.00.00.01.01.04.ff.ff.ff.00.1a.02.05.78.13.01.01.17.01.79.06.08.08.08.08.08.07.07.07.07.79.14.18.1e.00.00.0a.00.00.04.10.28.00.0a.00.00.06.00.0a.00.00.01.24.01.01.1f.01.00,pause)
+reg0[15] = 
put_dhcp_opts(offerip=10.0.0.4,router=10.0.0.1,netmask=255.255.255.0,mtu=1400,ip_forward_enable=1,default_ttl=121,dns_server={8.8.8.8,7.7.7.7},classless_static_route={30.0.0.0/24,10.0.0.4,40.0.0.0/16,10.0.0.6,0.0.0.0/0,10.0.0.1},ethernet_encap=1,router_discovery=0,tftp_server_address={10.0.0.4,10.0.0.5});
+formats as reg0[15] = put_dhcp_opts(offerip = 10.0.0.4, router = 10.0.0.1, 
netmask = 255.255.255.0, mtu = 1400, ip_forward_enable = 1, default_ttl = 121, 
dns_server = {8.8.8.8, 7.7.7.7}, classless_static_route = {30.0.0.0/24, 
10.0.0.4, 40.0.0.0/16, 10.0.0.6, 0.0.0.0/0, 10.0.0.1}, ethernet_encap = 1, 
router_discovery = 0, tftp_server_address = {10.0.0.4, 10.0.0.5});
+encodes as 
controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.

Re: [ovs-dev] [PATCH v3 2/5] ovn: Add generic HA chassis group

2019-03-07 Thread Numan Siddique
On Tue, Mar 5, 2019 at 11:34 PM Han Zhou  wrote:

> > > +
> > > +static void
> > > +add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
> > > +   const struct sbrec_chassis *chassis)
> > > +{
> > > +for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
> > > +if (ref_ch_info->ref_chassis[j] == chassis) {
> > > +   return;
> > > +}
> > > +}
> > > +
> > > +ref_ch_info->ref_chassis = xrealloc(ref_ch_info->ref_chassis,
> > > +sizeof
> *ref_ch_info->ref_chassis *
> > > +
> (++ref_ch_info->n_ref_chassis));
> >
> > This may be inefficient, considering the amount of ref chassises to be
> > added for each HA group. It is better to xrealloc for original_size *
> > 2 every time and expand only when more space is needed.
> >
> > > +ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis - 1] =
> > > +CONST_CAST(struct sbrec_chassis *, chassis);
> > > +}
> > > +
> > > +static void
> > > +update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
> > > +{
> > > +struct shash_node *node, *next;
> > > +SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
> > > +struct ha_ref_chassis_info *ha_ref_info = node->data;
> > > +
> sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
> > > +
>  ha_ref_info->ref_chassis,
> > > +
>  ha_ref_info->n_ref_chassis);
> > > +free(ha_ref_info->ref_chassis);
> > > +free(ha_ref_info);
> > > +shash_delete(ha_ref_chassis_map, node);
> > > +}
> > > +}
> > > +
> > > +/* This function returns logical router datapath (with a distributed
> > > + * gateway port) to which 'od' is connected to - either directly
> > > + * or indirectly (via transit logical switches).
> > > + * Returns NULL if no logical router with gw port found.
> > > + */
> > > +static struct ovn_datapath *
> > > +get_router_dp_with_gw_port(struct hmap *ports,
> > > +   struct ovn_datapath *od,
> > > +   struct ovn_datapath *peer_od)
> > > +{
> > > +if (!od) {
> > > +return NULL;
> > > +}
> > > +
> > > +if (od->nbs) {
> > > +/* It's a logical switch datapath. */
> > > +if (peer_od) {
> > > +/* If peer datapath is not logical router, then
> > > + * something is wrong. */
> > > +ovs_assert(peer_od->nbr);
> > > +}
> > > +
> > > +for (size_t i = 0; i < od->n_router_ports; i++) {
> > > +if (!od->router_ports[i]->peer) {
> > > +/* If there is no peer port connecting to the
> > > + * router datapath, ignore it. */
> > > +continue;
> > > +}
> > > +
> > > +struct ovn_datapath *router_dp =
> od->router_ports[i]->peer->od;
> > > +if (router_dp->l3dgw_port && router_dp->l3dgw_port->nbrp)
> {
> > > +/* Router datapath has a distributed gateway router
> port. */
> > > +return router_dp;
> >
> > I think we can't return when just one router_dp is found. There can be
> > more than one connected router that has gateway router ports. So the
> > return value of this function should be a set.
> >
> > > +}
> > > +}
> > > +
> > > +/* The logical switch datapath is not connected to any
> > > + * logical router with a distributed gateway port. Check if it
> > > + * is indirectly connected to a logical router with a gw
> port. */
> > > +for (size_t i = 0; i < od->n_router_ports; i++) {
> > > +if (!od->router_ports[i]->peer) {
> > > +continue;
> > > +}
> > > +
> > > +struct ovn_datapath *router_dp =
> > > +od->router_ports[i]->peer->od;
> > > +
> > > +/* If we don't check this, we will be in an infinite
> loop. */
> > > +if (router_dp != peer_od) {
> >
> > peer_od should also be a set, it should skip checking any datapath
> > that has already been checked. Consider the case LR1 is connected with
> > LS1, LS2 and LS3.
> >
> > > +router_dp = get_router_dp_with_gw_port(ports,
> router_dp,
> > > +   od);
> > > +if (router_dp) {
> > > +/* Found a logical router with gw port indirectly
> connected
> > > + * to 'od'. */
> > > +return router_dp;
> > > +}
> > > +}
> > > +}
> > > +} else if (od->nbr) {
> > > +/* It's a logical router datapath. */
> > > +if (peer_od) {
> > > +/* If peer datapath is not logical switch, then
> > > + * something is wrong. */
> > > +ovs_assert(peer_od->nbs);
> >
> > A router port can be peered with another router port directly, so this
> > assert is not true.
> >
> > > +}
> > > +
> > > +/* Check if th

Re: [ovs-dev] [PATCH v3 2/5] ovn: Add generic HA chassis group

2019-03-07 Thread Han Zhou
On Thu, Mar 7, 2019 at 8:37 AM Numan Siddique  wrote:
>
>
>
> On Tue, Mar 5, 2019 at 11:34 PM Han Zhou  wrote:
>>
>> > > +
>> > > +static void
>> > > +add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
>> > > +   const struct sbrec_chassis *chassis)
>> > > +{
>> > > +for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
>> > > +if (ref_ch_info->ref_chassis[j] == chassis) {
>> > > +   return;
>> > > +}
>> > > +}
>> > > +
>> > > +ref_ch_info->ref_chassis = xrealloc(ref_ch_info->ref_chassis,
>> > > +sizeof 
>> > > *ref_ch_info->ref_chassis *
>> > > +(++ref_ch_info->n_ref_chassis));
>> >
>> > This may be inefficient, considering the amount of ref chassises to be
>> > added for each HA group. It is better to xrealloc for original_size *
>> > 2 every time and expand only when more space is needed.
>> >
>> > > +ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis - 1] =
>> > > +CONST_CAST(struct sbrec_chassis *, chassis);
>> > > +}
>> > > +
>> > > +static void
>> > > +update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
>> > > +{
>> > > +struct shash_node *node, *next;
>> > > +SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
>> > > +struct ha_ref_chassis_info *ha_ref_info = node->data;
>> > > +
>> > > sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
>> > > +   ha_ref_info->ref_chassis,
>> > > +   
>> > > ha_ref_info->n_ref_chassis);
>> > > +free(ha_ref_info->ref_chassis);
>> > > +free(ha_ref_info);
>> > > +shash_delete(ha_ref_chassis_map, node);
>> > > +}
>> > > +}
>> > > +
>> > > +/* This function returns logical router datapath (with a distributed
>> > > + * gateway port) to which 'od' is connected to - either directly
>> > > + * or indirectly (via transit logical switches).
>> > > + * Returns NULL if no logical router with gw port found.
>> > > + */
>> > > +static struct ovn_datapath *
>> > > +get_router_dp_with_gw_port(struct hmap *ports,
>> > > +   struct ovn_datapath *od,
>> > > +   struct ovn_datapath *peer_od)
>> > > +{
>> > > +if (!od) {
>> > > +return NULL;
>> > > +}
>> > > +
>> > > +if (od->nbs) {
>> > > +/* It's a logical switch datapath. */
>> > > +if (peer_od) {
>> > > +/* If peer datapath is not logical router, then
>> > > + * something is wrong. */
>> > > +ovs_assert(peer_od->nbr);
>> > > +}
>> > > +
>> > > +for (size_t i = 0; i < od->n_router_ports; i++) {
>> > > +if (!od->router_ports[i]->peer) {
>> > > +/* If there is no peer port connecting to the
>> > > + * router datapath, ignore it. */
>> > > +continue;
>> > > +}
>> > > +
>> > > +struct ovn_datapath *router_dp = 
>> > > od->router_ports[i]->peer->od;
>> > > +if (router_dp->l3dgw_port && router_dp->l3dgw_port->nbrp) {
>> > > +/* Router datapath has a distributed gateway router 
>> > > port. */
>> > > +return router_dp;
>> >
>> > I think we can't return when just one router_dp is found. There can be
>> > more than one connected router that has gateway router ports. So the
>> > return value of this function should be a set.
>> >
>> > > +}
>> > > +}
>> > > +
>> > > +/* The logical switch datapath is not connected to any
>> > > + * logical router with a distributed gateway port. Check if it
>> > > + * is indirectly connected to a logical router with a gw port. 
>> > > */
>> > > +for (size_t i = 0; i < od->n_router_ports; i++) {
>> > > +if (!od->router_ports[i]->peer) {
>> > > +continue;
>> > > +}
>> > > +
>> > > +struct ovn_datapath *router_dp =
>> > > +od->router_ports[i]->peer->od;
>> > > +
>> > > +/* If we don't check this, we will be in an infinite loop. 
>> > > */
>> > > +if (router_dp != peer_od) {
>> >
>> > peer_od should also be a set, it should skip checking any datapath
>> > that has already been checked. Consider the case LR1 is connected with
>> > LS1, LS2 and LS3.
>> >
>> > > +router_dp = get_router_dp_with_gw_port(ports, router_dp,
>> > > +   od);
>> > > +if (router_dp) {
>> > > +/* Found a logical router with gw port indirectly 
>> > > connected
>> > > + * to 'od'. */
>> > > +return router_dp;
>> > > +}
>> > > +}
>> > > +}
>> > > +} else if (od->nbr) {
>> > > +/* It's a logical router datapath. */

Re: [ovs-dev] [PATCH v3 2/5] ovn: Add generic HA chassis group

2019-03-07 Thread Numan Siddique
On Thu, Mar 7, 2019, 10:46 PM Han Zhou  wrote:

> On Thu, Mar 7, 2019 at 8:37 AM Numan Siddique  wrote:
> >
> >
> >
> > On Tue, Mar 5, 2019 at 11:34 PM Han Zhou  wrote:
> >>
> >> > > +
> >> > > +static void
> >> > > +add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
> >> > > +   const struct sbrec_chassis *chassis)
> >> > > +{
> >> > > +for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
> >> > > +if (ref_ch_info->ref_chassis[j] == chassis) {
> >> > > +   return;
> >> > > +}
> >> > > +}
> >> > > +
> >> > > +ref_ch_info->ref_chassis = xrealloc(ref_ch_info->ref_chassis,
> >> > > +sizeof
> *ref_ch_info->ref_chassis *
> >> > > +
> (++ref_ch_info->n_ref_chassis));
> >> >
> >> > This may be inefficient, considering the amount of ref chassises to be
> >> > added for each HA group. It is better to xrealloc for original_size *
> >> > 2 every time and expand only when more space is needed.
> >> >
> >> > > +ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis - 1] =
> >> > > +CONST_CAST(struct sbrec_chassis *, chassis);
> >> > > +}
> >> > > +
> >> > > +static void
> >> > > +update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
> >> > > +{
> >> > > +struct shash_node *node, *next;
> >> > > +SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
> >> > > +struct ha_ref_chassis_info *ha_ref_info = node->data;
> >> > > +
> sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
> >> > > +
>  ha_ref_info->ref_chassis,
> >> > > +
>  ha_ref_info->n_ref_chassis);
> >> > > +free(ha_ref_info->ref_chassis);
> >> > > +free(ha_ref_info);
> >> > > +shash_delete(ha_ref_chassis_map, node);
> >> > > +}
> >> > > +}
> >> > > +
> >> > > +/* This function returns logical router datapath (with a
> distributed
> >> > > + * gateway port) to which 'od' is connected to - either directly
> >> > > + * or indirectly (via transit logical switches).
> >> > > + * Returns NULL if no logical router with gw port found.
> >> > > + */
> >> > > +static struct ovn_datapath *
> >> > > +get_router_dp_with_gw_port(struct hmap *ports,
> >> > > +   struct ovn_datapath *od,
> >> > > +   struct ovn_datapath *peer_od)
> >> > > +{
> >> > > +if (!od) {
> >> > > +return NULL;
> >> > > +}
> >> > > +
> >> > > +if (od->nbs) {
> >> > > +/* It's a logical switch datapath. */
> >> > > +if (peer_od) {
> >> > > +/* If peer datapath is not logical router, then
> >> > > + * something is wrong. */
> >> > > +ovs_assert(peer_od->nbr);
> >> > > +}
> >> > > +
> >> > > +for (size_t i = 0; i < od->n_router_ports; i++) {
> >> > > +if (!od->router_ports[i]->peer) {
> >> > > +/* If there is no peer port connecting to the
> >> > > + * router datapath, ignore it. */
> >> > > +continue;
> >> > > +}
> >> > > +
> >> > > +struct ovn_datapath *router_dp =
> od->router_ports[i]->peer->od;
> >> > > +if (router_dp->l3dgw_port &&
> router_dp->l3dgw_port->nbrp) {
> >> > > +/* Router datapath has a distributed gateway
> router port. */
> >> > > +return router_dp;
> >> >
> >> > I think we can't return when just one router_dp is found. There can be
> >> > more than one connected router that has gateway router ports. So the
> >> > return value of this function should be a set.
> >> >
> >> > > +}
> >> > > +}
> >> > > +
> >> > > +/* The logical switch datapath is not connected to any
> >> > > + * logical router with a distributed gateway port. Check
> if it
> >> > > + * is indirectly connected to a logical router with a gw
> port. */
> >> > > +for (size_t i = 0; i < od->n_router_ports; i++) {
> >> > > +if (!od->router_ports[i]->peer) {
> >> > > +continue;
> >> > > +}
> >> > > +
> >> > > +struct ovn_datapath *router_dp =
> >> > > +od->router_ports[i]->peer->od;
> >> > > +
> >> > > +/* If we don't check this, we will be in an infinite
> loop. */
> >> > > +if (router_dp != peer_od) {
> >> >
> >> > peer_od should also be a set, it should skip checking any datapath
> >> > that has already been checked. Consider the case LR1 is connected with
> >> > LS1, LS2 and LS3.
> >> >
> >> > > +router_dp = get_router_dp_with_gw_port(ports,
> router_dp,
> >> > > +   od);
> >> > > +if (router_dp) {
> >> > > +/* Found a logical router with gw port
> indirectly connected
> >> > > + * to 'od'. */
> >> > > +return router_dp;
> >> > > +}
> >> > > +}
> >> > >

Re: [ovs-dev] [PATCH v2] ovsdb-idl: Fix memory leak of idl->remote.

2019-03-07 Thread Ben Pfaff
On Wed, Mar 06, 2019 at 09:01:21AM -0800, Han Zhou wrote:
> From: Han Zhou 
> 
> Reported by Address Sanitizer.
> 
> Fixes: 5e07b8f93f03 ("ovsdb-idl: New function 
> ovsdb_idl_create_unconnected().")
> Signed-off-by: Han Zhou 
> ---
> 
> Notes:
> v1->v2: address comments from David Marchand.

Thanks, applied to branch-2.11
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [crash] trigger a panic when flush-conntrack

2019-03-07 Thread Darrell Ball
Thanks for your help Solomon

Can you give the following debug patch a spin ?
I will continue to try to repo locally.

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 5410ab4..6968c03 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -250,6 +250,22 @@ conn_key_cmp(const struct conn_key *key1, const struct
conn_key *key2)
 return 1;
 }

+static bool
+conn_key_present(struct conntrack *ct, struct conntrack_bucket *ctb,
+ const struct conn_key *key)
+OVS_REQUIRES(ctb->lock)
+{
+struct conn *conn;
+uint32_t hash = conn_key_hash(key, ct->hash_basis);
+
+HMAP_FOR_EACH_WITH_HASH (conn, node, hash, &ctb->connections) {
+if (!conn_key_cmp(&conn->key, key)) {
+return true;
+}
+}
+return false;
+}
+
 static void
 ct_print_conn_info(const struct conn *c, const char *log_msg,
enum vlog_level vll, bool force, bool rl_on)
@@ -812,7 +828,14 @@ conn_clean(struct conntrack *ct, struct conn *conn,
 expectation_clean(ct, &conn->key, ct->hash_basis);
 }
 ovs_list_remove(&conn->exp_node);
-hmap_remove(&ctb->connections, &conn->node);
+
+if (conn_key_present(ct, ctb, &conn->key)) {
+hmap_remove(&ctb->connections, &conn->node);
+} else {
+char *log_msg = xasprintf("conn_clean: conn not present in hmap");
+ct_print_conn_info(conn, log_msg, VLL_WARN, true, false);
+free(log_msg);
+}
 atomic_count_dec(&ct->n_conn);
 if (conn->nat_info) {
 nat_clean(ct, conn, ctb);
@@ -1383,19 +1406,18 @@ sweep_bucket(struct conntrack *ct, struct
conntrack_bucket *ctb,

 for (unsigned i = 0; i < N_CT_TM; i++) {
 LIST_FOR_EACH_SAFE (conn, next, exp_node, &ctb->exp_lists[i]) {
-if (conn->conn_type == CT_CONN_TYPE_DEFAULT) {
-if (!conn_expired(conn, now) || count >= limit) {
-min_expiration = MIN(min_expiration, conn->expiration);
-if (count >= limit) {
-/* Do not check other lists. */
-COVERAGE_INC(conntrack_long_cleanup);
-return min_expiration;
-}
-break;
+ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
+if (!conn_expired(conn, now) || count >= limit) {
+min_expiration = MIN(min_expiration, conn->expiration);
+if (count >= limit) {
+/* Do not check other lists. */
+COVERAGE_INC(conntrack_long_cleanup);
+return min_expiration;
 }
-conn_clean(ct, conn, ctb);
-count++;
+break;
 }
+conn_clean(ct, conn, ctb);
+count++;
 }
 }
 return min_expiration;
@@ -2540,16 +2562,18 @@ int
 conntrack_flush(struct conntrack *ct, const uint16_t *zone)
 {
 for (unsigned i = 0; i < CONNTRACK_BUCKETS; i++) {
-struct conn *conn, *next;
-
-ct_lock_lock(&ct->buckets[i].lock);
-HMAP_FOR_EACH_SAFE (conn, next, node, &ct->buckets[i].connections)
{
-if ((!zone || *zone == conn->key.zone) &&
-(conn->conn_type == CT_CONN_TYPE_DEFAULT)) {
-conn_clean(ct, conn, &ct->buckets[i]);
+struct conntrack_bucket *ctb = &ct->buckets[i];
+ct_lock_lock(&ctb->lock);
+for (unsigned j = 0; j < N_CT_TM; j++) {
+struct conn *conn, *next;
+LIST_FOR_EACH_SAFE (conn, next, exp_node, &ctb->exp_lists[j]) {
+if (!zone || *zone == conn->key.zone) {
+ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
+conn_clean(ct, conn, ctb);
+}
 }
 }
-ct_lock_unlock(&ct->buckets[i].lock);
+ct_lock_unlock(&ctb->lock);
 }

 return 0;
(END)


On Wed, Mar 6, 2019 at 11:33 PM solomon  wrote:

> Darrell Ball wrote:
> > +LIST_FOR_EACH_SAFE (conn, next, exp_node,
> &ctb->exp_lists[j]) {
> > +if (!zone || *zone == conn->key.zone) {
> > +ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
>
> why do we need this assert?
> Clean the CT_CONN_TYPE_DEFAULT type in conntrack_flush(), and clean
> CT_CONN_TYPE_UN_NAT in nat_clean() like following:
> +if ((!zone || *zone == conn->key.zone) &&
> +(conn->conn_type == CT_CONN_TYPE_DEFAULT)) {
> +//ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
>
>
> with the above code, catch an another panic which in ct_clean thread.
> I have see the same panic without changeing the code.
> Both ct_clean and flush-conntrack, can catch the bad bucket value.
>
> #0  0x562ae7402553 in hmap_remove (node=0x7f871bdc4258,
> hmap=0x7f9498701c68) at ./include/openvswitch/hmap.h:287
> 287 while (*bucket != node) {
> [Current thread is 1 (Thread 0x7f948700 (LWP 20

[ovs-dev] Contact:

2019-03-07 Thread LIU SHIYU
I'm Liu Shiyu,I have a beneficial proposal for you.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] OVN: Add support for DHCP option 150 - TFTP server address

2019-03-07 Thread Ben Pfaff
On Thu, Mar 07, 2019 at 04:28:35PM +, lmart...@redhat.com wrote:
> From: Lucas Alvares Gomes 
> 
> OpenStack Ironic relies on a few DHCP options [0] that were not
> supported in OVN yet. This patch is adding the last one which is the
> option 150 (TFTP server address, RFC5859 [1]).
> 
> Note that this option is Cisco proprietary, the IEEE standard that
> matches with this requirement is Option 66. The difference is that 150
> allows to multiple IPs to be specified and 66 only allows one.
> 
> [0]
> https://github.com/openstack/ironic/blob/3f6d4c6a789b12512d6cc67cdbc93ba5fbf29848/ironic/common/pxe_utils.py#L44-L54
> [1] https://tools.ietf.org/html/rfc5859
> 
> Signed-off-by: Lucas Alvares Gomes 

I applied this series.  Thank you!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V5 0/2] Do not rewrite fields with the same values as matched

2019-03-07 Thread Ben Pfaff
It's really hard for me to figure out what to do with a patch series
where the author rejects my feedback.

On Thu, Mar 07, 2019 at 02:53:31PM +, Eli Britstein wrote:
> I would really want it to get merged.
> 
> how do you suggest to progress?
> 
> 
> On 3/1/2019 7:40 AM, Eli Britstein wrote:
> > ping
> >
> > On 2/25/2019 9:42 PM, Eli Britstein wrote:
> >>
> >> On 2/25/2019 7:31 PM, Ben Pfaff wrote:
> >>> On Sun, Feb 24, 2019 at 09:45:21AM +, Eli Britstein wrote:
>  On 2/22/2019 6:42 PM, Ben Pfaff wrote:
> > On Sun, Feb 17, 2019 at 09:18:46AM +, Eli Britstein wrote:
> >> This patch set avoids unnecessary rewrite actions to fields with the
> >> same values as matched on.
> >>
> >> Patch 1 is a pre-step of generating ovs key fields macros
> >> Patch 2 avoids the unnecessary rewrites and adapts the tests 
> >> accordingly
> > Thanks for the revision.
> >
> > Do you foresee other uses of OVS_KEY_FIELD in the future? As is,
> > there's a lot of duplication here from the numerous declarations like
> >
> >   struct ovs_key_field_properties 
> > ovs_key_nd_extensions_properties[] = {
> > #define OVS_KEY_FIELD(type, name) {offsetof(struct 
> > ovs_key_nd_extensions, name), sizeof(type)},
> >   OVS_KEY_ND_EXTENSIONS_FIELDS
> >   {0, 0}
> > #undef OVS_KEY_FIELD
> >   };
> >
> > If this is the only currently foreseen use, it would be better to 
> > have
> > the code generator just generate the declarations directly instead of
> > forcing these later duplications.
>  Please note the definitions are not exactly duplicated - each define
>  refers to a different struct. Currently I don't know what other uses
>  will be needed/done in the future. However, I think we should remain
>  with more generic and flexible code, that will allow easier use of 
>  those
>  macros in the future rather than specific macros.
> >>> What uses do you foresee in the future?
> >>
> >> I don't foresee the future. I just think we should keep the code as 
> >> generic as possible, so future implementations are easier to reuse 
> >> the same infrastructure.
> >>
> >> Just for example, I can think of a log/debug implementation to print 
> >> a struct:
> >>
> >> suppose there is a function: void print_hex(char *buf, int len)
> >>
> >> and you want to print a struct ovs_key_ipv4 ip:
> >>
> >> #define OVS_KEY_FIELD(type, name) 
> >> print_hex(((char*)ip)+offsetof(struct ovs_key_ipv4, name), sizeof(type))
> >>
> >> OVS_KEY_IPV4_FIELDS - will print all the fields, and you can also add 
> >> a title etc.
> >>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ovsdb-idl: Fix memory leak of ovsdb_idl_db_clear.

2019-03-07 Thread Ben Pfaff
When I apply this series I get lots of segfaults in the testsuite, e.g.:

=
==27505==ERROR: AddressSanitizer: SEGV on unknown address 0x0008 (pc 
0x5557321b0f98 bp 0x7fffa7a2d230 sp 0x7fffa7a2d220 T0)
==27505==The signal is caused by a READ memory access.
==27505==Hint: address points to the zero page.
#0 0x5557321b0f97 in hmap_remove ../include/openvswitch/hmap.h:287
#1 0x5557321b0f97 in ovsdb_idl_row_destroy ../lib/ovsdb-idl.c:3106
#2 0x5557321b1ac0 in ovsdb_idl_row_destroy ../lib/ovsdb-idl.c:3104
#3 0x5557321b1ac0 in ovsdb_idl_row_clear_arcs ../lib/ovsdb-idl.c:3043
#4 0x5557321b2198 in ovsdb_idl_delete_row ../lib/ovsdb-idl.c:3207
#5 0x5557321b2e3a in ovsdb_idl_process_update2 ../lib/ovsdb-idl.c:2444
#6 0x5557321b2e3a in ovsdb_idl_db_parse_update__ ../lib/ovsdb-idl.c:2304
#7 0x5557321b2e3a in ovsdb_idl_db_parse_update ../lib/ovsdb-idl.c:2356
#8 0x5557321b429a in ovsdb_idl_db_parse_update_rpc ../lib/ovsdb-idl.c:2187
#9 0x5557321b557a in ovsdb_idl_db_parse_update_rpc ../lib/ovsdb-idl.c:2143
#10 0x5557321b557a in ovsdb_idl_process_msg ../lib/ovsdb-idl.c:822
#11 0x5557321b557a in ovsdb_idl_run ../lib/ovsdb-idl.c:901
#12 0x5557321b557a in ovsdb_idl_run ../lib/ovsdb-idl.c:866
#13 0x555731ed7015 in bridge_run ../vswitchd/bridge.c:2957
#14 0x555731eb9464 in main ../vswitchd/ovs-vswitchd.c:125
#15 0x7f4666eab09a in __libc_start_main ../csu/libc-start.c:308
#16 0x555731ebbe99 in _start 
(/home/blp/nicira/ovs/_build/vswitchd/ovs-vswitchd+0x119e99)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ../include/openvswitch/hmap.h:287 in hmap_remove
==27505==ABORTING
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ovsdb raft: Precheck prereq before proposing commit.

2019-03-07 Thread Ben Pfaff
On Mon, Mar 04, 2019 at 01:53:38PM -0800, Han Zhou wrote:
> On Mon, Mar 4, 2019 at 1:31 PM Ben Pfaff  wrote:
> >
> > On Fri, Mar 01, 2019 at 10:56:37AM -0800, Han Zhou wrote:
> > > From: Han Zhou 
> > >
> > > In current OVSDB Raft design, when there are multiple transactions
> > > pending, either from same server node or different nodes in the
> > > cluster, only the first one can be successful at once, and following
> > > ones will fail at the prerequisite check on leader node, because
> > > the first one will update the expected prerequisite eid on leader
> > > node, and the prerequisite used for proposing a commit has to be
> > > committed eid, so it is not possible for a node to use the latest
> > > prerequisite expected by the leader to propose a commit until the
> > > lastest transaction is committed by the leader and updated the
> > > committed_index on the node.
> > >
> > > Current implementation proposes the commit as soon as the transaction
> > > is requested by the client, which results in continously retry which
> > > causes high CPU load and waste.
> > >
> > > Particularly, even if all clients are using leader_only to connect to
> > > only the leader, the prereq check failure still happens a lot when
> > > a batch of transactions are pending on the leader node - the leader
> > > node proposes a batch of commits using the same committed eid as
> > > prerequisite and it updates the expected prereq as soon as the first
> > > one is in progress, but it needs time to append to followers and wait
> > > until majority replies to update the committed_index, which results in
> > > continously useless retries of the following transactions proposed by
> > > the leader itself.
> > >
> > > This patch doesn't change the design but simplely pre-checks if current
> > > eid is same as prereq, before proposing the commit, to avoid waste of
> > > CPU cycles, for both leader and followers. When clients use leader_only
> > > mode, this patch completely eliminates the prereq check failures.
> > >
> > > In scale test of OVN with 1k HVs and creating and binding 10k lports,
> > > the patch resulted in 90% CPU cost reduction on leader and >80% CPU cost
> > > reduction on followers. (The test was with leader election base time
> > > set to 1ms, because otherwise the test couldn't complete because
> > > of the frequent leader re-election.)
> > >
> > > This is just one of the related performance problems of the prereq
> > > checking mechanism dicussed at:
> > >
> > > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-February/048243.html
> > > Signed-off-by: Han Zhou 
> >
> > I *think* that this patch is going to be unreliable.  It appears to me
> > that what it does is wait until the current eid presented by the raft
> > storage is the one that we want.  But I don't think it's guaranteed that
> > that will ever happen.  What if we lose the raft connection, reconnect,
> > and skip past that particular eid?  I think in that kind of a case we'd
> > keep the trigger around forever and never discard it.
> 
> The function ovsdb_txn_precheck_prereq() compares the db->prereq with
> the current eid from raft storage. Both values can change from
> iteration to iteration. If raft reconnected and skipped the previous
> eid, it shouldn't matter because the function checks the new prereq
> against the new *current* eid.
> 
> In fact, prereq is last applied entry, so at least some day it should
> catch up with the current eid, unless there are always new changes
> appended to the log before the current node catch up. In that case
> even without this change, the current node cannot propose any commit
> sucessfully because it will encounter a prereq check failure. This
> commit just avoid the waste of CPU and bandwidth in that same
> situation - when current node cannot catch up with the latest appended
> entry.

After reading more code, and your explanation, I now understand.

I applied this to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ovsdb-idl: Fix memory leak of ovsdb_idl_db_clear.

2019-03-07 Thread Han Zhou
On Thu, Mar 7, 2019 at 2:22 PM Ben Pfaff  wrote:
>
> When I apply this series I get lots of segfaults in the testsuite, e.g.:
>
> =
> ==27505==ERROR: AddressSanitizer: SEGV on unknown address 0x0008 (pc 
> 0x5557321b0f98 bp 0x7fffa7a2d230 sp 0x7fffa7a2d220 T0)
> ==27505==The signal is caused by a READ memory access.
> ==27505==Hint: address points to the zero page.
> #0 0x5557321b0f97 in hmap_remove ../include/openvswitch/hmap.h:287
> #1 0x5557321b0f97 in ovsdb_idl_row_destroy ../lib/ovsdb-idl.c:3106
> #2 0x5557321b1ac0 in ovsdb_idl_row_destroy ../lib/ovsdb-idl.c:3104
> #3 0x5557321b1ac0 in ovsdb_idl_row_clear_arcs ../lib/ovsdb-idl.c:3043
> #4 0x5557321b2198 in ovsdb_idl_delete_row ../lib/ovsdb-idl.c:3207
> #5 0x5557321b2e3a in ovsdb_idl_process_update2 ../lib/ovsdb-idl.c:2444
> #6 0x5557321b2e3a in ovsdb_idl_db_parse_update__ ../lib/ovsdb-idl.c:2304
> #7 0x5557321b2e3a in ovsdb_idl_db_parse_update ../lib/ovsdb-idl.c:2356
> #8 0x5557321b429a in ovsdb_idl_db_parse_update_rpc ../lib/ovsdb-idl.c:2187
> #9 0x5557321b557a in ovsdb_idl_db_parse_update_rpc ../lib/ovsdb-idl.c:2143
> #10 0x5557321b557a in ovsdb_idl_process_msg ../lib/ovsdb-idl.c:822
> #11 0x5557321b557a in ovsdb_idl_run ../lib/ovsdb-idl.c:901
> #12 0x5557321b557a in ovsdb_idl_run ../lib/ovsdb-idl.c:866
> #13 0x555731ed7015 in bridge_run ../vswitchd/bridge.c:2957
> #14 0x555731eb9464 in main ../vswitchd/ovs-vswitchd.c:125
> #15 0x7f4666eab09a in __libc_start_main ../csu/libc-start.c:308
> #16 0x555731ebbe99 in _start 
> (/home/blp/nicira/ovs/_build/vswitchd/ovs-vswitchd+0x119e99)
>
> AddressSanitizer can not provide additional info.
> SUMMARY: AddressSanitizer: SEGV ../include/openvswitch/hmap.h:287 in 
> hmap_remove
> ==27505==ABORTING

My apologies. It is caused by the first patch in this series. I can
reproduce easily by running idl tests such as: 2439: set, simple3
idl-compound-index-with-ref, initially populated - C FAILED
(ovsdb-idl.at:1811)

Not sure how I missed this in the first place. Probably I have run
irrelevant tests with "-k" and declared success. Anyways, stupid
mistake.

In fact the two patches in this series are independent. I ran test
with -k idl & -k ovn for patch 2/2 with & without address sanitizer,
and there is only one failure in ovn tests when running with address
sanitizer:

2745: ovn -- 1 LR with distributed router gateway port FAILED (ovn.at:8745)

However, this fails even with master code. Without address sanitizer
it doesn't fail. Running tests with address sanitizer is much slower,
so it could be timing problem in test cases. So I should say the patch
2/2 is ready for review.

I will figure out the problem of patch 1/2.

Thanks,
Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ovsdb-idl: Fix memory leak of ovsdb_idl_db_clear.

2019-03-07 Thread Ben Pfaff
On Thu, Mar 07, 2019 at 04:15:20PM -0800, Han Zhou wrote:
> On Thu, Mar 7, 2019 at 2:22 PM Ben Pfaff  wrote:
> My apologies. It is caused by the first patch in this series. I can
> reproduce easily by running idl tests such as: 2439: set, simple3
> idl-compound-index-with-ref, initially populated - C FAILED
> (ovsdb-idl.at:1811)
> 
> Not sure how I missed this in the first place. Probably I have run
> irrelevant tests with "-k" and declared success. Anyways, stupid
> mistake.

I do that sometimes too.

> In fact the two patches in this series are independent. I ran test
> with -k idl & -k ovn for patch 2/2 with & without address sanitizer,
> and there is only one failure in ovn tests when running with address
> sanitizer:
> 
> 2745: ovn -- 1 LR with distributed router gateway port FAILED (ovn.at:8745)
> 
> However, this fails even with master code. Without address sanitizer
> it doesn't fail. Running tests with address sanitizer is much slower,
> so it could be timing problem in test cases. So I should say the patch
> 2/2 is ready for review.

I do not get a failure with this test, running with (or presumably
without) Address Sanitizer, so I agree that it is likely a timing issue.

> I will figure out the problem of patch 1/2.

OK.  I will review patch 2.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ovsdb-idl: Fix memory leak of ovsdb_idl_db_clear.

2019-03-07 Thread Ben Pfaff
On Tue, Mar 05, 2019 at 06:16:50PM -0800, Han Zhou wrote:
> From: Han Zhou 
> 
> ovsdb_idl_row_destroy() doesn't free the memory of row structure itself.
> This is because of the ovsdb change tracking feature: the deleted row
> may be accessed in the current iteration of main loop. The function
> ovsdb_idl_row_destroy_postprocess() is called at the end of
> ovsdb_idl_run() to free the deleted rows that are not tracked; the
> function ovsdb_idl_db_track_clear() is called (indirectly) by user
> at the end of each main loop iteration to free the deleted rows that
> are tracked. However, in ovsdb_idl_db_clear(), which may be called when
> a session is reset, or when the idl is destroyed, it didn't call
> ovsdb_idl_row_destroy_postprocess(), which would result in all the
> untracked rows leaked. This patch fixes that.
> 
> Signed-off-by: Han Zhou 

Thanks.  I applied this patch (but not patch 1).
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [crash] trigger a panic when flush-conntrack

2019-03-07 Thread Li Wei
hi darrell:

I set nat action with 
actions=ct(nat(src=172.16.1.1-172.255.255.255),commit,table=40)

With your patch, new double free panic happens in conntrack_flush() and 
sweep_bucket():

==1st panic==
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `ovs-vswitchd unix:/var/run/openvswitch/db.sock 
-vconsole:emer -vsyslog:err -vfi'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f92b122cb00 (LWP 2387347))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x7f92af62a42a in __GI_abort () at abort.c:89
#2  0x7f92af666c00 in __libc_message (do_abort=do_abort@entry=2, 
fmt=fmt@entry=0x7f92af75bd98 "*** Error in `%s': %s: 0x%s ***\n")
at ../sysdeps/posix/libc_fatal.c:175
#3  0x7f92af66cfc6 in malloc_printerr (action=3, str=0x7f92af75bef0 "double 
free or corruption (fasttop)", ptr=, ar_ptr=)
at malloc.c:5049
#4  0x7f92af66d80e in _int_free (av=0x7f8bb420, p=0x7f8bb70ef770, 
have_lock=0) at malloc.c:3905
#5  0x5622735d3f90 in delete_conn (conn=conn@entry=0x7f8bb70ef670) at 
lib/conntrack.c:2380
#6  0x5622735d52ad in nat_clean (ctb=0x7f92b10da7f0, conn=0x7f8bb70ef670, 
ct=0x7f92b10d5d98) at lib/conntrack.c:816
#7  conn_clean (ct=ct@entry=0x7f92b10d5d98, conn=0x7f8bb70ef670, 
ctb=ctb@entry=0x7f92b10da7f0) at lib/conntrack.c:842
#8  0x5622735da710 in conntrack_flush (ct=0x7f92b10d5d98, zone=0x0) at 
lib/conntrack.c:2574
#9  0x5622734bfb39 in ct_dpif_flush (dpif=0x5622758671d0, 
zone=zone@entry=0x0, tuple=tuple@entry=0x0) at lib/ct-dpif.c:140
#10 0x5622735de860 in dpctl_flush_conntrack (argc=argc@entry=1, 
argv=argv@entry=0x56227589cc40, dpctl_p=dpctl_p@entry=0x7fff9087fe90) at 
lib/dpctl.c:1388
#11 0x5622735dbc38 in dpctl_unixctl_handler (conn=0x56227589bc90, argc=1, 
argv=0x56227589cc40, aux=0x5622735de6d0 ) at 
lib/dpctl.c:2312
#12 0x56227357d6ea in process_command (request=, 
conn=0x56227589bc90) at lib/unixctl.c:308
#13 run_connection (conn=0x56227589bc90) at lib/unixctl.c:342
#14 unixctl_server_run (server=0x5622757af230) at lib/unixctl.c:393
#15 0x562273101217 in main (argc=, argv=) at 
vswitchd/ovs-vswitchd.c:126

The debug info in /var/log/openvswitch/ovs-vswitchd.log:
70 2019-03-08T00:54:31.278Z|00227|conntrack|WARN|conn_clean: conn not present 
in hmap: src ip 32.248.14.183 dst ip 222.15.63.163 rev src ip 222.15.63.163 rev 
dst ip 172.112.98.138 src/dst ports 54957/80 rev src/dst ports 80/54957 
zone/rev zone 0/0 nw_proto/rev nw_proto 6/6

==sec panic==
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x7faece4b642a in __GI_abort () at abort.c:89
#2  0x7faece4f2c00 in __libc_message (do_abort=do_abort@entry=2, 
fmt=fmt@entry=0x7faece5e7d98 "*** Error in `%s': %s: 0x%s ***\n")
at ../sysdeps/posix/libc_fatal.c:175
#3  0x7faece4f8fc6 in malloc_printerr (action=3, str=0x7faece5e7e60 "double 
free or corruption (out)", ptr=, ar_ptr=)
at malloc.c:5049
#4  0x7faece4f980e in _int_free (av=0x7faece81bb00 , 
p=0x7fa11c50ee30, have_lock=0) at malloc.c:3905
#5  0x562c63ac82ad in nat_clean (ctb=0x7faecff65cf8, conn=0x7fa11c50ee40, 
ct=0x7faecff61d98) at lib/conntrack.c:816
#6  conn_clean (ct=ct@entry=0x7faecff61d98, conn=0x7fa11c50ee40, 
ctb=ctb@entry=0x7faecff65cf8) at lib/conntrack.c:842
#7  0x562c63ac8639 in sweep_bucket (limit=3906, now=1287899232, 
ctb=, ct=0x7faecff61d98) at lib/conntrack.c:1421
#8  conntrack_clean (now=1287899232, ct=0x7faecff61d98) at lib/conntrack.c:1457
#9  clean_thread_main (f_=0x7faecff61d98) at lib/conntrack.c:1512
#10 0x562c63a3f48f in ovsthread_wrapper (aux_=) at 
lib/ovs-thread.c:354
#11 0x7faecef76494 in start_thread (arg=0x7faec77fe700) at 
pthread_create.c:333
#12 0x7faece56aacf in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:97

2019-03-08T01:15:16.929Z|00055|conntrack(ct_clean3)|WARN|conn_clean: conn not 
present in hmap: src ip 2.92.142.188 dst ip 222.15.63.163 rev src ip 
222.15.63.163 rev dst ip 172.116.154.125 src/dst ports 23446/80 rev src/dst 
ports 80/23446 zone/rev zone 0/0 nw_proto/rev nw_proto 6/6


Darrell Ball wrote:
> Thanks for your help Solomon
> 
> Can you give the following debug patch a spin ?
> I will continue to try to repo locally.
> 
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 5410ab4..6968c03 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -250,6 +250,22 @@ conn_key_cmp(const struct conn_key *key1, const struct
> conn_key *key2)
>  return 1;
>  }
> 
> +static bool
> +conn_key_present(struct conntrack *ct, struct conntrack_bucket *ctb,
> + const struct conn_key *key)
> +OVS_REQUIRES(ctb->lock)
> +{
> +struct conn *conn;
> +uint32_t hash = conn

[ovs-dev] [PATCH] faq: Explain why select groups don't sort out packets evenly.

2019-03-07 Thread Ben Pfaff
This keeps coming up.

Signed-off-by: Ben Pfaff 
---
 Documentation/faq/openflow.rst | 16 
 1 file changed, 16 insertions(+)

diff --git a/Documentation/faq/openflow.rst b/Documentation/faq/openflow.rst
index 4a20255daf21..561d793a6692 100644
--- a/Documentation/faq/openflow.rst
+++ b/Documentation/faq/openflow.rst
@@ -478,6 +478,22 @@ Q: How does OVS divide flows among buckets in an OpenFlow 
"select" group?
 Documentation/group-selection-method-property.txt in the Open vSwitch
 source tree.  (OpenFlow 1.5 support in Open vSwitch is still experimental.)
 
+Q: An OpenFlow "select" group isn't dividing packets evenly among the buckets.
+
+A: When a packet passes through a "select" group, Open vSwitch hashes a
+subset of the fields in the packet, then it maps the hash value to a
+bucket.  This means that packets whose hashed fields are the same will
+always go to the same bucket[*].  More specifically, if you test with a
+single traffic flow, only one bucket will receive any traffic[**].
+Furthermore, statistics and probability mean that testing with a small
+number of flows may still yield an uneven distribution.
+
+[*] Unless its bucket has a watch port or group whose liveness changes
+during the test.
+
+[**] Unless the hash includes fields that vary within a traffic flow, such
+as tcp_flags.
+
 Q: I added a flow to accept packets on VLAN 123 and output them on VLAN 456,
 like so::
 
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] ovsdb-idl: Fix memory leak of ovsdb_idl_delete_row.

2019-03-07 Thread Han Zhou
On Tue, Mar 5, 2019 at 6:17 PM Han Zhou  wrote:
>
> From: Han Zhou 
>
> When a row is deleted, if it is referenced by another row, the
> function ovsdb_idl_row_reparse_backrefs() is called but in that
> function it doesn't destroy the row, because parameter destroy_dsts
> is false when ovsdb_idl_row_reparse_backrefs() calls
> ovsdb_idl_row_clear_arcs().
>
> Signed-off-by: Han Zhou 
> ---
>  lib/ovsdb-idl.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> index 8cfb201..49fd45c 100644
> --- a/lib/ovsdb-idl.c
> +++ b/lib/ovsdb-idl.c
> @@ -3208,6 +3208,7 @@ ovsdb_idl_delete_row(struct ovsdb_idl_row *row)
>  ovsdb_idl_row_destroy(row);
>  } else {
>  ovsdb_idl_row_reparse_backrefs(row);
> +ovsdb_idl_row_destroy(row);
>  }
>  }
>
> --
> 2.1.0
>

This patch fails many tests, e.g. set, simple3
idl-compound-index-with-ref, initially populated - C

It causes SIGSEGV when processing update notification for deletions
when there are references. E.g. column C1 in row R1 in table T1 refers
to C2 in R2 in T2. When R1 is deleted, both R1 and R2 will be deleted
and sent in the same update notification. The IDL processes R2
deletion first. With this patch, it is destroyed after calling
ovsdb_idl_row_reparse_backrefs(row). However, when processing R1
deletion, the function ovsdb_idl_delete_row() calls
ovsdb_idl_row_clear_arcs(destroy=true) to free the orphan rows.
Somehow R1 still has arc pointing to R2, and so it is calling
ovsdb_idl_row_destroy() again for R2, which causes SIGSEGV, with below
backtrace.

(gdb) bt
#0  0x0043fdd3 in hmap_remove (node=0x16bce30, hmap=0x16b7d88)
at ../include/openvswitch/hmap.h:287
#1  ovsdb_idl_row_destroy (row=0x16bce30) at ../lib/ovsdb-idl.c:3104
#2  0x004400d6 in ovsdb_idl_row_clear_arcs
(row=row@entry=0x16bcf10, destroy_dsts=destroy_dsts@entry=true) at
../lib/ovsdb-idl.c:3041
#3  0x0044028e in ovsdb_idl_delete_row
(row=row@entry=0x16bcf10) at ../lib/ovsdb-idl.c:3205
#4  0x00441392 in ovsdb_idl_process_update2
(json_row=0x16c2370, operation=0x469d7e "delete", uuid=0x70219980,
table=0x16b7c90) at ../lib/ovsdb-idl.c:2442
#5  ovsdb_idl_db_parse_update__ (method=(unknown: 23866864),
table_updates=, db=) at
../lib/ovsdb-idl.c:2302
#6  ovsdb_idl_db_parse_update (db=db@entry=0x16b7348,
table_updates=0x16bd1a0,
method=method@entry=OVSDB_IDL_MM_MONITOR_COND_SINCE) at
../lib/ovsdb-idl.c:2354
#7  0x00441b6e in ovsdb_idl_db_parse_update_rpc
(db=db@entry=0x16b7348, msg=msg@entry=0x16bd290) at
../lib/ovsdb-idl.c:2185
#8  0x00441f0c in ovsdb_idl_db_parse_update_rpc
(msg=0x16bd290, db=0x16b7348) at ../lib/ovsdb-idl.c:900
#9  ovsdb_idl_process_msg (msg=0x16bd290, idl=0x16b7290) at
../lib/ovsdb-idl.c:820
#10 ovsdb_idl_run (idl=0x16b7290) at ../lib/ovsdb-idl.c:899
#11 0x0044682c in ovsdb_idl_txn_commit_block
(txn=txn@entry=0x16bb600) at ../lib/ovsdb-idl.c:4285
#12 0x00407f8a in do_idl_compound_index_with_ref
(ctx=) at ../tests/test-ovsdb.c:2804
#13 0x00430a84 in ovs_cmdl_run_command__
(ctx=ctx@entry=0x70219b80, commands=commands@entry=0x69a800
, read_only=read_only@entry=false)
at ../lib/command-line.c:223
#14 0x00431047 in ovs_cmdl_run_command
(ctx=ctx@entry=0x70219b80, commands=commands@entry=0x69a800
) at ../lib/command-line.c:254
#15 0x0040530c in main (argc=,
argv=0x70219c98) at ../tests/test-ovsdb.c:76

My original understanding was that ovsdb_idl_row_reparse_backrefs(row)
is to cut the link between R1 and R2 when R2 is being deleted.
However, it turns out in this function it unparse R1, clear arcs from
R1 to R2 without destroying R2 (i.e.
ovsdb_idl_row_clear_arcs(destroy=false)), and then it parse R1 again,
which adds the arcs back. So it seems doesn't cut any arcs, and I am
totally confused about the purpose of this function
ovsdb_idl_row_reparse_backrefs() if it doesn't change anything at all.

And still, it seems the memory leak would happen if R2 (the one being
referenced) is deleted but R1 is never deleted, because the only
chance that R2 gets destroyed is when R1 is deleted, though
ovsdb_idl_row_clear_arcs(destroy=true).

Did I misunderstand anything? Could someone who is familiar with this
code help explain a little bit?

Thanks,
Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] ovs-tcpdump: add dump_cmd checker before _doexec()

2019-03-07 Thread txfh2007 via dev
Hi Aaron:
Thanks for your kinkdly suggession. I have tried your proposol and test in 
my repository. This is the new version.

Add dump_cmd executable checker in ovs-tcpdump

Signed-off-by: Liu Chang 

diff --git a/utilities/ovs-tcpdump.in b/utilities/ovs-tcpdump.in
index 269c252..911c608 100755
--- a/utilities/ovs-tcpdump.in
+++ b/utilities/ovs-tcpdump.in
@@ -371,6 +371,9 @@ def argv_tuples(lst):
 except StopIteration:
 pass

+def py_which(executable):
+return any(os.access(os.path.join(path,executable),os.X_OK)
+for path in os.environ["PATH"].split(os.pathsep))

 def main():
 db_sock = 'unix:@RUNDIR@/db.sock'
@@ -416,6 +419,10 @@ def main():
 print("Error: must at least specify an interface with '-i' option")
 sys.exit(1)

+if not py_which(dump_cmd):
+print("Error: unable to execute '%s' (check PATH)"%dump_cmd)
+sys.exit(1)
+
 if '-l' not in tcpdargs:
 tcpdargs.insert(0, '-l')









 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] ovsdb-idl: Fix memory leak of ovsdb_idl_delete_row.

2019-03-07 Thread Han Zhou
On Thu, Mar 7, 2019 at 7:45 PM Han Zhou  wrote:
>
> On Tue, Mar 5, 2019 at 6:17 PM Han Zhou  wrote:
> >
> > From: Han Zhou 
> >
> > When a row is deleted, if it is referenced by another row, the
> > function ovsdb_idl_row_reparse_backrefs() is called but in that
> > function it doesn't destroy the row, because parameter destroy_dsts
> > is false when ovsdb_idl_row_reparse_backrefs() calls
> > ovsdb_idl_row_clear_arcs().
> >
> > Signed-off-by: Han Zhou 
> > ---
> >  lib/ovsdb-idl.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> > index 8cfb201..49fd45c 100644
> > --- a/lib/ovsdb-idl.c
> > +++ b/lib/ovsdb-idl.c
> > @@ -3208,6 +3208,7 @@ ovsdb_idl_delete_row(struct ovsdb_idl_row *row)
> >  ovsdb_idl_row_destroy(row);
> >  } else {
> >  ovsdb_idl_row_reparse_backrefs(row);
> > +ovsdb_idl_row_destroy(row);
> >  }
> >  }
> >
> > --
> > 2.1.0
> >
>
> This patch fails many tests, e.g. set, simple3
> idl-compound-index-with-ref, initially populated - C
>
> It causes SIGSEGV when processing update notification for deletions
> when there are references. E.g. column C1 in row R1 in table T1 refers
> to C2 in R2 in T2. When R1 is deleted, both R1 and R2 will be deleted
> and sent in the same update notification. The IDL processes R2
> deletion first. With this patch, it is destroyed after calling
> ovsdb_idl_row_reparse_backrefs(row). However, when processing R1
> deletion, the function ovsdb_idl_delete_row() calls
> ovsdb_idl_row_clear_arcs(destroy=true) to free the orphan rows.
> Somehow R1 still has arc pointing to R2, and so it is calling
> ovsdb_idl_row_destroy() again for R2, which causes SIGSEGV, with below
> backtrace.
>
> (gdb) bt
> #0  0x0043fdd3 in hmap_remove (node=0x16bce30, hmap=0x16b7d88)
> at ../include/openvswitch/hmap.h:287
> #1  ovsdb_idl_row_destroy (row=0x16bce30) at ../lib/ovsdb-idl.c:3104
> #2  0x004400d6 in ovsdb_idl_row_clear_arcs
> (row=row@entry=0x16bcf10, destroy_dsts=destroy_dsts@entry=true) at
> ../lib/ovsdb-idl.c:3041
> #3  0x0044028e in ovsdb_idl_delete_row
> (row=row@entry=0x16bcf10) at ../lib/ovsdb-idl.c:3205
> #4  0x00441392 in ovsdb_idl_process_update2
> (json_row=0x16c2370, operation=0x469d7e "delete", uuid=0x70219980,
> table=0x16b7c90) at ../lib/ovsdb-idl.c:2442
> #5  ovsdb_idl_db_parse_update__ (method=(unknown: 23866864),
> table_updates=, db=) at
> ../lib/ovsdb-idl.c:2302
> #6  ovsdb_idl_db_parse_update (db=db@entry=0x16b7348,
> table_updates=0x16bd1a0,
> method=method@entry=OVSDB_IDL_MM_MONITOR_COND_SINCE) at
> ../lib/ovsdb-idl.c:2354
> #7  0x00441b6e in ovsdb_idl_db_parse_update_rpc
> (db=db@entry=0x16b7348, msg=msg@entry=0x16bd290) at
> ../lib/ovsdb-idl.c:2185
> #8  0x00441f0c in ovsdb_idl_db_parse_update_rpc
> (msg=0x16bd290, db=0x16b7348) at ../lib/ovsdb-idl.c:900
> #9  ovsdb_idl_process_msg (msg=0x16bd290, idl=0x16b7290) at
> ../lib/ovsdb-idl.c:820
> #10 ovsdb_idl_run (idl=0x16b7290) at ../lib/ovsdb-idl.c:899
> #11 0x0044682c in ovsdb_idl_txn_commit_block
> (txn=txn@entry=0x16bb600) at ../lib/ovsdb-idl.c:4285
> #12 0x00407f8a in do_idl_compound_index_with_ref
> (ctx=) at ../tests/test-ovsdb.c:2804
> #13 0x00430a84 in ovs_cmdl_run_command__
> (ctx=ctx@entry=0x70219b80, commands=commands@entry=0x69a800
> , read_only=read_only@entry=false)
> at ../lib/command-line.c:223
> #14 0x00431047 in ovs_cmdl_run_command
> (ctx=ctx@entry=0x70219b80, commands=commands@entry=0x69a800
> ) at ../lib/command-line.c:254
> #15 0x0040530c in main (argc=,
> argv=0x70219c98) at ../tests/test-ovsdb.c:76
>
> My original understanding was that ovsdb_idl_row_reparse_backrefs(row)
> is to cut the link between R1 and R2 when R2 is being deleted.
> However, it turns out in this function it unparse R1, clear arcs from
> R1 to R2 without destroying R2 (i.e.
> ovsdb_idl_row_clear_arcs(destroy=false)), and then it parse R1 again,
> which adds the arcs back. So it seems doesn't cut any arcs, and I am
> totally confused about the purpose of this function
> ovsdb_idl_row_reparse_backrefs() if it doesn't change anything at all.
>
> And still, it seems the memory leak would happen if R2 (the one being
> referenced) is deleted but R1 is never deleted, because the only
> chance that R2 gets destroyed is when R1 is deleted, though
> ovsdb_idl_row_clear_arcs(destroy=true).
>
> Did I misunderstand anything? Could someone who is familiar with this
> code help explain a little bit?
>
I think I understand it now. There is no memory leak here, but the
logic is a tricky. The purpose of ovsdb_idl_row_reparse_backrefs() is
not to cut the arc between R1 and R2, but to make sure R2 disappears
from R1's C structure. The actual destroy of R2 is deferred to when R1
is deleted. If R1 -> R2 is a weak reference and R2 is deleted without
deleting R1, then there will be a "modify" operation for R1, involved
in the same update notification, whic