Re: [ovs-dev] OVN meeting report

2017-04-14 Thread Valentine Sinitsyn

Hi Ben,

On 13.04.2017 20:53, Ben Pfaff wrote:

On Wed, Apr 12, 2017 at 06:09:28PM +0500, Valentine Sinitsyn wrote:

Hi,

On 04.04.2017 15:29, Valentine Sinitsyn wrote:

On 03.04.2017 20:29, Valentine Sinitsyn wrote:

Hi Ben,

On 23.03.2017 08:11, Ben Pfaff wrote:

Hello everyone.  I am not sure whether I am going to be able to attend
the OVN meeting tomorrow, because I will be in another possibly
distracting meeting, so I'm going to give my report here.

Toward the end of last week I did a full pass of reviews through
patchwork.  The most notable result, I think, is that I applied patches
that add 802.1ad support.  For OVN, this makes it more reasonable to
consider adding support for tagged logical ports--currently, OVN drops
all tagged logical packets--which I've heard requested once or twice,
because it means that they can now be gatewayed to physical ports within
an outer VLAN.  I don't have any plans to work on that, but I think that
it is worth pointing out.

The OVS "Open Source Day" talks have been scheduled at OpenStack
Boston.  They are all on Wednesday:
https://www.openstack.org/summit/boston-2017/summit-schedule/#track=135

I've been spending what dev time I have on database clustering.  Today,
I managed to get it working, with many caveats.  It will take weeks or
months longer to get it finished, tested, and ready for posting.  (If
you want what I have, check out the raft3 branch in my ovs-reviews repo
at github.)

I've checked out your raft3 branch, and even learned how to create an
OVSDB cluster. Thanks for the docs!

What I don't get though is how do I instruct IDL to connect to the
cluster now? Do I just connect to a random server, or there should be
some dispatcher, or whatever?

OK I see this is an ongoing work in your branch.


I had some time to play with raft3 branch last week.

I added very basic and hacky replica set support to IDL and brought up an
OVN setup with clustered southbound database. It works to some extent, yet
if I try to throw several hundreds of logical ports into the mix, the
database becomes inconsistent. The reason is probably the race window
between when the raft leader appends a log entry to other nodes (so a client
such as ovn-northd already sees it) and the entry really appears in the
leader's log itself. Not sure if it is my bug or not. The original code had
some minor issues as well (which is absolutely normal for WIP) - I can send
my (rather trivial) patches if there is any interest.


I'm not surprised that there are inconsistency bugs.  The testing I've
done so far is really sketchy.  Let me assure you that I will implement
much more thorough testing before I will propose anything to be merged.

Sure, I didn't expect it to be bug free either.




Is there some design outline for the missing implementation bits?
Specifically, it would be good to know the following:

1. With clustered OVSDB, a client such as IDL needs two JSON RPC
connections: to the leader (to commit transactions), and a read-only one to
an arbitrary replica set (scaling reads). Will it be implemented on
ovsdb_idl level or encapsulated inside jsonrpc_session? The former seems
natural yet multiple remotes support went to jsonrpc_session already.


There are multiple possible approaches here.  The one that I am planning
to try out first is to have a client connect to only one randomly
selected server, and then have that server be responsible for relaying
write transactions to the leader.
Yes, this is an option. However, our tests suggest that ovsdb-server 
doesn't scale well with respect to (hundreds to thousands) connections. 
This relay approach adds at most one new connection within the cluster 
per new client connection, which could be a bottleneck.


Thanks,
Valentine




2. How does the client know which replica set member is currently a leader?
I just loop over remotes until one accepts the transaction (which is an
awful idea). It would be nice to send some sort of cluster metadata snapshot
to JSON RPC client during initial handshake. Alternatively, one can extend
the "not leader" error object with a leader URL.


If we do adopt the idea that followers relay write transactions to the
leader, then the client doesn't need to know the leader.  But if that
isn't practical, then the Raft thesis, section 6.2, suggests the same
idea as you did, of having the follower point to the leader if it knows
it.


3. For eventual consistency reasons, if an IDL reads from one member (A) but
writes to another one (B), it can try to delete a row not yet in A's
database. This would make all further requests fail with "inconsistent data"
error and basically is what I observe in my tests. How do you plan to
overcome this?


This sounds like a bug in the existing code (not too surprising).  What
is supposed to happen is that the client waits until it receives updated
data from the server, which it knows will eventually arrive because it
knows that its write wa

Re: [ovs-dev] OVN meeting report

2017-04-12 Thread Valentine Sinitsyn

Hi,

On 04.04.2017 15:29, Valentine Sinitsyn wrote:

On 03.04.2017 20:29, Valentine Sinitsyn wrote:

Hi Ben,

On 23.03.2017 08:11, Ben Pfaff wrote:

Hello everyone.  I am not sure whether I am going to be able to attend
the OVN meeting tomorrow, because I will be in another possibly
distracting meeting, so I'm going to give my report here.

Toward the end of last week I did a full pass of reviews through
patchwork.  The most notable result, I think, is that I applied patches
that add 802.1ad support.  For OVN, this makes it more reasonable to
consider adding support for tagged logical ports--currently, OVN drops
all tagged logical packets--which I've heard requested once or twice,
because it means that they can now be gatewayed to physical ports within
an outer VLAN.  I don't have any plans to work on that, but I think that
it is worth pointing out.

The OVS "Open Source Day" talks have been scheduled at OpenStack
Boston.  They are all on Wednesday:
https://www.openstack.org/summit/boston-2017/summit-schedule/#track=135

I've been spending what dev time I have on database clustering.  Today,
I managed to get it working, with many caveats.  It will take weeks or
months longer to get it finished, tested, and ready for posting.  (If
you want what I have, check out the raft3 branch in my ovs-reviews repo
at github.)

I've checked out your raft3 branch, and even learned how to create an
OVSDB cluster. Thanks for the docs!

What I don't get though is how do I instruct IDL to connect to the
cluster now? Do I just connect to a random server, or there should be
some dispatcher, or whatever?

OK I see this is an ongoing work in your branch.


I had some time to play with raft3 branch last week.

I added very basic and hacky replica set support to IDL and brought up 
an OVN setup with clustered southbound database. It works to some 
extent, yet if I try to throw several hundreds of logical ports into the 
mix, the database becomes inconsistent. The reason is probably the race 
window between when the raft leader appends a log entry to other nodes 
(so a client such as ovn-northd already sees it) and the entry really 
appears in the leader's log itself. Not sure if it is my bug or not. The 
original code had some minor issues as well (which is absolutely normal 
for WIP) - I can send my (rather trivial) patches if there is any interest.


Is there some design outline for the missing implementation bits? 
Specifically, it would be good to know the following:


1. With clustered OVSDB, a client such as IDL needs two JSON RPC 
connections: to the leader (to commit transactions), and a read-only one 
to an arbitrary replica set (scaling reads). Will it be implemented on 
ovsdb_idl level or encapsulated inside jsonrpc_session? The former seems 
natural yet multiple remotes support went to jsonrpc_session already.


2. How does the client know which replica set member is currently a 
leader? I just loop over remotes until one accepts the transaction 
(which is an awful idea). It would be nice to send some sort of cluster 
metadata snapshot to JSON RPC client during initial handshake. 
Alternatively, one can extend the "not leader" error object with a 
leader URL.


3. For eventual consistency reasons, if an IDL reads from one member (A) 
but writes to another one (B), it can try to delete a row not yet in A's 
database. This would make all further requests fail with "inconsistent 
data" error and basically is what I observe in my tests. How do you plan 
to overcome this?


Thanks in advance!

Valentine



Best,
Valentine



Thanks,
Valentine


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev



--
С уважением,
Синицын Валентин
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVN meeting report

2017-04-04 Thread Valentine Sinitsyn

On 03.04.2017 20:29, Valentine Sinitsyn wrote:

Hi Ben,

On 23.03.2017 08:11, Ben Pfaff wrote:

Hello everyone.  I am not sure whether I am going to be able to attend
the OVN meeting tomorrow, because I will be in another possibly
distracting meeting, so I'm going to give my report here.

Toward the end of last week I did a full pass of reviews through
patchwork.  The most notable result, I think, is that I applied patches
that add 802.1ad support.  For OVN, this makes it more reasonable to
consider adding support for tagged logical ports--currently, OVN drops
all tagged logical packets--which I've heard requested once or twice,
because it means that they can now be gatewayed to physical ports within
an outer VLAN.  I don't have any plans to work on that, but I think that
it is worth pointing out.

The OVS "Open Source Day" talks have been scheduled at OpenStack
Boston.  They are all on Wednesday:
https://www.openstack.org/summit/boston-2017/summit-schedule/#track=135

I've been spending what dev time I have on database clustering.  Today,
I managed to get it working, with many caveats.  It will take weeks or
months longer to get it finished, tested, and ready for posting.  (If
you want what I have, check out the raft3 branch in my ovs-reviews repo
at github.)

I've checked out your raft3 branch, and even learned how to create an
OVSDB cluster. Thanks for the docs!

What I don't get though is how do I instruct IDL to connect to the
cluster now? Do I just connect to a random server, or there should be
some dispatcher, or whatever?

OK I see this is an ongoing work in your branch.

Best,
Valentine



Thanks,
Valentine


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVN meeting report

2017-04-03 Thread Valentine Sinitsyn

Hi Ben,

On 23.03.2017 08:11, Ben Pfaff wrote:

Hello everyone.  I am not sure whether I am going to be able to attend
the OVN meeting tomorrow, because I will be in another possibly
distracting meeting, so I'm going to give my report here.

Toward the end of last week I did a full pass of reviews through
patchwork.  The most notable result, I think, is that I applied patches
that add 802.1ad support.  For OVN, this makes it more reasonable to
consider adding support for tagged logical ports--currently, OVN drops
all tagged logical packets--which I've heard requested once or twice,
because it means that they can now be gatewayed to physical ports within
an outer VLAN.  I don't have any plans to work on that, but I think that
it is worth pointing out.

The OVS "Open Source Day" talks have been scheduled at OpenStack
Boston.  They are all on Wednesday:
https://www.openstack.org/summit/boston-2017/summit-schedule/#track=135

I've been spending what dev time I have on database clustering.  Today,
I managed to get it working, with many caveats.  It will take weeks or
months longer to get it finished, tested, and ready for posting.  (If
you want what I have, check out the raft3 branch in my ovs-reviews repo
at github.)
I've checked out your raft3 branch, and even learned how to create an 
OVSDB cluster. Thanks for the docs!


What I don't get though is how do I instruct IDL to connect to the 
cluster now? Do I just connect to a random server, or there should be 
some dispatcher, or whatever?


Thanks,
Valentine


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Don't save pkt_mark in compose_output_action__().

2017-03-18 Thread Valentine Sinitsyn

Hi,

On 17.03.2017 22:55, Ben Pfaff wrote:

Previously, this function could modify the pkt_mark field as part of IPsec
integration.  It no longer does that, so there's no longer any need for it
to save and restore pkt_mark, and this commit removes that.
Does it mean that now there is no way to send a bit of information 
across a pair of patch ports, that is, mark a packet on bridge A and 
check the mark on bridge B?


Thanks,
Valentine



CC: Ansis Atteka 
Signed-off-by: Ben Pfaff 
---
 ofproto/ofproto-dpif-xlate.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 1a82b8d569be..9fe778a32857 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3265,7 +3265,6 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t 
ofp_port,
 struct flow *flow = >xin->flow;
 struct flow_tnl flow_tnl;
 union flow_vlan_hdr flow_vlans[FLOW_MAX_VLAN_HEADERS];
-uint32_t flow_pkt_mark;
 uint8_t flow_nw_tos;
 odp_port_t out_port, odp_port;
 bool tnl_push_pop_send = false;
@@ -3460,7 +3459,6 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t 
ofp_port,
 }

 memcpy(flow_vlans, flow->vlans, sizeof flow_vlans);
-flow_pkt_mark = flow->pkt_mark;
 flow_nw_tos = flow->nw_tos;

 if (count_skb_priorities(xport)) {
@@ -3588,7 +3586,6 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t 
ofp_port,
  out:
 /* Restore flow */
 memcpy(flow->vlans, flow_vlans, sizeof flow->vlans);
-flow->pkt_mark = flow_pkt_mark;
 flow->nw_tos = flow_nw_tos;
 }



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Reproducing ovn-scale-test results

2017-03-18 Thread Valentine Sinitsyn

Hi Han,

On 17.03.2017 23:36, Han Zhou wrote:


On Fri, Mar 17, 2017 at 2:50 AM, Valentine Sinitsyn
<valentine.sinit...@gmail.com <mailto:valentine.sinit...@gmail.com>> wrote:

Did you restart controllers or ovn-northd after running full tests

before binding more ports. It would be interesting to learn how long
does it take warm up IDL in controllers/northd in your setup.



No need to restart. It will cool down when test ends. On top of the
scale, we can just test creating & binding 500 lports and then tear down
the 500 lports before next test, which takes much less time than the
full run from empty.



Moreover, in current ovs-scale-test code, the step "wait 100 lport up"
is updated utilizing a new feature (wait for HVs to catch up) that was


Are you referring to ovn-nbctl --wait-until, or something else? If you

were not using it for the talk, what exactly does "create + bind" mean
on the graph?



--wait-until has been used always, to wait lport state become "up" in
NB, which means port binding is reflected in NB. The new change [1] was
using the new feature of ovn-nbctl: "sync --wait=hv" which will wait
until the port binding to be processed on all HVs. This feature was not
there yet when we had the talk, and no reasonable alternative to achieve
the same.
Got you know. This may affect the test negatovely indeed, as OVSDB seems 
to scale poorly with the number of clients, as our tests suggest.


Thanks you again.

Valentine


[1]
https://github.com/openvswitch/ovn-scale-test/commit/0ece1038de45f05f461b45162b21a8bde2793010

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Reproducing ovn-scale-test results

2017-03-17 Thread Valentine Sinitsyn

Hi,

On 17.03.2017 02:24, Han Zhou wrote:



On Thu, Mar 16, 2017 at 1:06 PM, Valentine Sinitsyn
<valentine.sinit...@gmail.com <mailto:valentine.sinit...@gmail.com>> wrote:


Hi Han,

Thanks for the quick answer.

On 17.03.2017 00:34, Han Zhou wrote:


On Thu, Mar 16, 2017 at 3:58 AM, Valentine Sinitsyn
<valentine.sinit...@gmail.com <mailto:valentine.sinit...@gmail.com>

<mailto:valentine.sinit...@gmail.com
<mailto:valentine.sinit...@gmail.com>>> wrote:



Hi all,

We are doing some stress testing on OVN 2.7, and wanted to reproduce


results from the talk [1]. Looking at ovn-scale-test sources, I have two
questions:





Hi Valentine,

Thanks for picking this up.



- Do I get correctly that the benchmark always starts with the empty


northbound db. Then lswitches are added, then you add ports to each

lswitch?





Yes, the test result shown in the talk was started from empty to
gradually reach 20k lports on 200 lswitches.


- What is the batch size in port_create_args?



I remember it was 100. In addition, there were 5 jobs running in

parallel.

+Lei to confirm.


Could you recall how long (approximately) does it take to create and

bind 20K ports with these settings? This would be really helpful.




I don't have the raw data now, but it took around 1 - 2 hours.
We don't always run the full test, but after the full test is completed,
we can just run another task to create and bind 1k more lports to
evaluate the optimizations in each iteration on top of the existing scale.
Did you restart controllers or ovn-northd after running full tests 
before binding more ports. It would be interesting to learn how long 
does it take warm up IDL in controllers/northd in your setup.




One more thing, the graph shared also involved sandbox (simulated HV)
creation and lswitch creation. They were all created gradually during
the test run.
The flow was like:
1. create 50 sandboxes
2. (5 jobs in parallel) create 1 lswitch, create 100 lports, bind 100
lports, wait 100 lport up
3. if there are 100 sandboxes already on the BM, switch to another BM
4. goto step1, until it is done for all 20 BMs.

Moreover, in current ovs-scale-test code, the step "wait 100 lport up"
is updated utilizing a new feature (wait for HVs to catch up) that was
Are you referring to ovn-nbctl --wait-until, or something else? If you 
were not using it for the talk, what exactly does "create + bind" mean 
on the graph?


Many thanks again.

Valentine


added after the report, and we didn't run the test again yet with this
change. I would expect it impact the test result slightly negatively,
but it would more accurate.





In short: is it true that for the setup involving (say) 1 ports


spanned over 100 lswitches in the aforementioned test, a Rally task
would look like this?



{
"version": 2,
"title": "Create and bind port",
"subtasks": [{
"title": "Create and bind port",
"workloads": [{
"name": "OvnNetwork.create_and_bind_ports",
"args": {
"network_create_args": {
"amount": 100,
"batch": 1,
"start_cidr": "172.16.1.0/24

<http://172.16.1.0/24> <http://172.16.1.0/24>",

"physical_network": "providernet"
},
"port_create_args" : {"batch": 2},
"ports_per_network": 100,
"port_bind_args": {"wait_up": true}
},
"runner": {
"type": "serial","times": 1},
"context": {
   "ovn_multihost" : {
"controller": "ovn-controller-node"
},
"sandbox":{ "tag": "ToR1"}
}
}]
}]
}

1. https://youtu.be/okralc7LrZo?t=1185

Thanks,
Valentine





--
С уважением,
Синицын Валентин



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Multi-threaded OVSDB

2017-03-17 Thread Valentine Sinitsyn

Hi Andy,

On 17.03.2017 02:28, Andy Zhou wrote:

On Thu, Mar 16, 2017 at 1:52 PM, Ben Pfaff <b...@ovn.org> wrote:

On Thu, Mar 16, 2017 at 11:38:19PM +0500, Valentine Sinitsyn wrote:

On 16.03.2017 20:56, Ben Pfaff wrote:

On Tue, Mar 14, 2017 at 07:08:54PM +0500, Valentine Sinitsyn wrote:

Recently, I was evaluating a multi-threaded OVSDB/ovn-northd design, and
came across the patchset [1].

Looks like this RFC patchset was received well, but never completed. What's
the reason? No real performance benefits, lack of interest, other
high-priority tasks or whatever?


It's kind of a combination of those.  Andy got preempted by other
higher-priority work, plus it's unclear whether threading ovsdb-server
solves an important problem at this time.  I'm currently working on
adding clustering support to OVSDB, which ought to allow scaling out
reads, which are most of the OVN workload, so that might solve the same
problem in a different way.

This sounds promising. Are you planning something Mongo-like, that is, one
server writes should be directed to, and all servers serving reads?


That's essentially the planned approach.  This should allow better
scaling out reads.  Half an hour isn't really acceptable and OVN should
aim to do much better than that.


In our tests, it takes about half an hour (and a few hundred
reconnects) to send an initial snapshot of a large southbound database
to 1000+ OVN 2.7 controllers. This makes disaster recovery plan a
pain. Should we expect things to get better here (we can probably
contribute to this, if feasible)?


I'd expect that the clustered database design should scale pretty well
for reads, which are most of the OVN workload.  I'll have to have
something actually working before we can test and tune it, though.


As for multi-threaded OVSDB, the latest patch series I found in Andy's fork
segfaults just after startup, so we can't even do a quick test to check if
it makes things better for us or not.


I don't know whether Andy thought it was ready for testing.


It was work-in-progress, not ready for testing. I have since worked a
bit more to
multi-thread all OVSDB sever features and found the changes will make OVSDB
server quite more complex.

Given that Ben is working on clustering, it may not be wise to make
two major changes
at the same time. I plan to revisit multi-threading after the
clustering changes are in.
This sounds sane. As things are going to change significantly, perhaps 
I'd stop trying to put mt6 branch in the Andy's fork into testable 
state. I've fixed a use-after-free bug (see the patch attached), but it 
still smashes the stack or ends up with garbage in barrier->seq instead 
of a pointer in ovs_barrier_block(). I haven't figured out the reason.


As for the clustering, we are currently looking for ways to scale OVSDB, 
and will be happy to be early adopters for this. As I mentioned 
previously, we can also contribute code if it would make things go 
quicker, so I can persuade my manager this is a worthwhile investment of 
time :)


Thanks,
Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Reproducing ovn-scale-test results

2017-03-16 Thread Valentine Sinitsyn

Hi Han,

Thanks for the quick answer.

On 17.03.2017 00:34, Han Zhou wrote:

On Thu, Mar 16, 2017 at 3:58 AM, Valentine Sinitsyn
<valentine.sinit...@gmail.com <mailto:valentine.sinit...@gmail.com>> wrote:


Hi all,

We are doing some stress testing on OVN 2.7, and wanted to reproduce

results from the talk [1]. Looking at ovn-scale-test sources, I have two
questions:




Hi Valentine,

Thanks for picking this up.



- Do I get correctly that the benchmark always starts with the empty

northbound db. Then lswitches are added, then you add ports to each lswitch?




Yes, the test result shown in the talk was started from empty to
gradually reach 20k lports on 200 lswitches.


- What is the batch size in port_create_args?


I remember it was 100. In addition, there were 5 jobs running in parallel.
+Lei to confirm.
Could you recall how long (approximately) does it take to create and 
bind 20K ports with these settings? This would be really helpful.


Thanks,
Valentine





In short: is it true that for the setup involving (say) 1 ports

spanned over 100 lswitches in the aforementioned test, a Rally task
would look like this?


{
"version": 2,
"title": "Create and bind port",
"subtasks": [{
"title": "Create and bind port",
"workloads": [{
"name": "OvnNetwork.create_and_bind_ports",
"args": {
"network_create_args": {
"amount": 100,
"batch": 1,
"start_cidr": "172.16.1.0/24 <http://172.16.1.0/24>",
"physical_network": "providernet"
},
"port_create_args" : {"batch": 2},
"ports_per_network": 100,
"port_bind_args": {"wait_up": true}
},
"runner": {
"type": "serial","times": 1},
"context": {
   "ovn_multihost" : {
"controller": "ovn-controller-node"
},
"sandbox":{ "tag": "ToR1"}
}
}]
}]
}

1. https://youtu.be/okralc7LrZo?t=1185

Thanks,
Valentine




--
С уважением,
Синицын Валентин
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Reproducing ovn-scale-test results

2017-03-16 Thread Valentine Sinitsyn

Hi all,

We are doing some stress testing on OVN 2.7, and wanted to reproduce 
results from the talk [1]. Looking at ovn-scale-test sources, I have two 
questions:


- Do I get correctly that the benchmark always starts with the empty 
northbound db. Then lswitches are added, then you add ports to each lswitch?


- What is the batch size in port_create_args?

In short: is it true that for the setup involving (say) 1 ports 
spanned over 100 lswitches in the aforementioned test, a Rally task 
would look like this?


{
"version": 2,
"title": "Create and bind port",
"subtasks": [{
"title": "Create and bind port",
"workloads": [{
"name": "OvnNetwork.create_and_bind_ports",
"args": {
"network_create_args": {
"amount": 100,
"batch": 1,
"start_cidr": "172.16.1.0/24",
"physical_network": "providernet"
},
"port_create_args" : {"batch": 2},
"ports_per_network": 100,
"port_bind_args": {"wait_up": true}
},
"runner": {
"type": "serial","times": 1},
"context": {
   "ovn_multihost" : {
"controller": "ovn-controller-node"
},
"sandbox":{ "tag": "ToR1"}
}
}]
}]
}

1. https://youtu.be/okralc7LrZo?t=1185

Thanks,
Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Multi-threaded OVSDB

2017-03-14 Thread Valentine Sinitsyn

Hi all,

Recently, I was evaluating a multi-threaded OVSDB/ovn-northd design, and 
came across the patchset [1].


Looks like this RFC patchset was received well, but never completed. 
What's the reason? No real performance benefits, lack of interest, other 
high-priority tasks or whatever?


1. https://mail.openvswitch.org/pipermail/ovs-dev/2016-March/310673.html

Thanks,
Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] -fomit-frame-pointer in OVS

2017-02-27 Thread Valentine Sinitsyn

Hi,

Currently, OVS seems to disable frame pointer (-fomit-frame-pointer) in 
non-debug builds.


While I do know this is a common optimization, it makes run-time 
profiling substantially less straightforward. So I wonder, if there are 
any benchmarks showing the effect of omitting frame pointer, especially 
on x86-64?


Thanks,
Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [branch-2.7] Set release date for 2.7.0.

2017-02-24 Thread Valentine Sinitsyn

Hi,

Open vSiwtch 2.7 won't be an LTS release, right?

Thanks,
Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] OVN: Preserving state between logical datapaths

2017-02-16 Thread Valentine Sinitsyn

Hi all,

Imagine you want to mark a packet in logical switch datapath then use 
this mark in logical router datapath somehow (an artificial use-case 
would be policy routing based on VM port, not destination IP address).


Is there a better way than using packet mark (which also doesn't seem to 
survive "output" action, yet it's easily fixable)? I assume OVS/OVN 2.6 
on Linux with in-kernel datapath, if this matters.


Many thanks,
Valentine Sinitsyn
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Quick question on Linux expo 2017

2017-02-15 Thread Valentine Sinitsyn

On 16.02.2017 11:52, Russell Bryant wrote:


On Thu, Feb 16, 2017 at 12:53 AM, Valentine Sinitsyn
<valentine.sinit...@gmail.com <mailto:valentine.sinit...@gmail.com>> wrote:

Hi all,

I feel strange about replying to seemingly spam emails, but is it
the same Southern California Linux Expo as in [1]?


Yes, I believe this was spam.
Yes, but what worries me is and attempt to sell private data which IMO 
compromises a respected event, not spam. We all know this happens but I 
never saw it happened on a public mailing list before (by accident, I 
suppose).


Sorry if I got things wrong.

Valentine




--
Russell Bryant

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Quick question on Linux expo 2017

2017-02-15 Thread Valentine Sinitsyn

Hi all,

I feel strange about replying to seemingly spam emails, but is it the 
same Southern California Linux Expo as in [1]?


If so, isn't it a brute violation of the Terms and Conditions [2], 
quoted below?



Privacy Policy

The Linux Expo of Southern California gathers personal information from people 
who register to attend the So Cal Linux Expo. This information is ONLY used by 
SCALE to improve future Expos. If other groups that participate in SCALE ask 
for information about attendees to improve their offering at SCALE, and we 
agree to share it, it will be only those demographics that are not identifiable 
to any individual.


Although I heard of the event just a few minutes ago, this doesn't look 
good for me.


1. https://www.socallinuxexpo.org/scale/15x
2. https://www.socallinuxexpo.org/scale/15x/policies

Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] how to create new ovsdb tables and code c funtion related?

2017-02-14 Thread Valentine Sinitsyn

Hi,

On 15.02.2017 10:08, lg.yue wrote:

Hi, everyone:

一. after i change ovn/ovn-[sn]b.ovsschema,  how to apply the changes?
makefile compiles ovn/ovn-[sn]b.ovsschema to
ovn/lib/ovn-[s,n]b-idl.ovsidl, i can not  find anyone use
ovn-[s,n]b-idl.ovsidl.
As the name suggests, *.ovsschema defines schema only. You still need to 
write code which uses your new tables. It should be possible to fill 
them with ovs-vsctl, but without something reading them, this would be 
dead data.



二. lets take sb for example.
 ovsdb-server --detach --monitor -vconsole:off
--log-file=/var/log/openvswitch/ovsdb-server-sb.log
--remote=punix:/run/openvswitch/ovnsb_db.sock
--pidfile=/run/openvswitch/ovnsb_db.pid
--remote=db:OVN_Southbound,SB_Global,connections --unixctl=ovnsb_db.ctl
--private-key=db:OVN_Southbound,SSL,private_key
--certificate=db:OVN_Southbound,SSL,certificate
--ca-cert=db:OVN_Southbound,SSL,ca_cert /var/lib/openvswitch/ovnsb_db.db
1. who launches the ovsdb-server process?  and when?
Usually this happens from the init scripts. The real machinery is under 
${PREFIX}/share/openvswitch/scripts.



2. how  /var/lib/openvswitch/ovnsb_db.db is created?
You can use ovsdb-tool create, see INSTALL.md. ovn-ctl script called 
from init scripts should handle this automatically.



3. 'ovn-sbctl set-connection ptp:6642'  how this instruction
associates port 6642 with ovsdb-server (please tell the source code)
三. supported the new db is created, whether  or not i need to code c
funtion  like sbrec_idl_class in ovn/lib/ovn-sb-idl.c
This is basically a compiled schema definition. You should start at 
ovn/utilities/ovn-sbctl.c





 too many questions, please help me figure it out.
thanks very much
Just don't forget to consult the documentation, some of your questions 
are already answered there ;-)


Valentine








At 2017-02-14 21:15:42, "Valentine Sinitsyn" <valentine.sinit...@gmail.com> 
wrote:

Hi,

Look at ovn/ovn-[sn]b.ovsschema. It's JSON. You'll also need to update
the documentation and the cksum; make explains how to do it.

Best,
Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev







___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] how to set up ovn's nb and sb table and column structure?

2017-02-14 Thread Valentine Sinitsyn

Hi,

Look at ovn/ovn-[sn]b.ovsschema. It's JSON. You'll also need to update 
the documentation and the cksum; make explains how to do it.


Best,
Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 00/16] port Jiri Benc's L3 patchset to ovs

2017-02-13 Thread Valentine Sinitsyn

Hi Jan,

On 10.02.2017 04:14, Jan Scheurich wrote:

Hi Valentine,

On 2017-02-09 08:58, Valentine Sinitsyn wrote:

This L3 patchset looks similar to what we did internally with OVS 2.6
to add support for IPv6 tunnels.

Could you please confirm that ovs-dpctl reports correct statistics
with this patchset when one uses in-kernel Linux datapath? We had some
issues with this (the counters were always zero). Largely, this was
because userspace code (I refer to the tools and the daemon, not DPDK
datapath here) assumes a plugged network interface is always L2, and I
don't see this patch touching these files.


The most recent user-space code in vswitchd that deals with the L3
tunnels in netdev and kernel datapath is contained in another patch
series:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-February/328391.html

You could help in reviewing that. It may not be complete with respect to
handling kernel datapath tunnels as we were not able to test yet due to
a lack of patches to configure L3 tunnel ports in the kernel. But
similar problems with counters might also exist with netdev datapath.
I had a quick look at patch series, thanks for the links. Will try to 
have a more in-depth look this week.


As for the "counters issue" I mentioned, it seems tiny given the scope 
of the patchset. Moreover, a get_etheraddr() change in [1] should fix 
it, although I haven't checked yet.


[1] https://mail.openvswitch.org/pipermail/ovs-dev/2017-February/328392.html

Best,
Valentine



Thanks, Jan


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 00/16] port Jiri Benc's L3 patchset to ovs

2017-02-08 Thread Valentine Sinitsyn

Hi all,

This L3 patchset looks similar to what we did internally with OVS 2.6 to 
add support for IPv6 tunnels.


Could you please confirm that ovs-dpctl reports correct statistics with 
this patchset when one uses in-kernel Linux datapath? We had some issues 
with this (the counters were always zero). Largely, this was because 
userspace code (I refer to the tools and the daemon, not DPDK datapath 
here) assumes a plugged network interface is always L2, and I don't see 
this patch touching these files.


Thanks for your co-operation.

Best regards,
Valentine Sinitsyn
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Flow key update in conntrack/nat

2017-01-11 Thread Valentine Sinitsyn

Hi Joe,

On 11.01.2017 23:30, Joe Stringer wrote:

On 11 January 2017 at 02:47, Valentine Sinitsyn
<valentine.sinit...@gmail.com> wrote:

Hi all,

I'm struggling to find an answer to a seemingly simple question: why does
"ct(nat)" action need to update the flow key after NAT (see
ovs_nat_update_key())?

My confusion comes from the following scenario. Consider the first
to-be-NATed packet coming. There is no datapath flow installed, so this
results in an upcall. The userspace part will then install a new datapath
flow (using original, unmodified flow key it got) and execute the action.
Subsequent packets will be handled in the kernel automatically, but again,
the ovs_nat_update_key() flow key will be silently discarded in
ovs_vport_receive().

So it looks like the modified flow key is never used. What am I missing
here?


This depends on your flow table. If another lookup needs to occur (eg,
ct(table=N,...) option), or the packet is sent to userspace
(sflow,ipfix, etc), then the updated flow key needs to be provided -
in datapath, recirc (if it triggers upcall) or userspace actions. Most
OVS actions in the datapath modify the key in-place so that it is
correct whenever it needs to be used; the key doesn't need to be
completely repopulated afresh when it is needed.

Thanks for answering.

So the point I was missing is that there could be other actions 
following 'ct(nat)', which may use the flow. Makes sense now.


Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Flow key update in conntrack/nat

2017-01-11 Thread Valentine Sinitsyn

Hi all,

I'm struggling to find an answer to a seemingly simple question: why 
does "ct(nat)" action need to update the flow key after NAT (see 
ovs_nat_update_key())?


My confusion comes from the following scenario. Consider the first 
to-be-NATed packet coming. There is no datapath flow installed, so this 
results in an upcall. The userspace part will then install a new 
datapath flow (using original, unmodified flow key it got) and execute 
the action. Subsequent packets will be handled in the kernel 
automatically, but again, the ovs_nat_update_key() flow key will be 
silently discarded in ovs_vport_receive().


So it looks like the modified flow key is never used. What am I missing 
here?


Thanks,
Valentine Sinitsyn
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Per-switch configuration in datapath actions

2016-12-12 Thread Valentine Sinitsyn

Hi all,

Suppose you are implementing a custom OpenFlow action, and you need some 
per-bridge configuration to translate it into a datapath action.


Which would be the architecturally correct way to promote this bit of 
information from OVSDB to somewhere inside struct xlate_ctx?


Thanks for your suggestions!

--
Best regards,
Valentine Sinitsyn
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC] [PATCH] ovn: Support sample action in logical datapath

2016-11-30 Thread Valentine Sinitsyn

On 30.11.2016 06:18, Ben Pfaff wrote:

On Tue, Nov 29, 2016 at 07:22:32PM +0500, Valentine Sinitsyn wrote:

On 29.11.2016 05:21, Ben Pfaff wrote:

On Fri, Oct 14, 2016 at 04:35:46PM +0500, Valentine Sinitsyn wrote:

This is a quick attempt to implement sample action at logical port
level.The goal is to export IPFIX flows for logical ports, yet it is
easy to extend this approach to logical switches as well.

Nothing is done to provision OVS instances with required
Flow_Sample_Collector_Set and IPFIX entries at this point.



This is pretty cool!  The integration among OVS and OVN and IPFIX is
graceful.

The part that worries me is the CMS integration.  Have you actually
built that integration already (for which CMS)?  I have two concerns.
First, I'd prefer to see at least one CMS (probably OpenStack) support
this at or around the time that it goes into OVN.  Second, I have some
skepticism around the idea that the CMS should configure the
Flow_Sample_Collector_Set, etc., because OVN doesn't currently require
the CMS to have any connectivity to OVSDB on each of the hypervisors and
this would require the CMS to add that support.

I agree that this particular bit is somewhat hacky. We plan to follow this
route for an in-house CMS we build, but I doubt OpenStack community would
pickup the idea. What alternatives do you see here? Having collector config
at south db level doesn't seem clear either. Think I want to configure
collector at 127.0.0.1:5900 - which localhost does this entry refer to?


Is this a common way to configure IPFIX?  I had been under the
impression that generally there's one or a few collectors in a network,
to which each switch forwards packets.  If it's common to use a
per-hypervisor collector, then that might actually makes thing easier,
since that would be easy for ovn-controller to configure into OVS on
each hypervisor.
Running collectors local to hypervisors is what we do here. I can't say 
if it's a common scenario, but given that IPFIX is most often UDP which 
can be lost, it usually makes sense to keep collectors and exporters as 
close as possible.




Otherwise, I'm inclined to at least learn what the requirements would be
for common deployments of IPFIX.  Even if we don't implement it them (or
all of them), it's important to me to know what we're leaving out so
that what we add now is built in a way that it's gracefully extensible
later.

For example: if a packet should be sent to a collector, should the
collector be chosen based on the packet's logical network, or based on
the packet's physical network (the hypervisor it's ingressing or
egressing), or on some combination of those?

I also find myself wondering whether logical port level is the right
level at which to choose whether to sample packets.  Will OVN users want
finer-grained control over sampling and, if so, would it make more sense
to add an ACL-like table for that purpose at the northbound level?
You mean using an lflow match to control when the "sample" action will 
trigger, rather than hard-wiring these actions to logical ports via 
"ipfix_options"? This sounds reasonable and not to hard to implement, 
given that we already have these tables in the southbound db.





If the sample() integration looks good, CMS assumptions aside, is there a
chance to merge it as a stand-alone action? That's true no publicly
available CMS would use it for a while, but when they decide to, the code
would already be there. And the code is not dead, as we'll be using it as
well.


It's better than no users at all.


Do you have any thoughts about supporting other monitoring technology
that OVS supports (e.g. sFlow) using similar techniques?

I haven't targeted any of them specifically, but it doesn't seem to be a
daunting task. One only need some way to associate sample() instance and a
sFlow receiver the same way collector_set_id does for IPFIX.

I'd suggest to generalize Flow_Sample_Collector_Set somehow, but we agreed
configuring things through this table in OVN scenario is suboptimal. Any
thoughts?


Did we have an earlier discussion?  I've spent a few minutes searching
my email archive and I don't see one.  If there was one, can you point
it out?

No, no prior discussion, sorry for being unclear.
I was referring to your concerns regarding CMS integration in the 
beginning of this thread.


Thanks,
Valentine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev