Hi Nithin,

The overall changes added to the design document looks good. I have some 
comments. Please see inline.

Thanks,
Sorin 

-----Original Message-----
From: dev [mailto:[email protected]] On Behalf Of Nithin Raju
Sent: Wednesday, 12 November, 2014 22:01
To: [email protected]
Subject: [ovs-dev] [PATCH] datapath-windows: update DESIGN document

In this patch, we update the design document to reflect the netlink based 
kernel-userspace interface implementation and a few other changes.
I have covered at a high level.

Please feel free to extend the document with more details that you think got 
missed out.

Signed-off-by: Nithin Raju <[email protected]>
---
 datapath-windows/DESIGN |  180 ++++++++++++++++++++++++++++++++--------------
 1 files changed, 125 insertions(+), 55 deletions(-)

diff --git a/datapath-windows/DESIGN b/datapath-windows/DESIGN index 
b438c44..f81dad0 100644
--- a/datapath-windows/DESIGN
+++ b/datapath-windows/DESIGN
@@ -1,20 +1,14 @@
                        OVS-on-Hyper-V Design Document
                        ============================== -There has been an 
effort in the recent past to develop the Open vSwitch (OVS) -solution onto 
multiple hypervisor platforms such as FreeBSD and Microsoft -Hyper-V. VMware 
has been working on a OVS solution for Microsoft Hyper-V for -the past few 
months and has successfully completed the implementation.
-
-This document provides details of the development effort. We believe this 
-document should give enough information to members of the community who are 
-curious about the developments of OVS on Hyper-V. The community should also be 
-able to get enough information to make plans to leverage the deliverables of 
-this effort.
-
-The userspace portion of the OVS has already been ported to Hyper-V and 
-committed to the openvswitch repo. So, this document will mostly emphasize on 
-the kernel driver, though we touch upon some of the aspects of userspace as 
-well.
+There has been an community effort to develop a port of Open vSwitch on 
[Sorin]: 
"There has been an community effort to develop a port of Open vSwitch on" -->
"There has been a community effort to develop Open vSwitch on"

+Microsoft Hyper-V. In this document, we provide details of the 
+development effort. We believe this document should give enough 
+information to understand the overall design.
+
+The userspace portion of the OVS has been ported to Hyper-V in a 
+separate effort, and committed to the openvswitch repo. So, this 
+document will mostly emphasize on the kernel driver, though we touch 
+upon some of the aspects of userspace as well.
 
 We cover the following topics:
 1. Background into relevant Hyper-V architecture @@ -80,17 +74,18 @@ has been 
used to retrieve some of the configuration information that OVS needs.
   |      | |  DAEMON/CTL  |       | |           |  |            | |
   +------+-++---+---------+       | +--+------+-+  +----+------++ | +--------+
   |  DPIF-  |   | netdev- |       |    |VIF #1|         |VIF #2|  | |Physical|
-  | Windows |<=>| Windows |       |    +------+         +------+  | |  NIC   |
+  | Netlink |   | Windows |       |    +------+         +------+  | |  NIC   |
   +---------+   +---------+       |      ||                   /\  | +--------+
-User     /\                       |      || *#1*         *#4* ||  |     /\
-=========||=======================+------||-------------------||--+     ||
-Kernel   ||                              \/                   ||  ||=====/
-         \/                           +-----+                 +-----+ *#5*
+User     /\         /\            |      || *#1*         *#4* ||  |     /\
+=========||=========||============+------||-------------------||--+     ||
+Kernel   ||         ||                   \/                   ||  ||=====/
+         \/         \/                +-----+                 +-----+ *#5*
  +-------------------------------+    |     |                 |     |
  |   +----------------------+    |    |     |                 |     |
  |   |   OVS Pseudo Device  |    |    |     |                 |     |
- |   +----------------+-----+    |    |     |                 |     |
- |                               |    |  I  |                 |     |
+ |   +----------------------+    |    |     |                 |     |
+ |      | Netlink Impl. |        |    |     |                 |     |
+ |      -----------------        |    |  I  |                 |     |
  | +------------+                |    |  N  |                 |  E  |
  | |  Flowtable | +------------+ |    |  G  |                 |  G  |
  | +------------+ |  Packet    | |*#2*|  R  |                 |  R  |
@@ -140,7 +135,9 @@ are:
  * Interface between the userspace and the kernel module.
  * Event notifications are significantly different.
  * The communication interface between DPIF and the kernel module need not be
-   implemented in the way OVS on Linux does.
+   implemented in the way OVS on Linux does. That said, it would be
+   advantageous to have a similar interface to the kernel module for reasons of
+   readibility and maintenance.
[Sorin]: "readibility and maintenance" --> "readability and maintainability"

  * Any licensing issues of using Linux kernel code directly.
 
 Due to these differences, it was a straightforward decision to develop the @@ 
-159,13 +156,17 @@ called ovs-wind. At a high level ovs-wind manages keeps the 
ovsdb used by  userspace in sync with the kernel state. More details in the 
userspace section.
[Sorin]: Please also reformulate the above phrase to something like:
" At a high level ovs-wind keeps the ovsdb, used by userspace, in sync with the 
kernel state. More details in the userspace section.".

 As explained in the OVS porting design document [7], DPIF is the portion of 
-userspace that interfaces with the kernel portion of the OVS. Each platform 
can -have its own implementation of the DPIF provider whose interface is 
defined in -dpif-provider.h [3]. For OVS on Hyper-V, we have an implementation 
of DPIF -provider for Hyper-V. The communication interface between userspace 
and the -kernel is a pseudo device and is different from that of the Linux’s 
DPIF -provider which uses netlink. But, as long as the DPIF provider interface 
is the -same, the callers should be agnostic of the underlying communication 
interface.
+userspace that interfaces with the kernel portion of the OVS. The 
+interface that each DPIF provider has to provide is defined in dpif-provider.h 
[3].
+Though each platform is allowed to have its own implementation of the 
+DPIF provider, it was found out via community feedback than it is a 
+good idea to share code whenever possible. Thus, the DPIF provider for 
+OVS on Hyper-V shares code with the DPIF provider on Linux. This 
+interface is implemented in dpif-netlink.c (formerly dpif-linux.c).
[Sorin]: In the latter phrase I would formulate as follows:
"The interface that each DPIF provider has to implement is defined in 
dpif-provider.h [3]. Though each platform is allowed to have its own 
implementation of the DPIF provider, it has been found, via community feedback, 
that is desired to share code whenever possible. Thus, the DPIF provider for 
OVS on Hyper-V shares code with the DPIF provider on Linux. This interface is 
implemented in dpif-netlink.c, formerly dpif-linux.c."

+
+We'll elaborate more on Kernel-Userspace interface in a dedicated 
+section below. Here it suffices to say that the the DPIF provider 
[Sorin]: "suffices to say that the the DPIF provider" --> "suffices to say that 
the DPIF provider"

+implementation for Windows is netlink based and shares code with the Linux one.
[Sorin]: " implementation for Windows is netlink based" should be " 
implementation for Windows is Netlink-based".

 2.a) Kernel module (datapath)
 -----------------------------
@@ -208,6 +209,35 @@ the OVS kernel module. This is equivalent to the typical 
character device  interface on POSIX platforms. The pseudo device supports a 
whole bunch of  ioctls that netdev and DPIF on OVS userspace make use of.
 
+Netlink messages
+----------------
+The communication between OVS userspace and OVS kernel datapath is in 
+the form of Netlink messages [1]. More on this in the section on 
+Kernel-userspace interface (#2.c). In the kernel, a full fledged 
[Sorin]: " More on this in the section on Kernel-userspace interface (#2.c)." 
--> " More details about this are provided in the #2.c section, 
Kernel-Userspace interface."

+netlink message parser has been implemented along the lines of the 
+netlink message parser in OVS userspace. In fact, a lot of the code is ported 
code.
+
+On the lines of 'struct ofpbuf' in OVS userspace, a managed buffer has 
+been implemented in the kernel datapath to make it easier to parse and 
+construct netlink messages.
+
+Netlink sockets
+---------------
+On Linux, OVS userspace utilizes netlink sockets to pass back and forth 
+netlink messages. Since much of userspace code including DPIF provider 
+in dpif-netlink.c (formerly dpif-linux.c) has been reused, 
[Sorin]: The dpif-netlink.c history is already provided above, so the duplicate 
explanation in the parenthesis should be removed.

+pseudo-netlink sockets have been implemented in OVS userspace. As it is 
+known, Windows lacks native netlink socket support, and also the socket family 
is not extensible either.
+Hence it is not possible to provide a native implementaion of netlink socket.
+We implement pseudo-netlink sockets in lib/netlink-socket.c that appear 
+to be netlink sockets from higher levels. However, the implementation 
+opens a handle to the pseudo device for each pseudo-netlink socket. 
+More on this in the section on later sections.
[Sorin]: "More on this in the section on later sections." --> "Additional 
information about netlink sockets are provided in the following sections."

+
+Typical netlink semantics of read message, write message, dump, and 
+transaction have been implemented so that higher level layers are not 
+affected by the netlink implementation not being native.
+
 Switch/Datapath management
 --------------------------
 As explained above, we hook onto the management callback functions in the NDIS 
@@ -279,36 +309,72 @@ interface to the OVS kernel driver.
 
 2.c) Kernel-Userspace interface
 -------------------------------
-DPIF-Windows
-------------
-DPIF-Windows is the Windows implementation of the interface defined in dpif- 
-provider.h, and provides an interface into the OVS kernel driver. We implement 
-most of the callbacks required by the DPIF provider. A quick summary of the 
-functionality implemented is as follows:
- * dp_dump, dp_get: dump all datapath information or get information for a
-   particular datapath.  Currently we only support one datapath.
- * flow_dump, flow_put, flow_get, flow_flush: These functions retrieve all
-   flows in the kernel, add a flow to the kernel, get a specific flow and
-   delete all the flows in the kernel.
- * recv_set, recv, recv_wait, recv_purge: these poll packets for upcalls.
- * execute: This is used to send packets from userspace to the kernel. The
-   packets could be either flow miss packet punted from kernel earlier or
-   userspace generated packets.
- * vport_dump, vport_get, ext_info: These functions dump all ports in the
-   kernel, get a specific port in the kernel, or get extended information
-   about a port.
- * event_subscribe, wait, poll: These functions subscribe, wait and poll the
-   events that kernel posts.  A typical example is kernel notices a port has
-   gone up/down, and would like to notify the userspace.
+As explained earlier, OVS on Hyper-V shares the DPIF provider 
+implementation with Linux. The DPIF provider on Linux uses Netlink 
+sockets and Netlink messages. Netlink sockets and messages are 
+extensively used on Linux to exchange information between userspace and 
+kernel. In order to satisfy these depdendencies, netlink socket (pseudo 
+and non-native) and netlink messages are implemented on Hyper-V.
[Sorin]: We should decide which formulation to choose: "Netlink 
implementation/messages/semantics/sockets" or "netlink 
implementation/messages/semantics/sockets". I propose using capital letter when 
referring to Netlink inside a sentence. Either way it should be consistent 
across all document.
Also "satisfy these depdendencies" should be "satisfy these dependencies".

+
+The following are the major advantages of sharing DPIF provider code:
+1. Maintenance is simpler:
+   Any change made to the interface defined in dpif-provider.h need not be
+   propagated to multiple implementations. Also, developers familiar with the
+   Linux implementation of the DPIF provider can easily ramp on the Hyper-V
+   implementation as well.
+2. Netlink messages provides inherent advantages:
+   Netlink messages are known for their extensiblity. Each message is
+   versioned, so the data structures provide mechanisms to version checking and
+   providing forwards and backwards compatiblity with the kernel module.
[Sorin]: I would rephrase the last sentence as follows:
"Each message is versioned, so the provided data structures offer a mechanism 
to version checking and forward/backward compatibility with the kernel module."
Also "known for their extensiblity" --> "known for their extensibility".
+
+openvswitch.h and OvsDpInterfaceExt.h
+-------------------------------------
+Since the DPIF provider is shared with Linux, the kernel datapath 
+provides the same interface as the Linux datapath. The interface is 
+defined in datapath/linux/compat/include/linux/openvswitch.h. 
+Derivatives of this interface file are created during OVS userspace 
+complation. The derivative for the kernel datpath on Hyper-V is in the 
following location:
[Sorin]: 
"OVS userspace complation" --> "OVS userspace compilation"
"kernel datpath on Hyper-V is in the" --> "kernel datapath on Hyper-V is 
provided in the"

+datapath-windows/include/OvsDpInterface.h
+
+That said, there are Windows specific extensions that are defined in 
+the interface file:
+datapath-windows/include/OvsDpInterfaceExt.h
+
+Netlink sockets
+---------------
+As explained in other sections, a version of netlink sockets has been 
+implemented in lib/netlink-socket.c for Windows. The implementation 
+creates a handle to the OVS pseudo device, and emulates netlink socket 
+semantics of receive message, send message, dump, and transact. Most of 
+the nl_* functions are supported.
+
+The fact that the implementation is non-native is demonstrated in various ways.
+One example is that PID for the netlink socket is not automatically 
+created when a handle is created to the OVS pseudo device. There's an 
+extra command (defined in OvsDpInterfaceExt.h) that is used to grab the 
+PID generated in t he kernel.
[Sorin]: 
"PID for the netlink socket is not automatically created" --> "PID for the 
netlink socket is not automatically assigned".
"PID generated in t he kernel" --> "PID generated in the kernel".

+
+DPIF provider
+--------------
+As has been alluded to in earlier sections, the netlink socket and 
+netlink message based DPIF provider on Linux has been ported to Windows.
+Correspondingly, the file is called lib/dpif-netlink.c now from its 
+former name of lib/dpif-linux.c.
[Sorin]: I propose the following formulation:
"As has been mentioned in earlier sections, the netlink socket and 
netlink message, based on the Linux's DPIF provider, has been ported to Windows.
Correspondingly, the file lib/dpif-netlink.c was renamed to lib/dpif-linux.c."

+
+Most of the code is common. Some divergence is in the code to receive 
+packets. The Linux implementation uses epoll() which is not natively 
+supported on Windows.
 
 Netdev-Windows
 --------------
-We have a Windows implementation of the the interface defined in lib/netdev- 
-provider.h. The implementation provided functionality to get extended
+We have a Windows implementation of the interface defined in 
+lib/netdev- provider.h. The implementation provides functionality to 
[Sorin]: "lib/netdev- provider.h" --> "lib/netdev-provider.h"

+get extended
 information about an interface. It is limited in functionality compared to the 
 Linux implementation of the netdev provider and cannot be used to add any 
-interfaces in the kernel such as a tap interface.
-
+interfaces in the kernel such as a tap interface or to send/receive packets.
+The netdev-windows implementation uses the datapath interface 
+extensions defined in:
+datapath-windows/include/OvsDpInterfaceExt.h
 
 2.d) Flow of a packet
 ---------------------
@@ -369,3 +435,7 @@ 
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366510(v=vs.85).aspx
 http://msdn.microsoft.com/en-us/library/windows/hardware/ff557015(v=vs.85).aspx
 7. How to Port Open vSwitch to New Software or Hardware  
http://git.openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=blob;f=PORTING
+8. Netlink
+http://en.wikipedia.org/wiki/Netlink
+9. epoll
+http://en.wikipedia.org/wiki/Epoll
--
1.7.4.1

_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev

Reply via email to