Re: [PATCH v3 net-next 0/3] rds: IPv6 support

2018-07-18 Thread Ka-Cheong Poon

On 07/19/2018 01:31 AM, David Miller wrote:

From: Sowmini Varadhan 
Date: Wed, 18 Jul 2018 03:33:40 -0700


On (07/18/18 15:19), Ka-Cheong Poon wrote:

bind() and connect() are using the sa_family/ss_family to have
the application signal to the kernel about whether ipv4 or ipv6 is
desired. (and bind and connect are doing the right thing for
v4mapped, so that doesnt seem to be a problem there)

In this case you want the application to signal that info via
the optlen.  (And the reason for this inconsistency is that you dont
want to deal with the user->kernel copy in the same way?)



Because doing that can break existing RDS apps.  Existing code
does not check the address family in processing this socket
option.  It only cares about the address and port.  If the new


I'll leave this up to DaveM. Existing code only handles IPv4,

everywhere else, we always check the sa_family or ss_family
first and verify the length afterward. This was DaveM's original
point about bind/connect/sendmsg. I dont know why rds sockopts have
to be special.


Yes, but the above point is valid.

If the code never verified the sa_family value before, it is a very
real possibility that code exists out that which is not initializing
it or setting it incorrectly.

Those apps have worked for a long time, and suddenly will break.

We often have to deal with unfortunate mistakes like this.

But for now, I guess the check can be added but we have to look out
for any regressions this causes and revert if necessary.



Is it OK not to do the check for this patch?  From a
customer's perspective, breaking working apps is a
really bad thing unless there is a very special
reason, such as security issue.  Do you see a very
important problem, such as security issue, for not
adding the check in this patch?

Thanks.


--
K. Poon
ka-cheong.p...@oracle.com




Re: [PATCH v3 net-next 0/3] rds: IPv6 support

2018-07-18 Thread Santosh Shilimkar

On 7/13/2018 4:02 AM, Ka-Cheong Poon wrote:

This patch set adds IPv6 support to the kernel RDS and related
modules.  Existing RDS apps using IPv4 address continue to run without
any problem.  New RDS apps which want to use IPv6 address can do so by
passing the address in struct sockaddr_in6 to bind(), connect() or
sendmsg().  And those apps also need to use the new IPv6 equivalents
of some of the existing socket options as the existing options use a
32 bit integer to store IP address.


[...]



Ka-Cheong Poon (3):
   rds: Changing IP address internal representation to struct in6_addr
   rds: Enable RDS IPv6 support
   rds: Extend RDS API for IPv6 support


Could you please post v4 and as aligned, please drop the SO_TRANSPORT
change from the set.

regards,
Santosh


Re: [PATCH v3 net-next 0/3] rds: IPv6 support

2018-07-18 Thread David Miller
From: Sowmini Varadhan 
Date: Wed, 18 Jul 2018 03:33:40 -0700

> On (07/18/18 15:19), Ka-Cheong Poon wrote:
>> >bind() and connect() are using the sa_family/ss_family to have
>> >the application signal to the kernel about whether ipv4 or ipv6 is
>> >desired. (and bind and connect are doing the right thing for
>> >v4mapped, so that doesnt seem to be a problem there)
>> >
>> >In this case you want the application to signal that info via
>> >the optlen.  (And the reason for this inconsistency is that you dont
>> >want to deal with the user->kernel copy in the same way?)
>> 
>> 
>> Because doing that can break existing RDS apps.  Existing code
>> does not check the address family in processing this socket
>> option.  It only cares about the address and port.  If the new
> 
> I'll leave this up to DaveM. Existing code only handles IPv4,
> 
> everywhere else, we always check the sa_family or ss_family
> first and verify the length afterward. This was DaveM's original
> point about bind/connect/sendmsg. I dont know why rds sockopts have
> to be special.

Yes, but the above point is valid.

If the code never verified the sa_family value before, it is a very
real possibility that code exists out that which is not initializing
it or setting it incorrectly.

Those apps have worked for a long time, and suddenly will break.

We often have to deal with unfortunate mistakes like this.

But for now, I guess the check can be added but we have to look out
for any regressions this causes and revert if necessary.

Thanks.


Re: [PATCH v3 net-next 0/3] rds: IPv6 support

2018-07-18 Thread Sowmini Varadhan
On (07/18/18 15:19), Ka-Cheong Poon wrote:
> >bind() and connect() are using the sa_family/ss_family to have
> >the application signal to the kernel about whether ipv4 or ipv6 is
> >desired. (and bind and connect are doing the right thing for
> >v4mapped, so that doesnt seem to be a problem there)
> >
> >In this case you want the application to signal that info via
> >the optlen.  (And the reason for this inconsistency is that you dont
> >want to deal with the user->kernel copy in the same way?)
> 
> 
> Because doing that can break existing RDS apps.  Existing code
> does not check the address family in processing this socket
> option.  It only cares about the address and port.  If the new

I'll leave this up to DaveM. Existing code only handles IPv4,

everywhere else, we always check the sa_family or ss_family
first and verify the length afterward. This was DaveM's original
point about bind/connect/sendmsg. I dont know why rds sockopts have
to be special.

> code suddenly checks that and use it to decide on the address
> family, working app will break.  As I mentioned before, this
> patch set does not change existing behavior.  And doing what
> you mentioned will change existing behavior and break apps.

thank you.

--Sowmini


Re: [PATCH v3 net-next 0/3] rds: IPv6 support

2018-07-18 Thread Ka-Cheong Poon

On 07/17/2018 07:27 PM, Sowmini Varadhan wrote:

On (07/17/18 13:32), Ka-Cheong Poon wrote:


The app can use either structures to make the call.  When the
app fills in the structure, it knows what it is filling in,
either sockaddr_in or sockaddr_in6.  So it knows the right size
to use.  The app can also use IPv4 mapped address in a sockaddr_in6
without a problem.


tupical applications that I have seen in routing applicaitons
will use a union like
   union {
struct sockaddr_in sin;
struct sockaddr_in sin5;
   }
Or they will use sockadd_storage. Passing down the sizeoof that structure
will do the worng thing thing in the existing code for ipv4 (even
though it will not generate EFAOIT)..



What a typical socket (not routing) app will do is to pass in
the right sizeof based on the socket address structure in doing
things like connect() or bind().  The app needs to fill in the
structure correctly according to which one after all.  And this
avoids the unnecessary copy of garbage data from user to kernel.



Could you please explain the inconsistency?  An app can use IPv4
mapped address in a sockaddr_in6 to operate on an IPv4 connection,
in case you are thinking of this new addition in v3 of the patch.


bind() and connect() are using the sa_family/ss_family to have
the application signal to the kernel about whether ipv4 or ipv6 is
desired. (and bind and connect are doing the right thing for
v4mapped, so that doesnt seem to be a problem there)

In this case you want the application to signal that info via
the optlen.  (And the reason for this inconsistency is that you dont
want to deal with the user->kernel copy in the same way?)



Because doing that can break existing RDS apps.  Existing code
does not check the address family in processing this socket
option.  It only cares about the address and port.  If the new
code suddenly checks that and use it to decide on the address
family, working app will break.  As I mentioned before, this
patch set does not change existing behavior.  And doing what
you mentioned will change existing behavior and break apps.


--
K. Poon
ka-cheong.p...@oracle.com




Re: [PATCH v3 net-next 0/3] rds: IPv6 support

2018-07-17 Thread Sowmini Varadhan
On (07/17/18 13:32), Ka-Cheong Poon wrote:
> 
> The app can use either structures to make the call.  When the
> app fills in the structure, it knows what it is filling in,
> either sockaddr_in or sockaddr_in6.  So it knows the right size
> to use.  The app can also use IPv4 mapped address in a sockaddr_in6
> without a problem.

tupical applications that I have seen in routing applicaitons
will use a union like
  union {
struct sockaddr_in sin;
struct sockaddr_in sin5;
  }
Or they will use sockadd_storage. Passing down the sizeoof that structure
will do the worng thing thing in the existing code for ipv4 (even
though it will not generate EFAOIT)..

> Could you please explain the inconsistency?  An app can use IPv4
> mapped address in a sockaddr_in6 to operate on an IPv4 connection,
> in case you are thinking of this new addition in v3 of the patch.

bind() and connect() are using the sa_family/ss_family to have
the application signal to the kernel about whether ipv4 or ipv6 is 
desired. (and bind and connect are doing the right thing for
v4mapped, so that doesnt seem to be a problem there)

In this case you want the application to signal that info via
the optlen.  (And the reason for this inconsistency is that you dont 
want to deal with the user->kernel copy in the same way?)

--Sowmini



Re: [PATCH v3 net-next 0/3] rds: IPv6 support

2018-07-16 Thread Ka-Cheong Poon

On 07/17/2018 12:20 AM, Sowmini Varadhan wrote:


-  Looks like rds_connect() is checking things in the right order (thanks)
However, rds_cancel_sent_to is still looking at the len to figure
out the family.. as we move to ipv6,  it would be better if we allow
the caller to specify struct sockaddr_storage, or even a union of
sockaddr_in/sockaddr_in6, rather than require them to hint at which
one of ipv4/ipv6 through the optlen.



The app can use either structures to make the call.  When the
app fills in the structure, it knows what it is filling in,
either sockaddr_in or sockaddr_in6.  So it knows the right size
to use.  The app can also use IPv4 mapped address in a sockaddr_in6
without a problem.



Please see __sys_connect and move_addr_to_kernel if the user-kernel
copy is the reason you are not doing this. Similar to inet_dgram_connect
you can then check the sa_family and use that to figure out the
"Assume IPv4" etc stuff.

This would also make the CANCEL_SEND_TO API consistent with the bind/
connect etc semantics.



Could you please explain the inconsistency?  An app can use IPv4
mapped address in a sockaddr_in6 to operate on an IPv4 connection,
in case you are thinking of this new addition in v3 of the patch.



-  net/rds/rds.h: thanks for moving RDS_CM_PORT to the rdma specific file.

I am guessing (?) that you want to update the comment to talk about
the non-existent "RDS over UDP" based on the title of the IANA registration?
I would just like to re-iterate that this is actually inaccurate
(and confusing to someone looking at this for the first time, since
there is no RDS-over-UDP today). If it were up to me, I would update
the comment to say

/* The following ports, 16385, 18634, 18635, are registered with IANA as
  * the ports to be used for "RDS over TCP and UDP".
  * The current linux implementation supports RDS over TCP and IB, and uses
  * the ports as follows: 18634 is the historical value used for the
  * RDMA_CM listener port.  RDS/TCP uses port 16385.  After
  * IPv6 work, RDMA_CM also uses 16385 as the listener port.  18634 is kept
  * to ensure compatibility with older RDS modules.  Those ports are defined
  * in each transport's header file.



Will update it to


/* The following ports, 16385, 18634, 18635, are registered with IANA as 

 * the ports to be used for RDS over TCP and UDP.  Currently, only RDS 
over
 * TCP and RDS over IB/RDMA are implemented.  18634 is the historical 
value
 * used for the RDMA_CM listener port.  RDS/TCP uses port 16385.  After 

 * IPv6 work, RDMA_CM also uses 16385 as the listener port.  18634 is 
kept
 * to ensure compatibility with older RDS modules.  Those ports are 
defined
 * in each transport's header file. 


 */



--
K. Poon
ka-cheong.p...@oracle.com




Re: [PATCH v3 net-next 0/3] rds: IPv6 support

2018-07-16 Thread Sowmini Varadhan


-  Looks like rds_connect() is checking things in the right order (thanks)
   However, rds_cancel_sent_to is still looking at the len to figure
   out the family.. as we move to ipv6,  it would be better if we allow
   the caller to specify struct sockaddr_storage, or even a union of
   sockaddr_in/sockaddr_in6, rather than require them to hint at which 
   one of ipv4/ipv6 through the optlen.

   Please see __sys_connect and move_addr_to_kernel if the user-kernel
   copy is the reason you are not doing this. Similar to inet_dgram_connect
   you can then check the sa_family and use that to figure out the
   "Assume IPv4" etc stuff.

   This would also make the CANCEL_SEND_TO API consistent with the bind/
   connect etc semantics.
   
-  net/rds/rds.h: thanks for moving RDS_CM_PORT to the rdma specific file.

   I am guessing (?) that you want to update the comment to talk about
   the non-existent "RDS over UDP" based on the title of the IANA registration?
   I would just like to re-iterate that this is actually inaccurate
   (and confusing to someone looking at this for the first time, since
   there is no RDS-over-UDP today). If it were up to me, I would update
   the comment to say

/* The following ports, 16385, 18634, 18635, are registered with IANA as
 * the ports to be used for "RDS over TCP and UDP".
 * The current linux implementation supports RDS over TCP and IB, and uses
 * the ports as follows: 18634 is the historical value used for the
 * RDMA_CM listener port.  RDS/TCP uses port 16385.  After
 * IPv6 work, RDMA_CM also uses 16385 as the listener port.  18634 is kept
 * to ensure compatibility with older RDS modules.  Those ports are defined
 * in each transport's header file.

IMHO that makes the comment look a little less odd (I've already explained
to you why RDS-over-UDP does not make much practical sense for the RDS
use-cases we anticipate). YMMV.

Thanks,

--Sowmini


[PATCH v3 net-next 0/3] rds: IPv6 support

2018-07-13 Thread Ka-Cheong Poon
This patch set adds IPv6 support to the kernel RDS and related
modules.  Existing RDS apps using IPv4 address continue to run without
any problem.  New RDS apps which want to use IPv6 address can do so by
passing the address in struct sockaddr_in6 to bind(), connect() or
sendmsg().  And those apps also need to use the new IPv6 equivalents
of some of the existing socket options as the existing options use a
32 bit integer to store IP address.

All RDS code now use struct in6_addr to store IP address.  IPv4
address is stored as an IPv4 mapped address.

Header file changes

There are many data structures (RDS socket options) used by RDS apps
which use a 32 bit integer to store IP address. To support IPv6,
struct in6_addr needs to be used. To ensure backward compatibility, a
new data structure is introduced for each of those data structures
which use a 32 bit integer to represent an IP address. And new socket
options are introduced to use those new structures. This means that
existing apps should work without a problem with the new RDS module.
For apps which want to use IPv6, those new data structures and socket
options can be used. IPv4 mapped address is used to represent IPv4
address in the new data structures.

Internally, all RDS data structures which contain an IP address are
changed to use struct in6_addr to store the address. IPv4 address is
stored as an IPv4 mapped address. All the functions which take an IP
address as argument are also changed to use struct in6_addr.

RDS/RDMA/IB uses a private data (struct rds_ib_connect_private)
exchange between endpoints at RDS connection establishment time to
support RDMA. This private data exchange uses a 32 bit integer to
represent an IP address. This needs to be changed in order to support
IPv6. A new private data struct rds6_ib_connect_private is introduced
to handle this. To ensure backward compatibility, an IPv6 capable RDS
stack uses another RDMA listener port (RDS_CM_PORT) to accept IPv6
connection. And it continues to use the original RDS_PORT for IPv4 RDS
connections. When it needs to communicate with an IPv6 peer, it uses
the RDS_TCP_PORT to send the connection set up request.

RDS/TCP changes

TCP related code is changed to support IPv6.  Note that only an IPv6
TCP listener on port RDS_TCP_PORT is created as it can accept both
IPv4 and IPv6 connection requests.

IB/RDMA changes

The initial private data exchange between IB endpoints using RDMA is
changed to support IPv6 address instead, if the peer address is IPv6.
To ensure backward compatibility, annother RDMA listener port
(RDS_CM_PORT) is used to accept IPv6 connection. An IPv6 capable RDS
module continues to use the original RDS_PORT for IPv4 RDS
connections. When it needs to communicate with an IPv6 peer, it uses
the RDS_CM_PORT to send the connection set up request.

Ka-Cheong Poon (3):
  rds: Changing IP address internal representation to struct in6_addr
  rds: Enable RDS IPv6 support
  rds: Extend RDS API for IPv6 support

 include/uapi/linux/rds.h |  71 ++-
 net/rds/af_rds.c | 201 --
 net/rds/bind.c   | 136 -
 net/rds/cong.c   |  23 ++--
 net/rds/connection.c | 259 ++-
 net/rds/ib.c | 114 +++--
 net/rds/ib.h |  51 ++--
 net/rds/ib_cm.c  | 309 +++
 net/rds/ib_mr.h  |   2 +
 net/rds/ib_rdma.c|  24 ++--
 net/rds/ib_recv.c|  18 +--
 net/rds/ib_send.c|  10 +-
 net/rds/loop.c   |   7 +-
 net/rds/rdma.c   |   6 +-
 net/rds/rdma_transport.c |  84 ++---
 net/rds/rdma_transport.h |   5 +
 net/rds/rds.h|  87 -
 net/rds/recv.c   |  76 +---
 net/rds/send.c   | 114 ++---
 net/rds/tcp.c| 128 
 net/rds/tcp.h|   2 +-
 net/rds/tcp_connect.c|  68 ---
 net/rds/tcp_listen.c |  74 +---
 net/rds/tcp_recv.c   |   9 +-
 net/rds/tcp_send.c   |   4 +-
 net/rds/threads.c|  69 +--
 net/rds/transport.c  |  15 ++-
 27 files changed, 1543 insertions(+), 423 deletions(-)

-- 
1.8.3.1