Re: [PATCH v3 net-next 0/3] rds: IPv6 support
On 07/19/2018 01:31 AM, David Miller wrote: From: Sowmini Varadhan Date: Wed, 18 Jul 2018 03:33:40 -0700 On (07/18/18 15:19), Ka-Cheong Poon wrote: bind() and connect() are using the sa_family/ss_family to have the application signal to the kernel about whether ipv4 or ipv6 is desired. (and bind and connect are doing the right thing for v4mapped, so that doesnt seem to be a problem there) In this case you want the application to signal that info via the optlen. (And the reason for this inconsistency is that you dont want to deal with the user->kernel copy in the same way?) Because doing that can break existing RDS apps. Existing code does not check the address family in processing this socket option. It only cares about the address and port. If the new I'll leave this up to DaveM. Existing code only handles IPv4, everywhere else, we always check the sa_family or ss_family first and verify the length afterward. This was DaveM's original point about bind/connect/sendmsg. I dont know why rds sockopts have to be special. Yes, but the above point is valid. If the code never verified the sa_family value before, it is a very real possibility that code exists out that which is not initializing it or setting it incorrectly. Those apps have worked for a long time, and suddenly will break. We often have to deal with unfortunate mistakes like this. But for now, I guess the check can be added but we have to look out for any regressions this causes and revert if necessary. Is it OK not to do the check for this patch? From a customer's perspective, breaking working apps is a really bad thing unless there is a very special reason, such as security issue. Do you see a very important problem, such as security issue, for not adding the check in this patch? Thanks. -- K. Poon ka-cheong.p...@oracle.com
Re: [PATCH v3 net-next 0/3] rds: IPv6 support
On 7/13/2018 4:02 AM, Ka-Cheong Poon wrote: This patch set adds IPv6 support to the kernel RDS and related modules. Existing RDS apps using IPv4 address continue to run without any problem. New RDS apps which want to use IPv6 address can do so by passing the address in struct sockaddr_in6 to bind(), connect() or sendmsg(). And those apps also need to use the new IPv6 equivalents of some of the existing socket options as the existing options use a 32 bit integer to store IP address. [...] Ka-Cheong Poon (3): rds: Changing IP address internal representation to struct in6_addr rds: Enable RDS IPv6 support rds: Extend RDS API for IPv6 support Could you please post v4 and as aligned, please drop the SO_TRANSPORT change from the set. regards, Santosh
Re: [PATCH v3 net-next 0/3] rds: IPv6 support
From: Sowmini Varadhan Date: Wed, 18 Jul 2018 03:33:40 -0700 > On (07/18/18 15:19), Ka-Cheong Poon wrote: >> >bind() and connect() are using the sa_family/ss_family to have >> >the application signal to the kernel about whether ipv4 or ipv6 is >> >desired. (and bind and connect are doing the right thing for >> >v4mapped, so that doesnt seem to be a problem there) >> > >> >In this case you want the application to signal that info via >> >the optlen. (And the reason for this inconsistency is that you dont >> >want to deal with the user->kernel copy in the same way?) >> >> >> Because doing that can break existing RDS apps. Existing code >> does not check the address family in processing this socket >> option. It only cares about the address and port. If the new > > I'll leave this up to DaveM. Existing code only handles IPv4, > > everywhere else, we always check the sa_family or ss_family > first and verify the length afterward. This was DaveM's original > point about bind/connect/sendmsg. I dont know why rds sockopts have > to be special. Yes, but the above point is valid. If the code never verified the sa_family value before, it is a very real possibility that code exists out that which is not initializing it or setting it incorrectly. Those apps have worked for a long time, and suddenly will break. We often have to deal with unfortunate mistakes like this. But for now, I guess the check can be added but we have to look out for any regressions this causes and revert if necessary. Thanks.
Re: [PATCH v3 net-next 0/3] rds: IPv6 support
On (07/18/18 15:19), Ka-Cheong Poon wrote: > >bind() and connect() are using the sa_family/ss_family to have > >the application signal to the kernel about whether ipv4 or ipv6 is > >desired. (and bind and connect are doing the right thing for > >v4mapped, so that doesnt seem to be a problem there) > > > >In this case you want the application to signal that info via > >the optlen. (And the reason for this inconsistency is that you dont > >want to deal with the user->kernel copy in the same way?) > > > Because doing that can break existing RDS apps. Existing code > does not check the address family in processing this socket > option. It only cares about the address and port. If the new I'll leave this up to DaveM. Existing code only handles IPv4, everywhere else, we always check the sa_family or ss_family first and verify the length afterward. This was DaveM's original point about bind/connect/sendmsg. I dont know why rds sockopts have to be special. > code suddenly checks that and use it to decide on the address > family, working app will break. As I mentioned before, this > patch set does not change existing behavior. And doing what > you mentioned will change existing behavior and break apps. thank you. --Sowmini
Re: [PATCH v3 net-next 0/3] rds: IPv6 support
On 07/17/2018 07:27 PM, Sowmini Varadhan wrote: On (07/17/18 13:32), Ka-Cheong Poon wrote: The app can use either structures to make the call. When the app fills in the structure, it knows what it is filling in, either sockaddr_in or sockaddr_in6. So it knows the right size to use. The app can also use IPv4 mapped address in a sockaddr_in6 without a problem. tupical applications that I have seen in routing applicaitons will use a union like union { struct sockaddr_in sin; struct sockaddr_in sin5; } Or they will use sockadd_storage. Passing down the sizeoof that structure will do the worng thing thing in the existing code for ipv4 (even though it will not generate EFAOIT).. What a typical socket (not routing) app will do is to pass in the right sizeof based on the socket address structure in doing things like connect() or bind(). The app needs to fill in the structure correctly according to which one after all. And this avoids the unnecessary copy of garbage data from user to kernel. Could you please explain the inconsistency? An app can use IPv4 mapped address in a sockaddr_in6 to operate on an IPv4 connection, in case you are thinking of this new addition in v3 of the patch. bind() and connect() are using the sa_family/ss_family to have the application signal to the kernel about whether ipv4 or ipv6 is desired. (and bind and connect are doing the right thing for v4mapped, so that doesnt seem to be a problem there) In this case you want the application to signal that info via the optlen. (And the reason for this inconsistency is that you dont want to deal with the user->kernel copy in the same way?) Because doing that can break existing RDS apps. Existing code does not check the address family in processing this socket option. It only cares about the address and port. If the new code suddenly checks that and use it to decide on the address family, working app will break. As I mentioned before, this patch set does not change existing behavior. And doing what you mentioned will change existing behavior and break apps. -- K. Poon ka-cheong.p...@oracle.com
Re: [PATCH v3 net-next 0/3] rds: IPv6 support
On (07/17/18 13:32), Ka-Cheong Poon wrote: > > The app can use either structures to make the call. When the > app fills in the structure, it knows what it is filling in, > either sockaddr_in or sockaddr_in6. So it knows the right size > to use. The app can also use IPv4 mapped address in a sockaddr_in6 > without a problem. tupical applications that I have seen in routing applicaitons will use a union like union { struct sockaddr_in sin; struct sockaddr_in sin5; } Or they will use sockadd_storage. Passing down the sizeoof that structure will do the worng thing thing in the existing code for ipv4 (even though it will not generate EFAOIT).. > Could you please explain the inconsistency? An app can use IPv4 > mapped address in a sockaddr_in6 to operate on an IPv4 connection, > in case you are thinking of this new addition in v3 of the patch. bind() and connect() are using the sa_family/ss_family to have the application signal to the kernel about whether ipv4 or ipv6 is desired. (and bind and connect are doing the right thing for v4mapped, so that doesnt seem to be a problem there) In this case you want the application to signal that info via the optlen. (And the reason for this inconsistency is that you dont want to deal with the user->kernel copy in the same way?) --Sowmini
Re: [PATCH v3 net-next 0/3] rds: IPv6 support
On 07/17/2018 12:20 AM, Sowmini Varadhan wrote: - Looks like rds_connect() is checking things in the right order (thanks) However, rds_cancel_sent_to is still looking at the len to figure out the family.. as we move to ipv6, it would be better if we allow the caller to specify struct sockaddr_storage, or even a union of sockaddr_in/sockaddr_in6, rather than require them to hint at which one of ipv4/ipv6 through the optlen. The app can use either structures to make the call. When the app fills in the structure, it knows what it is filling in, either sockaddr_in or sockaddr_in6. So it knows the right size to use. The app can also use IPv4 mapped address in a sockaddr_in6 without a problem. Please see __sys_connect and move_addr_to_kernel if the user-kernel copy is the reason you are not doing this. Similar to inet_dgram_connect you can then check the sa_family and use that to figure out the "Assume IPv4" etc stuff. This would also make the CANCEL_SEND_TO API consistent with the bind/ connect etc semantics. Could you please explain the inconsistency? An app can use IPv4 mapped address in a sockaddr_in6 to operate on an IPv4 connection, in case you are thinking of this new addition in v3 of the patch. - net/rds/rds.h: thanks for moving RDS_CM_PORT to the rdma specific file. I am guessing (?) that you want to update the comment to talk about the non-existent "RDS over UDP" based on the title of the IANA registration? I would just like to re-iterate that this is actually inaccurate (and confusing to someone looking at this for the first time, since there is no RDS-over-UDP today). If it were up to me, I would update the comment to say /* The following ports, 16385, 18634, 18635, are registered with IANA as * the ports to be used for "RDS over TCP and UDP". * The current linux implementation supports RDS over TCP and IB, and uses * the ports as follows: 18634 is the historical value used for the * RDMA_CM listener port. RDS/TCP uses port 16385. After * IPv6 work, RDMA_CM also uses 16385 as the listener port. 18634 is kept * to ensure compatibility with older RDS modules. Those ports are defined * in each transport's header file. Will update it to /* The following ports, 16385, 18634, 18635, are registered with IANA as * the ports to be used for RDS over TCP and UDP. Currently, only RDS over * TCP and RDS over IB/RDMA are implemented. 18634 is the historical value * used for the RDMA_CM listener port. RDS/TCP uses port 16385. After * IPv6 work, RDMA_CM also uses 16385 as the listener port. 18634 is kept * to ensure compatibility with older RDS modules. Those ports are defined * in each transport's header file. */ -- K. Poon ka-cheong.p...@oracle.com
Re: [PATCH v3 net-next 0/3] rds: IPv6 support
- Looks like rds_connect() is checking things in the right order (thanks) However, rds_cancel_sent_to is still looking at the len to figure out the family.. as we move to ipv6, it would be better if we allow the caller to specify struct sockaddr_storage, or even a union of sockaddr_in/sockaddr_in6, rather than require them to hint at which one of ipv4/ipv6 through the optlen. Please see __sys_connect and move_addr_to_kernel if the user-kernel copy is the reason you are not doing this. Similar to inet_dgram_connect you can then check the sa_family and use that to figure out the "Assume IPv4" etc stuff. This would also make the CANCEL_SEND_TO API consistent with the bind/ connect etc semantics. - net/rds/rds.h: thanks for moving RDS_CM_PORT to the rdma specific file. I am guessing (?) that you want to update the comment to talk about the non-existent "RDS over UDP" based on the title of the IANA registration? I would just like to re-iterate that this is actually inaccurate (and confusing to someone looking at this for the first time, since there is no RDS-over-UDP today). If it were up to me, I would update the comment to say /* The following ports, 16385, 18634, 18635, are registered with IANA as * the ports to be used for "RDS over TCP and UDP". * The current linux implementation supports RDS over TCP and IB, and uses * the ports as follows: 18634 is the historical value used for the * RDMA_CM listener port. RDS/TCP uses port 16385. After * IPv6 work, RDMA_CM also uses 16385 as the listener port. 18634 is kept * to ensure compatibility with older RDS modules. Those ports are defined * in each transport's header file. IMHO that makes the comment look a little less odd (I've already explained to you why RDS-over-UDP does not make much practical sense for the RDS use-cases we anticipate). YMMV. Thanks, --Sowmini
[PATCH v3 net-next 0/3] rds: IPv6 support
This patch set adds IPv6 support to the kernel RDS and related modules. Existing RDS apps using IPv4 address continue to run without any problem. New RDS apps which want to use IPv6 address can do so by passing the address in struct sockaddr_in6 to bind(), connect() or sendmsg(). And those apps also need to use the new IPv6 equivalents of some of the existing socket options as the existing options use a 32 bit integer to store IP address. All RDS code now use struct in6_addr to store IP address. IPv4 address is stored as an IPv4 mapped address. Header file changes There are many data structures (RDS socket options) used by RDS apps which use a 32 bit integer to store IP address. To support IPv6, struct in6_addr needs to be used. To ensure backward compatibility, a new data structure is introduced for each of those data structures which use a 32 bit integer to represent an IP address. And new socket options are introduced to use those new structures. This means that existing apps should work without a problem with the new RDS module. For apps which want to use IPv6, those new data structures and socket options can be used. IPv4 mapped address is used to represent IPv4 address in the new data structures. Internally, all RDS data structures which contain an IP address are changed to use struct in6_addr to store the address. IPv4 address is stored as an IPv4 mapped address. All the functions which take an IP address as argument are also changed to use struct in6_addr. RDS/RDMA/IB uses a private data (struct rds_ib_connect_private) exchange between endpoints at RDS connection establishment time to support RDMA. This private data exchange uses a 32 bit integer to represent an IP address. This needs to be changed in order to support IPv6. A new private data struct rds6_ib_connect_private is introduced to handle this. To ensure backward compatibility, an IPv6 capable RDS stack uses another RDMA listener port (RDS_CM_PORT) to accept IPv6 connection. And it continues to use the original RDS_PORT for IPv4 RDS connections. When it needs to communicate with an IPv6 peer, it uses the RDS_TCP_PORT to send the connection set up request. RDS/TCP changes TCP related code is changed to support IPv6. Note that only an IPv6 TCP listener on port RDS_TCP_PORT is created as it can accept both IPv4 and IPv6 connection requests. IB/RDMA changes The initial private data exchange between IB endpoints using RDMA is changed to support IPv6 address instead, if the peer address is IPv6. To ensure backward compatibility, annother RDMA listener port (RDS_CM_PORT) is used to accept IPv6 connection. An IPv6 capable RDS module continues to use the original RDS_PORT for IPv4 RDS connections. When it needs to communicate with an IPv6 peer, it uses the RDS_CM_PORT to send the connection set up request. Ka-Cheong Poon (3): rds: Changing IP address internal representation to struct in6_addr rds: Enable RDS IPv6 support rds: Extend RDS API for IPv6 support include/uapi/linux/rds.h | 71 ++- net/rds/af_rds.c | 201 -- net/rds/bind.c | 136 - net/rds/cong.c | 23 ++-- net/rds/connection.c | 259 ++- net/rds/ib.c | 114 +++-- net/rds/ib.h | 51 ++-- net/rds/ib_cm.c | 309 +++ net/rds/ib_mr.h | 2 + net/rds/ib_rdma.c| 24 ++-- net/rds/ib_recv.c| 18 +-- net/rds/ib_send.c| 10 +- net/rds/loop.c | 7 +- net/rds/rdma.c | 6 +- net/rds/rdma_transport.c | 84 ++--- net/rds/rdma_transport.h | 5 + net/rds/rds.h| 87 - net/rds/recv.c | 76 +--- net/rds/send.c | 114 ++--- net/rds/tcp.c| 128 net/rds/tcp.h| 2 +- net/rds/tcp_connect.c| 68 --- net/rds/tcp_listen.c | 74 +--- net/rds/tcp_recv.c | 9 +- net/rds/tcp_send.c | 4 +- net/rds/threads.c| 69 +-- net/rds/transport.c | 15 ++- 27 files changed, 1543 insertions(+), 423 deletions(-) -- 1.8.3.1