Re: [Int-area] [tcpm] draft-williams-overlaypath-ip-tcp-rfc
Brandon, Your are correct that the option could be problematic if added to a full-sized packet, or even a nearly full one. I can see that the document should have some discussion of this issue. Yes. In a case like ours, where the overlay network uses tunneling, transparently adding the option is not a critical problem to be solved. It is already the case that the overlay entry point must advertise a reduced MSS in order to accommodate the tunnel overhead. The amount of space consumed by the option will always be smaller than the tunnel overhead, and the option can be added at OVRLY_OUT, so the two are not additive. That said, I can see that an overlay network that does not use tunnels internally, or one that in fact does apply the option on OVRLY_IN, would have a bigger problem, though. So, the new TCP option is basically required between OVRLY_OUT and the receiver/server, because the relevant information is already somehow transported in the overlay, right? This raises another question (sorry if it is naive): Why can't the overlay tunnel just be extended to the server? This somehow implies that OVRLY_OUT would be kind of co-located with the server - obviously, there can be further routers/overlay nodes in between. I am asking this because processing the information contained in the TCP option will require anyway a modified TCP stack in the server, i. e., the server will not be fully backward compatible if it has to process the proposed option. But if the TCP/IP stack has to be modified anyway, I could imagine that one could just add to the server whatever encap/decap is required for the overlay transport. Then, I have the impression that the proposed TCP option would not be needed at all. I don't want to dig into the overlay design, because this is not really in scope of TCPM. But if there is a system architecture that does not require adding TCP options in middleboxes, thus affecting TCP end-to-end semantics, it would really be important to understand why such an architecture cannot be used. Thanks Michael The issue of the proposed fast-open scheme is one that we have not considered, but I don't think it adds any problems for the TCP option that aren't already a problem for tunneled connectivity in general. I will have to spend some time with that proposal and think about how they interrelate. --Brandon On 12/21/2012 08:34 AM, Scharf, Michael (Michael) wrote: Brandon, If there were tunnels between the OVRLY_IN and OVERLY_OUT boxes, then the inner IP headers would have the HOST_X and SERVER addresses, and the outer ones in the tunnel would have the overlay headers. Since the inner packets would be delivered ultimately after egressing the tunnels, the HOST_X addresses are totally visible to the server, and vice versa. There are indeed tunnels between OVRLY_IN and OVRLY_OUT, and the inner IP headers will typically use either the client-side addresses or the server-side addresses. However, neither OVRLY_IN nor OVRLY_OUT can be assumed to be reliably in-path between HOST and SERVER, which means that internet routing cannot be relied upon to cause packets to arrive at the overlay ingress. Instead, HOST_1 must directly address OVRLY_IN_1 in order to send its packets into the tunnel, and SERVER must directly address OVRLY_OUT in order to send the return traffic into the tunnel. Thanks for this explanation - this indeed helps to understand the architecture. But actually I still don't fully understand the motivation of bypassing Internet routing this way. As a non-expert on routing, it indeed looks to me like reinventing source routing - but this is outside my core expertise. Regarding TCPM's business: If I correctly understand the approach, OVRLY_IN will transparently add and remove TCP options. This is kind of dangerous from an end-to-end perspective... Sorry if that has been answered before, but I really wonder what to do if OVRLY_IN can't add this option, either because of lack of TCP option space, or because the path MTU is exceeded by the resulting IP packet. (In fact, I think that this problem does not apply to TCP options only.) Unless I miss something, the latter case could become much more relevant soon: TCPM currently works on the fast-open scheme that adds data to SYNs. With that, I think it is possible that all data packets from a sender to a receiver are either full sized or large enough that the proposed option does not fit in. Given that this option can include full-sized IPv6 addresses, this likelihood is much larger than for other existing TCP option, right? In some cases, I believe that the proposed TCP option cannot be added in the overlay without either IP fragmentation, which is unlikely to be a good idea with NATs, or TCP segment splitting, which probably can cause harm as well. For
Re: [Int-area] [tcpm] draft-williams-overlaypath-ip-tcp-rfc
Michael, Extending the overlay all the way to the application server would mean that existing solutions for load balancing, SSL offload, intrusion detection, diagnostic logging, etc. would not work. In other words, there are many systems in a common enterprise environment that would benefit from more accurate host identification, and all would require changes in order for the mechanism to work. On the other hand, there is existing middleware that can already handle an arbitrary tcp option, using its value for the above listed purposes. So using a TCP option for this purpose is deployable today, but extending the overlay is not. At the same time, use of the option does not carry significant risk of breaking existing connectivity, even in cases where the option is not understood by the TCP stack. Testing has shown that only about 1.7% of the top 100,000 web servers fail to establish connections when the option is included (see draft-abdo-hostid-tcpopt-implementation). This is mostly likely a characteristic of the common TCP stacks in use today, and so probably extends to non-HTTP application servers, too. --Brandon On 12/21/2012 02:14 PM, Scharf, Michael (Michael) wrote: Brandon, Your are correct that the option could be problematic if added to a full-sized packet, or even a nearly full one. I can see that the document should have some discussion of this issue. Yes. In a case like ours, where the overlay network uses tunneling, transparently adding the option is not a critical problem to be solved. It is already the case that the overlay entry point must advertise a reduced MSS in order to accommodate the tunnel overhead. The amount of space consumed by the option will always be smaller than the tunnel overhead, and the option can be added at OVRLY_OUT, so the two are not additive. That said, I can see that an overlay network that does not use tunnels internally, or one that in fact does apply the option on OVRLY_IN, would have a bigger problem, though. So, the new TCP option is basically required between OVRLY_OUT and the receiver/server, because the relevant information is already somehow transported in the overlay, right? This raises another question (sorry if it is naive): Why can't the overlay tunnel just be extended to the server? This somehow implies that OVRLY_OUT would be kind of co-located with the server - obviously, there can be further routers/overlay nodes in between. I am asking this because processing the information contained in the TCP option will require anyway a modified TCP stack in the server, i. e., the server will not be fully backward compatible if it has to process the proposed option. But if the TCP/IP stack has to be modified anyway, I could imagine that one could just add to the server whatever encap/decap is required for the overlay transport. Then, I have the impression that the proposed TCP option would not be needed at all. I don't want to dig into the overlay design, because this is not really in scope of TCPM. But if there is a system architecture that does not require adding TCP options in middleboxes, thus affecting TCP end-to-end semantics, it would really be important to understand why such an architecture cannot be used. Thanks Michael The issue of the proposed fast-open scheme is one that we have not considered, but I don't think it adds any problems for the TCP option that aren't already a problem for tunneled connectivity in general. I will have to spend some time with that proposal and think about how they interrelate. --Brandon On 12/21/2012 08:34 AM, Scharf, Michael (Michael) wrote: Brandon, If there were tunnels between the OVRLY_IN and OVERLY_OUT boxes, then the inner IP headers would have the HOST_X and SERVER addresses, and the outer ones in the tunnel would have the overlay headers. Since the inner packets would be delivered ultimately after egressing the tunnels, the HOST_X addresses are totally visible to the server, and vice versa. There are indeed tunnels between OVRLY_IN and OVRLY_OUT, and the inner IP headers will typically use either the client-side addresses or the server-side addresses. However, neither OVRLY_IN nor OVRLY_OUT can be assumed to be reliably in-path between HOST and SERVER, which means that internet routing cannot be relied upon to cause packets to arrive at the overlay ingress. Instead, HOST_1 must directly address OVRLY_IN_1 in order to send its packets into the tunnel, and SERVER must directly address OVRLY_OUT in order to send the return traffic into the tunnel. Thanks for this explanation - this indeed helps to understand the architecture. But actually I still don't fully understand the motivation of bypassing Internet routing this way. As a non-expert on routing, it indeed looks to me like reinventing source routing - but this is outside my core expertise. Regarding TCPM's business: If I correctly understand the approach, OVRLY_IN will
Re: [Int-area] [tcpm] draft-williams-overlaypath-ip-tcp-rfc
On 12/20/2012 12:21 PM, Brandon Williams wrote: Dear all, A new version of this draft has been submitted that attempts to lay out a more clear argument for the use of both TCP and IP options, with references to other efforts in the same arena. Comments are welcome. (note, I've cross-posted to INTAREA and TCPM, since similar announcements went to each list) Hi Brandon, *many* thanks for writing this; it does help me (at least) to understand what you're doing with this option. As I now understand it, instead of a tunneling approach that would normally be applied for building overlay networks, this approach pushes and pops IP addresses from the protocol options fields. Can you discuss why normal tunneling protocols aren't used to build the overlay? Since those are easily and widely available, I wonder why they aren't used and why something more elaborate, fragile, and not as compatible with the Internet architecture is really needed or felt to be a good idea? I understand that it basically *works* ... but just am not seeing how it makes sense? -- Wes Eddy MTI Systems ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
Re: [Int-area] [tcpm] draft-williams-overlaypath-ip-tcp-rfc
Hi Wes, Thanks for your comments. It looks like I might have managed to make the use of the proposed option less clear, instead of more clear. Or maybe I'm misunderstanding the point that you're making. The mechanics of our system are tunnel-based, as with most overlay architectures that I've looked at. The tunneling starts at an overlay ingress machine close to one of the endpoints (i.e. the client or server) and ends at an overlay egress machine close to the other endpoint. Since the ingress and egress are on the public internet, the overlay does not extend all the way onto the endpoints' LANs. This means that standard internet routing cannot be used to drive the connections into the overlay. Instead, NAT is used on both sides of the overlay, which results in the server having no way to reliably identify the client. The proposed options are not intended to be used as part of the mechanics of the overlay. The overlay is fully functional without the options. Instead, the options are intended to provide the client's connection identifying information to the server for use in load-balancing, diagnostics, etc. Does this clarify things? further muddy the waters? or simply indicate that I missed your point? --Brandon PS: Thanks for cross-posting your comments. I should have done that to begin with. I primarily posted to TCPM for informational purposes, since the TCPM has not shown much interest in this or similar drafts in the past. The INTAREA list has been more actively engaged in discussion related to client identification. Still, if I was going to cross-post, I should have done it with a single thread. On 12/20/2012 02:16 PM, Wesley Eddy wrote: On 12/20/2012 12:21 PM, Brandon Williams wrote: Dear all, A new version of this draft has been submitted that attempts to lay out a more clear argument for the use of both TCP and IP options, with references to other efforts in the same arena. Comments are welcome. (note, I've cross-posted to INTAREA and TCPM, since similar announcements went to each list) Hi Brandon, *many* thanks for writing this; it does help me (at least) to understand what you're doing with this option. As I now understand it, instead of a tunneling approach that would normally be applied for building overlay networks, this approach pushes and pops IP addresses from the protocol options fields. Can you discuss why normal tunneling protocols aren't used to build the overlay? Since those are easily and widely available, I wonder why they aren't used and why something more elaborate, fragile, and not as compatible with the Internet architecture is really needed or felt to be a good idea? I understand that it basically *works* ... but just am not seeing how it makes sense? -- Brandon Williams; Principal Software Engineer Cloud Engineering; Akamai Technologies Inc. ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
Re: [Int-area] [tcpm] draft-williams-overlaypath-ip-tcp-rfc
On 12/20/2012 3:49 PM, Brandon Williams wrote: Hi Wes, Thanks for your comments. It looks like I might have managed to make the use of the proposed option less clear, instead of more clear. Or maybe I'm misunderstanding the point that you're making. The mechanics of our system are tunnel-based, as with most overlay architectures that I've looked at. The tunneling starts at an overlay ingress machine close to one of the endpoints (i.e. the client or server) and ends at an overlay egress machine close to the other endpoint. Since the ingress and egress are on the public internet, the overlay does not extend all the way onto the endpoints' LANs. This means that standard internet routing cannot be used to drive the connections into the overlay. Instead, NAT is used on both sides of the overlay, which results in the server having no way to reliably identify the client. The proposed options are not intended to be used as part of the mechanics of the overlay. The overlay is fully functional without the options. Instead, the options are intended to provide the client's connection identifying information to the server for use in load-balancing, diagnostics, etc. Ah, so are there additional devices beyond what's shown in your Figure 1? I ask because if the overlay ingress and egress are simple tunnel endpoints, then the endpoint addresses would be totally visible to one another. Your figure 1 is: ++ || | INTERNET | || +---+ | ++| | HOST_1 |-| OVRLY_IN_1 |---+| +---+ | ++ || | || +---+ | ++ +---+ | ++ | HOST_2 |-| OVRLY_IN_2 |-| OVRLY_OUT |-| SERVER | +---+ | ++ +---+ | ++ | || +---+ | ++ || | HOST_3 |-| OVRLY_IN_3 |---+| +---+ | ++| || ++ Figure 1 If there were tunnels between the OVRLY_IN and OVERLY_OUT boxes, then the inner IP headers would have the HOST_X and SERVER addresses, and the outer ones in the tunnel would have the overlay headers. Since the inner packets would be delivered ultimately after egressing the tunnels, the HOST_X addresses are totally visible to the server, and vice versa. Your document shows instead: - ip hdr contains: ip hdr contains: SENDER - src = sender -- OVERLAY -- src = overlay2 -- RECEIVER dst = overlay1 dst = receiver - So, this is not really showing tunnels to me ... this is rewriting (NAT) of the destination address. Or am I misunderstanding? -- Wes Eddy MTI Systems ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
Re: [Int-area] [tcpm] draft-williams-overlaypath-ip-tcp-rfc
On 12/20/2012 04:04 PM, Wesley Eddy wrote: On 12/20/2012 3:49 PM, Brandon Williams wrote: Hi Wes, Thanks for your comments. It looks like I might have managed to make the use of the proposed option less clear, instead of more clear. Or maybe I'm misunderstanding the point that you're making. The mechanics of our system are tunnel-based, as with most overlay architectures that I've looked at. The tunneling starts at an overlay ingress machine close to one of the endpoints (i.e. the client or server) and ends at an overlay egress machine close to the other endpoint. Since the ingress and egress are on the public internet, the overlay does not extend all the way onto the endpoints' LANs. This means that standard internet routing cannot be used to drive the connections into the overlay. Instead, NAT is used on both sides of the overlay, which results in the server having no way to reliably identify the client. The proposed options are not intended to be used as part of the mechanics of the overlay. The overlay is fully functional without the options. Instead, the options are intended to provide the client's connection identifying information to the server for use in load-balancing, diagnostics, etc. Ah, so are there additional devices beyond what's shown in your Figure 1? I ask because if the overlay ingress and egress are simple tunnel endpoints, then the endpoint addresses would be totally visible to one another. Yes. There are additional devices between the HOST and OVRLY_IN, and also between OVRLY_OUT and SERVER, but those devices are just the internet's standard routing infrastructure. There are also potential intermediate devices between OVRLY_IN and OVRLY_OUT that can be used for optimized routing between the overlay's ingress and egress. Your figure 1 is: ++ || | INTERNET | || +---+ | ++| | HOST_1 |-| OVRLY_IN_1 |---+| +---+ | ++ || | || +---+ | ++ +---+ | ++ | HOST_2 |-| OVRLY_IN_2 |-| OVRLY_OUT |-| SERVER | +---+ | ++ +---+ | ++ | || +---+ | ++ || | HOST_3 |-| OVRLY_IN_3 |---+| +---+ | ++| || ++ Figure 1 If there were tunnels between the OVRLY_IN and OVERLY_OUT boxes, then the inner IP headers would have the HOST_X and SERVER addresses, and the outer ones in the tunnel would have the overlay headers. Since the inner packets would be delivered ultimately after egressing the tunnels, the HOST_X addresses are totally visible to the server, and vice versa. There are indeed tunnels between OVRLY_IN and OVRLY_OUT, and the inner IP headers will typically use either the client-side addresses or the server-side addresses. However, neither OVRLY_IN nor OVRLY_OUT can be assumed to be reliably in-path between HOST and SERVER, which means that internet routing cannot be relied upon to cause packets to arrive at the overlay ingress. Instead, HOST_1 must directly address OVRLY_IN_1 in order to send its packets into the tunnel, and SERVER must directly address OVRLY_OUT in order to send the return traffic into the tunnel. Your document shows instead: - ip hdr contains: ip hdr contains: SENDER - src = sender -- OVERLAY -- src = overlay2 -- RECEIVER dst = overlay1 dst = receiver - So, this is not really showing tunnels to me ... this is rewriting (NAT) of the destination address. As noted above, the use of tunnels and NAT in this case are not mutually exclusive. NAT is used to allow the overlay ingress to intercept packets, which are then tunneled to the overlay egress, where NAT is used to deliver the packets to the receiver and ensure that return traffic also uses the overlay. --Brandon PS: Sorry to double send this to you Wes. It was bounced by the ietf lists the first time. -- Brandon Williams; Principal Software Engineer Cloud Engineering; Akamai Technologies Inc. ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area