Re: [Taps] MTU / equivalent at the transport layer
On 12/9/2016 1:38 PM, Michael Tuexen wrote: >> On 9 Dec 2016, at 22:30, Joe Touchwrote: >> >> >> >> On 12/9/2016 1:26 PM, Michael Tuexen wrote: >>> Not sure what the reassembly limit is... SCTP handled arbitrary sized >>> user messages a the receiver side by using partial delivery. >>> >>> The SCTP_MAXSEG allows a user to limit the size of DATA chunks without >>> reducing the pmtu. >> Yes, but that size can actually be larger than the PMTU, not just smaller. > Hmm. https://tools.ietf.org/html/rfc6458#section-8.1.16 states: > >Note that the >underlying SCTP implementation may fragment into smaller sized chunks >when the PMTU of the underlying association is smaller than the value >set by the user. > > So this means the user can not rely on this option to turn off SCTP > fragmentation and let SCTP pass IP-packets down the stack to let > the IP do the fragmentation. > > That is why I said, the user can use this option to ask the SCTP > layer to use a smaller value than the one deduced from the PMTU. > That is something you can do safely. It seems like this setting is independent of PMTU. It could be larger than PMTU, in which case SCTP *or* IP could do fragmentation (and I don't see whether there's a way to know or force that decision). Yes, you can set it smaller too. Joe ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps
Re: [Taps] MTU / equivalent at the transport layer
> On 9 Dec 2016, at 22:30, Joe Touchwrote: > > > > On 12/9/2016 1:26 PM, Michael Tuexen wrote: >> >> Not sure what the reassembly limit is... SCTP handled arbitrary sized >> user messages a the receiver side by using partial delivery. >> >> The SCTP_MAXSEG allows a user to limit the size of DATA chunks without >> reducing the pmtu. > Yes, but that size can actually be larger than the PMTU, not just smaller. Hmm. https://tools.ietf.org/html/rfc6458#section-8.1.16 states: Note that the underlying SCTP implementation may fragment into smaller sized chunks when the PMTU of the underlying association is smaller than the value set by the user. So this means the user can not rely on this option to turn off SCTP fragmentation and let SCTP pass IP-packets down the stack to let the IP do the fragmentation. That is why I said, the user can use this option to ask the SCTP layer to use a smaller value than the one deduced from the PMTU. That is something you can do safely. Best regards Michael > >> Please note that this only affects DATA chunks, not >> the whole packet. > I was referring to the size the layer above SCTP deals with, not the > layer below. > > Joe > > ___ > Taps mailing list > Taps@ietf.org > https://www.ietf.org/mailman/listinfo/taps ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps
Re: [Taps] MTU / equivalent at the transport layer
On 12/9/2016 1:26 PM, Michael Tuexen wrote: > > Not sure what the reassembly limit is... SCTP handled arbitrary sized > user messages a the receiver side by using partial delivery. > > The SCTP_MAXSEG allows a user to limit the size of DATA chunks without > reducing the pmtu. Yes, but that size can actually be larger than the PMTU, not just smaller. > Please note that this only affects DATA chunks, not > the whole packet. I was referring to the size the layer above SCTP deals with, not the layer below. Joe ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps
Re: [Taps] MTU / equivalent at the transport layer
> On 9 Dec 2016, at 22:12, Joe Touchwrote: > > > > On 12/9/2016 12:28 PM, Michael Tuexen wrote: >>> ... In the API description in https://tools.ietf.org/html/rfc6458 the MTU exposed to the application via the API is "the number of bytes available in an SCTP packet for chunks." I think this is the best we can do at that interface... >>> AFAICT, that's 1) in my list, e.g., the largest chunk that SCTP can send >>> without having SCTP coordinate frag/reassembly. That doesn't itself >> You need to take the DATA or I-DATA chunk header into account... >> It is "the number of bytes in the packet for chunks", not "the number >> of bytes for the chunk value for DATA (or I-DATA) chunks". > Yes, but (see below). > >> >>> indicate whether it's SCTP doing the rest or the network layer. >> As I said, that is what is exposed to the upper layer via the API. >> SCTP itself has procedures to detect the PMTU (of each path). >> It does PMTU discovery by interacting with the IP layer... > > Yes, but it's possible that the chunksize the app sees from SCTP is > related to SCTP's reassembly limit, not IP's and not the PMTU. > > At least that's how I read SCTP_MAXSEG. Not sure what the reassembly limit is... SCTP handled arbitrary sized user messages a the receiver side by using partial delivery. The SCTP_MAXSEG allows a user to limit the size of DATA chunks without reducing the pmtu. Please note that this only affects DATA chunks, not the whole packet. So that is why I was not referring to this option. The socket options I was referring to are: * SCTP_GET_PEER_ADDR_INFO defined in https://tools.ietf.org/html/rfc6458#section-8.2.2 * SCTP_PEER_ADDR_PARAMS defined in https://tools.ietf.org/html/rfc6458#section-8.1.12 Best regards Michael > > Joe ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps
Re: [Taps] MTU / equivalent at the transport layer
> On 7 Dec 2016, at 15:54, Michael Welzlwrote: > > Hi all, > > I have a problem with one particular primitive, or lack of it, in UDP, > UDP-Lite and SCTP. It's something I just don't get. > > Consider this text from draft-fairhurst-taps-transports-usage-udp: > > "GET_INTERFACE_MTU: The GET_INTERFACE_MTU function a network-layer > function that indicates the largest unfragmented IP packet that > may be sent." > > Indeed, this is a network-layer function. It's about the interface, not about > UDP. Does that mean that, to decide how many bytes fit in the payload of a > packet, the programmer needs to know if it's IPv4 or IPv6, with or without > options, and do the calculation? > If so, isn't it extremely odd that UDP doesn't offer a primitive that > provides a more useful number: the available space in its payload, in bytes? > > I also have the same question for SCTP. For TCP, it's obvious that the > application shouldn't bother, but not for UDP or SCTP. In the API description in https://tools.ietf.org/html/rfc6458 the MTU exposed to the application via the API is "the number of bytes available in an SCTP packet for chunks." I think this is the best we can do at that interface... Best regards Michael > At the last meeting, knowing the MTU was mentioned as one of the needs that > latency-critical protocols have. I understand that - but I didn't include > this primitive in the last version of the usage draft because it is a > network-layer primitive... now I don't know how to approach this. > > Thoughts? Suggestions? > > Cheers, > Michael > > ___ > Taps mailing list > Taps@ietf.org > https://www.ietf.org/mailman/listinfo/taps ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps
Re: [Taps] MTU / equivalent at the transport layer
On 12/9/2016 12:56 AM, Michael Welzl wrote: > These things can be achieved by only changing the implementation of > transports to locally provide some more of its internal information to a > system on top; they don't change anything on the wire... FWIW, we really need to stop using that phrase ("on the wire"). A protocol is defined by: - it's interface to the layer above - its interface to the layer below - the messages (the "on the wire" part) - time - it's state machine and how it reacts to the above 4 items A protocol *implementation* can vary but still interoperate with other protocols only when NONE of the above change. Joe ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps
Re: [Taps] MTU / equivalent at the transport layer
> On 09 Dec 2016, at 16:18, Joe Touchwrote: > > > > On 12/9/2016 12:09 AM, Michael Welzl wrote: >>> On 07 Dec 2016, at 20:29, Joe Touch wrote: >>> >>> FYI, there are two different "largest messages" at the transport layer: >>> >>> 1) the size of the message that CAN be delivered at all >> True... I wasn't thinking of that, but yes. >> >> >>> 2) the size of the message that can be delivered without network-layer >>> fragmentation (there are no guarantees about link-layer - see ATM or the >>> recent discussion on tunnel MTUs on INTAREA) >>> >>> MTU generally refers to the *link payload*. At that point, transports >>> have to account for network headers, network options, transport headers, >>> and transport options too. See RFC6691. >>> >>> MSS refers to the transport message size AFAICT. It is *sometimes* tuned >>> to MTU-headers but not always. >>> >>> E.g., for IPv6, link MTU is required to be at least 1280, but the >>> src-dst transit MTU is required to be at least 1500. So a transport that >>> wants to match sizes and reduce fragmentation issues would pick >>> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust >>> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time. >> So I'm getting the impression that the answer to my question really is that, >> to figure out 2) (which I was concerned with), an application programmer >> needs to do the calculation her/himself. > > To figure out 2), the transport layer needs to know the unfragmented > link MTU, the size of all of the network headers (including options), > and the sizes of its own headers and options. > > It's also sometimes assumes that the transport can control the "DF" bit > (for IPv4). Yes - but that hardly sounds worse to me than requiring the application programmer to do this protocol-specific calculation by hand... > However, this all breaks down if the app makes the wrong choice because > the net can (will, and should) source fragment if it gets a message that > turns out to be too big for one fragment anyway. > >> Not a big deal - and maybe some systems offer a function to give you the >> size of a message that won't be fragmented. > > Remember that - at best - you're optimizing for the next layer down > only. You can't know whether that net layer message is link fragmented > (e.g., as in ATM) or tunnel fragmented (as needs to be required or this > whole MTU concept breaks down). Sure - but that's something end systems just can't see. It's information up to and including the IP layer that should be correctly handed over up the stack, inside the host, with all the caveats this information comes with. >> However: this calculation is transport protocol dependent, which we really >> don't want to have in TAPS. > > If you want to fix this, you need to change the API to the net layer to > provide immediate feedback. When transport hands a segment to network, > it has to get a "call failed" if the message is too big - and we really > do need transport layers to be able to pick between "too big for > non-fragmented net layer" and "too big for the net layer even with frag". > > Merely handing info to the transport layer might not be enough, esp. > when net layer option lengths change. True if you want to cleanly fix it across the RFC-specified stack, but that's beyond the concern of TAPS - it becomes a requirement from the TAPS WG. Does that make sense? Cheers, Michael ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps
[Taps] Document Action: 'Services provided by IETF transport protocols and congestion control mechanisms' to Informational RFC (draft-ietf-taps-transports-14.txt)
The IESG has approved the following document: - 'Services provided by IETF transport protocols and congestion control mechanisms' (draft-ietf-taps-transports-14.txt) as Informational RFC This document is the product of the Transport Services Working Group. The IESG contact persons are Mirja Kühlewind and Spencer Dawkins. A URL of this Internet Draft is: https://datatracker.ietf.org/doc/draft-ietf-taps-transports/ Technical Summary This document describes, surveys, classifies and compares the protocol mechanisms provided by existing IETF protocols, as background for determining a common set of transport services. Protocols addressed include TCP, SCTP, UDP, UDP-Lite, DCCP, ICMP, RTP, FLUTE/ALC, NORM, TLS, DTLS, and HTTP when used as a pseudotransport. It captures important analysis needed for the TAPS working group goal of developing an abstract API enabling applications to make use of modern transports with the help of TAPS mechanisms, for example to probe and verify end-to-end protocol transparency. This is a useful first step by the TAPS working group to proposing future abstractions and mechanisms. Working Group Summary All the protocols referenced in this document are products of the IETF. The goal here is to introduce consisten terminology and pull together a common view of a number of well-known protocols. The working group struggled early on in finding the right level of abstraction but was able to achieve consensus on the approach contained in the doc. Each protocol section had one or two active authors who are experts in their section and went through multiple revisions. The result is that about a dozen contributors have provided text so the engagement was high, compared to the active mailing list members. A few objections have been raised about whether the overall effort will be useful but the contents of this draft have not been controversial. The working group held a last call that spanned an IETF meeting with a number of cleanup tasks identified. Some new introductory text and restructuring on the doc was introduced and a second, online last call, produced no comments. It is the opinion of the shepherd that the document is ready for publication. Document Quality The document looks fine (to the AD). This is a survey of existing transport protocols, so the usual questions about implementations, etc. don't apply in this case. Personnel The document shepherd is Aaron Falk. The responsible Area Director is Spencer Dawkins. ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps
Re: [Taps] MTU / equivalent at the transport layer
On 12/9/2016 12:09 AM, Michael Welzl wrote: >> On 07 Dec 2016, at 20:29, Joe Touchwrote: >> >> FYI, there are two different "largest messages" at the transport layer: >> >> 1) the size of the message that CAN be delivered at all > True... I wasn't thinking of that, but yes. > > >> 2) the size of the message that can be delivered without network-layer >> fragmentation (there are no guarantees about link-layer - see ATM or the >> recent discussion on tunnel MTUs on INTAREA) >> >> MTU generally refers to the *link payload*. At that point, transports >> have to account for network headers, network options, transport headers, >> and transport options too. See RFC6691. >> >> MSS refers to the transport message size AFAICT. It is *sometimes* tuned >> to MTU-headers but not always. >> >> E.g., for IPv6, link MTU is required to be at least 1280, but the >> src-dst transit MTU is required to be at least 1500. So a transport that >> wants to match sizes and reduce fragmentation issues would pick >> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust >> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time. > So I'm getting the impression that the answer to my question really is that, > to figure out 2) (which I was concerned with), an application programmer > needs to do the calculation her/himself. To figure out 2), the transport layer needs to know the unfragmented link MTU, the size of all of the network headers (including options), and the sizes of its own headers and options. It's also sometimes assumes that the transport can control the "DF" bit (for IPv4). However, this all breaks down if the app makes the wrong choice because the net can (will, and should) source fragment if it gets a message that turns out to be too big for one fragment anyway. > Not a big deal - and maybe some systems offer a function to give you the size > of a message that won't be fragmented. Remember that - at best - you're optimizing for the next layer down only. You can't know whether that net layer message is link fragmented (e.g., as in ATM) or tunnel fragmented (as needs to be required or this whole MTU concept breaks down). > > However: this calculation is transport protocol dependent, which we really > don't want to have in TAPS. If you want to fix this, you need to change the API to the net layer to provide immediate feedback. When transport hands a segment to network, it has to get a "call failed" if the message is too big - and we really do need transport layers to be able to pick between "too big for non-fragmented net layer" and "too big for the net layer even with frag". Merely handing info to the transport layer might not be enough, esp. when net layer option lengths change. Joe ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps
Re: [Taps] MTU / equivalent at the transport layer
> On 09 Dec 2016, at 09:46, Gorry Fairhurstwrote: > > On 09/12/2016 08:09, Michael Welzl wrote: >>> On 07 Dec 2016, at 20:29, Joe Touch wrote: >>> >>> FYI, there are two different "largest messages" at the transport layer: >>> >>> 1) the size of the message that CAN be delivered at all >> True... I wasn't thinking of that, but yes. >> >> >>> 2) the size of the message that can be delivered without network-layer >>> fragmentation (there are no guarantees about link-layer - see ATM or the >>> recent discussion on tunnel MTUs on INTAREA) >>> >>> MTU generally refers to the *link payload*. At that point, transports >>> have to account for network headers, network options, transport headers, >>> and transport options too. See RFC6691. >>> >>> MSS refers to the transport message size AFAICT. It is *sometimes* tuned >>> to MTU-headers but not always. >>> >>> E.g., for IPv6, link MTU is required to be at least 1280, but the >>> src-dst transit MTU is required to be at least 1500. So a transport that >>> wants to match sizes and reduce fragmentation issues would pick >>> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust >>> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time. >> So I'm getting the impression that the answer to my question really is that, >> to figure out 2) (which I was concerned with), an application programmer >> needs to do the calculation her/himself. >> Not a big deal - and maybe some systems offer a function to give you the >> size of a message that won't be fragmented. >> >> However: this calculation is transport protocol dependent, which we really >> don't want to have in TAPS. >> >> I conclude: a TAPS system must implement a local function to provide this >> information, both 1) and 2) above. > Referning back to DCCP as a RFC reference which agrees with this. > > I agree with (2) for a datagram transport, similar to MPS in DCCP. If you are > going to allow different transports to be selected by TAPS - it's hard to > know even if the transport below the TAPS API will finally need to choose to > add an option to satisfy the service (e.g., timestamp, checksum, whatever ) > and the size of such an option likely varies depending on the finally chosen > protocol. To me, this suggests it useful to know how many bytes the app can > send with reasonable chance of unfragmented delivery. > > Apps should be allowed to send more if they wish, If an app is doing datagram > path MTU discovery, this may resilt in raising the maximum unfragmented > datagram size. > > I'm OK with being able to retrieve the absolute maximum allowed - that could > be useful in determinine probe sizes for an application doing path MTU > discovery. In DCCP there is a hard limt, called the current congestion > control maximum packet size (CCMPS), the largest permitted by the stack using > the current congestion control method. That's bound to be less than or equal > to what is permittd for the local Interface MTU. > > Gorry I agree with all that. But combining this with the wish to have message dependencies, a wish that is in both draft-trammell-post-sockets and draft-mcquistin-taps-low-latency-services, yet also not supported by any of the already defined transport protocols, it gets clear that we must define some extra functionality for a TAPS system, when we get to charter item 3. These things can be achieved by only changing the implementation of transports to locally provide some more of its internal information to a system on top; they don't change anything on the wire... Cheers, Michael ___ Taps mailing list Taps@ietf.org https://www.ietf.org/mailman/listinfo/taps