Re: [Taps] MTU / equivalent at the transport layer

2016-12-07 Thread Gorry Fairhurst
Not quoite answering this question, but for DCCP this is clearly also a 
transport understanding of the maxmium datagram size, as in RFC 4340:


"A DCCP implementation MUST maintain the maximum packet size (MPS) 
allowed for each active DCCP session."

and
"The MPS reported to the application SHOULD be influenced by the size 
expected to be required for DCCP headers and options."


Gorry

On 07/12/2016 14:54, Michael Welzl wrote:

Hi all,

I have a problem with one particular primitive, or lack of it, in UDP, UDP-Lite 
and SCTP. It's something I just don't get.

Consider this text from draft-fairhurst-taps-transports-usage-udp:

"GET_INTERFACE_MTU:  The GET_INTERFACE_MTU function a network-layer
  function that indicates the largest unfragmented IP packet that
  may be sent."

Indeed, this is a network-layer function. It's about the interface, not about 
UDP. Does that mean that, to decide how many bytes fit in the payload of a 
packet, the programmer needs to know if it's IPv4 or IPv6, with or without 
options, and do the calculation?
If so, isn't it extremely odd that UDP doesn't offer a primitive that provides 
a more useful number: the available space in its payload, in bytes?

I also have the same question for SCTP.  For TCP, it's obvious that the 
application shouldn't bother, but not for UDP or SCTP.
At the last meeting, knowing the MTU was mentioned as one of the needs that 
latency-critical protocols have. I understand that - but I didn't include this 
primitive in the last version of the usage draft because it is a network-layer 
primitive... now I don't know how to approach this.

Thoughts? Suggestions?

Cheers,
Michael

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps



___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-07 Thread Joe Touch
FYI, there are two different "largest messages" at the transport layer:

1) the size of the message that CAN be delivered at all

2) the size of the message that can be delivered without network-layer
fragmentation (there are no guarantees about link-layer - see ATM or the
recent discussion on tunnel MTUs on INTAREA)

MTU generally refers to the *link payload*. At that point, transports
have to account for network headers, network options, transport headers,
and transport options too. See RFC6691.

MSS refers to the transport message size AFAICT. It is *sometimes* tuned
to MTU-headers but not always.

E.g., for IPv6, link MTU is required to be at least 1280, but the
src-dst transit MTU is required to be at least 1500. So a transport that
wants to match sizes and reduce fragmentation issues would pick
1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time.

Joe


On 12/7/2016 6:54 AM, Michael Welzl wrote:
> Hi all,
>
> I have a problem with one particular primitive, or lack of it, in UDP, 
> UDP-Lite and SCTP. It's something I just don't get.
>
> Consider this text from draft-fairhurst-taps-transports-usage-udp:
>
> "GET_INTERFACE_MTU:  The GET_INTERFACE_MTU function a network-layer
>   function that indicates the largest unfragmented IP packet that
>   may be sent."
>
> Indeed, this is a network-layer function. It's about the interface, not about 
> UDP. Does that mean that, to decide how many bytes fit in the payload of a 
> packet, the programmer needs to know if it's IPv4 or IPv6, with or without 
> options, and do the calculation?
> If so, isn't it extremely odd that UDP doesn't offer a primitive that 
> provides a more useful number: the available space in its payload, in bytes?
>
> I also have the same question for SCTP.  For TCP, it's obvious that the 
> application shouldn't bother, but not for UDP or SCTP.
> At the last meeting, knowing the MTU was mentioned as one of the needs that 
> latency-critical protocols have. I understand that - but I didn't include 
> this primitive in the last version of the usage draft because it is a 
> network-layer primitive... now I don't know how to approach this.
>
> Thoughts? Suggestions?
>
> Cheers,
> Michael
>
> ___
> Taps mailing list
> Taps@ietf.org
> https://www.ietf.org/mailman/listinfo/taps

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Michael Welzl

> On 07 Dec 2016, at 20:29, Joe Touch  wrote:
> 
> FYI, there are two different "largest messages" at the transport layer:
> 
> 1) the size of the message that CAN be delivered at all

True... I wasn't thinking of that, but yes.


> 2) the size of the message that can be delivered without network-layer
> fragmentation (there are no guarantees about link-layer - see ATM or the
> recent discussion on tunnel MTUs on INTAREA)
> 
> MTU generally refers to the *link payload*. At that point, transports
> have to account for network headers, network options, transport headers,
> and transport options too. See RFC6691.
> 
> MSS refers to the transport message size AFAICT. It is *sometimes* tuned
> to MTU-headers but not always.
> 
> E.g., for IPv6, link MTU is required to be at least 1280, but the
> src-dst transit MTU is required to be at least 1500. So a transport that
> wants to match sizes and reduce fragmentation issues would pick
> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time.

So I'm getting the impression that the answer to my question really is that, to 
figure out 2)  (which I was concerned with), an application programmer needs to 
do the calculation her/himself.
Not a big deal - and maybe some systems offer a function to give you the size 
of a message that won't be fragmented.

However: this calculation is transport protocol dependent, which we really 
don't want to have in TAPS.

I conclude: a TAPS system must implement a local function to provide this 
information, both 1) and 2) above.

Cheers,
Michael

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Gorry Fairhurst

On 09/12/2016 08:09, Michael Welzl wrote:

On 07 Dec 2016, at 20:29, Joe Touch  wrote:

FYI, there are two different "largest messages" at the transport layer:

1) the size of the message that CAN be delivered at all

True... I wasn't thinking of that, but yes.



2) the size of the message that can be delivered without network-layer
fragmentation (there are no guarantees about link-layer - see ATM or the
recent discussion on tunnel MTUs on INTAREA)

MTU generally refers to the *link payload*. At that point, transports
have to account for network headers, network options, transport headers,
and transport options too. See RFC6691.

MSS refers to the transport message size AFAICT. It is *sometimes* tuned
to MTU-headers but not always.

E.g., for IPv6, link MTU is required to be at least 1280, but the
src-dst transit MTU is required to be at least 1500. So a transport that
wants to match sizes and reduce fragmentation issues would pick
1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time.

So I'm getting the impression that the answer to my question really is that, to 
figure out 2)  (which I was concerned with), an application programmer needs to 
do the calculation her/himself.
Not a big deal - and maybe some systems offer a function to give you the size 
of a message that won't be fragmented.

However: this calculation is transport protocol dependent, which we really 
don't want to have in TAPS.

I conclude: a TAPS system must implement a local function to provide this 
information, both 1) and 2) above.

Referning back to DCCP as a RFC reference which agrees with this.

I agree with (2) for a datagram transport, similar to MPS in DCCP. If 
you are going to allow different transports to be selected by TAPS - 
it's hard to know even if the transport below the TAPS API will finally 
need to choose to add an option to satisfy the service (e.g., timestamp, 
checksum, whatever ) and the size of such an option likely varies 
depending on the finally chosen protocol. To me, this suggests it useful 
to know how many bytes the app can send with reasonable chance of 
unfragmented delivery.


Apps should be allowed to send more if they wish, If an app is doing 
datagram path MTU discovery, this may resilt in raising the maximum 
unfragmented datagram size.


I'm OK with being able to retrieve the absolute maximum allowed - that 
could be useful in determinine probe sizes for an application doing path 
MTU discovery.  In DCCP there is  a hard limt, called the current 
congestion control maximum packet size (CCMPS), the largest permitted by 
the stack using the current congestion control method. That's bound to 
be less than or equal to what is permittd for the local Interface MTU.


Gorry

Cheers,
Michael

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Michael Welzl

> On 09 Dec 2016, at 09:46, Gorry Fairhurst  wrote:
> 
> On 09/12/2016 08:09, Michael Welzl wrote:
>>> On 07 Dec 2016, at 20:29, Joe Touch  wrote:
>>> 
>>> FYI, there are two different "largest messages" at the transport layer:
>>> 
>>> 1) the size of the message that CAN be delivered at all
>> True... I wasn't thinking of that, but yes.
>> 
>> 
>>> 2) the size of the message that can be delivered without network-layer
>>> fragmentation (there are no guarantees about link-layer - see ATM or the
>>> recent discussion on tunnel MTUs on INTAREA)
>>> 
>>> MTU generally refers to the *link payload*. At that point, transports
>>> have to account for network headers, network options, transport headers,
>>> and transport options too. See RFC6691.
>>> 
>>> MSS refers to the transport message size AFAICT. It is *sometimes* tuned
>>> to MTU-headers but not always.
>>> 
>>> E.g., for IPv6, link MTU is required to be at least 1280, but the
>>> src-dst transit MTU is required to be at least 1500. So a transport that
>>> wants to match sizes and reduce fragmentation issues would pick
>>> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
>>> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time.
>> So I'm getting the impression that the answer to my question really is that, 
>> to figure out 2)  (which I was concerned with), an application programmer 
>> needs to do the calculation her/himself.
>> Not a big deal - and maybe some systems offer a function to give you the 
>> size of a message that won't be fragmented.
>> 
>> However: this calculation is transport protocol dependent, which we really 
>> don't want to have in TAPS.
>> 
>> I conclude: a TAPS system must implement a local function to provide this 
>> information, both 1) and 2) above.
> Referning back to DCCP as a RFC reference which agrees with this.
> 
> I agree with (2) for a datagram transport, similar to MPS in DCCP. If you are 
> going to allow different transports to be selected by TAPS - it's hard to 
> know even if the transport below the TAPS API will finally need to choose to 
> add an option to satisfy the service (e.g., timestamp, checksum, whatever ) 
> and the size of such an option likely varies depending on the finally chosen 
> protocol. To me, this suggests it useful to know how many bytes the app can 
> send with reasonable chance of unfragmented delivery.
> 
> Apps should be allowed to send more if they wish, If an app is doing datagram 
> path MTU discovery, this may resilt in raising the maximum unfragmented 
> datagram size.
> 
> I'm OK with being able to retrieve the absolute maximum allowed - that could 
> be useful in determinine probe sizes for an application doing path MTU 
> discovery.  In DCCP there is  a hard limt, called the current congestion 
> control maximum packet size (CCMPS), the largest permitted by the stack using 
> the current congestion control method. That's bound to be less than or equal 
> to what is permittd for the local Interface MTU.
> 
> Gorry

I agree with all that.
But combining this with the wish to have message dependencies, a wish that is 
in both draft-trammell-post-sockets and 
draft-mcquistin-taps-low-latency-services, yet also not supported by any of the 
already defined transport protocols, it gets clear that we must define some 
extra functionality for a TAPS system, when we get to charter item 3.

These things can be achieved by only changing the implementation of transports 
to locally provide some more of its internal information to a system on top; 
they don't change anything on the wire...

Cheers,
Michael

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Joe Touch


On 12/9/2016 12:09 AM, Michael Welzl wrote:
>> On 07 Dec 2016, at 20:29, Joe Touch  wrote:
>>
>> FYI, there are two different "largest messages" at the transport layer:
>>
>> 1) the size of the message that CAN be delivered at all
> True... I wasn't thinking of that, but yes.
>
>
>> 2) the size of the message that can be delivered without network-layer
>> fragmentation (there are no guarantees about link-layer - see ATM or the
>> recent discussion on tunnel MTUs on INTAREA)
>>
>> MTU generally refers to the *link payload*. At that point, transports
>> have to account for network headers, network options, transport headers,
>> and transport options too. See RFC6691.
>>
>> MSS refers to the transport message size AFAICT. It is *sometimes* tuned
>> to MTU-headers but not always.
>>
>> E.g., for IPv6, link MTU is required to be at least 1280, but the
>> src-dst transit MTU is required to be at least 1500. So a transport that
>> wants to match sizes and reduce fragmentation issues would pick
>> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
>> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time.
> So I'm getting the impression that the answer to my question really is that, 
> to figure out 2)  (which I was concerned with), an application programmer 
> needs to do the calculation her/himself.

To figure out 2), the transport layer needs to know the unfragmented
link MTU, the size of all of the network headers (including options),
and the sizes of its own headers and options.

It's also sometimes assumes that the transport can control the "DF" bit
(for IPv4).

However, this all breaks down if the app makes the wrong choice because
the net can (will, and should) source fragment if it gets a message that
turns out  to be too big for one fragment anyway.

> Not a big deal - and maybe some systems offer a function to give you the size 
> of a message that won't be fragmented.

Remember that - at best - you're optimizing for the next layer down
only. You can't know whether that net layer message is link fragmented
(e.g., as in ATM) or tunnel fragmented (as needs to be required or this
whole MTU concept breaks down).

>
> However: this calculation is transport protocol dependent, which we really 
> don't want to have in TAPS.

If you want to fix this, you need to change the API to the net layer to
provide immediate feedback. When transport hands a segment to network,
it has to get a "call failed" if the message is too big - and we really
do need transport layers to be able to pick between "too big for
non-fragmented net layer" and "too big for the net layer even with frag".

Merely handing info to the transport layer might not be enough, esp.
when net layer option lengths change.

Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Michael Welzl

> On 09 Dec 2016, at 16:18, Joe Touch  wrote:
> 
> 
> 
> On 12/9/2016 12:09 AM, Michael Welzl wrote:
>>> On 07 Dec 2016, at 20:29, Joe Touch  wrote:
>>> 
>>> FYI, there are two different "largest messages" at the transport layer:
>>> 
>>> 1) the size of the message that CAN be delivered at all
>> True... I wasn't thinking of that, but yes.
>> 
>> 
>>> 2) the size of the message that can be delivered without network-layer
>>> fragmentation (there are no guarantees about link-layer - see ATM or the
>>> recent discussion on tunnel MTUs on INTAREA)
>>> 
>>> MTU generally refers to the *link payload*. At that point, transports
>>> have to account for network headers, network options, transport headers,
>>> and transport options too. See RFC6691.
>>> 
>>> MSS refers to the transport message size AFAICT. It is *sometimes* tuned
>>> to MTU-headers but not always.
>>> 
>>> E.g., for IPv6, link MTU is required to be at least 1280, but the
>>> src-dst transit MTU is required to be at least 1500. So a transport that
>>> wants to match sizes and reduce fragmentation issues would pick
>>> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
>>> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time.
>> So I'm getting the impression that the answer to my question really is that, 
>> to figure out 2)  (which I was concerned with), an application programmer 
>> needs to do the calculation her/himself.
> 
> To figure out 2), the transport layer needs to know the unfragmented
> link MTU, the size of all of the network headers (including options),
> and the sizes of its own headers and options.
> 
> It's also sometimes assumes that the transport can control the "DF" bit
> (for IPv4).

Yes - but that hardly sounds worse to me than requiring the application 
programmer to do this protocol-specific calculation by hand...


> However, this all breaks down if the app makes the wrong choice because
> the net can (will, and should) source fragment if it gets a message that
> turns out  to be too big for one fragment anyway.
> 
>> Not a big deal - and maybe some systems offer a function to give you the 
>> size of a message that won't be fragmented.
> 
> Remember that - at best - you're optimizing for the next layer down
> only. You can't know whether that net layer message is link fragmented
> (e.g., as in ATM) or tunnel fragmented (as needs to be required or this
> whole MTU concept breaks down).

Sure - but that's something end systems just can't see. It's information up to 
and including the IP layer that should be correctly handed over up the stack, 
inside the host, with all the caveats this information comes with.


>> However: this calculation is transport protocol dependent, which we really 
>> don't want to have in TAPS.
> 
> If you want to fix this, you need to change the API to the net layer to
> provide immediate feedback. When transport hands a segment to network,
> it has to get a "call failed" if the message is too big - and we really
> do need transport layers to be able to pick between "too big for
> non-fragmented net layer" and "too big for the net layer even with frag".
> 
> Merely handing info to the transport layer might not be enough, esp.
> when net layer option lengths change.

True if you want to cleanly fix it across the RFC-specified stack, but that's 
beyond the concern of TAPS - it becomes a requirement from the TAPS WG. Does 
that make sense?

Cheers,
Michael

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Joe Touch


On 12/9/2016 8:12 AM, Michael Welzl wrote:
>> On 09 Dec 2016, at 16:18, Joe Touch  wrote:
>>
>>
>>
>> On 12/9/2016 12:09 AM, Michael Welzl wrote:
 On 07 Dec 2016, at 20:29, Joe Touch  wrote:

 FYI, there are two different "largest messages" at the transport layer:

 1) the size of the message that CAN be delivered at all
>>> True... I wasn't thinking of that, but yes.
>>>
>>>
 2) the size of the message that can be delivered without network-layer
 fragmentation (there are no guarantees about link-layer - see ATM or the
 recent discussion on tunnel MTUs on INTAREA)

 MTU generally refers to the *link payload*. At that point, transports
 have to account for network headers, network options, transport headers,
 and transport options too. See RFC6691.

 MSS refers to the transport message size AFAICT. It is *sometimes* tuned
 to MTU-headers but not always.

 E.g., for IPv6, link MTU is required to be at least 1280, but the
 src-dst transit MTU is required to be at least 1500. So a transport that
 wants to match sizes and reduce fragmentation issues would pick
 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
 that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the 
 time.
>>> So I'm getting the impression that the answer to my question really is 
>>> that, to figure out 2)  (which I was concerned with), an application 
>>> programmer needs to do the calculation her/himself.
>> To figure out 2), the transport layer needs to know the unfragmented
>> link MTU, the size of all of the network headers (including options),
>> and the sizes of its own headers and options.
>>
>> It's also sometimes assumes that the transport can control the "DF" bit
>> (for IPv4).
> Yes - but that hardly sounds worse to me than requiring the application 
> programmer to do this protocol-specific calculation by hand...

The app programmer needs to know what the transport can support, the
transport needs to know what net supports, etc.

Pushing the link MTU up the line and expecting all the other layers to
figure out what to do results in unnecessary complexity, never mind
undermining one of the key features of layering.

>
>> However, this all breaks down if the app makes the wrong choice because
>> the net can (will, and should) source fragment if it gets a message that
>> turns out  to be too big for one fragment anyway.
>>
>>> Not a big deal - and maybe some systems offer a function to give you the 
>>> size of a message that won't be fragmented.
>> Remember that - at best - you're optimizing for the next layer down
>> only. You can't know whether that net layer message is link fragmented
>> (e.g., as in ATM) or tunnel fragmented (as needs to be required or this
>> whole MTU concept breaks down).
> Sure - but that's something end systems just can't see. It's information up 
> to and including the IP layer that should be correctly handed over up the 
> stack, inside the host, with all the caveats this information comes with.
Why does that apply at the link layer but not other layers? If transport
can transfer and reassemble 1MB messages, then that's the "MTU" it needs
to tell the app layer. The same is true for net to tell transport, etc.

We've conflated the two between transport and net unnecessarily.

>
>>> However: this calculation is transport protocol dependent, which we really 
>>> don't want to have in TAPS.
>> If you want to fix this, you need to change the API to the net layer to
>> provide immediate feedback. When transport hands a segment to network,
>> it has to get a "call failed" if the message is too big - and we really
>> do need transport layers to be able to pick between "too big for
>> non-fragmented net layer" and "too big for the net layer even with frag".
>>
>> Merely handing info to the transport layer might not be enough, esp.
>> when net layer option lengths change.
> True if you want to cleanly fix it across the RFC-specified stack, but that's 
> beyond the concern of TAPS - it becomes a requirement from the TAPS WG. Does 
> that make sense?

Then this is part of the API requirements that TAPS should be
indicating, no?

Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Joe Touch


On 12/9/2016 12:56 AM, Michael Welzl wrote:
> These things can be achieved by only changing the implementation of 
> transports to locally provide some more of its internal information to a 
> system on top; they don't change anything on the wire...
FWIW, we really need to stop using that phrase ("on the wire").

A protocol is defined by:
- it's interface to the layer above
- its interface to the layer below
- the messages (the "on the wire" part)
- time
- it's state machine and how it reacts to the above 4 items

A protocol *implementation* can vary but still interoperate with other
protocols only when NONE of the above change.

Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Michael Tuexen

> On 7 Dec 2016, at 15:54, Michael Welzl  wrote:
> 
> Hi all,
> 
> I have a problem with one particular primitive, or lack of it, in UDP, 
> UDP-Lite and SCTP. It's something I just don't get.
> 
> Consider this text from draft-fairhurst-taps-transports-usage-udp:
> 
> "GET_INTERFACE_MTU:  The GET_INTERFACE_MTU function a network-layer
>  function that indicates the largest unfragmented IP packet that
>  may be sent."
> 
> Indeed, this is a network-layer function. It's about the interface, not about 
> UDP. Does that mean that, to decide how many bytes fit in the payload of a 
> packet, the programmer needs to know if it's IPv4 or IPv6, with or without 
> options, and do the calculation?
> If so, isn't it extremely odd that UDP doesn't offer a primitive that 
> provides a more useful number: the available space in its payload, in bytes?
> 
> I also have the same question for SCTP.  For TCP, it's obvious that the 
> application shouldn't bother, but not for UDP or SCTP.
In the API description in https://tools.ietf.org/html/rfc6458 the MTU exposed 
to the application
via the API is "the number of bytes available in an SCTP packet for chunks." I 
think this is the best
we can do at that interface...

Best regards
Michael
> At the last meeting, knowing the MTU was mentioned as one of the needs that 
> latency-critical protocols have. I understand that - but I didn't include 
> this primitive in the last version of the usage draft because it is a 
> network-layer primitive... now I don't know how to approach this.
> 
> Thoughts? Suggestions?
> 
> Cheers,
> Michael
> 
> ___
> Taps mailing list
> Taps@ietf.org
> https://www.ietf.org/mailman/listinfo/taps

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Joe Touch


On 12/9/2016 12:14 PM, Michael Tuexen wrote:
>> On 7 Dec 2016, at 15:54, Michael Welzl  wrote:
>>
>> Hi all,
>>
>> I have a problem with one particular primitive, or lack of it, in UDP, 
>> UDP-Lite and SCTP. It's something I just don't get.
>>
>> Consider this text from draft-fairhurst-taps-transports-usage-udp:
>>
>> "GET_INTERFACE_MTU:  The GET_INTERFACE_MTU function a network-layer
>>  function that indicates the largest unfragmented IP packet that
>>  may be sent."
>>
>> Indeed, this is a network-layer function. It's about the interface, not 
>> about UDP. Does that mean that, to decide how many bytes fit in the payload 
>> of a packet, the programmer needs to know if it's IPv4 or IPv6, with or 
>> without options, and do the calculation?
>> If so, isn't it extremely odd that UDP doesn't offer a primitive that 
>> provides a more useful number: the available space in its payload, in bytes?
>>
>> I also have the same question for SCTP.  For TCP, it's obvious that the 
>> application shouldn't bother, but not for UDP or SCTP.
> In the API description in https://tools.ietf.org/html/rfc6458 the MTU exposed 
> to the application
> via the API is "the number of bytes available in an SCTP packet for chunks." 
> I think this is the best
> we can do at that interface...
AFAICT, that's 1) in my list, e.g., the largest chunk that SCTP can send
without having SCTP coordinate frag/reassembly. That doesn't itself
indicate whether it's SCTP doing the rest or the network layer.

Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Michael Tuexen
> On 9 Dec 2016, at 21:23, Joe Touch  wrote:
> 
> 
> 
> On 12/9/2016 12:14 PM, Michael Tuexen wrote:
>>> On 7 Dec 2016, at 15:54, Michael Welzl  wrote:
>>> 
>>> Hi all,
>>> 
>>> I have a problem with one particular primitive, or lack of it, in UDP, 
>>> UDP-Lite and SCTP. It's something I just don't get.
>>> 
>>> Consider this text from draft-fairhurst-taps-transports-usage-udp:
>>> 
>>> "GET_INTERFACE_MTU:  The GET_INTERFACE_MTU function a network-layer
>>> function that indicates the largest unfragmented IP packet that
>>> may be sent."
>>> 
>>> Indeed, this is a network-layer function. It's about the interface, not 
>>> about UDP. Does that mean that, to decide how many bytes fit in the payload 
>>> of a packet, the programmer needs to know if it's IPv4 or IPv6, with or 
>>> without options, and do the calculation?
>>> If so, isn't it extremely odd that UDP doesn't offer a primitive that 
>>> provides a more useful number: the available space in its payload, in bytes?
>>> 
>>> I also have the same question for SCTP.  For TCP, it's obvious that the 
>>> application shouldn't bother, but not for UDP or SCTP.
>> In the API description in https://tools.ietf.org/html/rfc6458 the MTU 
>> exposed to the application
>> via the API is "the number of bytes available in an SCTP packet for chunks." 
>> I think this is the best
>> we can do at that interface...
> AFAICT, that's 1) in my list, e.g., the largest chunk that SCTP can send
> without having SCTP coordinate frag/reassembly. That doesn't itself
You need to take the DATA or I-DATA chunk header into account...
It is "the number of bytes in the packet for chunks", not "the number
of bytes for the chunk value for DATA (or I-DATA) chunks".
> indicate whether it's SCTP doing the rest or the network layer.
As I said, that is what is exposed to the upper layer via the API.
SCTP itself has procedures to detect the PMTU (of each path).
It does PMTU discovery by interacting with the IP layer...

Best regards
Michael
> 
> Joe
> 
> ___
> Taps mailing list
> Taps@ietf.org
> https://www.ietf.org/mailman/listinfo/taps

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Joe Touch


On 12/9/2016 12:28 PM, Michael Tuexen wrote:
>> ...
>>> In the API description in https://tools.ietf.org/html/rfc6458 the MTU 
>>> exposed to the application
>>> via the API is "the number of bytes available in an SCTP packet for 
>>> chunks." I think this is the best
>>> we can do at that interface...
>> AFAICT, that's 1) in my list, e.g., the largest chunk that SCTP can send
>> without having SCTP coordinate frag/reassembly. That doesn't itself
> You need to take the DATA or I-DATA chunk header into account...
> It is "the number of bytes in the packet for chunks", not "the number
> of bytes for the chunk value for DATA (or I-DATA) chunks".
Yes, but (see below).

>
>> indicate whether it's SCTP doing the rest or the network layer.
> As I said, that is what is exposed to the upper layer via the API.
> SCTP itself has procedures to detect the PMTU (of each path).
> It does PMTU discovery by interacting with the IP layer...

Yes, but it's possible that the chunksize the app sees from SCTP is
related to SCTP's reassembly limit, not IP's and not the PMTU.

At least that's how I read SCTP_MAXSEG.

Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Michael Tuexen
> On 9 Dec 2016, at 22:12, Joe Touch  wrote:
> 
> 
> 
> On 12/9/2016 12:28 PM, Michael Tuexen wrote:
>>> ...
 In the API description in https://tools.ietf.org/html/rfc6458 the MTU 
 exposed to the application
 via the API is "the number of bytes available in an SCTP packet for 
 chunks." I think this is the best
 we can do at that interface...
>>> AFAICT, that's 1) in my list, e.g., the largest chunk that SCTP can send
>>> without having SCTP coordinate frag/reassembly. That doesn't itself
>> You need to take the DATA or I-DATA chunk header into account...
>> It is "the number of bytes in the packet for chunks", not "the number
>> of bytes for the chunk value for DATA (or I-DATA) chunks".
> Yes, but (see below).
> 
>> 
>>> indicate whether it's SCTP doing the rest or the network layer.
>> As I said, that is what is exposed to the upper layer via the API.
>> SCTP itself has procedures to detect the PMTU (of each path).
>> It does PMTU discovery by interacting with the IP layer...
> 
> Yes, but it's possible that the chunksize the app sees from SCTP is
> related to SCTP's reassembly limit, not IP's and not the PMTU.
> 
> At least that's how I read SCTP_MAXSEG.
Not sure what the reassembly limit is... SCTP handled arbitrary sized
user messages a the receiver side by using partial delivery.

The SCTP_MAXSEG allows a user to limit the size of DATA chunks without
reducing the pmtu. Please note that this only affects DATA chunks, not
the whole packet. So that is why I was not referring to this option.
The socket options I was referring to are:
* SCTP_GET_PEER_ADDR_INFO defined in 
https://tools.ietf.org/html/rfc6458#section-8.2.2
* SCTP_PEER_ADDR_PARAMS defined in 
https://tools.ietf.org/html/rfc6458#section-8.1.12

Best regards
Michael
> 
> Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Joe Touch


On 12/9/2016 1:26 PM, Michael Tuexen wrote:
>
> Not sure what the reassembly limit is... SCTP handled arbitrary sized
> user messages a the receiver side by using partial delivery.
>
> The SCTP_MAXSEG allows a user to limit the size of DATA chunks without
> reducing the pmtu.
Yes, but that size can actually be larger than the PMTU, not just smaller.

>  Please note that this only affects DATA chunks, not
> the whole packet.
I was referring to the size the layer above SCTP deals with, not the
layer below.

Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Michael Tuexen

> On 9 Dec 2016, at 22:30, Joe Touch  wrote:
> 
> 
> 
> On 12/9/2016 1:26 PM, Michael Tuexen wrote:
>> 
>> Not sure what the reassembly limit is... SCTP handled arbitrary sized
>> user messages a the receiver side by using partial delivery.
>> 
>> The SCTP_MAXSEG allows a user to limit the size of DATA chunks without
>> reducing the pmtu.
> Yes, but that size can actually be larger than the PMTU, not just smaller.
Hmm. https://tools.ietf.org/html/rfc6458#section-8.1.16 states:

   Note that the
   underlying SCTP implementation may fragment into smaller sized chunks
   when the PMTU of the underlying association is smaller than the value
   set by the user.

So this means the user can not rely on this option to turn off SCTP
fragmentation and let SCTP pass IP-packets down the stack to let
the IP do the fragmentation.

That is why I said, the user can use this option to ask the SCTP
layer to use a smaller value than the one deduced from the PMTU.
That is something you can do safely.

Best regards
Michael
> 
>> Please note that this only affects DATA chunks, not
>> the whole packet.
> I was referring to the size the layer above SCTP deals with, not the
> layer below.

> 
> Joe
> 
> ___
> Taps mailing list
> Taps@ietf.org
> https://www.ietf.org/mailman/listinfo/taps

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-09 Thread Joe Touch


On 12/9/2016 1:38 PM, Michael Tuexen wrote:
>> On 9 Dec 2016, at 22:30, Joe Touch  wrote:
>>
>>
>>
>> On 12/9/2016 1:26 PM, Michael Tuexen wrote:
>>> Not sure what the reassembly limit is... SCTP handled arbitrary sized
>>> user messages a the receiver side by using partial delivery.
>>>
>>> The SCTP_MAXSEG allows a user to limit the size of DATA chunks without
>>> reducing the pmtu.
>> Yes, but that size can actually be larger than the PMTU, not just smaller.
> Hmm. https://tools.ietf.org/html/rfc6458#section-8.1.16 states:
>
>Note that the
>underlying SCTP implementation may fragment into smaller sized chunks
>when the PMTU of the underlying association is smaller than the value
>set by the user.
>
> So this means the user can not rely on this option to turn off SCTP
> fragmentation and let SCTP pass IP-packets down the stack to let
> the IP do the fragmentation.
>
> That is why I said, the user can use this option to ask the SCTP
> layer to use a smaller value than the one deduced from the PMTU.
> That is something you can do safely.

It seems like this setting is independent of PMTU. It could be larger
than PMTU, in which case SCTP *or* IP could do fragmentation (and I
don't see whether there's a way to know or force that decision).

Yes, you can set it smaller too.

Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-12 Thread Michael Welzl
Hi,

Just trying to understand, so we're not talking past each other. Please note 
that I'm not trying to argue in any direction with my comments below, just 
asking for clarification:


> On 09 Dec 2016, at 18:32, Joe Touch  wrote:
> 
> 
> 
> On 12/9/2016 8:12 AM, Michael Welzl wrote:
>>> On 09 Dec 2016, at 16:18, Joe Touch  wrote:
>>> 
>>> 
>>> 
>>> On 12/9/2016 12:09 AM, Michael Welzl wrote:
> On 07 Dec 2016, at 20:29, Joe Touch  wrote:
> 
> FYI, there are two different "largest messages" at the transport layer:
> 
> 1) the size of the message that CAN be delivered at all
 True... I wasn't thinking of that, but yes.
 
 
> 2) the size of the message that can be delivered without network-layer
> fragmentation (there are no guarantees about link-layer - see ATM or the
> recent discussion on tunnel MTUs on INTAREA)
> 
> MTU generally refers to the *link payload*. At that point, transports
> have to account for network headers, network options, transport headers,
> and transport options too. See RFC6691.
> 
> MSS refers to the transport message size AFAICT. It is *sometimes* tuned
> to MTU-headers but not always.
> 
> E.g., for IPv6, link MTU is required to be at least 1280, but the
> src-dst transit MTU is required to be at least 1500. So a transport that
> wants to match sizes and reduce fragmentation issues would pick
> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the 
> time.
 So I'm getting the impression that the answer to my question really is 
 that, to figure out 2)  (which I was concerned with), an application 
 programmer needs to do the calculation her/himself.
>>> To figure out 2), the transport layer needs to know the unfragmented
>>> link MTU, the size of all of the network headers (including options),
>>> and the sizes of its own headers and options.
>>> 
>>> It's also sometimes assumes that the transport can control the "DF" bit
>>> (for IPv4).
>> Yes - but that hardly sounds worse to me than requiring the application 
>> programmer to do this protocol-specific calculation by hand...
> 
> The app programmer needs to know what the transport can support, the
> transport needs to know what net supports, etc.
> 
> Pushing the link MTU up the line and expecting all the other layers to
> figure out what to do results in unnecessary complexity, never mind
> undermining one of the key features of layering.

Either we just agree here, or you're saying that your 2) above should not be 
exposed? Or something else?


>>> However, this all breaks down if the app makes the wrong choice because
>>> the net can (will, and should) source fragment if it gets a message that
>>> turns out  to be too big for one fragment anyway.
>>> 
 Not a big deal - and maybe some systems offer a function to give you the 
 size of a message that won't be fragmented.
>>> Remember that - at best - you're optimizing for the next layer down
>>> only. You can't know whether that net layer message is link fragmented
>>> (e.g., as in ATM) or tunnel fragmented (as needs to be required or this
>>> whole MTU concept breaks down).
>> Sure - but that's something end systems just can't see. It's information up 
>> to and including the IP layer that should be correctly handed over up the 
>> stack, inside the host, with all the caveats this information comes with.
> Why does that apply at the link layer but not other layers? If transport
> can transfer and reassemble 1MB messages, then that's the "MTU" it needs
> to tell the app layer. The same is true for net to tell transport, etc.
> 
> We've conflated the two between transport and net unnecessarily.

So this sounds like you're saying that your item 2) above should not be exposed 
by the transport layer to the application.


 However: this calculation is transport protocol dependent, which we really 
 don't want to have in TAPS.
>>> If you want to fix this, you need to change the API to the net layer to
>>> provide immediate feedback. When transport hands a segment to network,
>>> it has to get a "call failed" if the message is too big - and we really
>>> do need transport layers to be able to pick between "too big for
>>> non-fragmented net layer" and "too big for the net layer even with frag".
>>> 
>>> Merely handing info to the transport layer might not be enough, esp.
>>> when net layer option lengths change.
>> True if you want to cleanly fix it across the RFC-specified stack, but 
>> that's beyond the concern of TAPS - it becomes a requirement from the TAPS 
>> WG. Does that make sense?
> 
> Then this is part of the API requirements that TAPS should be
> indicating, no?

So what does that mean: that the API should contain a "don't fragment" flag 
from the application?


Cheers,
Michael

___
Taps mailing list
Taps@ietf.o

Re: [Taps] MTU / equivalent at the transport layer

2016-12-12 Thread Michael Welzl

> On 09 Dec 2016, at 23:13, Joe Touch  wrote:
> 
> 
> 
> On 12/9/2016 1:38 PM, Michael Tuexen wrote:
>>> On 9 Dec 2016, at 22:30, Joe Touch  wrote:
>>> 
>>> 
>>> 
>>> On 12/9/2016 1:26 PM, Michael Tuexen wrote:
 Not sure what the reassembly limit is... SCTP handled arbitrary sized
 user messages a the receiver side by using partial delivery.
 
 The SCTP_MAXSEG allows a user to limit the size of DATA chunks without
 reducing the pmtu.
>>> Yes, but that size can actually be larger than the PMTU, not just smaller.
>> Hmm. https://tools.ietf.org/html/rfc6458#section-8.1.16 states:
>> 
>>   Note that the
>>   underlying SCTP implementation may fragment into smaller sized chunks
>>   when the PMTU of the underlying association is smaller than the value
>>   set by the user.
>> 
>> So this means the user can not rely on this option to turn off SCTP
>> fragmentation and let SCTP pass IP-packets down the stack to let
>> the IP do the fragmentation.
>> 
>> That is why I said, the user can use this option to ask the SCTP
>> layer to use a smaller value than the one deduced from the PMTU.
>> That is something you can do safely.
> 
> It seems like this setting is independent of PMTU. It could be larger
> than PMTU, in which case SCTP *or* IP could do fragmentation (and I
> don't see whether there's a way to know or force that decision).

Why do you say that?
IIUC, you can get "spinfo_mtu" ( 
https://tools.ietf.org/html/rfc6458#section-8.2.2 ), and then set the maximum 
fragmentation size ( https://tools.ietf.org/html/rfc6458#section-8.1.16 ) to be 
equal to or smaller than that value, in which case you can safely assume no 
fragmentation inside the same host.

Cheers,
Michael

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-12 Thread Gorry Fairhurst

On 12/12/2016 09:31, Michael Welzl wrote:

Hi,

Just trying to understand, so we're not talking past each other. Please note 
that I'm not trying to argue in any direction with my comments below, just 
asking for clarification:



On 09 Dec 2016, at 18:32, Joe Touch  wrote:



On 12/9/2016 8:12 AM, Michael Welzl wrote:

On 09 Dec 2016, at 16:18, Joe Touch  wrote:



On 12/9/2016 12:09 AM, Michael Welzl wrote:

On 07 Dec 2016, at 20:29, Joe Touch  wrote:

FYI, there are two different "largest messages" at the transport layer:

1) the size of the message that CAN be delivered at all

True... I wasn't thinking of that, but yes.



2) the size of the message that can be delivered without network-layer
fragmentation (there are no guarantees about link-layer - see ATM or the
recent discussion on tunnel MTUs on INTAREA)

MTU generally refers to the *link payload*. At that point, transports
have to account for network headers, network options, transport headers,
and transport options too. See RFC6691.

MSS refers to the transport message size AFAICT. It is *sometimes* tuned
to MTU-headers but not always.

E.g., for IPv6, link MTU is required to be at least 1280, but the
src-dst transit MTU is required to be at least 1500. So a transport that
wants to match sizes and reduce fragmentation issues would pick
1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the time.

So I'm getting the impression that the answer to my question really is that, to 
figure out 2)  (which I was concerned with), an application programmer needs to 
do the calculation her/himself.

To figure out 2), the transport layer needs to know the unfragmented
link MTU, the size of all of the network headers (including options),
and the sizes of its own headers and options.

It's also sometimes assumes that the transport can control the "DF" bit
(for IPv4).

Yes - but that hardly sounds worse to me than requiring the application 
programmer to do this protocol-specific calculation by hand...


The app programmer needs to know what the transport can support, the
transport needs to know what net supports, etc.

Pushing the link MTU up the line and expecting all the other layers to
figure out what to do results in unnecessary complexity, never mind
undermining one of the key features of layering.


Either we just agree here, or you're saying that your 2) above should not be 
exposed? Or something else?



However, this all breaks down if the app makes the wrong choice because
the net can (will, and should) source fragment if it gets a message that
turns out  to be too big for one fragment anyway.


Not a big deal - and maybe some systems offer a function to give you the size 
of a message that won't be fragmented.

Remember that - at best - you're optimizing for the next layer down
only. You can't know whether that net layer message is link fragmented
(e.g., as in ATM) or tunnel fragmented (as needs to be required or this
whole MTU concept breaks down).

Sure - but that's something end systems just can't see. It's information up to 
and including the IP layer that should be correctly handed over up the stack, 
inside the host, with all the caveats this information comes with.

Why does that apply at the link layer but not other layers? If transport
can transfer and reassemble 1MB messages, then that's the "MTU" it needs
to tell the app layer. The same is true for net to tell transport, etc.

We've conflated the two between transport and net unnecessarily.


So this sounds like you're saying that your item 2) above should not be exposed 
by the transport layer to the application.



However: this calculation is transport protocol dependent, which we really 
don't want to have in TAPS.

If you want to fix this, you need to change the API to the net layer to
provide immediate feedback. When transport hands a segment to network,
it has to get a "call failed" if the message is too big - and we really
do need transport layers to be able to pick between "too big for
non-fragmented net layer" and "too big for the net layer even with frag".

Merely handing info to the transport layer might not be enough, esp.
when net layer option lengths change.

True if you want to cleanly fix it across the RFC-specified stack, but that's 
beyond the concern of TAPS - it becomes a requirement from the TAPS WG. Does 
that make sense?


Then this is part of the API requirements that TAPS should be
indicating, no?


So what does that mean: that the API should contain a "don't fragment" flag 
from the application?


Definitely.

The use of DF in a datagram protocol is per-datagram decision - 
depending on what the app needs to happen.


gorry



Cheers,
Michael

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps



___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/

Re: [Taps] MTU / equivalent at the transport layer

2016-12-12 Thread Joe Touch


On 12/12/2016 1:31 AM, Michael Welzl wrote:
> Hi,
>
> Just trying to understand, so we're not talking past each other. Please note 
> that I'm not trying to argue in any direction with my comments below, just 
> asking for clarification:
Sure...
>
>> On 09 Dec 2016, at 18:32, Joe Touch  wrote:
>>
>>
>>
>> On 12/9/2016 8:12 AM, Michael Welzl wrote:
 On 09 Dec 2016, at 16:18, Joe Touch  wrote:



 On 12/9/2016 12:09 AM, Michael Welzl wrote:
>> On 07 Dec 2016, at 20:29, Joe Touch  wrote:
>>
>> FYI, there are two different "largest messages" at the transport layer:
>>
>> 1) the size of the message that CAN be delivered at all
> True... I wasn't thinking of that, but yes.
>
>
>> 2) the size of the message that can be delivered without network-layer
>> fragmentation (there are no guarantees about link-layer - see ATM or the
>> recent discussion on tunnel MTUs on INTAREA)
>>
>> MTU generally refers to the *link payload*. At that point, transports
>> have to account for network headers, network options, transport headers,
>> and transport options too. See RFC6691.
>>
>> MSS refers to the transport message size AFAICT. It is *sometimes* tuned
>> to MTU-headers but not always.
>>
>> E.g., for IPv6, link MTU is required to be at least 1280, but the
>> src-dst transit MTU is required to be at least 1500. So a transport that
>> wants to match sizes and reduce fragmentation issues would pick
>> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
>> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the 
>> time.
> So I'm getting the impression that the answer to my question really is 
> that, to figure out 2)  (which I was concerned with), an application 
> programmer needs to do the calculation her/himself.
 To figure out 2), the transport layer needs to know the unfragmented
 link MTU, the size of all of the network headers (including options),
 and the sizes of its own headers and options.

 It's also sometimes assumes that the transport can control the "DF" bit
 (for IPv4).
>>> Yes - but that hardly sounds worse to me than requiring the application 
>>> programmer to do this protocol-specific calculation by hand...
>> The app programmer needs to know what the transport can support, the
>> transport needs to know what net supports, etc.
>>
>> Pushing the link MTU up the line and expecting all the other layers to
>> figure out what to do results in unnecessary complexity, never mind
>> undermining one of the key features of layering.
> Either we just agree here, or you're saying that your 2) above should not be 
> exposed? Or something else?

I'm saying that exposing 2) is a bad idea because it requires extra
information that can vary at other layers.


 However, this all breaks down if the app makes the wrong choice because
 the net can (will, and should) source fragment if it gets a message that
 turns out  to be too big for one fragment anyway.

> Not a big deal - and maybe some systems offer a function to give you the 
> size of a message that won't be fragmented.
 Remember that - at best - you're optimizing for the next layer down
 only. You can't know whether that net layer message is link fragmented
 (e.g., as in ATM) or tunnel fragmented (as needs to be required or this
 whole MTU concept breaks down).
>>> Sure - but that's something end systems just can't see. It's information up 
>>> to and including the IP layer that should be correctly handed over up the 
>>> stack, inside the host, with all the caveats this information comes with.
>> Why does that apply at the link layer but not other layers? If transport
>> can transfer and reassemble 1MB messages, then that's the "MTU" it needs
>> to tell the app layer. The same is true for net to tell transport, etc.
>>
>> We've conflated the two between transport and net unnecessarily.
> So this sounds like you're saying that your item 2) above should not be 
> exposed by the transport layer to the application.
Right - because it's irrelevant to the app. The app needs to know the
"unit of transfer" of the next layer down. If transport frags and
reassembles it to the network layer, then the network layer unit of
transmission is not relevant to the app.


> However: this calculation is transport protocol dependent, which we 
> really don't want to have in TAPS.
 If you want to fix this, you need to change the API to the net layer to
 provide immediate feedback. When transport hands a segment to network,
 it has to get a "call failed" if the message is too big - and we really
 do need transport layers to be able to pick between "too big for
 non-fragmented net layer" and "too big for the net layer even with frag".

 Merely handing info to the transport layer might not be enough, esp.
 when net layer op

Re: [Taps] MTU / equivalent at the transport layer

2016-12-12 Thread Joe Touch


On 12/12/2016 3:45 AM, Gorry Fairhurst wrote:
>>
>> So what does that mean: that the API should contain a "don't
>> fragment" flag from the application?
>>
> Definitely.
>
> The use of DF in a datagram protocol is per-datagram decision -
> depending on what the app needs to happen.
>
> gorry

IMO, the app should never need to play with DF. It needs to know what it
thinks the transport can deliver - which might include transport
frag/reassembly and network frag/reassembly.

Joe


___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-12 Thread Gorry Fairhurst

On 12/12/2016 18:50, Joe Touch wrote:



On 12/12/2016 3:45 AM, Gorry Fairhurst wrote:


So what does that mean: that the API should contain a "don't
fragment" flag from the application?


Definitely.

The use of DF in a datagram protocol is per-datagram decision -
depending on what the app needs to happen.

gorry


IMO, the app should never need to play with DF. It needs to know what it
thinks the transport can deliver - which might include transport
frag/reassembly and network frag/reassembly.

How does the App handle probes for path MTU then in UDP?

Gorry


Joe




___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-12 Thread Joe Touch


On 12/12/2016 10:58 AM, Gorry Fairhurst wrote:
>>
>> IMO, the app should never need to play with DF. It needs to know what it
>> thinks the transport can deliver - which might include transport
>> frag/reassembly and network frag/reassembly.
> How does the App handle probes for path MTU then in UDP?
>
> Gorry 
I think there needs to be two parts to the API:

- largest transmission size
- native transmission desired (true/false)

If the app says "YES" to native transmission size, then that would
suggest that UDP would do *nothing* and pass that same kind of flag down
to IP, where IP would not only set DF=1, but also not source fragment.

I.e., I don't think it's the app's job to know how to explicitly control
a mechanism two layers down, and DF isn't really what you want anyway.
DF isn't the same as "don't source fragment".

Joe
___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-12 Thread Gorry (erg)
This is fine - it looks a like what I pointed to in the DCCP spec. But 
specifically,  I agree you don't need the DF flag visible - if you have a way 
to convey the info needed to set the flag at the transport (and anything else 
appropriate -as you note). I am all in favour of such appropriate abstraction.

Gorry

> On 12 Dec 2016, at 19:09, Joe Touch  wrote:
> 
>> On 12/12/2016 10:58 AM, Gorry Fairhurst wrote:
>>> 
>>> IMO, the app should never need to play with DF. It needs to know what it 
>>> thinks the transport can deliver - which might include transport 
>>> frag/reassembly and network frag/reassembly. 
>> How does the App handle probes for path MTU then in UDP? 
>> 
>> Gorry
> I think there needs to be two parts to the API:
> 
> - largest transmission size
> - native transmission desired (true/false)
> 
> If the app says "YES" to native transmission size, then that would suggest 
> that UDP would do *nothing* and pass that same kind of flag down to IP, where 
> IP would not only set DF=1, but also not source fragment.
> 
> I.e., I don't think it's the app's job to know how to explicitly control a 
> mechanism two layers down, and DF isn't really what you want anyway. DF isn't 
> the same as "don't source fragment".
> 
> Joe
> ___
> Taps mailing list
> Taps@ietf.org
> https://www.ietf.org/mailman/listinfo/taps

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-13 Thread Michael Welzl
Hi,

This direction definitely makes sense to me, too. I see some tension here, 
though - on the one hand, Joe is (as usual) arguing "cleanliness", i.e. keep 
layering right. On the other hand, applications tend to want to know a message 
size that doesn't get fragmented along an IPv4 path (as identified by the 
authors of draft-trammell-post-sockets and 
draft-mcquistin-taps-low-latency-services).
Raising the abstraction level is fine, but I think Joe's suggestion below 
misses something.

In an earlier email, Joe wrote about these two sizes:

***
1) the size of the message that CAN be delivered at all

2) the size of the message that can be delivered without network-layer
fragmentation
***
and stated that 2) should not be exposed.

So, in the proposal below, "largest transmission size" is 1) from above, and 
sending it would fail if it's bigger than 2) above AND "native transmission 
desired" is set to TRUE. So this is how the application would then do its own 
form of PMTUD.

Given that we don't know which protocol we're running over, probing strategies 
that involve common MTU sizes (like using the table in section 7.1 of RFC1191) 
can't work. So it's not the world's most efficient PMTUD that applications will 
be using, to eventually find the value of 2).
A protocol like SCTP is even going to do PMTUD on its own, so it could provide 
a number for 2), which would have less overhead than requiring applications to 
do their own PMTUD.  =>  If we have to "go dirty" anyway, which we already do 
by exposing the binary "native transmission desired", why not offer the value 
of 2) as well?
In other words: how is this boolean better than offering 2) ?

Cheers,
Michael



> On 12 Dec 2016, at 21:53, Gorry (erg)  wrote:
> 
> This is fine - it looks a like what I pointed to in the DCCP spec. But 
> specifically,  I agree you don't need the DF flag visible - if you have a way 
> to convey the info needed to set the flag at the transport (and anything else 
> appropriate -as you note). I am all in favour of such appropriate abstraction.
> 
> Gorry
> 
>> On 12 Dec 2016, at 19:09, Joe Touch  wrote:
>> 
>>> On 12/12/2016 10:58 AM, Gorry Fairhurst wrote:
 
 IMO, the app should never need to play with DF. It needs to know what it 
 thinks the transport can deliver - which might include transport 
 frag/reassembly and network frag/reassembly. 
>>> How does the App handle probes for path MTU then in UDP? 
>>> 
>>> Gorry
>> I think there needs to be two parts to the API:
>> 
>> - largest transmission size
>> - native transmission desired (true/false)
>> 
>> If the app says "YES" to native transmission size, then that would suggest 
>> that UDP would do *nothing* and pass that same kind of flag down to IP, 
>> where IP would not only set DF=1, but also not source fragment.
>> 
>> I.e., I don't think it's the app's job to know how to explicitly control a 
>> mechanism two layers down, and DF isn't really what you want anyway. DF 
>> isn't the same as "don't source fragment".
>> 
>> Joe
>> ___
>> Taps mailing list
>> Taps@ietf.org
>> https://www.ietf.org/mailman/listinfo/taps
> 

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-13 Thread Gorry Fairhurst

On 13/12/2016 09:13, Michael Welzl wrote:

Hi,

This direction definitely makes sense to me, too. I see some tension here, though - on 
the one hand, Joe is (as usual) arguing "cleanliness", i.e. keep layering 
right. On the other hand, applications tend to want to know a message size that doesn't 
get fragmented along an IPv4 path (as identified by the authors of 
draft-trammell-post-sockets and draft-mcquistin-taps-low-latency-services).
Raising the abstraction level is fine, but I think Joe's suggestion below 
misses something.

In an earlier email, Joe wrote about these two sizes:

***
1) the size of the message that CAN be delivered at all

2) the size of the message that can be delivered without network-layer
fragmentation
***
and stated that 2) should not be exposed.

So, in the proposal below, "largest transmission size" is 1) from above, and sending it 
would fail if it's bigger than 2) above AND "native transmission desired" is set to TRUE. 
So this is how the application would then do its own form of PMTUD.

Given that we don't know which protocol we're running over, probing strategies 
that involve common MTU sizes (like using the table in section 7.1 of RFC1191) 
can't work. So it's not the world's most efficient PMTUD that applications will 
be using, to eventually find the value of 2).
A protocol like SCTP is even going to do PMTUD on its own, so it could provide a number for 2), which 
would have less overhead than requiring applications to do their own PMTUD.  =>   If we have to 
"go dirty" anyway, which we already do by exposing the binary "native transmission 
desired", why not offer the value of 2) as well?
In other words: how is this boolean better than offering 2) ?

Cheers,
Michael




On 12 Dec 2016, at 21:53, Gorry (erg)  wrote:

This is fine - it looks a like what I pointed to in the DCCP spec. But 
specifically,  I agree you don't need the DF flag visible - if you have a way 
to convey the info needed to set the flag at the transport (and anything else 
appropriate -as you note). I am all in favour of such appropriate abstraction.

Gorry


On 12 Dec 2016, at 19:09, Joe Touch  wrote:


On 12/12/2016 10:58 AM, Gorry Fairhurst wrote:

IMO, the app should never need to play with DF. It needs to know what it
thinks the transport can deliver - which might include transport
frag/reassembly and network frag/reassembly.

How does the App handle probes for path MTU then in UDP?

Gorry

I think there needs to be two parts to the API:

- largest transmission size
- native transmission desired (true/false)

If the app says "YES" to native transmission size, then that would suggest that 
UDP would do *nothing* and pass that same kind of flag down to IP, where IP would not 
only set DF=1, but also not source fragment.

I.e., I don't think it's the app's job to know how to explicitly control a mechanism two 
layers down, and DF isn't really what you want anyway. DF isn't the same as "don't 
source fragment".

Joe
___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps
So I'd like to return to RFCs that have been through part of this 
discussion before,


(1) I think we need a parameter returned to the App that is equivalent 
to Maximum Packet Size, MPS, in DCCP (RFC4340). It is useful to know how 
many bytes the app can send with reasonable chance of unfragmented delivery.


(2) It's helpful for Apps to be able to retrieve the upper size allowed 
with potential fragmentation - that could be useful in determinine probe 
sizes for an application.  Apps should know the hard limt, In DCCP this 
is called the current congestion control maximum packet size (CCMPS), 
the largest permitted by the stack using the current congestion control 
method. That's bound to be less than or equal to what is permitted for 
the local Interface MTU. This limit lets the App also take into 
consideration other size constraints in the stack below the API.


(3) Apps need to be allowed to fragment datagrams more than MPS - This 
is not expected as the default, the stack needs to be told.


(4) Apps need to be allowed to not allow datagram fragmentation - The 
stack needs to be told. You could do this by using the DF semantics 
(i.e., don't source fragment a DF-marked packet). Thinking more, this 
seems the easiest.


Sorry, if this goes over what I said before, but I think we should first 
explore the approaches that have already been put forward in RFCs 
(alebit these were not RFCs about UDP).


Gorry


___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-13 Thread Michael Welzl

> On 13 Dec 2016, at 11:05, Gorry Fairhurst  wrote:
> 
> On 13/12/2016 09:13, Michael Welzl wrote:
>> Hi,
>> 
>> This direction definitely makes sense to me, too. I see some tension here, 
>> though - on the one hand, Joe is (as usual) arguing "cleanliness", i.e. keep 
>> layering right. On the other hand, applications tend to want to know a 
>> message size that doesn't get fragmented along an IPv4 path (as identified 
>> by the authors of draft-trammell-post-sockets and 
>> draft-mcquistin-taps-low-latency-services).
>> Raising the abstraction level is fine, but I think Joe's suggestion below 
>> misses something.
>> 
>> In an earlier email, Joe wrote about these two sizes:
>> 
>> ***
>> 1) the size of the message that CAN be delivered at all
>> 
>> 2) the size of the message that can be delivered without network-layer
>> fragmentation
>> ***
>> and stated that 2) should not be exposed.
>> 
>> So, in the proposal below, "largest transmission size" is 1) from above, and 
>> sending it would fail if it's bigger than 2) above AND "native transmission 
>> desired" is set to TRUE. So this is how the application would then do its 
>> own form of PMTUD.
>> 
>> Given that we don't know which protocol we're running over, probing 
>> strategies that involve common MTU sizes (like using the table in section 
>> 7.1 of RFC1191) can't work. So it's not the world's most efficient PMTUD 
>> that applications will be using, to eventually find the value of 2).
>> A protocol like SCTP is even going to do PMTUD on its own, so it could 
>> provide a number for 2), which would have less overhead than requiring 
>> applications to do their own PMTUD.  =>   If we have to "go dirty" anyway, 
>> which we already do by exposing the binary "native transmission desired", 
>> why not offer the value of 2) as well?
>> In other words: how is this boolean better than offering 2) ?
>> 
>> Cheers,
>> Michael
>> 
>> 
>> 
>>> On 12 Dec 2016, at 21:53, Gorry (erg)  wrote:
>>> 
>>> This is fine - it looks a like what I pointed to in the DCCP spec. But 
>>> specifically,  I agree you don't need the DF flag visible - if you have a 
>>> way to convey the info needed to set the flag at the transport (and 
>>> anything else appropriate -as you note). I am all in favour of such 
>>> appropriate abstraction.
>>> 
>>> Gorry
>>> 
 On 12 Dec 2016, at 19:09, Joe Touch  wrote:
 
> On 12/12/2016 10:58 AM, Gorry Fairhurst wrote:
>> IMO, the app should never need to play with DF. It needs to know what it
>> thinks the transport can deliver - which might include transport
>> frag/reassembly and network frag/reassembly.
> How does the App handle probes for path MTU then in UDP?
> 
> Gorry
 I think there needs to be two parts to the API:
 
 - largest transmission size
 - native transmission desired (true/false)
 
 If the app says "YES" to native transmission size, then that would suggest 
 that UDP would do *nothing* and pass that same kind of flag down to IP, 
 where IP would not only set DF=1, but also not source fragment.
 
 I.e., I don't think it's the app's job to know how to explicitly control a 
 mechanism two layers down, and DF isn't really what you want anyway. DF 
 isn't the same as "don't source fragment".
 
 Joe
 ___
 Taps mailing list
 Taps@ietf.org
 https://www.ietf.org/mailman/listinfo/taps
>> ___
>> Taps mailing list
>> Taps@ietf.org
>> https://www.ietf.org/mailman/listinfo/taps
> So I'd like to return to RFCs that have been through part of this discussion 
> before,
> 
> (1) I think we need a parameter returned to the App that is equivalent to 
> Maximum Packet Size, MPS, in DCCP (RFC4340). It is useful to know how many 
> bytes the app can send with reasonable chance of unfragmented delivery.

I agree; that seems to be what I ended up proposing above.


> (2) It's helpful for Apps to be able to retrieve the upper size allowed with 
> potential fragmentation - that could be useful in determinine probe sizes for 
> an application.  Apps should know the hard limt, In DCCP this is called the 
> current congestion control maximum packet size (CCMPS), the largest permitted 
> by the stack using the current congestion control method. That's bound to be 
> less than or equal to what is permitted for the local Interface MTU. This 
> limit lets the App also take into consideration other size constraints in the 
> stack below the API.

Agreed; I think that was Joe's item 1) ("the size of the message that CAN be 
delivered at all").


> (3) Apps need to be allowed to fragment datagrams more than MPS - This is not 
> expected as the default, the stack needs to be told.
> 
> (4) Apps need to be allowed to not allow datagram fragmentation - The stack 
> needs to be told. You could do this by using the DF semantics (i.e., don't 
> source fragment a DF-marked pack

Re: [Taps] MTU / equivalent at the transport layer

2016-12-13 Thread Gorry Fairhurst

On 13/12/2016 12:53, Michael Welzl wrote:

On 13 Dec 2016, at 11:05, Gorry Fairhurst  wrote:

On 13/12/2016 09:13, Michael Welzl wrote:

Hi,

This direction definitely makes sense to me, too. I see some tension here, though - on 
the one hand, Joe is (as usual) arguing "cleanliness", i.e. keep layering 
right. On the other hand, applications tend to want to know a message size that doesn't 
get fragmented along an IPv4 path (as identified by the authors of 
draft-trammell-post-sockets and draft-mcquistin-taps-low-latency-services).
Raising the abstraction level is fine, but I think Joe's suggestion below 
misses something.

In an earlier email, Joe wrote about these two sizes:

***
1) the size of the message that CAN be delivered at all

2) the size of the message that can be delivered without network-layer
fragmentation
***
and stated that 2) should not be exposed.

So, in the proposal below, "largest transmission size" is 1) from above, and sending it 
would fail if it's bigger than 2) above AND "native transmission desired" is set to TRUE. 
So this is how the application would then do its own form of PMTUD.

Given that we don't know which protocol we're running over, probing strategies 
that involve common MTU sizes (like using the table in section 7.1 of RFC1191) 
can't work. So it's not the world's most efficient PMTUD that applications will 
be using, to eventually find the value of 2).
A protocol like SCTP is even going to do PMTUD on its own, so it could provide a number for 2), which 
would have less overhead than requiring applications to do their own PMTUD.  =>If we have to 
"go dirty" anyway, which we already do by exposing the binary "native transmission 
desired", why not offer the value of 2) as well?
In other words: how is this boolean better than offering 2) ?

Cheers,
Michael




On 12 Dec 2016, at 21:53, Gorry (erg)   wrote:

This is fine - it looks a like what I pointed to in the DCCP spec. But 
specifically,  I agree you don't need the DF flag visible - if you have a way 
to convey the info needed to set the flag at the transport (and anything else 
appropriate -as you note). I am all in favour of such appropriate abstraction.

Gorry


On 12 Dec 2016, at 19:09, Joe Touch   wrote:


On 12/12/2016 10:58 AM, Gorry Fairhurst wrote:

IMO, the app should never need to play with DF. It needs to know what it
thinks the transport can deliver - which might include transport
frag/reassembly and network frag/reassembly.

How does the App handle probes for path MTU then in UDP?

Gorry

I think there needs to be two parts to the API:

- largest transmission size
- native transmission desired (true/false)

If the app says "YES" to native transmission size, then that would suggest that 
UDP would do *nothing* and pass that same kind of flag down to IP, where IP would not 
only set DF=1, but also not source fragment.

I.e., I don't think it's the app's job to know how to explicitly control a mechanism two 
layers down, and DF isn't really what you want anyway. DF isn't the same as "don't 
source fragment".

Joe
___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps

So I'd like to return to RFCs that have been through part of this discussion 
before,

(1) I think we need a parameter returned to the App that is equivalent to 
Maximum Packet Size, MPS, in DCCP (RFC4340). It is useful to know how many 
bytes the app can send with reasonable chance of unfragmented delivery.

I agree; that seems to be what I ended up proposing above.



(2) It's helpful for Apps to be able to retrieve the upper size allowed with 
potential fragmentation - that could be useful in determinine probe sizes for 
an application.  Apps should know the hard limt, In DCCP this is called the 
current congestion control maximum packet size (CCMPS), the largest permitted 
by the stack using the current congestion control method. That's bound to be 
less than or equal to what is permitted for the local Interface MTU. This limit 
lets the App also take into consideration other size constraints in the stack 
below the API.

Agreed; I think that was Joe's item 1) ("the size of the message that CAN be 
delivered at all").



(3) Apps need to be allowed to fragment datagrams more than MPS - This is not 
expected as the default, the stack needs to be told.

(4) Apps need to be allowed to not allow datagram fragmentation - The stack 
needs to be told. You could do this by using the DF semantics (i.e., don't 
source fragment a DF-marked packet). Thinking more, this seems the easiest.

These two are hard to parse,

Sorry - trying hard.

making me wonder if they mean what was intended. E.g. for (3): applications are always 
allowed to fragment their data as they wish, right?  Did you mean to say "Apps need 
to be allowed to allow to fragment datagrams m

Re: [Taps] MTU / equivalent at the transport layer

2016-12-13 Thread Joe Touch


On 12/13/2016 5:34 AM, Gorry Fairhurst wrote:
> ...
>>>
>>> (1) I think we need a parameter returned to the App that is
>>> equivalent to Maximum Packet Size, MPS, in DCCP (RFC4340). It is
>>> useful to know how many bytes the app can send with reasonable
>>> chance of unfragmented delivery.

All we can know is whether it is unfragmented at the next layer down.

>>
>>> (2) It's helpful for Apps to be able to retrieve the upper size
>>> allowed with potential fragmentation - that could be useful in
>>> determinine probe sizes for an application.  Apps should know the
>>> hard limt, In DCCP this is called the current congestion control
>>> maximum packet size (CCMPS), the largest permitted by the stack
>>> using the current congestion control method. That's bound to be less
>>> than or equal to what is permitted for the local Interface MTU. This
>>> limit lets the App also take into consideration other size
>>> constraints in the stack below the API.
>>
Again, next layer down only. We're generally talking about existing
transports that try to pass the link MTU up through network and
transport transparently, but that need not be the case. Keep in mind
that the link MTU for ATM AAL5 isn't 48B, it's 9K - i.e., it is the
message that the link will deliver intact, not the native link size.
...
> (3) Apps need to be able to ask the stack to try hard to send
> datagrams larger than the current MPS - 

I disagree.

The app should see two values from transport:
   
A) how big a message can you deliver at all?
B) how big a message can you deliver "natively"?

Any probing happens between those two values.

> This is not expected as the default, the stack needs to be told to
> enable this use of source/router fragmentation and send IPv4 datagrams
> with DF=0 (For some IPv4 paths, the PMTU, and hence MPS can be very
> small).

I disagree.

DF=0 is a network flag that should never be exposed to the app. Even if
it is, this wouldn't be the control the app really wants. The app would
want to prevent source fragmentation. DF=0 applies to IPv4 only and only
affects *on path* fragmentation.

But that's not the transport's job.

Consider this:
- transport has max and native message sizes
- network has max and native message sizes
- link has max and native message sizes

Every layer has these. Sometimes they're the same (when a layer doesn't
support frag/ressembly , e.g., UDP, Ethernet), sometimes they're not
(IP, TCP). Somtimes they're unlimited (TCP has no max message size, AFAICT).

> (4) Apps need to be able to ask the stack to send datagrams larger
> than the current MPS, but NOT if this results in source fragmentation. 
Apps can control only transport fragmentation. They can't and shouldn't
see or control network or link fragmentation UNLESS transport lets that
pass through as a transport behavior.

> Such packets need to be sent with DF=1.  - This is not expected as the
> default, the stack needs to be told to enable this -  for UDP it would
> be needed to perform PMTUD. That's I think what has been just called
> "native transmission desired ".

The issue is that this is an interaction between the app and transport.
It has nothing to do with the network or link layers  - unless the
transport wants it to. It's up to the transport to decide whether to try
to "pass through" the native network size. It's up to the network layer
to decide whether to "pass through" the native link size.

E.g., for IPv6, the lowest values to the answers above are:
A) 1500B, including IP header and options
B) 1280B, including IP header and options

That necessarily means that IPv6 over IPv6 cannot truthfully answer (B)
- it HAS to require the lower IPv6 to fragment and reassemble. Otherwise
it would be reporting a value that would make it no longer compliant
with RFC2460.

So we need to be careful about this - there really aren't 4 values here.
There are only two - the max and "native" *as reported by* the next
layer down.

There never has been and never can be a way that an app can solely
manage PMTUD to match network *unless* the transport passes that
information through. There never has been and never can be a way an app
can match to a link layer native MTU (otherwise, we'd be spinning MTUs
down to 48B for ATM).

Joe


___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-14 Thread Gorry Fairhurst


It seems we can not find a common basis here. See below:

On 14/12/2016 01:23, Joe Touch wrote:



On 12/13/2016 5:34 AM, Gorry Fairhurst wrote:

...


(1) I think we need a parameter returned to the App that is
equivalent to Maximum Packet Size, MPS, in DCCP (RFC4340). It is
useful to know how many bytes the app can send with reasonable
chance of unfragmented delivery.


All we can know is whether it is unfragmented at the next layer down.

I disagree. The stack can tell the App a MPS value based on PMTUD (when 
it implements this, or understanding of headers). That's already 
specified for SCTP & DCCP. Sure, the path may change, but at least the 
App can access a recent result.





(2) It's helpful for Apps to be able to retrieve the upper size
allowed with potential fragmentation - that could be useful in
determinine probe sizes for an application.  Apps should know the
hard limt, In DCCP this is called the current congestion control
maximum packet size (CCMPS), the largest permitted by the stack
using the current congestion control method. That's bound to be less
than or equal to what is permitted for the local Interface MTU. This
limit lets the App also take into consideration other size
constraints in the stack below the API.



Again, next layer down only. We're generally talking about existing
transports that try to pass the link MTU up through network and
transport transparently, but that need not be the case. Keep in mind
that the link MTU for ATM AAL5 isn't 48B, it's 9K - i.e., it is the
message that the link will deliver intact, not the native link size.
...

>
In this case, I think you are wrong, sorry. Apps can be told the largest 
message they can send over a transport. And some transports do in fact 
limit this.


I don't see the relevance of the ATM example. Datagram protocols work at 
the transport layer.



(3) Apps need to be able to ask the stack to try hard to send
datagrams larger than the current MPS -


I disagree.


We don't agree. Apps can send probe messages.


The app should see two values from transport:

A) how big a message can you deliver at all?

- Wasn't that the thing I originally cited as CCMPS

B) how big a message can you deliver "natively"?

- Wasn't that MPS?


Any probing happens between those two values.


That's a true!


This is not expected as the default, the stack needs to be told to
enable this use of source/router fragmentation and send IPv4 datagrams
with DF=0 (For some IPv4 paths, the PMTU, and hence MPS can be very
small).


I disagree.

DF=0 is a network flag that should never be exposed to the app. Even if
it is, this wouldn't be the control the app really wants. The app would
want to prevent source fragmentation. DF=0 applies to IPv4 only and only
affects *on path* fragmentation.

But that's not the transport's job.

I disagree, if  MPS > datagram > CCMPS the stack needs to know whether 
to source fragment (3), or for IPv4 whether to allow network 
fragmentation (4). Potentially you could discard if neither (3) or (4) 
is allowed.



Consider this:
- transport has max and native message sizes
- network has max and native message sizes
- link has max and native message sizes

Every layer has these. Sometimes they're the same (when a layer doesn't
support frag/reassembly , e.g., UDP,

>
UDP does support fragmentation.


 Ethernet)

... Which is link layer.


, sometimes they're not
(IP, TCP). Somtimes they're unlimited (TCP has no max message size, AFAICT).

TCP isn't datagram either - and is stream-based - so segments are not 
necessarily packets.



(4) Apps need to be able to ask the stack to send datagrams larger
than the current MPS, but NOT if this results in source fragmentation.

Apps can control only transport fragmentation. They can't and shouldn't
see or control network or link fragmentation UNLESS transport lets that
pass through as a transport behavior.

If we were talking about TCP-like protocols that would be fine, but I'm 
talking about datagram protocols where the PDU being sent is a datagram.



Such packets need to be sent with DF=1.  - This is not expected as the
default, the stack needs to be told to enable this -  for UDP it would
be needed to perform PMTUD. That's I think what has been just called
"native transmission desired ".


The issue is that this is an interaction between the app and transport.
It has nothing to do with the network or link layers  - unless the
transport wants it to. It's up to the transport to decide whether to try
to "pass through" the native network size. It's up to the network layer
to decide whether to "pass through" the native link size.

E.g., for IPv6, the lowest values to the answers above are:
A) 1500B, including IP header and options
B) 1280B, including IP header and options

That necessarily means that IPv6 over IPv6 cannot truthfully answer (B)
- it HAS to require the lower IPv6 to fragment and reassemble. Otherwise
it would be reporting a value that would make it 

Re: [Taps] MTU / equivalent at the transport layer

2016-12-14 Thread Joe Touch
Hi, Gorry,

Let me see of I can explain my viewpoint.

I'll start by noting that there's a difference between "what transports
currently do" and what they "should" do. I agree with you that current
transports do have access to network MTUs and DF control, but don't
always pass all that to the user.

Here's what I think is currently required:

1122 indicates the interface between IP and transport as indicating the
max send and receive transport sizes - but for IPv6 that would be 1500 -
IPheaders, not 1280 - IP headers or even 1500 or 1280.

1122 also indicates that the transport - not the application - gets to
set DF in the SEND call to IP.

However, there does not appear to be any provision to limit source
fragmentation between transport and IP, nor any requirement for UDP to
pass that control to the app.



So 1122 is consistent with my view that:

- the application sees a transport MSS (I'm not sure whether to call
that max or native yet...)

- the transport sees the max network MSS (i.e., allowing source
fragmentation), not the native network MSS

I don't understand why 1122 requires transport control over DF, but UDP
is not required to pass that control to the user.

---

My conclusion is that the app-transport API *should* be required to give
the user control over whether to use the max or native transport MSS,
just as the transport-API should be required to give the transport
control over whether to use the max or native network MSS, and so forth
for the network-link API.

However, none of that implies that the user ever will be able to match
the link MTU. There's always the possibility that one of these layers
decides to never report a true native size, e.g., ATM (as a link layer)
never reports 48 to IP.

This means that apps can't ever really force a probe of much of
anything, too. Only transports can.

(FWIW, when I say that UDP doesn't support fragmentation, I mean that
UDP doesn't have UDP fragmentation. The fragmentation is at the IP
layer; UDP is ignorant of it).

Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps


Re: [Taps] MTU / equivalent at the transport layer

2016-12-14 Thread Joe Touch
One piece of potentially important information:

PMTUD and PLMTUD are intended to avoid the need for on-path
fragmentation. That's why there is control only over the DF bit, not
source fragmentation.

AFAICT, this points to a (AFAICT) a problem with IPv6 PMTUD as described
in RFC1981. That doc claims to try to optimize to the native link MTU,
but that doesn't appear to be possible given the way IPv4 and IPv6
interact with transports. There's no signal to IP that says "don't
source fragment".

Joe


On 12/14/2016 11:12 AM, Joe Touch wrote:
> Hi, Gorry,
>
> Let me see of I can explain my viewpoint.
>
> I'll start by noting that there's a difference between "what transports
> currently do" and what they "should" do. I agree with you that current
> transports do have access to network MTUs and DF control, but don't
> always pass all that to the user.
>
> Here's what I think is currently required:
>
> 1122 indicates the interface between IP and transport as indicating the
> max send and receive transport sizes - but for IPv6 that would be 1500 -
> IPheaders, not 1280 - IP headers or even 1500 or 1280.
>
> 1122 also indicates that the transport - not the application - gets to
> set DF in the SEND call to IP.
>
> However, there does not appear to be any provision to limit source
> fragmentation between transport and IP, nor any requirement for UDP to
> pass that control to the app.
>
> 
>
> So 1122 is consistent with my view that:
>
> - the application sees a transport MSS (I'm not sure whether to call
> that max or native yet...)
>
> - the transport sees the max network MSS (i.e., allowing source
> fragmentation), not the native network MSS
>
> I don't understand why 1122 requires transport control over DF, but UDP
> is not required to pass that control to the user.
>
> ---
>
> My conclusion is that the app-transport API *should* be required to give
> the user control over whether to use the max or native transport MSS,
> just as the transport-API should be required to give the transport
> control over whether to use the max or native network MSS, and so forth
> for the network-link API.
>
> However, none of that implies that the user ever will be able to match
> the link MTU. There's always the possibility that one of these layers
> decides to never report a true native size, e.g., ATM (as a link layer)
> never reports 48 to IP.
>
> This means that apps can't ever really force a probe of much of
> anything, too. Only transports can.
>
> (FWIW, when I say that UDP doesn't support fragmentation, I mean that
> UDP doesn't have UDP fragmentation. The fragmentation is at the IP
> layer; UDP is ignorant of it).
>
> Joe

___
Taps mailing list
Taps@ietf.org
https://www.ietf.org/mailman/listinfo/taps