Re: [ovs-discuss] OVN - MTU path discovery

2018-11-01 Thread Ben Pfaff
On Tue, Oct 30, 2018 at 04:07:42PM +0530, Numan Siddique wrote:
> During the discussion, Ben proposed a new action instead which will check
> the MTU of the outport for the given port.
> something like check_mtu(router_port_xyzzy). This action would raise an
> exception if the packet size is greater
> than the MTU of the outport.  Please see [1] for the chat logs.
> 
> I explored a bit on this new action - check_mtu. And I am not sure if we
> can solve the problem reported here using
> this action. When the chassis (hosting the gateway router port) receives
> the packet from the compute chassis
> tunnel port, it  runs the router pipeline, does NATting and sends the
> packet to the ingress pipeline of the provider
> network logical switch (with the localnet port). From the localnet patch
> port the packet will enter the provider bridge (br-ex)
> and the packet is pushed out of the physical interface. If in case mtu of
> this physical interface is lesser than the size of the
> packet, OVS drops the packet.
> 
> The issue with the check_mtu action is that, OVN doesn't program the
> provider bridges and is unaware of the physical
> interface connected to it. So I am not sure if we can use this approach to
> solve this problem. Although there can be an
> usecase for the check_mtu action.
> 
> Ben, do you have any comments on this ? Did I misunderstand from what you
> were trying to say ? Please correct me If my
> understanding is wrong.

I think you're right.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-10-30 Thread Numan Siddique
Hi All,

In the last week's OVN meeting we discussed about this issue. I want to
summarize the discussion here and
ask a few questions.

Based on the suggestions I got from Ben few weeks earlier, I took the
approach of checking the packet length
using a new OVS action - chk_pkt_len_gt(length) , which is unattractive
name (my apologies).

The idea is to use this action something like
"actions=chk_pkt_len_gt(1500)->NXM_NX_REG0[0], resubmit(..)"
The new action chk_pkt_len_gt(1500) will set the reg0[0] bit to 1 if the
packet length is greater than 1500, 0 otherwise.

And later in the pipeline have a flow like

match= reg0=0x1/01, ... actions=icmp{} /* To send the ICMP type 3
(Destination
Unreachable), code 4
as per RFC 1191 back to the sender */

During the discussion, Ben proposed a new action instead which will check
the MTU of the outport for the given port.
something like check_mtu(router_port_xyzzy). This action would raise an
exception if the packet size is greater
than the MTU of the outport.  Please see [1] for the chat logs.

I explored a bit on this new action - check_mtu. And I am not sure if we
can solve the problem reported here using
this action. When the chassis (hosting the gateway router port) receives
the packet from the compute chassis
tunnel port, it  runs the router pipeline, does NATting and sends the
packet to the ingress pipeline of the provider
network logical switch (with the localnet port). From the localnet patch
port the packet will enter the provider bridge (br-ex)
and the packet is pushed out of the physical interface. If in case mtu of
this physical interface is lesser than the size of the
packet, OVS drops the packet.

The issue with the check_mtu action is that, OVN doesn't program the
provider bridges and is unaware of the physical
interface connected to it. So I am not sure if we can use this approach to
solve this problem. Although there can be an
usecase for the check_mtu action.

Ben, do you have any comments on this ? Did I misunderstand from what you
were trying to say ? Please correct me If my
understanding is wrong.

To solve this MTU issue with OVN, I will go ahead with the action
chk_pkt_len_gt. Please let me know if there are any
concerns here. If the approach seems fine, request to suggest some better
name for the action.

There is also another thing missing. When OVN needs to send the ICMP packet
type 3, code 4, as per the RFC 1191
***

the router MUST include the MTU of that next-hop network in the
low-order 16 bits of the

ICMP header field that is labelled "unused" in the ICMP specification

***


I  don't think we have any OVS action field presently to set this MTU
value. Either we need to support setting this field

with a new OVS action or frame the complete ICMP packet and set the
MTU value in ovn-controller itself (in which

case, we cannot use the existing OVN action "icmp{..}"). Is it
reasonable to add this new action in OVS ? Or let ovn-controller

take care of it with a new OVN action ? Any thoughts on this please ?


Thanks

Numan


[1] - https://botbot.me/freenode/openvswitch/2018-10-25/?tz=Asia/Kolkata

(Looks like botbot.me will be shut down soon -
https://lincolnloop.com/blog/saying-goodbye-botbotme/)

 So copied to a temp pastebin as well - http://paste.openstack.org/show/733616/





On Mon, Sep 24, 2018 at 6:28 PM Daniel Alvarez Sanchez 
wrote:

> Resending this email as I can't see it in [0] for some reason.
> [0] https://mail.openvswitch.org/pipermail/ovs-dev/2018-September/
>
>
>
>
> On Fri, Sep 21, 2018 at 2:36 PM Daniel Alvarez Sanchez <
> dalva...@redhat.com> wrote:
>
>> Hi folks,
>>
>> After talking to Numan and reading log from IRC meeting yesterday,
>> looks like there's some confusion around the issue.
>>
>> jpettit | I should look at the initial bug report again, but is it not
>> sufficient to configure a smaller MTU within the VM?
>>
>> Imagine the case where some host from the external network (MTU 1500)
>> sends 1000B UDP packets to the VM (MTU 200). When OVN attempts to deliver
>> the packet to the VM it won't fit and the application running there will
>> never
>> get the packet.
>>
>> With reference implementation (or if namespaces were used as Han suggests
>> that this is what NSX does), the packet would be handled by the IP stack
>> on
>> the gateway node. An ICMP need-to-frag would be sent back to the sender
>> and - if they're not blocked by some firewall - the IP stack on the
>> sender node
>> will fragment this and subsequent packets to fit the MTU on the receiver.
>>
>> Also, generally we don't want to configure small MTUs on the VMs for
>> performance as it would also impact on east/west traffic where
>> Jumbo frames appear to work.
>>
>> Thanks a lot for bringing this up on the meeting!
>> Daniel
>>
>> On Mon, Aug 13, 2018 at 5:23 PM Miguel Angel Ajo Pelayo <
>> majop...@redhat.com> wrote:
>> >
>> > Yeah, later on we have found that it was, again, more important that we
>> think.
>> >
>> > For example, there are st

Re: [ovs-discuss] OVN - MTU path discovery

2018-09-24 Thread Daniel Alvarez Sanchez
Resending this email as I can't see it in [0] for some reason.
[0] https://mail.openvswitch.org/pipermail/ovs-dev/2018-September/




On Fri, Sep 21, 2018 at 2:36 PM Daniel Alvarez Sanchez 
wrote:

> Hi folks,
>
> After talking to Numan and reading log from IRC meeting yesterday,
> looks like there's some confusion around the issue.
>
> jpettit | I should look at the initial bug report again, but is it not
> sufficient to configure a smaller MTU within the VM?
>
> Imagine the case where some host from the external network (MTU 1500)
> sends 1000B UDP packets to the VM (MTU 200). When OVN attempts to deliver
> the packet to the VM it won't fit and the application running there will
> never
> get the packet.
>
> With reference implementation (or if namespaces were used as Han suggests
> that this is what NSX does), the packet would be handled by the IP stack on
> the gateway node. An ICMP need-to-frag would be sent back to the sender
> and - if they're not blocked by some firewall - the IP stack on the sender
> node
> will fragment this and subsequent packets to fit the MTU on the receiver.
>
> Also, generally we don't want to configure small MTUs on the VMs for
> performance as it would also impact on east/west traffic where
> Jumbo frames appear to work.
>
> Thanks a lot for bringing this up on the meeting!
> Daniel
>
> On Mon, Aug 13, 2018 at 5:23 PM Miguel Angel Ajo Pelayo <
> majop...@redhat.com> wrote:
> >
> > Yeah, later on we have found that it was, again, more important that we
> think.
> >
> > For example, there are still cases not covered by TCP MSS negotiation (or
> > for UDP/other protocols):
> >
> > Imagine you have two clouds, both with an internal MTU (let’s imagine
> > MTUb on cloud B, and MTUa on cloud A), and an external transit
> > network with a 1500 MTU (MTUc).
> >
> > MTUb > MTUc. And MTUb > MTUc
> >
> > Also, imagine that VMa in cloud A, has a floating IP (DNAT_SNAT NAT),
> > and VMb in cloud B has also a floating IP.
> >
> > VMa tries to establish  connection to VMb FIP, and announces
> > MSSa = MTUa - (IP + TCP overhead), VMb ACKs the TCP SYN request
> > with  MSSb = MTUb - (IP - TCP overhead).
> >
> > So the agreement will be min(MSSa,MSSb) , but… the transit network MSSc
> > will always be smaller , min(MSSa, MSSb) < MSSc.
> >
> > In ML2/OVS deployments, those big packets will get fragmented at the
> router
> > edge, and a notification ICMP will be sent to the sender of the packets
> to notify
> > fragmenting in source is necessary.
> >
> >
> > I guess we can also replicate this with 2 VMs on the same cloud with
> MSSa > MSSb
> > where they try to talk via floating IP to each other.
> >
> >
> > So going back to the thing, I guess we need to implement some OpenFlow
> extension
> > to match packets per size, redirecting those to an slow path
> (ovn-controller) so we can
> > Fragment/and icmp back the source for source fragmentation?
> >
> > Any advise on what’s the procedure here (OpenFlow land, kernel wise,
> even in terms
> > of our source code and design so we could implement this) ?
> >
> >
> > Best regards,
> > Miguel Ángel.
> >
> >
> > On 3 August 2018 at 17:41:05, Daniel Alvarez Sanchez (
> dalva...@redhat.com) wrote:
> >
> > Maybe ICMP is not that critical but seems like not having the ICMP 'need
> to frag' on UDP communications could break some applications that are aware
> of this to reduce the size of the packets? I wonder...
> >
> > Thanks!
> > Daniel
> >
> > On Fri, Aug 3, 2018 at 5:20 PM Miguel Angel Ajo Pelayo <
> majop...@redhat.com> wrote:
> >>
> >>
> >> We didn’t understand why a MTU missmatch in one direction worked (N/S),
> >> but in other direction (S/N) didn’t work… and we found that that it’s
> actually
> >> working (at least for TCP, via MSS negotiation), we had a
> missconfiguration
> >> In one of the physical interfaces.
> >>
> >> So, in the case of TCP we are fine. TCP is smart enough to negotiate
> properly.
> >>
> >> Other protocols like ICMP with the DF flag, or UDP… would not get the
> ICMP
> >> that notifies the sender about the MTU miss-match.
> >>
> >> I suspect that the most common cases are covered, and that it’s not
> worth
> >> pursuing what I was asking for at least with a high priority, but I’d
> like to hear
> >> opinions.
> >>
> >>
> >> Best regards,
> >> Miguel Ángel.
> >>
> >> On 3 August 2018 at 08:11:01, Miguel Angel Ajo Pelayo (
> majop...@redhat.com) wrote:
> >>
> >> I’m going to capture some example traffic and try to figure out which
> RFCs
> >> talk about that behaviour so we can come up with a consistent solution.
> >> I can document it in the project.
> >>
> >> To be honest, when I looked at it, I was expecting that the router would
> >> fragment, and I ended up discovering that we had this path MTU discovery
> >> mechanism in play for IPv4 .
> >>
> >> On 2 August 2018 at 22:21:28, Ben Pfaff (b...@ovn.org) wrote:
> >>
> >> On Thu, Aug 02, 2018 at 01:19:57PM -0700, Ben Pfaff wrote:
> >> > On Wed, Aug 01, 2018 at 10:46:07AM -0400, M

Re: [ovs-discuss] OVN - MTU path discovery

2018-08-13 Thread Miguel Angel Ajo Pelayo
Yeah, later on we have found that it was, again, more important that we
think.

For example, there are still cases not covered by TCP MSS negotiation (or
for UDP/other protocols):

Imagine you have two clouds, both with an internal MTU (let’s imagine
MTUb on cloud B, and MTUa on cloud A), and an external transit
network with a 1500 MTU (MTUc).

MTUb > MTUc. And MTUb > MTUc

Also, imagine that VMa in cloud A, has a floating IP (DNAT_SNAT NAT),
and VMb in cloud B has also a floating IP.

VMa tries to establish  connection to VMb FIP, and announces
MSSa = MTUa - (IP + TCP overhead), VMb ACKs the TCP SYN request
with  MSSb = MTUb - (IP - TCP overhead).

So the agreement will be min(MSSa,MSSb) , but… the transit network MSSc
will always be smaller , min(MSSa, MSSb) < MSSc.

In ML2/OVS deployments, those big packets will get fragmented at the router
edge, and a notification ICMP will be sent to the sender of the packets to
notify
fragmenting in source is necessary.


I guess we can also replicate this with 2 VMs on the same cloud with MSSa >
MSSb
where they try to talk via floating IP to each other.


So going back to the thing, I guess we need to implement some OpenFlow
extension
to match packets per size, redirecting those to an slow path
(ovn-controller) so we can
Fragment/and icmp back the source for source fragmentation?

Any advise on what’s the procedure here (OpenFlow land, kernel wise, even
in terms
of our source code and design so we could implement this) ?


Best regards,
Miguel Ángel.


On 3 August 2018 at 17:41:05, Daniel Alvarez Sanchez (dalva...@redhat.com)
wrote:

Maybe ICMP is not that critical but seems like not having the ICMP 'need to
frag' on UDP communications could break some applications that are aware of
this to reduce the size of the packets? I wonder...

Thanks!
Daniel

On Fri, Aug 3, 2018 at 5:20 PM Miguel Angel Ajo Pelayo 
wrote:

>
> We didn’t understand why a MTU missmatch in one direction worked (N/S),
> but in other direction (S/N) didn’t work… and we found that that it’s
> actually
> working (at least for TCP, via MSS negotiation), we had a missconfiguration
> In one of the physical interfaces.
>
> So, in the case of TCP we are fine. TCP is smart enough to negotiate
> properly.
>
> Other protocols like ICMP with the DF flag, or UDP… would not get the ICMP
> that notifies the sender about the MTU miss-match.
>
> I suspect that the most common cases are covered, and that it’s not worth
> pursuing what I was asking for at least with a high priority, but I’d like
> to hear
> opinions.
>
>
> Best regards,
> Miguel Ángel.
>
> On 3 August 2018 at 08:11:01, Miguel Angel Ajo Pelayo (majop...@redhat.com)
> wrote:
>
> I’m going to capture some example traffic and try to figure out which RFCs
> talk about that behaviour so we can come up with a consistent solution.
> I can document it in the project.
>
> To be honest, when I looked at it, I was expecting that the router would
> fragment, and I ended up discovering that we had this path MTU discovery
> mechanism in play for IPv4 .
>
> On 2 August 2018 at 22:21:28, Ben Pfaff (b...@ovn.org) wrote:
>
> On Thu, Aug 02, 2018 at 01:19:57PM -0700, Ben Pfaff wrote:
> > On Wed, Aug 01, 2018 at 10:46:07AM -0400, Miguel Angel Ajo Pelayo wrote:
> > > Hi Ben, ICMP is used as a signal from the router to tell the sender
> > > “next hop has a lower mtu, please send smaller packets”, we would
> > > need at least something in OVS to slow-path the “bigger than X”
> packets,
> > > at that point ova-controller could take care of constructing the ICMP
> packet
> > > and sending it to the source.
> >
> > Yes.
> >
> > > But I guess, that we still need the kernel changes to match on
> > > those “big packets”.
> >
> > Maybe. If we only need to worry about ICMP, though, we can set up OVN
> > so that it always slow-paths ICMP.
>
> Oh, I think maybe I was just being slow. The ICMP is generated, not
> processed. Never mind.
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-08-03 Thread Daniel Alvarez Sanchez
Maybe ICMP is not that critical but seems like not having the ICMP 'need to
frag' on UDP communications could break some applications that are aware of
this to reduce the size of the packets? I wonder...

Thanks!
Daniel

On Fri, Aug 3, 2018 at 5:20 PM Miguel Angel Ajo Pelayo 
wrote:

>
> We didn’t understand why a MTU missmatch in one direction worked (N/S),
> but in other direction (S/N) didn’t work… and we found that that it’s
> actually
> working (at least for TCP, via MSS negotiation), we had a missconfiguration
> In one of the physical interfaces.
>
> So, in the case of TCP we are fine. TCP is smart enough to negotiate
> properly.
>
> Other protocols like ICMP with the DF flag, or UDP… would not get the ICMP
> that notifies the sender about the MTU miss-match.
>
> I suspect that the most common cases are covered, and that it’s not worth
> pursuing what I was asking for at least with a high priority, but I’d like
> to hear
> opinions.
>
>
> Best regards,
> Miguel Ángel.
>
> On 3 August 2018 at 08:11:01, Miguel Angel Ajo Pelayo (majop...@redhat.com)
> wrote:
>
> I’m going to capture some example traffic and try to figure out which RFCs
> talk about that behaviour so we can come up with a consistent solution.
> I can document it in the project.
>
> To be honest, when I looked at it, I was expecting that the router would
> fragment, and I ended up discovering that we had this path MTU discovery
> mechanism in play for IPv4 .
>
> On 2 August 2018 at 22:21:28, Ben Pfaff (b...@ovn.org) wrote:
>
> On Thu, Aug 02, 2018 at 01:19:57PM -0700, Ben Pfaff wrote:
> > On Wed, Aug 01, 2018 at 10:46:07AM -0400, Miguel Angel Ajo Pelayo wrote:
> > > Hi Ben, ICMP is used as a signal from the router to tell the sender
> > > “next hop has a lower mtu, please send smaller packets”, we would
> > > need at least something in OVS to slow-path the “bigger than X”
> packets,
> > > at that point ova-controller could take care of constructing the ICMP
> packet
> > > and sending it to the source.
> >
> > Yes.
> >
> > > But I guess, that we still need the kernel changes to match on
> > > those “big packets”.
> >
> > Maybe. If we only need to worry about ICMP, though, we can set up OVN
> > so that it always slow-paths ICMP.
>
> Oh, I think maybe I was just being slow. The ICMP is generated, not
> processed. Never mind.
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-08-03 Thread Miguel Angel Ajo Pelayo
We didn’t understand why a MTU missmatch in one direction worked (N/S),
but in other direction (S/N) didn’t work… and we found that that it’s
actually
working (at least for TCP, via MSS negotiation), we had a missconfiguration
In one of the physical interfaces.

So, in the case of TCP we are fine. TCP is smart enough to negotiate
properly.

Other protocols like ICMP with the DF flag, or UDP… would not get the ICMP
that notifies the sender about the MTU miss-match.

I suspect that the most common cases are covered, and that it’s not worth
pursuing what I was asking for at least with a high priority, but I’d like
to hear
opinions.


Best regards,
Miguel Ángel.

On 3 August 2018 at 08:11:01, Miguel Angel Ajo Pelayo (majop...@redhat.com)
wrote:

I’m going to capture some example traffic and try to figure out which RFCs
talk about that behaviour so we can come up with a consistent solution.
I can document it in the project.

To be honest, when I looked at it, I was expecting that the router would
fragment, and I ended up discovering that we had this path MTU discovery
mechanism in play for IPv4 .

On 2 August 2018 at 22:21:28, Ben Pfaff (b...@ovn.org) wrote:

On Thu, Aug 02, 2018 at 01:19:57PM -0700, Ben Pfaff wrote:
> On Wed, Aug 01, 2018 at 10:46:07AM -0400, Miguel Angel Ajo Pelayo wrote:
> > Hi Ben, ICMP is used as a signal from the router to tell the sender
> > “next hop has a lower mtu, please send smaller packets”, we would
> > need at least something in OVS to slow-path the “bigger than X” packets,
> > at that point ova-controller could take care of constructing the ICMP
packet
> > and sending it to the source.
>
> Yes.
>
> > But I guess, that we still need the kernel changes to match on
> > those “big packets”.
>
> Maybe. If we only need to worry about ICMP, though, we can set up OVN
> so that it always slow-paths ICMP.

Oh, I think maybe I was just being slow. The ICMP is generated, not
processed. Never mind.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-08-02 Thread Miguel Angel Ajo Pelayo
I’m going to capture some example traffic and try to figure out which RFCs
talk about that behaviour so we can come up with a consistent solution.
I can document it in the project.

To be honest, when I looked at it, I was expecting that the router would
fragment, and I ended up discovering that we had this path MTU discovery
mechanism in play for IPv4 .

On 2 August 2018 at 22:21:28, Ben Pfaff (b...@ovn.org) wrote:

On Thu, Aug 02, 2018 at 01:19:57PM -0700, Ben Pfaff wrote:
> On Wed, Aug 01, 2018 at 10:46:07AM -0400, Miguel Angel Ajo Pelayo wrote:
> > Hi Ben, ICMP is used as a signal from the router to tell the sender
> > “next hop has a lower mtu, please send smaller packets”, we would
> > need at least something in OVS to slow-path the “bigger than X”
packets,
> > at that point ova-controller could take care of constructing the ICMP
packet
> > and sending it to the source.
>
> Yes.
>
> > But I guess, that we still need the kernel changes to match on
> > those “big packets”.
>
> Maybe. If we only need to worry about ICMP, though, we can set up OVN
> so that it always slow-paths ICMP.

Oh, I think maybe I was just being slow. The ICMP is generated, not
processed. Never mind.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-08-02 Thread Ben Pfaff
On Thu, Aug 02, 2018 at 01:19:57PM -0700, Ben Pfaff wrote:
> On Wed, Aug 01, 2018 at 10:46:07AM -0400, Miguel Angel Ajo Pelayo wrote:
> > Hi Ben, ICMP is used as a signal from the router to tell the sender
> > “next hop has a lower mtu, please send smaller packets”, we would
> > need at least something in OVS to slow-path the “bigger than X” packets,
> > at that point ova-controller could take care of constructing the ICMP packet
> > and sending it to the source.
> 
> Yes.
> 
> > But I guess, that we still need the kernel changes to match on
> > those “big packets”.
> 
> Maybe.  If we only need to worry about ICMP, though, we can set up OVN
> so that it always slow-paths ICMP.

Oh, I think maybe I was just being slow.  The ICMP is generated, not
processed.  Never mind.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-08-02 Thread Ben Pfaff
On Wed, Aug 01, 2018 at 10:46:07AM -0400, Miguel Angel Ajo Pelayo wrote:
> Hi Ben, ICMP is used as a signal from the router to tell the sender
> “next hop has a lower mtu, please send smaller packets”, we would
> need at least something in OVS to slow-path the “bigger than X” packets,
> at that point ova-controller could take care of constructing the ICMP packet
> and sending it to the source.

Yes.

> But I guess, that we still need the kernel changes to match on
> those “big packets”.

Maybe.  If we only need to worry about ICMP, though, we can set up OVN
so that it always slow-paths ICMP.

> On 27 July 2018 at 23:35:58, Ben Pfaff (b...@ovn.org) wrote:
> 
> On Thu, Jul 12, 2018 at 04:03:33PM +0200, Miguel Angel Ajo Pelayo wrote:
> > Is there any way to match packet_size > X on a flow?
> >
> > How could we implement this?
> 
> OVS doesn't currently have a way to do that. Adding such a feature
> would require kernel changes.
> 
> You mentioned ICMP at one point. It would be pretty easy to implement
> such a feature specifically for ICMP to logical router IP addresses in
> OVN, because we could just slow-path such traffic to ovn-controller
> (maybe we already do?) and check the packet size there. I don't know
> whether there's value in that.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-08-01 Thread Miguel Angel Ajo Pelayo
Hi Ben, ICMP is used as a signal from the router to tell the sender
“next hop has a lower mtu, please send smaller packets”, we would
need at least something in OVS to slow-path the “bigger than X” packets,
at that point ova-controller could take care of constructing the ICMP packet
and sending it to the source.

But I guess, that we still need the kernel changes to match on
those “big packets”.


On 27 July 2018 at 23:35:58, Ben Pfaff (b...@ovn.org) wrote:

On Thu, Jul 12, 2018 at 04:03:33PM +0200, Miguel Angel Ajo Pelayo wrote:
> Is there any way to match packet_size > X on a flow?
>
> How could we implement this?

OVS doesn't currently have a way to do that. Adding such a feature
would require kernel changes.

You mentioned ICMP at one point. It would be pretty easy to implement
such a feature specifically for ICMP to logical router IP addresses in
OVN, because we could just slow-path such traffic to ovn-controller
(maybe we already do?) and check the packet size there. I don't know
whether there's value in that.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-07-27 Thread Ben Pfaff
On Thu, Jul 12, 2018 at 04:03:33PM +0200, Miguel Angel Ajo Pelayo wrote:
> Is there any way to match packet_size > X on a flow?
> 
> How could we implement this?

OVS doesn't currently have a way to do that.  Adding such a feature
would require kernel changes.

You mentioned ICMP at one point.  It would be pretty easy to implement
such a feature specifically for ICMP to logical router IP addresses in
OVN, because we could just slow-path such traffic to ovn-controller
(maybe we already do?) and check the packet size there.  I don't know
whether there's value in that.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-07-27 Thread Han Zhou
On Fri, Jul 27, 2018 at 12:49 AM, Miguel Angel Ajo Pelayo <
majop...@redhat.com> wrote:

>
>
>
> On 24 July 2018 at 17:43:51, Han Zhou (zhou...@gmail.com) wrote:
>
>
>
> On Tue, Jul 24, 2018 at 8:26 AM, Miguel Angel Ajo Pelayo <
> majop...@redhat.com> wrote:
>
>>
>>
>>
>> On 24 July 2018 at 17:20:59, Han Zhou (zhou...@gmail.com) wrote:
>>
>>
>>
>> On Thu, Jul 12, 2018 at 7:03 AM, Miguel Angel Ajo Pelayo <
>> majop...@redhat.com> wrote:
>>
>>> I believe we need to emit ICMP / need to frag messages to have proper
>>> support
>>> on different MTUs (on router sides), I wonder how does it work the other
>>> way
>>> around (when external net is 1500, and internal net is 1500-geneve
>>> overhead).
>>>
>>
>> I think this is expected since GW chassis forwards packets without going
>> through IP stack.
>> One solution might be using a network namespace on the GW node as an
>> intermediate hop, so that IP stack on the GW will handle the fragmentation
>> (or reply ICMP when DF is set). Of course this will have some latency
>> added, and also increase complexity of the deployment, so I'd rather tune
>> the MTU properly to avoid the problem. But if east-west performance is more
>> important and HV <-> HV jumbo frame is supported, then probably it worth
>> the namespace trick just to make external work regardless of internal MTU
>> settings. Does this make sense?
>>
>>
>> I believe we should avoid that path at all costs, it’s the way the
>> neutron reference implementation was built and it’s slower. Also it has a
>> lot of complexity.
>>
>>
>> Sometimes the MTU will be just mismatched the internal network/ls has a
>> bigger MTU to increase performance, but the external network is on the
>> standard 1500, in some cases such thing could be circumvented by having a
>> leg of the external router with big MTU just for ovn, but… if we look at
>> how people use openstack for example, that probably render most of the
>> deployments incompatible with ovn.
>>
>>
>> For example, customers tend to have several provider networks + external
>> networks, like legacy networks, different providers, etc.
>>
>>
>>
>>
>>> Is there any way to match packet_size > X on a flow?
>>>
>>> How could we implement this?
>>>
>> I didn't find anything for matching packet_size in ovs-fields.7. Even we
>> could do this in OVN (e.g. through controller action in slowpath), I wonder
>> is it really better than relying on IP stack. Maybe blp or someone else
>> could shed a light on this :)
>>
>> I think that would be undesirable also.
>>
>>
>> I wonder how it works now when external network is generally on 1500 MTU,
>> while Geneve has a lower mtu.
>>
> Do you mean for example: VM has MTU: 1400, while external network and eth0
> (tunnel physical interface) of HVs and GWs are all 1500 MTU? Why would
> there be a problem in this case? Or did I misunderstand?
>
>
> In that case some handling is also necessary at some point, imagine you
> have stablished a TCP connection through a floating IP (dnat), when the
> packets traverse the router from external network to internal network, if
> the router is not handling MTU, a 1500 packet will be transmitted over the
> 1400 network, and either Geneve is fragmenting/defragmenting (very bad for
> performance), or, if the packet went through VLAN, it would be dropped when
> arriving the final hypervisor.
>
>
> An I right, or am I missing something?, I need to actually try it and look
> at the traffic/packets.
>
In my example above all physical interfaces are with MTU 1500, only the
VM's internal MTU setting is 1400. In this case I don't think there is any
IP fragmentation or dropping happening, because the MSS of the TCP
connection should be adjusted by the hand-shake to fit for the MTU 1400 (or
smaller if the remote endpoint has MTU < 1400).

Or maybe you are talking about some different settings?
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-07-27 Thread Miguel Angel Ajo Pelayo
On 24 July 2018 at 17:43:51, Han Zhou (zhou...@gmail.com) wrote:



On Tue, Jul 24, 2018 at 8:26 AM, Miguel Angel Ajo Pelayo <
majop...@redhat.com> wrote:

>
>
>
> On 24 July 2018 at 17:20:59, Han Zhou (zhou...@gmail.com) wrote:
>
>
>
> On Thu, Jul 12, 2018 at 7:03 AM, Miguel Angel Ajo Pelayo <
> majop...@redhat.com> wrote:
>
>> I believe we need to emit ICMP / need to frag messages to have proper
>> support
>> on different MTUs (on router sides), I wonder how does it work the other
>> way
>> around (when external net is 1500, and internal net is 1500-geneve
>> overhead).
>>
>
> I think this is expected since GW chassis forwards packets without going
> through IP stack.
> One solution might be using a network namespace on the GW node as an
> intermediate hop, so that IP stack on the GW will handle the fragmentation
> (or reply ICMP when DF is set). Of course this will have some latency
> added, and also increase complexity of the deployment, so I'd rather tune
> the MTU properly to avoid the problem. But if east-west performance is more
> important and HV <-> HV jumbo frame is supported, then probably it worth
> the namespace trick just to make external work regardless of internal MTU
> settings. Does this make sense?
>
>
> I believe we should avoid that path at all costs, it’s the way the neutron
> reference implementation was built and it’s slower. Also it has a lot of
> complexity.
>
>
> Sometimes the MTU will be just mismatched the internal network/ls has a
> bigger MTU to increase performance, but the external network is on the
> standard 1500, in some cases such thing could be circumvented by having a
> leg of the external router with big MTU just for ovn, but… if we look at
> how people use openstack for example, that probably render most of the
> deployments incompatible with ovn.
>
>
> For example, customers tend to have several provider networks + external
> networks, like legacy networks, different providers, etc.
>
>
>
>
>> Is there any way to match packet_size > X on a flow?
>>
>> How could we implement this?
>>
> I didn't find anything for matching packet_size in ovs-fields.7. Even we
> could do this in OVN (e.g. through controller action in slowpath), I wonder
> is it really better than relying on IP stack. Maybe blp or someone else
> could shed a light on this :)
>
> I think that would be undesirable also.
>
>
> I wonder how it works now when external network is generally on 1500 MTU,
> while Geneve has a lower mtu.
>
Do you mean for example: VM has MTU: 1400, while external network and eth0
(tunnel physical interface) of HVs and GWs are all 1500 MTU? Why would
there be a problem in this case? Or did I misunderstand?


In that case some handling is also necessary at some point, imagine you
have stablished a TCP connection through a floating IP (dnat), when the
packets traverse the router from external network to internal network, if
the router is not handling MTU, a 1500 packet will be transmitted over the
1400 network, and either Geneve is fragmenting/defragmenting (very bad for
performance), or, if the packet went through VLAN, it would be dropped when
arriving the final hypervisor.


An I right, or am I missing something?, I need to actually try it and look
at the traffic/packets.





>
>
> Thanks,
> Han
>
>
>>
>>
>> On Wed, Jul 11, 2018 at 1:01 PM Daniel Alvarez Sanchez <
>> dalva...@redhat.com> wrote:
>>
>>> On Wed, Jul 11, 2018 at 12:55 PM Daniel Alvarez Sanchez <
>>> dalva...@redhat.com> wrote:
>>>
 Hi all,

 Miguel Angel Ajo and I have been trying to setup Jumbo frames in
 OpenStack using OVN as a backend.

 The external network has an MTU of 1900 while we have created two
 tenant networks (Logical Switches) with an MTU of 8942.

>>>
>>> s/1900/1500
>>>

 When pinging from one instance in one of the networks to the other
 instance on the other network, the routing takes place locally and
 everything is fine. We can ping with -s 3000 and with tcpdump we verify
 that the packets are not fragmented at all.

 However, when trying to reach the external network, we see that the
 packets are not tried to be fragmented and the traffic doesn't go through.

 In the ML2/OVS case (reference implementation for OpenStack
 networking), this works as we're seeing the following when attempting to
 reach a network with a lower MTU:

>>>
>>> Just to clarify, in the reference implementation (ML2/OVS) the routing
>>> takes place with iptables rules so we assume that it's the kernel
>>> processing those ICMP packets.
>>>

 10:38:03.807695 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP
 echo request, id 30977, seq 0, length 3008

 10:38:03.807723 IP overcloud-controller-0 > 192.168.20.14: ICMP
 dell-virt-lab-01.mgmt.com unreachable - need to frag (mtu 1500),
 length 556

 As you can see, the router (overcloud-controller-0) is responding to
 the instance with an ICMP need to frag

Re: [ovs-discuss] OVN - MTU path discovery

2018-07-24 Thread Han Zhou
On Tue, Jul 24, 2018 at 8:26 AM, Miguel Angel Ajo Pelayo <
majop...@redhat.com> wrote:

>
>
>
> On 24 July 2018 at 17:20:59, Han Zhou (zhou...@gmail.com) wrote:
>
>
>
> On Thu, Jul 12, 2018 at 7:03 AM, Miguel Angel Ajo Pelayo <
> majop...@redhat.com> wrote:
>
>> I believe we need to emit ICMP / need to frag messages to have proper
>> support
>> on different MTUs (on router sides), I wonder how does it work the other
>> way
>> around (when external net is 1500, and internal net is 1500-geneve
>> overhead).
>>
>
> I think this is expected since GW chassis forwards packets without going
> through IP stack.
> One solution might be using a network namespace on the GW node as an
> intermediate hop, so that IP stack on the GW will handle the fragmentation
> (or reply ICMP when DF is set). Of course this will have some latency
> added, and also increase complexity of the deployment, so I'd rather tune
> the MTU properly to avoid the problem. But if east-west performance is more
> important and HV <-> HV jumbo frame is supported, then probably it worth
> the namespace trick just to make external work regardless of internal MTU
> settings. Does this make sense?
>
>
> I believe we should avoid that path at all costs, it’s the way the neutron
> reference implementation was built and it’s slower. Also it has a lot of
> complexity.
>
>
> Sometimes the MTU will be just mismatched the internal network/ls has a
> bigger MTU to increase performance, but the external network is on the
> standard 1500, in some cases such thing could be circumvented by having a
> leg of the external router with big MTU just for ovn, but… if we look at
> how people use openstack for example, that probably render most of the
> deployments incompatible with ovn.
>
>
> For example, customers tend to have several provider networks + external
> networks, like legacy networks, different providers, etc.
>
>
>
>
>> Is there any way to match packet_size > X on a flow?
>>
>> How could we implement this?
>>
> I didn't find anything for matching packet_size in ovs-fields.7. Even we
> could do this in OVN (e.g. through controller action in slowpath), I wonder
> is it really better than relying on IP stack. Maybe blp or someone else
> could shed a light on this :)
>
> I think that would be undesirable also.
>
>
> I wonder how it works now when external network is generally on 1500 MTU,
> while Geneve has a lower mtu.
>
Do you mean for example: VM has MTU: 1400, while external network and eth0
(tunnel physical interface) of HVs and GWs are all 1500 MTU? Why would
there be a problem in this case? Or did I misunderstand?


>
>
> Thanks,
> Han
>
>
>>
>>
>> On Wed, Jul 11, 2018 at 1:01 PM Daniel Alvarez Sanchez <
>> dalva...@redhat.com> wrote:
>>
>>> On Wed, Jul 11, 2018 at 12:55 PM Daniel Alvarez Sanchez <
>>> dalva...@redhat.com> wrote:
>>>
 Hi all,

 Miguel Angel Ajo and I have been trying to setup Jumbo frames in
 OpenStack using OVN as a backend.

 The external network has an MTU of 1900 while we have created two
 tenant networks (Logical Switches) with an MTU of 8942.

>>>
>>> s/1900/1500
>>>

 When pinging from one instance in one of the networks to the other
 instance on the other network, the routing takes place locally and
 everything is fine. We can ping with -s 3000 and with tcpdump we verify
 that the packets are not fragmented at all.

 However, when trying to reach the external network, we see that the
 packets are not tried to be fragmented and the traffic doesn't go through.

 In the ML2/OVS case (reference implementation for OpenStack
 networking), this works as we're seeing the following when attempting to
 reach a network with a lower MTU:

>>>
>>> Just to clarify, in the reference implementation (ML2/OVS) the routing
>>> takes place with iptables rules so we assume that it's the kernel
>>> processing those ICMP packets.
>>>

 10:38:03.807695 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP
 echo request, id 30977, seq 0, length 3008

 10:38:03.807723 IP overcloud-controller-0 > 192.168.20.14: ICMP
 dell-virt-lab-01.mgmt.com unreachable - need to frag (mtu 1500),
 length 556

 As you can see, the router (overcloud-controller-0) is responding to
 the instance with an ICMP need to frag and after this, subsequent packets
 are going fragmented (while replies are not):

 0:38:34.630437 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
 request, id 31233, seq 0, length 1480

 10:38:34.630458 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp

 10:38:34.630462 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp

 10:38:34.631334 IP dell-virt-lab-01.mgmt.com > 192.168.20.14: ICMP
 echo reply, id 31233, seq 0, length 3008



 Are we missing some configuration or we lack support for this in OVN?

 Any pointers are highly appreciated :)


 Thanks 

Re: [ovs-discuss] OVN - MTU path discovery

2018-07-24 Thread Miguel Angel Ajo Pelayo
On 24 July 2018 at 17:20:59, Han Zhou (zhou...@gmail.com) wrote:



On Thu, Jul 12, 2018 at 7:03 AM, Miguel Angel Ajo Pelayo <
majop...@redhat.com> wrote:

> I believe we need to emit ICMP / need to frag messages to have proper
> support
> on different MTUs (on router sides), I wonder how does it work the other
> way
> around (when external net is 1500, and internal net is 1500-geneve
> overhead).
>

I think this is expected since GW chassis forwards packets without going
through IP stack.
One solution might be using a network namespace on the GW node as an
intermediate hop, so that IP stack on the GW will handle the fragmentation
(or reply ICMP when DF is set). Of course this will have some latency
added, and also increase complexity of the deployment, so I'd rather tune
the MTU properly to avoid the problem. But if east-west performance is more
important and HV <-> HV jumbo frame is supported, then probably it worth
the namespace trick just to make external work regardless of internal MTU
settings. Does this make sense?


I believe we should avoid that path at all costs, it’s the way the neutron
reference implementation was built and it’s slower. Also it has a lot of
complexity.


Sometimes the MTU will be just mismatched the internal network/ls has a
bigger MTU to increase performance, but the external network is on the
standard 1500, in some cases such thing could be circumvented by having a
leg of the external router with big MTU just for ovn, but… if we look at
how people use openstack for example, that probably render most of the
deployments incompatible with ovn.


For example, customers tend to have several provider networks + external
networks, like legacy networks, different providers, etc.




> Is there any way to match packet_size > X on a flow?
>
> How could we implement this?
>
I didn't find anything for matching packet_size in ovs-fields.7. Even we
could do this in OVN (e.g. through controller action in slowpath), I wonder
is it really better than relying on IP stack. Maybe blp or someone else
could shed a light on this :)

I think that would be undesirable also.


I wonder how it works now when external network is generally on 1500 MTU,
while Geneve has a lower mtu.



Thanks,
Han


>
>
> On Wed, Jul 11, 2018 at 1:01 PM Daniel Alvarez Sanchez <
> dalva...@redhat.com> wrote:
>
>> On Wed, Jul 11, 2018 at 12:55 PM Daniel Alvarez Sanchez <
>> dalva...@redhat.com> wrote:
>>
>>> Hi all,
>>>
>>> Miguel Angel Ajo and I have been trying to setup Jumbo frames in
>>> OpenStack using OVN as a backend.
>>>
>>> The external network has an MTU of 1900 while we have created two tenant
>>> networks (Logical Switches) with an MTU of 8942.
>>>
>>
>> s/1900/1500
>>
>>>
>>> When pinging from one instance in one of the networks to the other
>>> instance on the other network, the routing takes place locally and
>>> everything is fine. We can ping with -s 3000 and with tcpdump we verify
>>> that the packets are not fragmented at all.
>>>
>>> However, when trying to reach the external network, we see that the
>>> packets are not tried to be fragmented and the traffic doesn't go through.
>>>
>>> In the ML2/OVS case (reference implementation for OpenStack networking),
>>> this works as we're seeing the following when attempting to reach a network
>>> with a lower MTU:
>>>
>>
>> Just to clarify, in the reference implementation (ML2/OVS) the routing
>> takes place with iptables rules so we assume that it's the kernel
>> processing those ICMP packets.
>>
>>>
>>> 10:38:03.807695 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
>>> request, id 30977, seq 0, length 3008
>>>
>>> 10:38:03.807723 IP overcloud-controller-0 > 192.168.20.14: ICMP
>>> dell-virt-lab-01.mgmt.com unreachable - need to frag (mtu 1500), length
>>> 556
>>>
>>> As you can see, the router (overcloud-controller-0) is responding to the
>>> instance with an ICMP need to frag and after this, subsequent packets are
>>> going fragmented (while replies are not):
>>>
>>> 0:38:34.630437 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
>>> request, id 31233, seq 0, length 1480
>>>
>>> 10:38:34.630458 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp
>>>
>>> 10:38:34.630462 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp
>>>
>>> 10:38:34.631334 IP dell-virt-lab-01.mgmt.com > 192.168.20.14: ICMP echo
>>> reply, id 31233, seq 0, length 3008
>>>
>>>
>>>
>>> Are we missing some configuration or we lack support for this in OVN?
>>>
>>> Any pointers are highly appreciated :)
>>>
>>>
>>> Thanks a lot.
>>>
>>> Daniel Alvarez
>>>
>>>
>>
>> ___
>> discuss mailing list
>> disc...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>>
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>
___
discuss mailing list
disc...@openvswitch.org
h

Re: [ovs-discuss] OVN - MTU path discovery

2018-07-24 Thread Han Zhou
On Thu, Jul 12, 2018 at 7:03 AM, Miguel Angel Ajo Pelayo <
majop...@redhat.com> wrote:

> I believe we need to emit ICMP / need to frag messages to have proper
> support
> on different MTUs (on router sides), I wonder how does it work the other
> way
> around (when external net is 1500, and internal net is 1500-geneve
> overhead).
>

I think this is expected since GW chassis forwards packets without going
through IP stack.
One solution might be using a network namespace on the GW node as an
intermediate hop, so that IP stack on the GW will handle the fragmentation
(or reply ICMP when DF is set). Of course this will have some latency
added, and also increase complexity of the deployment, so I'd rather tune
the MTU properly to avoid the problem. But if east-west performance is more
important and HV <-> HV jumbo frame is supported, then probably it worth
the namespace trick just to make external work regardless of internal MTU
settings. Does this make sense?


> Is there any way to match packet_size > X on a flow?
>
> How could we implement this?
>
I didn't find anything for matching packet_size in ovs-fields.7. Even we
could do this in OVN (e.g. through controller action in slowpath), I wonder
is it really better than relying on IP stack. Maybe blp or someone else
could shed a light on this :)

Thanks,
Han


>
>
> On Wed, Jul 11, 2018 at 1:01 PM Daniel Alvarez Sanchez <
> dalva...@redhat.com> wrote:
>
>> On Wed, Jul 11, 2018 at 12:55 PM Daniel Alvarez Sanchez <
>> dalva...@redhat.com> wrote:
>>
>>> Hi all,
>>>
>>> Miguel Angel Ajo and I have been trying to setup Jumbo frames in
>>> OpenStack using OVN as a backend.
>>>
>>> The external network has an MTU of 1900 while we have created two tenant
>>> networks (Logical Switches) with an MTU of 8942.
>>>
>>
>> s/1900/1500
>>
>>>
>>> When pinging from one instance in one of the networks to the other
>>> instance on the other network, the routing takes place locally and
>>> everything is fine. We can ping with -s 3000 and with tcpdump we verify
>>> that the packets are not fragmented at all.
>>>
>>> However, when trying to reach the external network, we see that the
>>> packets are not tried to be fragmented and the traffic doesn't go through.
>>>
>>> In the ML2/OVS case (reference implementation for OpenStack networking),
>>> this works as we're seeing the following when attempting to reach a network
>>> with a lower MTU:
>>>
>>
>> Just to clarify, in the reference implementation (ML2/OVS) the routing
>> takes place with iptables rules so we assume that it's the kernel
>> processing those ICMP packets.
>>
>>>
>>> 10:38:03.807695 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
>>> request, id 30977, seq 0, length 3008
>>>
>>> 10:38:03.807723 IP overcloud-controller-0 > 192.168.20.14: ICMP
>>> dell-virt-lab-01.mgmt.com unreachable - need to frag (mtu 1500), length
>>> 556
>>>
>>> As you can see, the router (overcloud-controller-0) is responding to the
>>> instance with an ICMP need to frag and after this, subsequent packets are
>>> going fragmented (while replies are not):
>>>
>>> 0:38:34.630437 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
>>> request, id 31233, seq 0, length 1480
>>>
>>> 10:38:34.630458 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp
>>>
>>> 10:38:34.630462 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp
>>>
>>> 10:38:34.631334 IP dell-virt-lab-01.mgmt.com > 192.168.20.14: ICMP echo
>>> reply, id 31233, seq 0, length 3008
>>>
>>>
>>>
>>> Are we missing some configuration or we lack support for this in OVN?
>>>
>>> Any pointers are highly appreciated :)
>>>
>>>
>>> Thanks a lot.
>>>
>>> Daniel Alvarez
>>>
>>>
>>
>> ___
>> discuss mailing list
>> disc...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>>
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-07-12 Thread Miguel Angel Ajo Pelayo
I believe we need to emit ICMP / need to frag messages to have proper
support
on different MTUs (on router sides), I wonder how does it work the other way
around (when external net is 1500, and internal net is 1500-geneve
overhead).

Is there any way to match packet_size > X on a flow?

How could we implement this?



On Wed, Jul 11, 2018 at 1:01 PM Daniel Alvarez Sanchez 
wrote:

> On Wed, Jul 11, 2018 at 12:55 PM Daniel Alvarez Sanchez <
> dalva...@redhat.com> wrote:
>
>> Hi all,
>>
>> Miguel Angel Ajo and I have been trying to setup Jumbo frames in
>> OpenStack using OVN as a backend.
>>
>> The external network has an MTU of 1900 while we have created two tenant
>> networks (Logical Switches) with an MTU of 8942.
>>
>
> s/1900/1500
>
>>
>> When pinging from one instance in one of the networks to the other
>> instance on the other network, the routing takes place locally and
>> everything is fine. We can ping with -s 3000 and with tcpdump we verify
>> that the packets are not fragmented at all.
>>
>> However, when trying to reach the external network, we see that the
>> packets are not tried to be fragmented and the traffic doesn't go through.
>>
>> In the ML2/OVS case (reference implementation for OpenStack networking),
>> this works as we're seeing the following when attempting to reach a network
>> with a lower MTU:
>>
>
> Just to clarify, in the reference implementation (ML2/OVS) the routing
> takes place with iptables rules so we assume that it's the kernel
> processing those ICMP packets.
>
>>
>> 10:38:03.807695 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
>> request, id 30977, seq 0, length 3008
>>
>> 10:38:03.807723 IP overcloud-controller-0 > 192.168.20.14: ICMP
>> dell-virt-lab-01.mgmt.com unreachable - need to frag (mtu 1500), length
>> 556
>>
>> As you can see, the router (overcloud-controller-0) is responding to the
>> instance with an ICMP need to frag and after this, subsequent packets are
>> going fragmented (while replies are not):
>>
>> 0:38:34.630437 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
>> request, id 31233, seq 0, length 1480
>>
>> 10:38:34.630458 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp
>>
>> 10:38:34.630462 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp
>>
>> 10:38:34.631334 IP dell-virt-lab-01.mgmt.com > 192.168.20.14: ICMP echo
>> reply, id 31233, seq 0, length 3008
>>
>>
>>
>> Are we missing some configuration or we lack support for this in OVN?
>>
>> Any pointers are highly appreciated :)
>>
>>
>> Thanks a lot.
>>
>> Daniel Alvarez
>>
>>
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN - MTU path discovery

2018-07-11 Thread Daniel Alvarez Sanchez
On Wed, Jul 11, 2018 at 12:55 PM Daniel Alvarez Sanchez 
wrote:

> Hi all,
>
> Miguel Angel Ajo and I have been trying to setup Jumbo frames in OpenStack
> using OVN as a backend.
>
> The external network has an MTU of 1900 while we have created two tenant
> networks (Logical Switches) with an MTU of 8942.
>

s/1900/1500

>
> When pinging from one instance in one of the networks to the other
> instance on the other network, the routing takes place locally and
> everything is fine. We can ping with -s 3000 and with tcpdump we verify
> that the packets are not fragmented at all.
>
> However, when trying to reach the external network, we see that the
> packets are not tried to be fragmented and the traffic doesn't go through.
>
> In the ML2/OVS case (reference implementation for OpenStack networking),
> this works as we're seeing the following when attempting to reach a network
> with a lower MTU:
>

Just to clarify, in the reference implementation (ML2/OVS) the routing
takes place with iptables rules so we assume that it's the kernel
processing those ICMP packets.

>
> 10:38:03.807695 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
> request, id 30977, seq 0, length 3008
>
> 10:38:03.807723 IP overcloud-controller-0 > 192.168.20.14: ICMP
> dell-virt-lab-01.mgmt.com unreachable - need to frag (mtu 1500), length
> 556
>
> As you can see, the router (overcloud-controller-0) is responding to the
> instance with an ICMP need to frag and after this, subsequent packets are
> going fragmented (while replies are not):
>
> 0:38:34.630437 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
> request, id 31233, seq 0, length 1480
>
> 10:38:34.630458 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp
>
> 10:38:34.630462 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp
>
> 10:38:34.631334 IP dell-virt-lab-01.mgmt.com > 192.168.20.14: ICMP echo
> reply, id 31233, seq 0, length 3008
>
>
>
> Are we missing some configuration or we lack support for this in OVN?
>
> Any pointers are highly appreciated :)
>
>
> Thanks a lot.
>
> Daniel Alvarez
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVN - MTU path discovery

2018-07-11 Thread Daniel Alvarez Sanchez
Hi all,

Miguel Angel Ajo and I have been trying to setup Jumbo frames in OpenStack
using OVN as a backend.

The external network has an MTU of 1900 while we have created two tenant
networks (Logical Switches) with an MTU of 8942.

When pinging from one instance in one of the networks to the other instance
on the other network, the routing takes place locally and everything is
fine. We can ping with -s 3000 and with tcpdump we verify that the packets
are not fragmented at all.

However, when trying to reach the external network, we see that the packets
are not tried to be fragmented and the traffic doesn't go through.

In the ML2/OVS case (reference implementation for OpenStack networking),
this works as we're seeing the following when attempting to reach a network
with a lower MTU:

10:38:03.807695 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
request, id 30977, seq 0, length 3008

10:38:03.807723 IP overcloud-controller-0 > 192.168.20.14: ICMP
dell-virt-lab-01.mgmt.com unreachable - need to frag (mtu 1500), length 556

As you can see, the router (overcloud-controller-0) is responding to the
instance with an ICMP need to frag and after this, subsequent packets are
going fragmented (while replies are not):

0:38:34.630437 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: ICMP echo
request, id 31233, seq 0, length 1480

10:38:34.630458 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp

10:38:34.630462 IP 192.168.20.14 > dell-virt-lab-01.mgmt.com: icmp

10:38:34.631334 IP dell-virt-lab-01.mgmt.com > 192.168.20.14: ICMP echo
reply, id 31233, seq 0, length 3008



Are we missing some configuration or we lack support for this in OVN?

Any pointers are highly appreciated :)


Thanks a lot.

Daniel Alvarez
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss