On 04/25/2017 04:45 AM, Zhou, Danny wrote:

Thanks Pierre, comments inline.

*From:*Pierre Pfister (ppfister) [mailto:ppfis...@cisco.com]
*Sent:* Tuesday, April 25, 2017 4:11 PM
*To:* Ni, Hongjun <hongjun...@intel.com>
*Cc:* Zhou, Danny <danny.z...@intel.com>; Ed Warnicke <hagb...@gmail.com>; Li, Johnson <johnson...@intel.com>; vpp-dev@lists.fd.io
*Subject:* Re: [vpp-dev] Requirement on Load Balancer plugin for VPP

    Le 25 avr. 2017 à 09:52, Ni, Hongjun <hongjun...@intel.com
    <mailto:hongjun...@intel.com>> a écrit :

    Hi Pierre,

    For LB distribution case, I think we could assign a node IP for
    each LB box.

    When received packets from client, LB will do both SNAT and DNAT.
    i.e. source IP -> LB’s Node IP, destination IP -> AS’s IP.

    When returned packets from AS, LB also do both DNAT and SNAT. i.e.
    source IP -> AS’s IP, destination IP -> Client’s IP.

Does NSH solve this problem solve this problem of transparently forwarding the traffic.

I see.

Doing so you completely hide the client's source address from the application.

You also require per-connexion binding at the load balancer (MagLev does per-connexion binding, but in a way which allows for hash collisions, because it is not a big deal if two flows use the same entry in the hash table. This allows for smaller and fixed size hash table, which also provides a performance advantage to MagLev).

In my humble opinion, using SNAT+DNAT is a terribly bad idea, so I would advise you to reconsider finding a way to either:

- Enable any type of packet tunneling protocol in your ASs (IPinIP, L2TP, whatever-other-protocol, and extend VPP's LB plugin with the one you pick).

- Put some box closer to the ASs (bump in the wire) for decap.

- If your routers support MPLS, you could also use it as encap.

*/[Zhou, Danny] In a cloud environment where hundreds of or thousands of ASs are dynamically deployed in a VM or a container, it is not easy for orchestrator (within global view) to find a close enough boxes to be configured automatically in order to offload encap/decap works. Mostly like, it will be still software to do the encap/decap work. Secondly, if we are target small packet line rate performance, adding the tunnel heads increases the total packet size hence decrease the packet efficiency and cause packet loss. I would consider adding GRE tunnels for LB is like abuse of tunneling protocol, as those tunneling protocols are not designed for this case. SNAT + DNA has its own disadvantage, but they are widely used in software centric Cloud environment orchestrated by Openstack or Kubernetes./*

If you really want to use SNAT+DNAT (god forbid), and are willing to suffer (or somehow like suffering), you may try to:

- Use VPP's SNAT on the client-facing interface. The SNAT will just change clients source addresses to one of LB's source addresses.

- Extend VPP's LB plugin to support DNAT "encap".

- Extend VPP's LB plugin to support return traffic and stateless SNAT base on LB flow table (And find a way to make that work on multiple cores...).

The client->AS traffic, in VPP, would do ---> client-facing-iface --> SNAT --> LB(DNAT) --> AS-facing-iface

The AS->client traffic, in VPP, would do ---> AS-facing-iface --> LB(Stateless SNAT) --> SNAT Plugin (doing DNAT-back) --> client-facing-iface

Now the choice is all yours.

But I will have warned you.

Cheers,

- Pierre



    Thanks,

    Hongjun

    *From:*Pierre Pfister (ppfister) [mailto:ppfis...@cisco.com]
    *Sent:*Tuesday, April 25, 2017 3:12 PM
    *To:*Zhou, Danny <danny.z...@intel.com <mailto:danny.z...@intel.com>>
    *Cc:*Ni, Hongjun <hongjun...@intel.com
    <mailto:hongjun...@intel.com>>; Ed Warnicke <hagb...@gmail.com
    <mailto:hagb...@gmail.com>>; Li, Johnson <johnson...@intel.com
    <mailto:johnson...@intel.com>>; vpp-dev@lists.fd.io
    <mailto:vpp-dev@lists.fd.io>
    *Subject:*Re: [vpp-dev] Requirement on Load Balancer plugin for VPP

    Hello all,

    As mentioned by Ed, introducing return traffic would dramatically
    reduce the performance of the solution.

    -> Return traffic typically consists of data packets, whereas
    forward traffic mostly consists of ACKs. So you will have to have
    significantly more LB boxes if you want to support all your return
    traffic.

    -> Having to deal with return traffic also means that we need to
    either make sure return traffic goes through the same core, or add
    locks to the structures (for now, everything is lockless,
    per-core), or steer traffic for core to core.

    There also is something that I am not sure to understand. You
    mentioned DNAT in order to steer the traffic to the AS, but how do
    you make sure the return traffic goes back to the LB ? My guess is
    that all the traffic coming out of the ASs is routed toward one
    LB, is that right ? How do you make sure the return traffic is
    evenly distributed between LBs ?

    It's a pretty interesting requirement that you have, but I am
    quite sure the solution will have to be quite far from MagLev's
    design, and probably less efficient.

    - Pierre

        Le 25 avr. 2017 à 05:11, Zhou, Danny <danny.z...@intel.com
        <mailto:danny.z...@intel.com>> a écrit :

        Share  my two cents as well:

        Firstly, introducing GRE or whatever other tunneling protocols
        to LB introduces performance overhead (for encap and decap) to
        both the load balancer as well as the network service.
        Secondly, other mechanism on the network service node not only
        needs to decap the GRE but also needs to perform a DNAT
        operation in order to change the destination IP of the
        original frame from LB’s IP to the service entity’s IP, which
        introduces the complexity to the network service.

        Existing well-known load balancers such as Netfilter or Nginx
        do not adopt this tunneling approach, they just simply do a
        service node selection followed by a NAT operation.

        -Danny

        *From:*vpp-dev-boun...@lists.fd.io
        
<mailto:vpp-dev-boun...@lists.fd.io>[mailto:vpp-dev-boun...@lists.fd.io]*On
        Behalf Of*Ni, Hongjun
        *Sent:*Tuesday, April 25, 2017 11:05 AM
        *To:*Ed Warnicke <hagb...@gmail.com <mailto:hagb...@gmail.com>>
        *Cc:*Li, Johnson <johnson...@intel.com
        <mailto:johnson...@intel.com>>;vpp-dev@lists.fd.io
        <mailto:vpp-dev@lists.fd.io>
        *Subject:*Re: [vpp-dev] Requirement on Load Balancer plugin
        for VPP

        Hi Ed,

        Thanks for your prompt response.

        This item is required to handle legacy AS, because some legacy
        AS does not want to change their underlay forwarding
        infrastructure.

        Besides, some AS IPs are private and invisible outside the AS
        cluster domain, and not allowed to expose to external network.

        Thanks,

        Hongjun

        *From:*Ed Warnicke [mailto:hagb...@gmail.com]
        *Sent:*Tuesday, April 25, 2017 10:44 AM
        *To:*Ni, Hongjun <hongjun...@intel.com
        <mailto:hongjun...@intel.com>>
        *Cc:*vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>; Li,
        Johnson <johnson...@intel.com <mailto:johnson...@intel.com>>
        *Subject:*Re: [vpp-dev] Requirement on Load Balancer plugin
        for VPP

        Hongjun,

        I can see this point of view, but it radically reduces the
        scalability of the whole system.

        Wouldn't it just make sense to run vpp or some other mechanism
        to decap the GRE on whatever is running the other AS and feed
        whatever we are

        load balancing to?  Forcing back traffic through the central
        load balancer radically reduces scalability (which is why

        Maglev, which inspired what we are doing here, doesn't do it
        that way either).

        Ed

        On Mon, Apr 24, 2017 at 7:18 PM, Ni, Hongjun
        <hongjun...@intel.com <mailto:hongjun...@intel.com>> wrote:

            Hey,

            Currently, traffic received for a given VIP (or VIP
            prefix) is tunneled using GRE towards

            the different ASs in a way that (tries to) ensure that a
            given session will

            always be tunneled to the same AS.

            But in real environment, many Application Servers do not
            support GRE feature.

            So we raise a requirement for LB in VPP:

            (1). When received traffic for a VIP, the LB need to do
            load balance, then do DNAT to change traffic’s destination
            IP from VIP to AS’s IP.

            (2). When returned traffic from AS, the LB will do SNAT
            first to change traffic’s source IP from AS’s IP to VIP,
            then go through load balance sessions, and then sent to
            clients.

            Any comments about this requirement are welcome.

            Thanks a lot,

            Hongjun


            _______________________________________________
            vpp-dev mailing list
            vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>
            https://lists.fd.io/mailman/listinfo/vpp-dev

        _______________________________________________
        vpp-dev mailing list
        vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>
        https://lists.fd.io/mailman/listinfo/vpp-dev



_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

--
*Thomas F Herbert*
Fast Data Planes
Office of Technology
*Red Hat*
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to