Raj,

Thanks for the insight.
It looks like it was the buffer size. The rx buffer was increased on the lustre nodes and there have been no more dropped packets.

Brian Andrus




On 12/5/2017 11:12 AM, Raj wrote:
Brian,
I would check the following:
- MTU size must be same across all the nodes (servers + client)
- peer_credit and credit must be same across all the nodes
- /proc/sys/lnet/peers can show if you are constantly seeing negative credits - Buffer overflow counters on the switches if it provide. If the buffer size is low to handle IO stream, you may want to reduce credits.

-Raj


On Tue, Dec 5, 2017 at 11:56 AM Brian Andrus <toomuc...@gmail.com <mailto:toomuc...@gmail.com>> wrote:

    Shawn,

    Flow control is configured and these connections are all on the
    same 40g subnet and all directly connected to the same switch.

    I'm a little new with using lnet_selftest, but as I run it 1:1, I
    do see the dropped packets go up on the client node pretty
    significantly when I run it. The node I set for server does not
    drop any packets.

    Brian Andrus


    On 12/5/2017 9:20 AM, Shawn Hall wrote:
    Hi Brian,

    Do you have flow control configured on all ports that are on the
    network path? Lustre has a tendency to cause packet losses in
    ways that performance testing tools don’t because of the N to 1
    packet flows, so flow control is often necessary. Lnet_selftest
    should replicate this behavior.

    Is there a point in the network path where the link bandwidth
    changes (e.g. 40 GbE down to 10 GbE, or 2x40 GbE down to 1x40
    GbE)? That will commonly be the biggest point of loss if flow
    control isn’t doing its job.

    Shawn

    On 12/5/17, 11:49 AM, "lustre-discuss on behalf of
    jongwoo...@naver.com <mailto:jongwoo...@naver.com>"
    <lustre-discuss-boun...@lists.lustre.org on behalf of
    jongwoo...@naver.com>
    
<mailto:lustre-discuss-bounces@lists.lustre.orgonbehalfofjongwoo...@naver.com>
    wrote:

    Did you check your connection with iperf and iperf3 in TCP
    bandwidth? in that case, these tools cannot find out packet drops.

    Try checking out your block device backend responsibility with
    benchmark tools like vdbench or bonnie++. Sometimes bad block
    device causes incorrect data transfer.

    -----Original Message-----
    From: &quot;Brian Andrus&quot;<toomuc...@gmail.com>
    <mailto:toomuc...@gmail.com>
    To: "lustre-discuss@lists.lustre.org"
    <mailto:lustre-discuss@lists.lustre.org><lustre-discuss@lists.lustre.org>
    <mailto:lustre-discuss@lists.lustre.org>;
    Cc:
    Sent: 2017-12-06 (수) 01:38:04
    Subject: [lustre-discuss] lustre causing dropped packets

    All,

    I have a small setup I am testing (1 MGS, 2 OSS) that is
    connected via
    40G ethernet.

    I notice that when I run anything that writes to the lustre
    filesystem
    causes dropped packets. Reads do not seem to cause this. I have also
    tested the network (iperf, iperf3, general traffic) with no
    dropped packets.

    Is there something with writes that can cause dropped packets?


    Brian Andrus

    _______________________________________________
    lustre-discuss mailing list
    lustre-discuss@lists.lustre.org
    <mailto:lustre-discuss@lists.lustre.org>
    http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
    _______________________________________________
    lustre-discuss mailing list
    lustre-discuss@lists.lustre.org
    <mailto:lustre-discuss@lists.lustre.org>
    http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



    *Disclaimer*

    This e-mail has been scanned for all viruses and malware, and may
    have been automatically archived by Mimecast Ltd, an innovator in
    Software as a Service (SaaS) for business.


    _______________________________________________
    lustre-discuss mailing list
    lustre-discuss@lists.lustre.org
    <mailto:lustre-discuss@lists.lustre.org>
    http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to