Wido,

Take the switch out of the path between nodes and remeasure.. ICMP-echo
requests are very low priority traffic for switches and network stacks.

If you really want to know, place a network analyzer between the nodes to
measure the request packet to response packet latency.. The ICMP traffic to
the "ping application" is not accurate in the sub-millisecond range. And
should only be used as a rough estimate.

You also may want to install the high resolution timer patch, sometimes
called HRT, to the kernel which may give you different results.

ICMP traffic takes a different path than the TCP traffic and should not be
considered an indicator of defect.

I believe the ping app calls the sendto system call.(sorry its been a while
since I last looked)  Systems calls can take between .1us and .2us each.
However, the ping application makes several of these calls and waits for a
signal from the kernel. The wait for a signal means the ping application
must wait to be rescheduled to report the time.Rescheduling will depend on
a lot of other factors in the os. eg, timers, card interrupts other tasks
with higher priorities.  Reporting the time must add a few more systems
calls for this to happen. As the ping application loops to post the next
ping request which again requires a few systems calls which may cause a
task switch while in each system call.

For the above factors, the ping application is not a good representation of
network performance due to factors in the application and network traffic
shaping performed at the switch and the tcp stacks.

cheers,
gary


On Fri, Nov 7, 2014 at 4:32 PM, Łukasz Jagiełło <jagiello.luk...@gmail.com>
wrote:

> Hi,
>
> rtt min/avg/max/mdev = 0.070/0.177/0.272/0.049 ms
>
> 04:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+
> Network Connection (rev 01)
>
> at both hosts and Arista 7050S-64 between.
>
> Both hosts were part of active ceph cluster.
>
>
> On Thu, Nov 6, 2014 at 5:18 AM, Wido den Hollander <w...@42on.com> wrote:
>
>> Hello,
>>
>> While working at a customer I've ran into a 10GbE latency which seems
>> high to me.
>>
>> I have access to a couple of Ceph cluster and I ran a simple ping test:
>>
>> $ ping -s 8192 -c 100 -n <ip>
>>
>> Two results I got:
>>
>> rtt min/avg/max/mdev = 0.080/0.131/0.235/0.039 ms
>> rtt min/avg/max/mdev = 0.128/0.168/0.226/0.023 ms
>>
>> Both these environment are running with Intel 82599ES 10Gbit cards in
>> LACP. One with Extreme Networks switches, the other with Arista.
>>
>> Now, on a environment with Cisco Nexus 3000 and Nexus 7000 switches I'm
>> seeing:
>>
>> rtt min/avg/max/mdev = 0.160/0.244/0.298/0.029 ms
>>
>> As you can see, the Cisco Nexus network has high latency compared to the
>> other setup.
>>
>> You would say the switches are to blame, but we also tried with a direct
>> TwinAx connection, but that didn't help.
>>
>> This setup also uses the Intel 82599ES cards, so the cards don't seem to
>> be the problem.
>>
>> The MTU is set to 9000 on all these networks and cards.
>>
>> I was wondering, others with a Ceph cluster running on 10GbE, could you
>> perform a simple network latency test like this? I'd like to compare the
>> results.
>>
>> --
>> Wido den Hollander
>> 42on B.V.
>> Ceph trainer and consultant
>>
>> Phone: +31 (0)20 700 9902
>> Skype: contact42on
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> Łukasz Jagiełło
> lukasz<at>jagiello<dot>org
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to