Constantine,

Please mail me offlist if Init7 can be of any help to resolve the case.

--
Fredy Kuenzler
Init7 (Switzerland) Ltd.
St.-Georgen-Strasse 70
CH-8400 Winterthur
Switzerland

http://www.init7.net/


> Am 30.11.2013 um 08:30 schrieb "Constantine A. Murenin" <muren...@gmail.com>:
> 
> Dear NANOG@,
> 
> I'm not exactly sure how else I can get he.net's attention, because I've been 
> experiencing congestion issues between my dedi and Indiana for a couple of 
> months now, all due to he.net's poor transit, as it turns out.  The issue was 
> complicated by the fact that the routes are asymmetric, and it appears as if 
> the traffic loss is going on somewhere where there is none at all.
> 
> I will just provide the data, and people can make their own conclusions, any 
> insights are welcome.
> 
> During all of this, since some late September 2013, all 4 networks involved 
> have been contacted -- hetzner, init7, he.net, indiana; all except for he.net 
> have responded and did troubleshooting.
> 
> After pressing the lack of any kind of response from he.net, all they did was 
> ask for a customer number, and that was back in September.  I have not heard 
> from their NOC@ ever since, with requests left unanswered, sans the "we have 
> received your request" autoreply.
> 
> Interestingly enough, only some of their Europe-to-US routes are blatantly 
> congested and have very obvious packet loss (often making ssh unusable), 
> whereas others appear to be doing just fine (at least, not losing packets and 
> not experiencing jitter, and the increased latency).  E.g. IPv6 routes don't 
> appear affected, for example.  IPv4 addresses in North America that are 
> announced directly from AS6939 (e.g. Linode in Fremont) don't appear 
> affected, either.  But the multi-homed indiana.edu and wiscnet.net are 
> affected.  The single-homed ntp1.yycix.ca is affected, too.  Probably other 
> customers are affected as well.
> 
> Where's the end to this?
> 
> Or is the ongoing 0.5+% traffic loss, and the 140+ms avg latency on a 114ms 
> route, with random spikes and jitter in certain hours of the day (generally 
> around midnight ET), every day for several weeks or even months, an 
> acceptable practice?
> 
> 
> 
> From hetzner.de through he.net:
> 
> 
> Cns# date ; mtr --report{,-wide,-cycles=600} --interval 0.1 --order "SRL 
> BGAWV" -4 ????c????????.indiana.edu ; date
> Fri Nov 29 21:06:17 PST 2013
> HOST: Cns???????                                   Snt   Rcv Loss% Best Gmean 
>   Avg  Wrst StDev
>  1.|-- static.??.???.4.46.clients.your-server.de    600   600  0.0%  0.5   
> 1.0   1.3   4.9   1.1
>  2.|-- hos-tr1.juniper1.rz13.hetzner.de             600   600  0.0%  0.1   
> 0.2   1.9  66.0   7.6
>  3.|-- core21.hetzner.de                            600   600  0.0%  0.2   
> 0.2   0.2   5.8   0.4
>  4.|-- core22.hetzner.de                            600   600  0.0%  0.2   
> 0.2   0.2  19.4   1.2
>  5.|-- core1.hetzner.de                             600   600  0.0%  4.8   
> 4.8   4.8  13.2   0.7
>  6.|-- juniper1.ffm.hetzner.de                      600   600  0.0%  4.8   
> 4.8   4.8  27.4   1.4
>  7.|-- 30gigabitethernet1-3.core1.ams1.he.net       600   600  0.0% 11.2  
> 14.0  14.6  48.7   4.5
>  8.|-- 10gigabitethernet1-4.core1.lon1.he.net       600   600  0.0% 18.2  
> 19.6  19.9  53.9   4.1
>  9.|-- 10gigabitethernet10-4.core1.nyc4.he.net      600   599  0.2% 87.0 
> 116.1 116.7 145.7  12.4
> 10.|-- 100gigabitethernet7-2.core1.chi1.he.net      600   597  0.5% 106.6 
> 135.4 136.1 192.0  13.3
> 11.|-- ???                                          600     0 100.0  0.0   
> 0.0   0.0   0.0   0.0
> 12.|-- et-11-0-0.945.rtr.ictc.indiana.gigapop.net   600   594  1.0% 113.3 
> 139.3 139.7 166.1  11.4
> 13.|-- xe-0-3-0.11.br2.ictc.net.uits.iu.edu         600   596  0.7% 113.2 
> 139.8 140.3 177.3  12.0
> 14.|-- ae-0.0.br2.bldc.net.uits.iu.edu              600   595  0.8% 114.2 
> 140.1 140.6 183.2  11.8
> 15.|-- ae-10.0.cr3.bldc.net.uits.iu.edu             600   597  0.5% 114.3 
> 140.3 140.8 165.0  11.5
> 16.|-- ????c????????.indiana.edu                    600   597  0.5% 114.7 
> 140.7 141.1 161.6  11.4
> Fri Nov 29 21:08:52 PST 2013
> 
> 
> Cns# unbuffer hping --icmp-ts ????c????????.indiana.edu | \
> perl -ne 'if (/icmp_seq=(\d+) rtt=(\d+\.\d)/) {($s, $p) = ($1, $2);} \
> if (/ate=(\d+) Receive=(\d+) Transmit=(\d+)/) {($o, $r, $t) = ($1, $2, $3);} \
> if (/tsrtt=(\d+)/) { \
> print $s, "\t", $p, "\t", $1, " = ", $r - $o, " + ", $o + $1 - $t, "\n"; }'
> 0       143.5   144 = 87 + 57
> 1       125.5   126 = 69 + 57
> 2       143.6   144 = 87 + 57
> 3       157.9   158 = 102 + 56
> 4       122.0   122 = 66 + 56
> 5       141.6   142 = 85 + 57
> 6       132.2   133 = 76 + 57
> 7       146.2   146 = 89 + 57
> 8       145.1   145 = 88 + 57
> 9       119.9   119 = 63 + 56
> 10      132.7   132 = 75 + 57
> 11      140.1   140 = 83 + 57
> 12      151.0   151 = 94 + 57
> 13      152.6   152 = 96 + 56
> 14      129.1   129 = 72 + 57
> 15      128.5   128 = 71 + 57
> ^C
> 
> 
> 
> Single-homed at he.net:
> 
> 
> Cns# date ; mtr --report{,-cycles=600} --interval 0.1 --order "SRL BGAWV" -4 
> ntp1.yycix.ca ; date
> Fri Nov 29 21:16:14 PST 2013
> HOST: Cns???????                    Snt   Rcv Loss%   Best Gmean   Avg Wrst 
> StDev
>  1.|-- static.??.???.4.46.client   600   600  0.0%    0.5   1.0   1.3  10.2   
> 1.2
>  2.|-- hos-tr4.juniper2.rz13.het   600   600  0.0%    0.1   0.2   2.0 153.9   
> 9.8
>  3.|-- core22.hetzner.de           600   600  0.0%    0.2   0.2   0.2  10.6   
> 0.6
>  4.|-- core1.hetzner.de            600   600  0.0%    4.8   4.8   4.8  16.4   
> 0.9
>  5.|-- juniper1.ffm.hetzner.de     600   600  0.0%    4.8   4.8   4.8  36.4   
> 1.5
>  6.|-- 30gigabitethernet1-3.core   600   600  0.0%   11.2  13.5  14.0  36.6   
> 4.3
>  7.|-- 10gigabitethernet1-4.core   600   600  0.0%   18.0  21.5  21.8  43.1   
> 4.0
>  8.|-- 10gigabitethernet10-4.cor   600   597  0.5%   93.2 128.0 128.3 157.5   
> 8.9
>  9.|-- 10gigabitethernet1-2.core   600   596  0.7%  103.1 139.4 139.6 157.5   
> 8.2
> 10.|-- 10gigabitethernet3-1.core   600   597  0.5%  128.2 164.9 165.1 181.9   
> 8.2
> 11.|-- 10gigabitethernet1-1.core   600   593  1.2%  138.7 175.9 176.1 192.6   
> 7.8
> 12.|-- sebo-systems-inc.gigabite   600   597  0.5%  139.0 176.4 176.5 187.5   
> 6.9
> 13.|-- ???                         600     0 100.0    0.0   0.0   0.0   0.0   
> 0.0
> 14.|-- ntp1.yycix.ca               600   597  0.5%  141.0 176.9 177.0 186.9   
> 6.9
> Fri Nov 29 21:18:32 PST 2013
> Cns# traceroute -A ntp1.yycix.ca
> traceroute to ntp1.yycix.ca (192.75.191.6), 64 hops max, 40 byte packets
> 1  static.??.???.4.46.clients.your-server.de (46.4.???.??) [AS24940] 0.664 ms 
>  0.648 ms  0.453 ms
> 2  hos-tr1.juniper1.rz13.hetzner.de (213.239.224.1) [AS24940]  23.985 ms 
> hos-tr2.juniper1.rz13.hetzner.de (213.239.224.33) [AS24940]  0.234 ms 
> hos-tr3.juniper2.rz13.hetzner.de (213.239.224.65) [AS24940]  0.238 ms
> 3  core22.hetzner.de (213.239.245.121) [AS24940]  0.238 ms core21.hetzner.de 
> (213.239.245.81) [AS24940]  0.234 ms  0.236 ms
> 4  core1.hetzner.de (213.239.245.177) [AS24940]  4.811 ms  4.809 ms 
> core22.hetzner.de (213.239.245.162) [AS24940]  0.248 ms
> 5  core1.hetzner.de (213.239.245.177) [AS24940]  4.831 ms 
> juniper1.ffm.hetzner.de (213.239.245.5) [AS24940]  4.842 ms  4.826 ms
> 6  juniper1.ffm.hetzner.de (213.239.245.5) [AS24940]  4.857 ms  4.864 ms 
> 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) [AS1200] 11.233 ms
> 7  10gigabitethernet1-4.core1.lon1.he.net (72.52.92.81) [AS6939, AS6939]  
> 19.869 ms 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) [AS1200]  
> 18.420 ms  11.255 ms
> 8  10gigabitethernet10-4.core1.nyc4.he.net (72.52.92.241) [AS6939, AS6939]  
> 115.845 ms  101.875 ms 10gigabitethernet1-4.core1.lon1.he.net (72.52.92.81) 
> [AS6939, AS6939]  17.249 ms
> 9  10gigabitethernet10-4.core1.nyc4.he.net (72.52.92.241) [AS6939, AS6939]  
> 138.302 ms 10gigabitethernet1-2.core1.tor1.he.net (184.105.222.18) [AS6939]  
> 120.449 ms  139.730 ms
> 10  10gigabitethernet1-2.core1.tor1.he.net (184.105.222.18) [AS6939] 134.755 
> ms  104.661 ms 10gigabitethernet3-1.core1.ywg1.he.net (184.105.223.221) 
> [AS6939]  167.282 ms
> 11  10gigabitethernet1-1.core1.yyc1.he.net (184.105.223.214) [AS6939] 139.310 
> ms 10gigabitethernet3-1.core1.ywg1.he.net (184.105.223.221) [AS6939]  155.983 
> ms  155.910 ms
> 12  sebo-systems-inc.gigabitethernet2-23.core1.yyc1.he.net (216.218.214.250) 
> [AS6939]  138.703 ms  178.530 ms 10gigabitethernet1-1.core1.yyc1.he.net 
> (184.105.223.214) [AS6939] 172.423 ms
> 13  sebo-systems-inc.gigabitethernet2-23.core1.yyc1.he.net (216.218.214.250) 
> [AS6939]  158 ms * *
> 14  * * ntp1.yycix.ca (192.75.191.6) [AS53339]  181.433 ms
> Cns#
> Cns#
> Cns#
> Cns# unbuffer hping --icmp-ts ntp1.yycix.ca | perl -ne \
> 'if (/icmp_seq=(\d+) rtt=(\d+\.\d)/) {($s, $p) = ($1, $2);} \
> if (/ate=(\d+) Receive=(\d+) Transmit=(\d+)/) {($o, $r, $t) = ($1, $2, $3);} \
> if (/tsrtt=(\d+)/) { \
> print $s, "\t", $p, "\t", $1, " = ", $r - $o, " + ", $o + $1 - $t, "\n"; }'
> 0       165.0   165 = 95 + 70
> 1       156.2   156 = 86 + 70
> 2       178.9   179 = 109 + 70
> 3       181.0   181 = 111 + 70
> 4       178.3   179 = 108 + 71
> 5       163.8   164 = 94 + 70
> 6       175.7   176 = 106 + 70
> 7       173.9   174 = 104 + 70
> 8       172.6   173 = 103 + 70
> 9       163.5   164 = 94 + 70
> 10      181.8   182 = 112 + 70
> 11      161.9   162 = 92 + 70
> 12      183.1   184 = 113 + 71
> 13      174.5   174 = 104 + 70
> 14      181.8   181 = 111 + 70
> 15      181.7   181 = 111 + 70
> ^C
> Cns#
> 
> 
> 
> 
> From indiana.edu to hetzner.de; notice that the mtr by itself gives a false 
> impression of a traffic loss at init7, whereas in reality, it's the reverse 
> path through he.net that's causing the loss, as hping confirms:
> 
> 
> m: {5134} date ; sudo mtr --report{,-cycles=600} --interval 0.1 --order "SRL 
> BGAWV" -4 ?????? ; date
> Sat Nov 30 00:36:27 EST 2013
> HOST: ????c????????.indiana.edu     Snt   Rcv Loss%   Best Gmean   Avg Wrst 
> StDev
>  1.|-- 129.79.???.?                600   600  0.0%    0.4   0.7   0.9  24.7   
> 1.5
>  2.|-- ae-13.0.br2.bldc.net.uits   600   600  0.0%    0.5   0.7   0.9  22.6   
> 1.8
>  3.|-- ae-0.0.br2.ictc.net.uits.   600   600  0.0%    1.4   1.7   1.8  20.2   
> 1.6
>  4.|-- xe-0-1-0.11.rtr.ictc.indi   600   600  0.0%    1.4   2.1   3.8  66.5   
> 8.1
>  5.|-- 64.57.21.13                 600   600  0.0%    6.0   7.2   8.4  72.9   
> 8.0
>  6.|-- xe-2-2-0.0.ny0.tr-cps.int   600   600  0.0%   32.3  33.9  34.4  81.0   
> 6.9
>  7.|-- paix-nyc.init7.net          600   600  0.0%   32.5  35.3  35.5  44.7   
> 3.8
>  8.|-- r1lon1.core.init7.net       600   599  0.2%  100.1 104.7 104.9 146.5   
> 7.5
>  9.|-- r1nue1.core.init7.net       600   599  0.2%  114.6 115.7 115.7 125.4   
> 2.2
> 10.|-- gw-hetzner.init7.net        600   594  1.0%  112.4 141.3 142.4 241.9  
> 18.2
> 11.|-- core12.hetzner.de           600   468 22.0%  112.2 142.7 144.0 203.4  
> 20.3
> 12.|-- core21.hetzner.de           600   202 66.3%  114.4 143.7 145.0 204.1  
> 20.1
> 13.|-- juniper1.rz13.hetzner.de    600   594  1.0%  114.7 141.4 142.1 212.2  
> 14.3
> 14.|-- hos-tr2.ex3k11.rz13.hetzn   600   599  0.2%  113.8 123.9 125.5 218.2  
> 21.8
> 15.|-- static.88-198-??-??.clien   599   592  1.2%  114.6 137.2 137.9 167.6  
> 13.2
> 0.244u 1.766s 1:05.52 3.0%      0+0k 0+1io 0pf+0w
> Sat Nov 30 00:37:32 EST 2013
> 
> m: {5137} sudo script -q /dev/null hping3 --icmp-ts 88.198.??.?? | perl -ne 
> 'if (/icmp_seq=(\d+) rtt=(\d+\.\d)/) {($s, $p) = ($1, $2);} \
> if (/ate=(\d+) Receive=(\d+) Transmit=(\d+)/) {($o, $r, $t) = ($1, $2, $3);} \
> if (/tsrtt=(\d+)/) { \
> print $s, "\t", $p, "\t", $1, " = ", $r - $o, " + ", $o + $1 - $t, "\r\n"; }'
> 0       131.3   131 = 57 + 74
> 1       122.4   122 = 56 + 66
> 2       122.6   123 = 56 + 67
> 3       127.6   128 = 57 + 71
> 4       146.5   147 = 57 + 90
> 5       139.8   140 = 56 + 84
> 6       131.0   131 = 57 + 74
> 7       134.6   135 = 57 + 78
> 8       137.7   138 = 57 + 81
> 9       148.1   148 = 57 + 91
> 10      141.2   142 = 57 + 85
> 11      146.4   146 = 56 + 90
> 12      153.6   154 = 57 + 97
> 13      149.4   150 = 57 + 93
> 14      120.2   121 = 57 + 64
> 15      120.6   120 = 56 + 64
> 16      130.7   131 = 57 + 74
> 17      126.4   126 = 56 + 70
> 18      117.9   118 = 57 + 61
> 19      116.9   117 = 57 + 60
> 20      119.8   119 = 56 + 63
> 21      132.0   132 = 56 + 76
> 22      134.2   134 = 56 + 78
> 23      138.8   139 = 57 + 82
> 
> 
> 
> Note the ICMP timestamp data from hping above.  From this ICMP timestamping 
> data, it is obvious that the congestion is only happening on one path -- the 
> one over he.net, and init7 is in the clear.
> 
> Any further insights are welcome.  But finding out about the ICMP timestamp 
> feature has so far been the most useful thing in troubleshooting this issue; 
> I'm surprised it's a rather unknown method to get to the bottom of these 
> problems.
> 
> However, even after finding out about the cause and the party responsible, 
> the problem is yet to be exhausted.  Any help appreciated.
> 
> Best regards,
> Constantine.
> 

Reply via email to