Re: [LARTC] Random ping jumps

2004-01-08 Thread R. Steve McKown
On Thursday 08 January 2004 01:01 pm, ArtÅras Ålajus wrote:
> Map is at http://h2o.pieva.net/net.png

Ah, nice.

> > I'm also unclear about the pings that you've tried.  After you've shown
> > the network map, perhaps you can identify the two machines (and
> > interfaces) involved in each of the different ping tests you've
> > performed.
>
> The machine is totaly random.

What happens if you ping from the linux box to the linux box's default 
gateway?  If the problem doesn't exhibit in this test nor in any test between 
machines in your LAN, the problem is probably your providers: the DSL modem 
or something 'downstream' from it.  You should consider doing tests #2 and #3 
anyway as support for your position when you call your ISP to open a trouble 
ticket.

If the latency problem does exhibit pinging from the linux box to the default 
gateway, you haven't learned much yet.  Continue testing by removing 
variables, attempting to isolate the smallest 'configuration' that exhibits 
the problem.  The variables are: computers, hubs/switches, cables, and the 
like.  Here's some suggestions for testing:

1. plug the linux router directly into the DSL modem and ping from the router 
to the default gateway.  If the problem goes away, it's something in the 
hardware and cables that were 'bypassed' in this test.  You can continue this 
strategy to test into your network.  Read my security note below.

2. plug a PC, configured as the linux router's eth0:1 interface (with proper 
default gateway) and ping from the pc to the default gateway.  If the problem 
goes away, its probably the linux router (hardware or software).

3. If #1 and #2 don't cause it to go away, be sure you used a different cable 
in tests #1 and #2.  If the problem still doesn't go away, it's an issue for 
your network provider.

* security note *

Running both your LAN and the internet provider subnets on the same ethernet 
network puts you at a much greater security risk.  You should seriously 
consider installing a third network interface into your linux box and moving 
eth0:1's ip info to eth2.  Then plug the DSL modem into eth2 with a 
cross-over cable with no computers attached.

I'm guessing your thirty users using Windows.  If they have windows network 
enabled, they are all generating broadcast traffic.  That traffic will most 
likely be crossing the DSL modem (since it is bridging).  Aside from security 
implications, the local traffic that does get bridged is tying up your DSL 
bandwidth.  It seems unlikely that 30 PC's could saturate your 128kbps 
uplink, but I'm no expert on windows networking.  128kbps is not a huge pipe, 
so perhaps it's possible.  If so, the solution to your security problem is 
also the solution to the latency variability issue.  If this is the case, 
both tests #2 and #3 will not show the variability, since your local LAN is 
effectively removed from the test.

Hope this helps,
Steve

> [EMAIL PROTECTED]:~$ traceroute fortas.ktu.lt
> traceroute to fortas.ktu.lt (193.219.160.131), 30 hops max, 38 byte packets
>1  adsl-213-190-40-129.takas.lt (213.190.40.129)  26.269 ms  23.333 ms 
> 25.156 ms 2  fe22-acc0-tai.kns.telecom.lt (212.59.7.233)  63.079 ms  33.146
> ms  26.117 ms 3  telecom-gw.is.lt (193.219.13.99)  35.978 ms  26.476 ms 
> 103.138 ms 4  litnet-gw.is.lt (193.219.13.98)  22.715 ms  24.531 ms 
> 209.984 ms 5  cat6506-p2-1.kttc.litnet.lt (193.219.62.125)  52.826 ms 
> 98.040 ms  81.609 ms 6  ktu-lan.litnet.lt (193.219.61.252)  38.696 ms 
> 182.582 ms  241.836 ms 7  fortas.ktu.lt (193.219.160.131)  215.523 ms 
> 126.815 ms  29.217 ms
>
> [EMAIL PROTECTED]:~$ traceroute cs.mes.lt
> traceroute to cs.mes.lt (193.219.67.253), 30 hops max, 38 byte packets
>1  adsl-213-190-40-129.takas.lt (213.190.40.129)  748.174 ms  66.331 ms 
> 135.586 ms 2  fe22-acc0-tai.kns.telecom.lt (212.59.7.233)  21.645 ms 
> 21.588 ms  24.597 ms 3  telecom-gw.is.lt (193.219.13.99)  30.584 ms  31.065
> ms  29.612 ms 4  litnet-gw.is.lt (193.219.13.98)  24.602 ms  143.212 ms 
> 143.096 ms 5  cat6506-p2-1.kttc.litnet.lt (193.219.62.125)  292.196 ms 
> 163.870 ms  84.549 ms 6  ktu-lan.litnet.lt (193.219.61.252)  84.982 ms 
> 54.801 ms  69.143 ms 7  diz.ktu.lt (193.219.67.253)  33.831 ms  29.877 ms 
> 30.005 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=5 ttl=59
> time=34.8 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=6 ttl=59
> time=32.6 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=7 ttl=59
> time=33.1 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=8 ttl=59
> time=324 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=9 ttl=59
> time=836 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=10 ttl=59
> time=850 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=11 ttl=59
> time=321 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=12 ttl=59
> time=147 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=13 ttl=59
> time=115 ms 64 bytes from diz.ktu.lt (193.219.67.253): icmp_seq=14 ttl

Re: [LARTC] Random ping jumps

2004-01-08 Thread R. Steve McKown
Can you provide some more detail on your network configuration?  I'm unclear 
if the linux server is your internet router or just another client computer 
on your local LAN, where the test pings to "the internet" are going (i.e. 
nexthop router, etc.), and if/where CIPE tunnels are involved in the 
equation.  Perhaps a small network map would be helpful.

I'm also unclear about the pings that you've tried.  After you've shown the 
network map, perhaps you can identify the two machines (and interfaces) 
involved in each of the different ping tests you've performed.

I had a similar problem recently.  A linux-based router with four interfaces 
serving three local LANs and a T-1 (via the provider's router) to the 
internet.  The router was forwarding traffic between all combinations of 
networks (that were allowed by rule) correctly, except between LANs 1 and 2.  
In this case, pings would vary much as in your case.  Interestingly, it 
turned out to be bad hardware.  Moved the boot media to an identically 
configured machine and the problem went away.  Returned the boot media to the 
original machine and the problem returned.

On Wednesday 07 January 2004 02:26 pm, ArtÅras Ålajus wrote:
> Hello,
>
>I've got this problem. There is an linux server with 2.4.24 kernel
> and pinging from him to internet (or from lan) ping randomly jumps up:
>
> 64 bytes from fortas.ktu.lt (193.219.160.131): icmp_seq=387 ttl=59
> time=30.0 ms 64 bytes from fortas.ktu.lt (193.219.160.131): icmp_seq=388
> ttl=59 time=32.6 ms 64 bytes from fortas.ktu.lt (193.219.160.131):
> icmp_seq=389 ttl=59 time=34.9 ms 64 bytes from fortas.ktu.lt
> (193.219.160.131): icmp_seq=390 ttl=59 time=198 ms 64 bytes from
> fortas.ktu.lt (193.219.160.131): icmp_seq=391 ttl=59 time=407 ms 64 bytes
> from fortas.ktu.lt (193.219.160.131): icmp_seq=392 ttl=59 time=407 ms 64
> bytes from fortas.ktu.lt (193.219.160.131): icmp_seq=393 ttl=59 time=430 ms
> 64 bytes from fortas.ktu.lt (193.219.160.131): icmp_seq=394 ttl=59
> time=30.9 ms 64 bytes from fortas.ktu.lt (193.219.160.131): icmp_seq=395
> ttl=59 time=31.6 ms
>
>Internet line isn't loaded up, server load fine. QOS isn't used, qdiscs
> default. I don't realize what the problem is and even how to debug it.
> Sysctl config: net/ipv4/ip_forward = 1
> net/ipv4/icmp_ignore_bogus_error_responses = 1
> net/ipv4/icmp_echo_ignore_broadcasts = 1
> net/ipv4/tcp_syncookies = 1
> net/ipv4/tcp_timestamps = 0
> net/ipv4/tcp_window_scaling = 0
> net/ipv4/tcp_sack = 0
> net/ipv4/tcp_fin_timeout = 30
> net/ipv4/tcp_keepalive_time = 1800
> net/ipv4/tcp_low_latency = 1
>
> Thanks for any thoughts.
>
>
> ___
> LARTC mailing list / [EMAIL PROTECTED]
> http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/


Re: [LARTC] Multihomed Masquerading, routing and iptables

2004-01-06 Thread R. Steve McKown
On Tuesday 06 January 2004 02:25 am, Gordan Bobic wrote:
> If one of the default routes is removed, everything works OK. However, if
> there are two default routes, packets get misdirected. ChangeLog for 2.4.21
> lists a few conntrack bug fixes, which I suspect to be the cause of this.
> Basically, the non-deterministic default route selection/rotation seems to
> take precedence over maintaining the same interface for serving a
> particular established connection through the firewall.

You are right.  This is because the routing core is often queried more than 
once to set up a usable route cache entry for a given connection/session.  
Have a look at ip_route_connect() in linux/include/net/route.h as an example:

static inline int ip_route_connect(struct rtable **rp, u32 dst, u32 src, u32 
tos, int oif)
{
int err;
err = ip_route_output(rp, dst, src, tos, oif);
if (err || (dst && src))
return err;
dst = (*rp)->rt_dst;
src = (*rp)->rt_src;
ip_rt_put(*rp);
*rp = NULL;
return ip_route_output(rp, dst, src, tos, oif);
}

Consider when this function is called with src==0, which happens for locally 
generated output (SNAT is similar I believe).  The first ip_route_output() 
call returns a pointer to a route cache entry, which includes a src ip in 
(*rp)->rt_src.  The first route cache entry doesn't work for us, because its 
'key' has src==0 and so won't match subsequent traffic.  So a second 
ip_route_output() is called using the new src as part of its key.  The new 
key matches no existing route cache and as a result the default multipath 
route is again consulted and a nexthop is determined.  This latter process 
does not use src in its processing so there is no guarantee that the nexthop 
returned is the same as that returned by the first query.  Hence, src ip is 
not guaranteed to match outbound interface.

Julian Anastasov's patches, noted earlier in this thread, provide a solution 
to this problem.  He allows for additional route rules and route tables that 
are matched by the second route query in preference to the default route so 
the src ip and outbound interface can be forced to be consistent.

I'm still pretty new to all this, so I hope Julian or someone else can correct 
any errors I have made.  The example above is in the non-NAT case of locally 
generated traffic, but I believe it's representative of what happens in the 
SNAT case as well.

> I'm compiling a new clean 2.4.24 with the jumbo routes patch at the moment,
> which will hopefully fix things. I'm hoping to try it out tonight. And BTW,
> the latest RH9 kernel released yesterday (2.4.20-28.9 IIRC), is still
> broken as far as routing is concerned.

I haven't looked at RedHat's route patch; it'd be killer if they solved this 
without requiring the additional route rules and tables setups as required by 
Julian's patches.  Let us know the outcome, would you?

The reason for this behavior makes sense from a code perspective, but not IMO 
from a route administration perspective.  I have a patch in its infancy that 
attempts to address this problem without requiring extra route administration 
(rules and tables).  It works in the non-nat case, but there is still much 
more testing to go before it's worth publishing.  If it survives the next few 
weeks of testing, I'd be happy to pass it on to anyone else who might be 
interested in playing with it.

Best Regards,
Steve

___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/