David Whelan [EMAIL PROTECTED]  wrote:

> >> Does anyone know where  can find high-level "what happens when" 
> >> documentation for ping? I mean clear information about what exactly 
> >> happens from the moment you enter a ping command to the returned message.
> >>     
> >
> > I don't know any good high-level doc. Most books about TCP/IP will cover 
> > it, but you will have to read quite much before you have the picture.
> >
> > I could try a brief description: 
> >
> >   
> Thanks very much for your excellent description Enrique.
> On a new installation of an old version of Debian, I have a problem 
> reaching any Internet address and, trying to work through the problem on 
> my own, I thought that such a high-level doc for ping would be useful in 
> troubleshooting it; but I couldn't find anything suitable either. 

Ah, that's why you needed it.

>I 
> wanted something that concentrates on the user's box rather than getting 
> into the whole network theory; i.e., what happens when you enter the 
> ping command: the chain of processes, what they do, and how to find out 
> if they were successful, and what goes out and what comes into the box. 

OK, ping sends messages on the wire to the specified IP address, requesting 
responses.
If you get any responses, the connectivity is there. If you don't, there are a 
couple of possible explanations. First, perhaps your network card is not 
configured. 

Try the command "ifconfig"

This will show some data about each of the interfaces.  Some interfaces are 
artificial and software-only. You probably know the name of the real 
interface(s). 

The artificial interface "lo"  (loopback) must also be running. 

Each description has a line in mostly uppercase, like "UP LOOPBACK RUNNING  
MTU:16436  Metric:1"
(This was my loopback interface).

The real inteface should have "UP BROADCAST RUNNING MULTICAST  MTU:1500  
Metric:1".
You need UP, BROADCAST, RUNNING. You need a nonzero MTU. 

The interfaces must have an IP address: inet addr:10.0.0.2  Bcast:10.0.0.255  
Mask:255.255.255.0
The loopback must have IP 127.0.0.1.  Beware of non-routable addresses. 169.* 
are zero-conf addresses, but I don't know how to make them work if they don't 
already. Addresses 168.192.*.* and 10.*.*.* are non-routable, but that is OK if 
you have a router that does NAT (network address translation).

Your computer must have configured a default gateway.

   # route -n
   Kernel IP routing table
   Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
   10.0.0.0        0.0.0.0         255.255.255.0   U     0      0        0 ath0
   0.0.0.0         10.0.0.138      0.0.0.0         UG    0      0        0 ath0

Do not forget the "-n" option!!!!

Next, perhaps your computer does not have a DNS server configured. You do not 
need a DNS server to use ping, but then you must use the -n option and use 
nummeric addresses, otherwise ping will try to translate back and forth, and 
get stuck waiting for an answer.

Try "cat /etc/resolv.conf"
It should say "nameserver 201.11.113.2" or similar.  You could now check with 
ping, first the gateway 10.0.0.138:

   # ping -n 10.0.0.138

and then the dns server

   # ping -n 201.11.113.2

Remember "-n"!!! Otherwise ping tries to do DNS lookups.

If resolv.conf says 127.0.0.*, then there should be a DNS server on your own 
computer. To check that, 

   # netstat -atue| grep domain
   tcp        0      0 *:domain                *:*                     LISTEN   
  root       15271
   udp        0      0 *:domain                *:*                              
  root       15272


If it is there, you have to determine what program is running the server. 
Supposing you don't have the program lsof installed, try 

   # ls -l /proc/[1-9]*/fd | grep -B 10 '1527[12]'

   /proc/4/fd:
   total 0

   /proc/5025/fd:
   total 6
   lr-x------ 1 root root 64 2007-06-06 15:44 0 -> /dev/null
   l-wx------ 1 root root 64 2007-06-06 15:44 1 -> /dev/null
   l-wx------ 1 root root 64 2007-06-06 15:44 2 -> /dev/null
   lrwx------ 1 root root 64 2007-06-06 15:44 3 -> socket:[15300]
   lrwx------ 1 root root 64 2007-06-06 15:44 4 -> socket:[15271]
   lrwx------ 1 root root 64 2007-06-06 15:44 5 -> socket:[15272]

This shows that the sockets are held by process 5025. Then 

   $ ps -p 5025
     PID TTY          TIME CMD
    5025 ?        00:00:00 pdnsd

So the server is called pdnsd. Whatever it is, check it's configuration. In 
this case, pdnsd wants to be notified whenever a network interface is 
configured, and the result can be checked with this command:

   $ pdnsd-ctl status
       ... (snip long output)
        label: resolvconf
        ip: 10.0.0.138
        server assumed available: yes
        ...
Ha!  That is the actual dns server. try a ping -n 10.0.0.138

> Because ping requires so many other processes to work, this might be a 
> useful way to troubleshoot a connectivity problem like mine. 

Actually, the nice thing about ping is that is requires less things to run, so 
it is easier to draw conclusions if it does not work, and if the problem is not 
really the network, ping is likely to work.

If all the tests above come through with positive results, the problem is more 
likely in the network itself. 
Rather than "ping", try "traceroute"

Traceroute sends probing packets similar to "ping", but it plays another trick 
with the time-to-live field: The first probe is sent with ttl=1, the next with 
ttl=2, etc. Traceroute does not use the ICMP echo request datagrams, because 
ICMP datagrams that count their ttl down to zero are just discarded, while if a 
non-icmp datagram expires its TTL, the router where this happens is supposed to 
send back a notification, another kind of ICMP. In this way, traceroute 
triggers responses from each router along the path to the target.  That allows 
you to see where the problem arises. 

A last reason why the connectivity is not there may be firewalls. Notice that 
nowadays many computers are configured to not respond to ping or traceroute, 
and most firewalls stop traceroute probes.

> Would this 
> approach be useful to other people do you think? If it would, and I 
> could get some technical help with it, I would be prepared to produce 
> such a document. Would this be useful?

Certainly.

There are a couple of network troubleshooting guides on the net, that you could 
try. The problem is that network configuration is very much a moving taqrget, 
so many old guides are useless.  The techiques I have used above are even 
worse, extremely old, but, ironically, by being so old they are so basic that 
they do not get out of fashion.  What these commands can pinpoint is, if the 
connectivity is there, if the network card is configured somehow, etc. If not, 
it does not say why not. All modern distributions have sophisticated program 
suites that pass bits of information around, notifying eachother of 
plug-and-play cards coming up and going away, roaming wirless connections, etc. 
Their components are generally poorly documented, described in a way that you 
must already know (almost) everything to understand it.  For each piece there 
may be two or three obsolete alternatives, and it is almost impossible to 
figure out which ones to read more about.

Anyone who tracks down how a caching DNS server like pdnsd gets its 
notifications when a roaming wireless gets connected, and describes it in an 
intelligible language will be doing the community a big favor - until the whole 
thing changes again in three months. (Actually I think the moment is a good 
one. Precisely the plug-and-play nature of modern computer is what has caused 
most of the dramatic changes, but I tend to think it must be settling a bit 
now. Describe it, and I give it four months and a half :).

But that brings up the question: Why an old version of debian on a new 
installation?

Perhaps the most usefull thing would be a table of end-user configuration 
tools. What is the name of the conf tool du jour? Which are obsolete, when did 
they become that? What tools are incompatible or require tinkering to reconcile?

Regards

Reply via email to