Re: [tcpdump-workers] RPC responde code question (print-nfs.c)

Guy Harris Wed, 05 Sep 2007 11:01:50 -0700

Alex Still wrote:

We're using tcpdump to try to diagnose an NFS performance problem.
We're seeing a lot of these :
[..]
16:01:19.890794 IP nfs_server.nfs > client_46.1205729087: reply ERR 1448
16:01:19.890831 IP nfs_server.nfs > client_46.3664003007: reply ERR 1448
[..]


That seems to be an RPC error,


It seems to be one, but it isn't necessarily one.

The code to print that printed "reply OK {length}" if the field in thepacket that, *IF* it contains the beginning of an ONC RPC reply, wouldbe the reply code is 0, and "reply ERR {length}" if it's not zero.

However, not all packets from an ONC RPC server (such as an NFS server)necessarily contain the beginning of an ONC RPC reply, as a reply (orrequest) can require more than one link-layer packet - for example, anNFS READ reply that returns more bytes that fit in one Ethernet packetwill, when sent over Ethernet, take more than one Ethernet packet.

If the request or reply is coming over UDP, that's handled by IPfragmentation; the first packet of the reply will be the first fragment,i.e. it will have a fragment offset of 0, and can be identified bylooking at the IP header. tcpdump only attempts to dissect the firstfragment as anything other than IP (for one thing, subsequent fragmentsdon't have UDP or TCP port numbers, so, unless it keeps track of thefragments, it has no idea what port number any fragment other than thefirst one was sent from or to), so it won't try to dissect thesubsequent fragments as if they contained the beginning of an ONC RPCrequest or reply.

If, however, it's coming over TCP, that's handled by the RPCimplementation, as TCP merely provides a byte stream, and has no ideawhat the message boundaries are for the protocols that run atop it.

Therefore, the current tcpdump code to dissect RPC requests and repliesdoesn't identify TCP segments that are in the middle of an ONC RPCrequest or reply, and tries to dissect them as if they were at thebeginning of the request or reply.

This can cause packets to be mis-dissected; the apparent RPC errorsyou're seeing are probably examples of that.

That part of the code
has changed in tcpdump-3.9.7, which we installed, and now gives :

16:01:19.890794 IP nfs_server.nfs > client_46.1205729087: reply Unknown rpc
response code=3128327487 1448
16:01:19.890831 IP nfs_server.nfs > client_46.3664003007: reply Unknown rpc
response code=910276799 1448

It now prints the value of the field that, if the packet really *is* anRPC error, would be the error code (the code is derived from fromNetBSD's version of tcpdump). If the packet *isn't* an RPC error, thevalue at the location that's interpreted as a reply code has a very goodchance of being bogus, as was the case here.

I don't understand what's happening here. Is it something wrong with our NFS
setup,

No.

or I'm misunderstanding the tcpdump output ?


Yes.

Wireshark/TShark should do a better job of this, for two reasons:

1) it does reassembly of ONC RPC requests and replies over TCP, as wellas handling TCP segments with multiple requests or replies;

2) it does some heuristics to check whether a packet *is* an ONC RPCrequest or reply.

1) would probably be a major undertaking to do in tcpdump, but 2) couldbe done:

for requests, check whether the RPC version field (not the NFS version)has the value 2 (neither Sun nor anybody else have released any versionof ONC RPC other than 2), and check whether the program number in therequest is the program number for NFS (note that Sun also run anotherRPC protocol on port 2049 - a protocol to allow access to POSIX ACLsover NFS - which will be misdissected if interpreted as NFS);

for replies, check whether the transaction ID of the reply matches thetransaction ID of a request we've already seen (tcpdump already doesthat matching for replies without errors, as it needs to know what typeof request the reply is for).


I'll see whether that could be done.
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Re: [tcpdump-workers] RPC responde code question (print-nfs.c)

Reply via email to