On 07/29/11 09:49, Guy Harris wrote:
On Jul 27, 2011, at 3:02 AM, Darren Reed wrote:
With Solaris, the interfaces available from the driver and protocol stack
prohibit access to actual packets at the link layer. I don't know if this is or
will be possible with Linux, but if the link layer header for IPoIB on Linux is
12 bytes, then no, the data before the IP header that is exposed by Infiniband
on Linux is not the link layer header. Furthermore, the comments that I've
received suggest that this type of access to network packets is not possible
with Infiniband.
For ARP packets, the influence of Infiniband is simply on the size of the
address placed in the ARP packets.
The address used in ARP packets for Infiniband is the same across all
implementations of IPoIB.
So whilst the pre-IP header is different on Solaris and Linux for Infiniband
packets, the Infiniband address placed in the ARP packets is an Infiniband
address and is not dependent on the implementation of IPoIB.
Thus mapping ARPHRD_INFINIBAND to 32 will be fine for both Linux and Solaris.
Presumably "for Solaris" means that, for libpcap on Solaris 11, you have a
choice of using BPF (which returns DLT_ values), PF_PACKET sockets (which returns ARPHRD_
values), and DLPI (which returns DL_ values)? If it doesn't support using PF_PACKET
sockets for capturing, libpcap-on-Solaris has no reason to care about ARPHRD_anything.
The ARPHRD_INFINIBAND value (32) is seen by tcpdump when decoding ARP
headers in IPOIB
traffic on the relevant interfaces. As can be seen in this patch:
diff -uN tcpdump-4.1.1/print-arp.c tcpdump-4.1.1.new/print-arp.c
--- tcpdump-4.1.1/print-arp.c 2010-03-11 17:56:44.000000000 -0800
+++ tcpdump-4.1.1.new/print-arp.c 2011-07-14 09:01:08.965396346 -0700
@@ -62,6 +62,7 @@
u_char ar_hln; /* length of hardware address */
u_char ar_pln; /* length of protocol address */
u_short ar_op; /* one of: */
+#define ARPHRD_INFINIBAND 32 /* Infiniband RFC 4391 */
#define ARPOP_REQUEST 1 /* request to resolve address */
#define ARPOP_REPLY 2 /* response to previous request */
#define ARPOP_REVREQUEST 3 /* request protocol address given
hardware */
@@ -118,6 +119,7 @@
{ ARPHRD_STRIP, "Strip" },
{ ARPHRD_IEEE1394, "IEEE 1394" },
{ ARPHRD_ATM2225, "ATM" },
+ { ARPHRD_INFINIBAND, "Infiniband" },
{ 0, NULL }
};
Here the symbol "ARPHRD_INFINBAND" is defined only for use with printing
out ARP packets. Now that I think about it, the above patch isn't
really the best but it should give you an idea about what the problem
is here. Without the above patch, tcpdump prints that it has an address
for an unknown address type in the ARP messages. That message can be
confusing and is avoidable.
Yes, on Solaris, DL_IB is defined for use with DLPI and Infiniband.
For the DLT values, I'm going to use the names DLT_IPOIB and
LINKTYPE_SOLARIS_IPOIB for Solaris 11. If a pair of numbers can be assigned in
the next 24 or so hours, I'll use those, otherwise it'll be DLT_USER15 for
both. If I understand correctly, the design is such that libpcap on Linux would
then map DLT_IPOIB to LINKTYPE_LINUX_IPOIB
No. As there are APIs in libpcap that are expected to return DLT_ values for
savefiles, and as savefiles have LINKTYPE_ values in them (because there are
some cases where different BSDs use different numerical values for the same
DLT_ definitions - and, in at least some of those cases, BSD #1 uses a given
numerical value for DLT_xxx and BSD #2 uses that numerical value for DLT_yyy
and a different numerical value for DLT_xxx - so we need a single LINKTYPE_xxx
numerical value to correspond to all of the different numerical values of
DLT_xxx), so there would have to be different DLT_s for LINKTYPE_SOLARIS_IPOIB
and LINKTYPE_LINUX_IPOIB.
Right, I'm with you on that.
So Linux would, presumably, when opening an Infiniband interface, map
ARPHRD_INFINIBAND (32) to DLT_LINUX_IPOIB, just as Solaris BPF would just
return get DLT_SOLARIS_IPOIB and, if there's DLPI access to those interfaces,
libpcap on Solaris's DLPI code would map DL_IPOIB or whatever to
DLT_SOLARIS_IPOIB (if they have the same link-type header format). libpcap on
*all* platforms, and WinPcap on Windows, would map LINKTYPE_SOLARIS_IPOIB in a
capture file to DLT_SOLARIS_IPOIB and would map LINKTYPE_LINUX_IPOIB in a
capture file to DLT_LINUX_IPOIB to be returned by pcap_datalink().
DLPI's DL_IB and BPF's DLT_SOLARIS_IPOIB on Solaris result in the same
header that is found before the IP header being received by applications.
With this email, I've attached a capture from an IB adapter on Solaris.
The patch above is required to make "tcpdump -v" sensible with ARP
messages inside IPOIB, example:
08:11:08.610137 ARP, Infiniband (len 20), IPv4 (len 4), Request who-has
192.168.37.12
(00:ff:ff:ff:ff:10:40:1b:00:00:00:00:00:00:00:00:ff:ff:ff:ff) tell
192.168.37.1, length 56
08:11:08.610327 ARP, Infiniband (len 20), IPv4 (len 4), Reply
192.168.37.12 is-at
80:00:00:51:fe:80:00:00:00:00:00:00:00:21:28:00:01:a1:1d:45, length 56
Darren
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.