On Fri, 8 Oct 2010, Attilio Rao wrote:
GENERAL FRAMEWORK ARCHITECTURE
Netdump is composed, right now, by an userland "server" and a kernel
"client". The former is run on the target machine (where the dump will
phisically happen) and it is responsible for receiving the packets
containing coredumps frame and for correctly writing them on-disk. The
latter is part of the kernel installed on the source machine (where the
dump is initiated) and is responsible for building correctly UDP packets
containing the coredump frames, pushing through the network interface and
routing them appropriately.
Hi Attilio:
Network dumps would be a great addition to the FreeBSD debugging suite! A few
casual comments and questions, I need to spend some more time pondering the
implications of the current netdump design later in the week.
(1) Did you consider using tftp as the network dump protocol, rather than a
custom protocol? It's also a simple UDP-based, ACKed file transfer protocol,
with the advantage that it's widely supported by exiting tftp daemons, packet
sniffers, security devices, etc. This wouldn't require using a stock tftpd,
although that would certainly be a benefit as well. The post-processing of
crashdumps that seems to happen in the netdump server could, presumably,
happen "offline" as easily...?
(2) Have you thought about adding a checksum to the dump format, since packets
are going over the wire on UDP, etc? Even an MD5 sum wouldn't be too hard to
arrange: all the code is in the kernel already, requires relatively little
state storage, and is designed for streamed data.
(3) As the bounds checking/etc behavior in dumping grows more complex, it
seems a shame to replicate it in architecture-specific code. Could we pull
the mediaoffset/mediasize checking into common code? Also, a reserved
size/offset of "0" meaning "no limit" sounds slightly worrying; it might be
slightly more conservative to add a flags field to dumperinfo and have a flag
indicating "size is no object" for netdump, rather than overloading the
existing fields.
Some specific patch comments:
+ * XXX: This should be split into machdep and non-machdep parts
What MD parts are in the file?
The sysctl parts of the patch have a number of issues:
- Please use sysctl_handle_string rather than manual calls and string bounds
checking with SYSCTL_IN/SYSCTL_OUT. In general, type-specific sysctl
handlers are named sysctl_handle_<type>, which I recommend here as well.
- Please use stack-local, fixed-length string buffers as the
source/destination of copyin/copyout on ifnet names, rather than the fields
in the structure itself. We support interface renaming, so the field can
change, and sysctl has the potential to block for extended periods
mid-copyin/copyout.
- Because we support ifnet renaming, "none" is probably not a good reserved
name. Could you use the empty string or something similar, or just a flag
to indicate that netdump is disabled?
- "ifp" rather than "ifn" and "nic" is the preferred variable name for ifnet
pointers throughout the kernel.
- ifnets can, and are, removeable at runtime. We have a refcount, but that's
not really what you want. I suggest simply looking up the ifnet by name
when it's needed during a dump or for sysctl output, rather than maintaining
a long-term pointer, which can become stale. Especially with the
auto-detect code.
- Since sysctl_nic is only used once, I'd prefer it were inlined into a
use-specific handler function, avoiding the arg1/arg2 complications that
make this code a bit harder to read. sysctl_ip is useful because it's used
more than once, though. However, it also very much wants you to call
sysctl_handle_string!
+sysctl_force_crash(SYSCTL_HANDLER_ARGS)
Does this belong in the netdump code? We already have some of these options
in debug.kdb.*, but perhaps others should be added there as well.
+ /*
+ * get and fill a header mbuf, then chain data as an extended
+ * mbuf.
+ */
+ MGETHDR(m, M_DONTWAIT, MT_DATA);
The idea of calling into the mbuf allocator in this context is just freaky,
and may have some truly awful side effects. I suppose this is the cost of
trying to combine code paths in the network device driver rather than have an
independent path in the netdump case, but it's quite unfortunate and will
significantly reduce the robustness of netdumps in the face of, for example,
mbuf starvation.
+ if (ntohs(ah->ar_hrd) != ARPHRD_ETHER &&
+ ntohs(ah->ar_hrd) != ARPHRD_IEEE802 &&
+ ntohs(ah->ar_hrd) != ARPHRD_ARCNET &&
+ ntohs(ah->ar_hrd) != ARPHRD_IEEE1394) {
+ NETDDEBUG("nd_handle_arp: unknown hardware address fmt "
+ "0x%2D)\n", (unsigned char *)&ah->ar_hrd, "");
+ return;
+ }
Are you sure you don't want to just check for ETHER here?
+ /* XXX: Probably should check if we're the recipient MAC address */
+ /* Done ethernet processing. Strip off the ethernet header */
Yes, quite probably. What if you panic in promiscuous mode?
+ /*
+ * The first write (at offset 0) is the kernel dump header. Flag it
+ * for the server to treat specially. XXX: This doesn't strip out the
+ * footer KDH, although it shouldn't hurt anything.
+ */
The footer allows us to confirm that the tail end of a dump matches the
beginning; while probably not strictly necessary in this scenario, it's not a
bad idea given the lack of a checksum.
+ /* Default the nic to the first available interface */
+ if ((ifn = TAILQ_FIRST(&ifnet)) != NULL) do {
+ if ((ifn->if_flags & IFF_UP) == 0)
+ continue;
More locking needed.
I'm not sure I like the idea of defaulting to net dumping out the first
interface -- in the future, we'll want to compile netdump into the kernel in
GENERIC so that it's always available. But we won't want to default to doing
the above for a number of reasons. This means the defaults need to be
sensible and quite conservative (i.e., disabled and non-autoconfigured).
+ /* Default the client to the first IP on nd_nic */
+ if ((ifa = TAILQ_FIRST(&nd_nic->if_addrhead)) != NULL) do {
+ if (ifa->ifa_addr->sa_family != AF_INET) {
+ continue;
+ }
+ nd_client = ((struct sockaddr_in *)ifa->ifa_addr)->sin_addr;
+ } while ((ifa = TAILQ_NEXT(ifa, ifa_link)) != NULL &&
+ nd_client.s_addr == INADDR_ANY);
More locking needed here too. I'm guess this code is forward-ported from an
older FreeBSD version without locking here, so you will want to check for
other changes in the replicated code paths, such as input / arp / etc
handling.
And is this really a good idea? Addresses change, and I think you aren't
registering a callback to pick up the change in the event you're using an
auto-configured address. In some ways, in the same way you might want to look
up the ifnet to use at dump-time, you might simultaneously want to
auto-configure an address then, if auto-configuration is selected (i.e.,
netdump is enabled and no IP is configured).
- void *if_pspare[7];
+ struct netdump_methods *if_ndumpfuncs; /* netdump virtual methods */
+ void *if_pspare[6];
int if_ispare[4];
In HEAD, at least, you should add your field and not use the spare. The spare
should only be used on a merge to a KBI-stable branch (such as 8.x).
I need to ponder your changes to ifnet and to the drivers some more; I
recognize the benefits of maximal code alignment, but I worry a lot about
these code paths having side effects (not least, calls into memory allocators,
which in turn can enter VM, etc). I wonder if, given that, we need to teach
the mbuf allocator to be much more conservative if netdumping is active...
There are style bugs throughout.
Robert
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"