Re: [PATCH] Netdump for review and testing -- preliminary version
On Fri, 15 Oct 2010, Robert N. M. Watson wrote: On 15 Oct 2010, at 20:39, Garrett Cooper wrote: But there are already some cases that aren't properly handled today in the ddb area dealing with dumping that aren't handled properly. Take for instance the following two scenarios: 1. Call doadump twice from the debugger. 2. Call doadump, exit the debugger, reenter the debugger, and call doadump again. Both of these scenarios hang reliably for me. I'm not saying that we should regress things further, but I'm just noting that there are most likely a chunk of edgecases that aren't being handled properly when doing dumps that could be handled better / fixed. Even thinking about calling doadump even once from within the debugger is an error. I was asleep when the similar error for panic was committed, and this error has propagated. Debuggers should use a trampoline to call the "any" function, not the least so that they can be used to debug the "any" function without the extra complications to make themself reentrant. I think gdb has always used a trampoline for this outside of the kernel. Not sure what it does within the kernel, but it would have even larger problems than in userland finding a place for the trampoline. In the kernel, there is the additional problem of keeping control while the "any" function is run. Other CPUs must be kept stopped and interrupts must be kept masked, except when the "any" function really needs other CPUs or unmasked interrupts. Single stepping also needs this and doesn't have it (other CPUs and interrupt handlers can run and execute any number of instructions while you are trying to execute a single one). All ddb "commands" that change the system state are really non-ddb commands that should use an external function via a trampoline. Panicing and dumping are just the largest ones, so they are the most impossible to do correctly as commands and the most in need of ddb to debug them. Right: one of the points I've made to Attilio is that we need to move to a more principled model as to what sorts of things we allow in various kernel environments. The early boot is a special environment -- so is the debugger, but the debugger on panic is not the same as the debugger when you can continue. Likewise, the crash dumping code is special, but also not the same as the debugger. Right now, exceptional behaviour to limit hangs/etc is done inconsistently. We need to develop a set of principles that tell us what is permitted in what contexts, and then use that to drive design decisions, normalizing what's there already. ENONUNIXEDITOR. Format not recovered. panic() from within a debugger (or a fast interrupt handler, or a fast interrupt handler that has trappeded to the debugger by request...) is, although an error, not too bad since panic() must be prepared to work starting from the "any" state anyway, and as you mention it doesn'tneed to be able to return (except for RESTARTABLE_PANICS, which makes things impossibly difficult). Continuing from a debugger is feasible mainly because in the usual case the system state is not changed (except for time-dependent things). If you use it to modify memory or i/o or run one of its unsafe commands then you have to be careful. This is not dissimilar to what we do with locking already, BTW: we define a set of kernel environments (fast interrupt handlers, non-sleepable threads, sleepable thread holding non-sleepable locks, etc), and based on those principles prevent significant sources of instability that might otherwise arise in a complex, concurrent kernel. We need to apply the same sort of approach to handling kernel debugging and crashing. Locking has imposed considerable discipline, which if followed by panic() would should how wrong most of the things done by panic() are -- it will hit locks, but shouldn't even be calling functions that have locks, since such functions expect their locks to work. The rules for fast interrupt handlers are simple and mostly not followed. They are that a fast interrupt handler may not access any state not specially locked by its subsystem. This means that they may not call any other subsystem or any upper layer except the null set of ones documented to be safe to call. In practice, this means not calling the "any" function, but it is necessary for atomic ops, bus space accesses, and a couple of scheduling functions to be safe enough. BTW, my view is that except in very exceptional cases, it should not be possible to continue after generating a dump. Dumps often cause disk controllers to get reset, which may leave outstanding I/O in nasty situations. Unless the dump device and model is known not to interfere with operation, we should set state indicating that the system is non-continuable once a dump has occurred. It might be safe if the system reinitialized everything. Too hard for just dumping, but it is needed after resume anyway. So the following could reason
Re: [PATCH] Netdump for review and testing -- preliminary version
On 15 Oct 2010, at 20:39, Garrett Cooper wrote: >But there are already some cases that aren't properly handled > today in the ddb area dealing with dumping that aren't handled > properly. Take for instance the following two scenarios: > 1. Call doadump twice from the debugger. > 2. Call doadump, exit the debugger, reenter the debugger, and call > doadump again. >Both of these scenarios hang reliably for me. >I'm not saying that we should regress things further, but I'm just > noting that there are most likely a chunk of edgecases that aren't > being handled properly when doing dumps that could be handled better / > fixed. Right: one of the points I've made to Attilio is that we need to move to a more principled model as to what sorts of things we allow in various kernel environments. The early boot is a special environment -- so is the debugger, but the debugger on panic is not the same as the debugger when you can continue. Likewise, the crash dumping code is special, but also not the same as the debugger. Right now, exceptional behaviour to limit hangs/etc is done inconsistently. We need to develop a set of principles that tell us what is permitted in what contexts, and then use that to drive design decisions, normalizing what's there already. This is not dissimilar to what we do with locking already, BTW: we define a set of kernel environments (fast interrupt handlers, non-sleepable threads, sleepable thread holding non-sleepable locks, etc), and based on those principles prevent significant sources of instability that might otherwise arise in a complex, concurrent kernel. We need to apply the same sort of approach to handling kernel debugging and crashing. BTW, my view is that except in very exceptional cases, it should not be possible to continue after generating a dump. Dumps often cause disk controllers to get reset, which may leave outstanding I/O in nasty situations. Unless the dump device and model is known not to interfere with operation, we should set state indicating that the system is non-continuable once a dump has occurred. Robert ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
On Fri, Oct 15, 2010 at 12:25 PM, Robert Watson wrote: > On Thu, 14 Oct 2010, Attilio Rao wrote: > >>> No, what I'm saying is: UMA needs to not call its drain handlers, and >>> ideally not call into VM to fill slabs, from the dumping context. That's >>> easy to implement and will cause the dump to fail rather than causing the >>> system to hang. >> >> My point is, however, still the same: that should not happen just for the >> netdump specific case but for all the dumping/KDB/panic cases (I know it is >> unlikely current code !netdump calls into UMA but it is not an established >> pre-requisite and may still happen that some added code does). I still see >> this as a weakness on the infrastructure, independently from netdump. I can >> see that your point is that it is vital to netdump correct behaviour though, >> so I'd wonder if it worths fixing it now or later. > > Quite a bit of our kernel and dumping infrastructure special cases debugging > and dumping behavior to avoid sources of non-robustness. For example, > serial drivers avoid locking, and for disk dumps we bypass GEOM to avoid the > memory allocation, freeing, and threading that it depends on. > > The goal here is to be robust when handling dumps: hanging is worse than not > dumping, since you won't get the dump either way, and if you don't reboot > then the system requires manual intervention to recover. Example of things > that are critical to avoid include: > > - The dumping thread tripping over locks held by the panicked thread, or by > another now-suspended thread, leading to deadlock against a suspended > thread. > > - Corrupting dumps by increasing concurrency in the panic case. We ran into > a > case a year or two ago where changing VM state during the dump on amd64 > caused file system corruption as the dump code assumed that the space > required for a dump didn't change while dumping took place. > > Any code dependency we add in the panic / KDB / dump path is one more risk > that we don't successfully dump and reboot, so we need to minimize that > code. But there are already some cases that aren't properly handled today in the ddb area dealing with dumping that aren't handled properly. Take for instance the following two scenarios: 1. Call doadump twice from the debugger. 2. Call doadump, exit the debugger, reenter the debugger, and call doadump again. Both of these scenarios hang reliably for me. I'm not saying that we should regress things further, but I'm just noting that there are most likely a chunk of edgecases that aren't being handled properly when doing dumps that could be handled better / fixed. Thanks, -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
On Thu, 14 Oct 2010, Attilio Rao wrote: No, what I'm saying is: UMA needs to not call its drain handlers, and ideally not call into VM to fill slabs, from the dumping context. That's easy to implement and will cause the dump to fail rather than causing the system to hang. My point is, however, still the same: that should not happen just for the netdump specific case but for all the dumping/KDB/panic cases (I know it is unlikely current code !netdump calls into UMA but it is not an established pre-requisite and may still happen that some added code does). I still see this as a weakness on the infrastructure, independently from netdump. I can see that your point is that it is vital to netdump correct behaviour though, so I'd wonder if it worths fixing it now or later. Quite a bit of our kernel and dumping infrastructure special cases debugging and dumping behavior to avoid sources of non-robustness. For example, serial drivers avoid locking, and for disk dumps we bypass GEOM to avoid the memory allocation, freeing, and threading that it depends on. The goal here is to be robust when handling dumps: hanging is worse than not dumping, since you won't get the dump either way, and if you don't reboot then the system requires manual intervention to recover. Example of things that are critical to avoid include: - The dumping thread tripping over locks held by the panicked thread, or by another now-suspended thread, leading to deadlock against a suspended thread. - Corrupting dumps by increasing concurrency in the panic case. We ran into a case a year or two ago where changing VM state during the dump on amd64 caused file system corruption as the dump code assumed that the space required for a dump didn't change while dumping took place. Any code dependency we add in the panic / KDB / dump path is one more risk that we don't successfully dump and reboot, so we need to minimize that code. Robert ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
2010/10/14 Robert N. M. Watson : > > On 14 Oct 2010, at 15:10, Attilio Rao wrote: > >>> My concern is less about occasional lost dumps that destabilising the >>> dumping process: calls into the memory allocator can currently trigger a >>> lot of interesting behaviours, such as further calls back into the VM >>> system, which can then trigger calls into other subsystems. What I'm >>> suggesting is that if we want the mbuf allocator to be useful in this >>> context, we need to teach it about things not to do in the dumping / crash >>> / ... context, which probably means helping uma out a bit in that regard. >>> And a watchdog to make sure the dump is making progress. >> >> I think that this would be way too complicated just to cope with panic >> within the VM/UMA (not sure what other subsystems you are referring >> to, wrt supposed to call). Besides, if we have a panic in the VM I'm >> sure that normal dumps could also be affected. >> When dealing with netdump, I'm not trying to fix all the bugs related >> to our dumping infrastructure because, as long as we already >> discussed, we know there are quite a few of them, but trying at least >> to follow the same fragile-ness than what we have today. >> And again, while I think the "watchdog" idea is good, I think it still >> applies to normal dumps too, it is not specific to netdump. > > No, what I'm saying is: UMA needs to not call its drain handlers, and ideally > not call into VM to fill slabs, from the dumping context. That's easy to > implement and will cause the dump to fail rather than causing the system to > hang. Ok. My point is, however, still the same: that should not happen just for the netdump specific case but for all the dumping/KDB/panic cases (I know it is unlikely current code !netdump calls into UMA but it is not an established pre-requisite and may still happen that some added code does). I still see this as a weakness on the infrastructure, independently from netdump. I can see that your point is that it is vital to netdump correct behaviour though, so I'd wonder if it worths fixing it now or later. More people's comment would be appreciated. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
On 14 Oct 2010, at 15:10, Attilio Rao wrote: >> My concern is less about occasional lost dumps that destabilising the >> dumping process: calls into the memory allocator can currently trigger a lot >> of interesting behaviours, such as further calls back into the VM system, >> which can then trigger calls into other subsystems. What I'm suggesting is >> that if we want the mbuf allocator to be useful in this context, we need to >> teach it about things not to do in the dumping / crash / ... context, which >> probably means helping uma out a bit in that regard. And a watchdog to make >> sure the dump is making progress. > > I think that this would be way too complicated just to cope with panic > within the VM/UMA (not sure what other subsystems you are referring > to, wrt supposed to call). Besides, if we have a panic in the VM I'm > sure that normal dumps could also be affected. > When dealing with netdump, I'm not trying to fix all the bugs related > to our dumping infrastructure because, as long as we already > discussed, we know there are quite a few of them, but trying at least > to follow the same fragile-ness than what we have today. > And again, while I think the "watchdog" idea is good, I think it still > applies to normal dumps too, it is not specific to netdump. No, what I'm saying is: UMA needs to not call its drain handlers, and ideally not call into VM to fill slabs, from the dumping context. That's easy to implement and will cause the dump to fail rather than causing the system to hang. Robert___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
2010/10/14 Robert N. M. Watson : > > On 13 Oct 2010, at 18:46, Ryan Stone wrote: > >> On Fri, Oct 8, 2010 at 9:15 PM, Robert Watson wrote: >>> + /* >>> + * get and fill a header mbuf, then chain data as an >>> extended >>> + * mbuf. >>> + */ >>> + MGETHDR(m, M_DONTWAIT, MT_DATA); >>> >>> The idea of calling into the mbuf allocator in this context is just freaky, >>> and may have some truly awful side effects. I suppose this is the cost of >>> trying to combine code paths in the network device driver rather than have >>> an independent path in the netdump case, but it's quite unfortunate and will >>> significantly reduce the robustness of netdumps in the face of, for example, >>> mbuf starvation. >> >> Changing this will require very invasive changes to the network >> drivers. I know that the Intel drivers allocate their own mbufs for >> their receive rings and I imagine that all other drivers have to do >> something similar. Plus the drivers are responsible for freeing mbufs >> after they have been transmitted. It seems to me that the cost of >> making significant changes to the network drivers to support an >> alternate lifecycle for netdump mbufs far outweighs the cost of losing >> a couple of kernel dumps in extreme circumstances. > > My concern is less about occasional lost dumps that destabilising the dumping > process: calls into the memory allocator can currently trigger a lot of > interesting behaviours, such as further calls back into the VM system, which > can then trigger calls into other subsystems. What I'm suggesting is that if > we want the mbuf allocator to be useful in this context, we need to teach it > about things not to do in the dumping / crash / ... context, which probably > means helping uma out a bit in that regard. And a watchdog to make sure the > dump is making progress. I think that this would be way too complicated just to cope with panic within the VM/UMA (not sure what other subsystems you are referring to, wrt supposed to call). Besides, if we have a panic in the VM I'm sure that normal dumps could also be affected. When dealing with netdump, I'm not trying to fix all the bugs related to our dumping infrastructure because, as long as we already discussed, we know there are quite a few of them, but trying at least to follow the same fragile-ness than what we have today. And again, while I think the "watchdog" idea is good, I think it still applies to normal dumps too, it is not specific to netdump. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
On 13 Oct 2010, at 18:46, Ryan Stone wrote: > On Fri, Oct 8, 2010 at 9:15 PM, Robert Watson wrote: >> + /* >> +* get and fill a header mbuf, then chain data as an >> extended >> +* mbuf. >> +*/ >> + MGETHDR(m, M_DONTWAIT, MT_DATA); >> >> The idea of calling into the mbuf allocator in this context is just freaky, >> and may have some truly awful side effects. I suppose this is the cost of >> trying to combine code paths in the network device driver rather than have >> an independent path in the netdump case, but it's quite unfortunate and will >> significantly reduce the robustness of netdumps in the face of, for example, >> mbuf starvation. > > Changing this will require very invasive changes to the network > drivers. I know that the Intel drivers allocate their own mbufs for > their receive rings and I imagine that all other drivers have to do > something similar. Plus the drivers are responsible for freeing mbufs > after they have been transmitted. It seems to me that the cost of > making significant changes to the network drivers to support an > alternate lifecycle for netdump mbufs far outweighs the cost of losing > a couple of kernel dumps in extreme circumstances. My concern is less about occasional lost dumps that destabilising the dumping process: calls into the memory allocator can currently trigger a lot of interesting behaviours, such as further calls back into the VM system, which can then trigger calls into other subsystems. What I'm suggesting is that if we want the mbuf allocator to be useful in this context, we need to teach it about things not to do in the dumping / crash / ... context, which probably means helping uma out a bit in that regard. And a watchdog to make sure the dump is making progress. Robert___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
On Fri, Oct 8, 2010 at 9:15 PM, Robert Watson wrote: > + /* > +* get and fill a header mbuf, then chain data as an > extended > +* mbuf. > +*/ > + MGETHDR(m, M_DONTWAIT, MT_DATA); > > The idea of calling into the mbuf allocator in this context is just freaky, > and may have some truly awful side effects. I suppose this is the cost of > trying to combine code paths in the network device driver rather than have > an independent path in the netdump case, but it's quite unfortunate and will > significantly reduce the robustness of netdumps in the face of, for example, > mbuf starvation. Changing this will require very invasive changes to the network drivers. I know that the Intel drivers allocate their own mbufs for their receive rings and I imagine that all other drivers have to do something similar. Plus the drivers are responsible for freeing mbufs after they have been transmitted. It seems to me that the cost of making significant changes to the network drivers to support an alternate lifecycle for netdump mbufs far outweighs the cost of losing a couple of kernel dumps in extreme circumstances. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
On Wed, Oct 13, 2010 at 01:04:54PM +0200, Attilio Rao wrote: > 2010/10/9 Robert Watson : > > (1) Did you consider using tftp as the network dump protocol, rather than a > > custom protocol? ??It's also a simple UDP-based, ACKed file transfer > > protocol, with the advantage that it's widely supported by exiting tftp > > daemons, packet sniffers, security devices, etc. ??This wouldn't require > > using a stock tftpd, although that would certainly be a benefit as well. > > ??The post-processing of crashdumps that seems to happen in the netdump > > server could, presumably, happen "offline" as easily...? > > I don't understand the "offline" question but, really, I don't know > why tftp wasn't considered in the first place. > Let me do some small search and see how much we could benefit from it. I think Robert means having something other than the netdump server doing the post-processing - e.g., a cron job could look in the directory where a tftp server saved a core file and run a script to extract the desired information. TFTP has a number of advantages; I'm not sure if the original author of the netdump code considered it and rejected it for some reason. I think the original implementation used a broadcast packet to locate the server though, so a custom server was required in any case. That said, I don't think it matters too much if we go ahead with the current version and switch to TFTP later on. > > (2) Have you thought about adding a checksum to the dump format, since > > packets are going over the wire on UDP, etc? ??Even an MD5 sum wouldn't be > > too hard to arrange: all the code is in the kernel already, requires > > relatively little state storage, and is designed for streamed data. > > Someone else already brought this point and I agree, we could use a > checksum here. Both a whole dump and per-packet checksum could be valuable. If we want the former I think this should be done in the dump infrastructure, so that disk dumps get it too. The latter conflicts somewhat with a desire to switch to TFTP though (other than a standard UDP packet checksum). > > The sysctl parts of the patch have a number of issues: > > Sysctl are still not overhauled just because I'm not sure we want to > keep them. That is one of the points I raised in the main e-mail and > I'd really would like to hear opinions about if we should keep them > rather than having a setup userland process, etc. > I'm sorry about this, but please keep in mind that the file still > needs a lot of cleanup so some comments are a bit out of date and > other parts may not be still perfectly overhauled. Attilio suggested adding a userland tool to configure netdump in lieu of the current sysctls. I don't have a strong opinion either way; the only real advantage to the sysctl approach that I see is that the use of the mirrored loader tunables is more obvious. > > The idea of calling into the mbuf allocator in this context is just freaky, > > and may have some truly awful side effects. ??I suppose this is the cost of > > trying to combine code paths in the network device driver rather than have > > an independent path in the netdump case, but it's quite unfortunate and will > > significantly reduce the robustness of netdumps in the face of, for example, > > mbuf starvation. > > I'm not sure in which other way we could fix that actually. Maybe a > pre-allocated pool of mbufs? Despite the freakiness, I can only offer the observation that it has worked very well for us in practise. > > + ?? ?? ?? if (ntohs(ah->ar_hrd) != ARPHRD_ETHER && > > + ?? ?? ?? ?? ?? ntohs(ah->ar_hrd) != ARPHRD_IEEE802 && > > ... > > > > Are you sure you don't want to just check for ETHER here? > > I'd really like to hear Ryan's or Ed's idea here. Interesting, I can't find this in our local version. It looks like this check was copied from if_ether.c::arpintr(). > > + ?? ?? ?? /* XXX: Probably should check if we're the recipient MAC address > > */ > > + ?? ?? ?? /* Done ethernet processing. Strip off the ethernet header */ > > > > Yes, quite probably. ??What if you panic in promiscuous mode? Well, the packet would get dropped at the next layer up, in nd_handle_ip. -Ed ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
2010/10/9 Robert Watson : > On Fri, 8 Oct 2010, Attilio Rao wrote: > >>> GENERAL FRAMEWORK ARCHITECTURE >>> >>> Netdump is composed, right now, by an userland "server" and a kernel >>> "client". The former is run on the target machine (where the dump will >>> phisically happen) and it is responsible for receiving the packets >>> containing coredumps frame and for correctly writing them on-disk. The >>> latter is part of the kernel installed on the source machine (where the dump >>> is initiated) and is responsible for building correctly UDP packets >>> containing the coredump frames, pushing through the network interface and >>> routing them appropriately. > > Hi Attilio: > > Network dumps would be a great addition to the FreeBSD debugging suite! A > few casual comments and questions, I need to spend some more time pondering > the implications of the current netdump design later in the week. > > (1) Did you consider using tftp as the network dump protocol, rather than a > custom protocol? It's also a simple UDP-based, ACKed file transfer > protocol, with the advantage that it's widely supported by exiting tftp > daemons, packet sniffers, security devices, etc. This wouldn't require > using a stock tftpd, although that would certainly be a benefit as well. > The post-processing of crashdumps that seems to happen in the netdump > server could, presumably, happen "offline" as easily...? I don't understand the "offline" question but, really, I don't know why tftp wasn't considered in the first place. Let me do some small search and see how much we could benefit from it. > (2) Have you thought about adding a checksum to the dump format, since > packets are going over the wire on UDP, etc? Even an MD5 sum wouldn't be > too hard to arrange: all the code is in the kernel already, requires > relatively little state storage, and is designed for streamed data. Someone else already brought this point and I agree, we could use a checksum here. > (3) As the bounds checking/etc behavior in dumping grows more complex, it > seems a shame to replicate it in architecture-specific code. Could we pull > the mediaoffset/mediasize checking into common code? Also, a reserved > size/offset of "0" meaning "no limit" sounds slightly worrying; it might be > slightly more conservative to add a flags field to dumperinfo and have a > flag indicating "size is no object" for netdump, rather than overloading the > existing fields. I don't agree with you here. The real problem is that dumpsys is highly disk-specific (I've commented about it somewhere, maybe the e-mail or maybe the commit logs). What we'd need is to have something like netdumpsys() which tries to use network, but it is not short to make and the mediasize+mediaoffset concept really rappresents an higher bound which can't be 0. I think it is a reasonable compromise so far but it is subjected to further improvements for sure. > Some specific patch comments: > > + * XXX: This should be split into machdep and non-machdep parts > > What MD parts are in the file? This is just a stale comment. > The sysctl parts of the patch have a number of issues: Sysctl are still not overhauled just because I'm not sure we want to keep them. That is one of the points I raised in the main e-mail and I'd really would like to hear opinions about if we should keep them rather than having a setup userland process, etc. I'm sorry about this, but please keep in mind that the file still needs a lot of cleanup so some comments are a bit out of date and other parts may not be still perfectly overhauled. > +sysctl_force_crash(SYSCTL_HANDLER_ARGS) > > Does this belong in the netdump code? We already have some of these options > in debug.kdb.*, but perhaps others should be added there as well. For this specific case, I'd really axe it out rather. > + /* > + * get and fill a header mbuf, then chain data as an > extended > + * mbuf. > + */ > + MGETHDR(m, M_DONTWAIT, MT_DATA); > > The idea of calling into the mbuf allocator in this context is just freaky, > and may have some truly awful side effects. I suppose this is the cost of > trying to combine code paths in the network device driver rather than have > an independent path in the netdump case, but it's quite unfortunate and will > significantly reduce the robustness of netdumps in the face of, for example, > mbuf starvation. I'm not sure in which other way we could fix that actually. Maybe a pre-allocated pool of mbufs? > + if (ntohs(ah->ar_hrd) != ARPHRD_ETHER && > + ntohs(ah->ar_hrd) != ARPHRD_IEEE802 && > + ntohs(ah->ar_hrd) != ARPHRD_ARCNET && > + ntohs(ah->ar_hrd) != ARPHRD_IEEE1394) { > + NETDDEBUG("nd_handle_arp: unknown hardware address fmt " > + "0x%2D)\n", (unsigned char *)&ah->ar_hrd, ""); > + return; > + } > > Are you sure you don't want to just check for ETHER her
Re: [PATCH] Netdump for review and testing -- preliminary version
On Sat, Oct 09, 2010 at 02:15:39AM +0100, Robert Watson wrote: > Network dumps would be a great addition to the FreeBSD debugging suite! > [...] It seems that at EuroBSDCon there was a discussion of Contiki[1] and the uIPv6 stack[2] that it contains, and I think something like this could be a great future enhancement for netdump and similar consumers. Ideally I'd like to see us get to having a clean, simple API for transmitting and receiving packets and a basic stack that could be shared by netdump, as well as a network-enabled ddb, gdb backend, and perhaps console. Using the uIPv6 stack for this would get us v6 support (as the name suggests), as well as a supported / maintained stack vs. what is in netdump now. It's also BSD-licensed. This could be a good future project, although I think the current netdump implementation is worth bringing in soon (taking into account the good feedback received to date, of course). [1] http://www.sics.se/contiki/about-contiki.html [2] http://www.sics.se/contiki/contiki-6lowpan-uipv6-faq.html - Ed ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
On Fri, 8 Oct 2010, Attilio Rao wrote: GENERAL FRAMEWORK ARCHITECTURE Netdump is composed, right now, by an userland "server" and a kernel "client". The former is run on the target machine (where the dump will phisically happen) and it is responsible for receiving the packets containing coredumps frame and for correctly writing them on-disk. The latter is part of the kernel installed on the source machine (where the dump is initiated) and is responsible for building correctly UDP packets containing the coredump frames, pushing through the network interface and routing them appropriately. Hi Attilio: Network dumps would be a great addition to the FreeBSD debugging suite! A few casual comments and questions, I need to spend some more time pondering the implications of the current netdump design later in the week. (1) Did you consider using tftp as the network dump protocol, rather than a custom protocol? It's also a simple UDP-based, ACKed file transfer protocol, with the advantage that it's widely supported by exiting tftp daemons, packet sniffers, security devices, etc. This wouldn't require using a stock tftpd, although that would certainly be a benefit as well. The post-processing of crashdumps that seems to happen in the netdump server could, presumably, happen "offline" as easily...? (2) Have you thought about adding a checksum to the dump format, since packets are going over the wire on UDP, etc? Even an MD5 sum wouldn't be too hard to arrange: all the code is in the kernel already, requires relatively little state storage, and is designed for streamed data. (3) As the bounds checking/etc behavior in dumping grows more complex, it seems a shame to replicate it in architecture-specific code. Could we pull the mediaoffset/mediasize checking into common code? Also, a reserved size/offset of "0" meaning "no limit" sounds slightly worrying; it might be slightly more conservative to add a flags field to dumperinfo and have a flag indicating "size is no object" for netdump, rather than overloading the existing fields. Some specific patch comments: + * XXX: This should be split into machdep and non-machdep parts What MD parts are in the file? The sysctl parts of the patch have a number of issues: - Please use sysctl_handle_string rather than manual calls and string bounds checking with SYSCTL_IN/SYSCTL_OUT. In general, type-specific sysctl handlers are named sysctl_handle_, which I recommend here as well. - Please use stack-local, fixed-length string buffers as the source/destination of copyin/copyout on ifnet names, rather than the fields in the structure itself. We support interface renaming, so the field can change, and sysctl has the potential to block for extended periods mid-copyin/copyout. - Because we support ifnet renaming, "none" is probably not a good reserved name. Could you use the empty string or something similar, or just a flag to indicate that netdump is disabled? - "ifp" rather than "ifn" and "nic" is the preferred variable name for ifnet pointers throughout the kernel. - ifnets can, and are, removeable at runtime. We have a refcount, but that's not really what you want. I suggest simply looking up the ifnet by name when it's needed during a dump or for sysctl output, rather than maintaining a long-term pointer, which can become stale. Especially with the auto-detect code. - Since sysctl_nic is only used once, I'd prefer it were inlined into a use-specific handler function, avoiding the arg1/arg2 complications that make this code a bit harder to read. sysctl_ip is useful because it's used more than once, though. However, it also very much wants you to call sysctl_handle_string! +sysctl_force_crash(SYSCTL_HANDLER_ARGS) Does this belong in the netdump code? We already have some of these options in debug.kdb.*, but perhaps others should be added there as well. + /* +* get and fill a header mbuf, then chain data as an extended +* mbuf. +*/ + MGETHDR(m, M_DONTWAIT, MT_DATA); The idea of calling into the mbuf allocator in this context is just freaky, and may have some truly awful side effects. I suppose this is the cost of trying to combine code paths in the network device driver rather than have an independent path in the netdump case, but it's quite unfortunate and will significantly reduce the robustness of netdumps in the face of, for example, mbuf starvation. + if (ntohs(ah->ar_hrd) != ARPHRD_ETHER && + ntohs(ah->ar_hrd) != ARPHRD_IEEE802 && + ntohs(ah->ar_hrd) != ARPHRD_ARCNET && + ntohs(ah->ar_hrd) != ARPHRD_IEEE1394) { + NETDDEBUG("nd_handle_arp: unknown hardware address fmt " + "0x%2D)\n", (unsigned char *)&ah->ar_hrd, ""); + return; + } Are you sure you don't want to just check for ETHER here? + /* XXX: Prob
Re: [PATCH] Netdump for review and testing -- preliminary version
2010/9/28 Attilio Rao : > In the last weeks I worked for porting the netdump infrastructure to > FreeBSD-CURRENT on the behalf of Sandvine Incorporated. > Netdump is a framework that aims for handling kernel coredumps over > the TCP/IP suite in order to dump to a separate machine than the > running one. That may be used on an interesting number of cases > involving disk-less workstations, disk driver debugging or embedded > devices. > > GENERAL FRAMEWORK ARCHITECTURE > > Netdump is composed, right now, by an userland "server" and a kernel > "client". The former is run on the target machine (where the dump will > phisically happen) and it is responsible for receiving the packets > containing coredumps frame and for correctly writing them on-disk. > The latter is part of the kernel installed on the source machine > (where the dump is initiated) and is responsible for building > correctly UDP packets containing the coredump frames, pushing through > the network interface and routing them appropriately. > > While the server may appear as, pretty much, a simple userland deamon > dealing with UDP packets, the client imposes interesting problems as > long as its activity is linked to handling kernel core dumping. More > precisely, as long as the client is part of the dumping mechanism and > the kernel may be in general panic conditions, netdump must ensure > "robustness" of operations. That is partially achieved by reworking > (and someway replicating) locally some UDP, ARP and IP operations, > hardsetting some values (like the default gateway, the destination and > the client address) and reducing further interactions with the kernel > just to the network interface acitivities. > More specifically, it implements a very basic UDP / IPv4 / ARP stack > separate from the standard stack (since that may not be in a > consistent state). > It can dump to a server on the same LAN or via a router (correctly > specifying the connection gateway). > In order to receive packet on critical conditions, netdump polls the > interface. Every network driver can implement hooks to be used by > netdump independently by DEVICE_POLLING option, even if it is > probabilly a good idea to share some code among them. The reference > set of hooks is contained into "struct netdump_methods". > And if_lem/if_em driver modifies may be set as reference for netdump > hooks implementation. > > In order to work into an "up and running" system (meant as with all > the devices in place) the netdump handler hooks as a pre-sync handler > (differently from other dumping routines). It however suffers some > problems typical of other dumping mechanism. For example, on DDB > entering unlocked version of polling handler is used, in order to > reduce the risk of deadlocks during inspections*. That reflects, among > the netdump methods, the existence of 2 versions of polling hooks, > where the "unlocked" is meant as reducing locking as much as possible. > > PATCH AND FURTHER WORK > > The patch is not totally complete and it is not intended to be > committed in SVN yet. What I'm looking for now is more testing and > review (in particular in terms of architecture) coverage by community. > The server should be in realtively "committable" state, though, but I > encourage its stress-testing. A manpage is provided along that should > be very easy to understand how to use it. > > Things that can be further improved, as it is now, in the client, are: > - Deciding if hardcoding of the kernel parameter is done properly. I > personally don't like the sysctl usage and I would prefer an userland > small utility used to testing and maybe add some tunables for enabling > netdump early in the boot. You may have several opinions on this > though. > - VIMAGE and IPv6 support. > - More drivers support. Right now only if_em (and if_lem) are > converted to use netdump and can be used as a draft for other device > drivers. if_ixgb should came along in the final, committing, version > too. In general I think that all drivers supporting device polling > could easilly support also netdump > - Ideally dumpsys() in FreeBSD is too much disk-activity oriented. It > should be made, instead, more neutral and more flexible to cope better > with different interfaces. It is a quite a bit of work, however, and > beyond the scope of netdump introduction (even if it could be > beneficial for it) > > Netdump has been developed on a FreeBSD project branch located here: > svn://svn.freebsd.org/base/projects/sv/ > > which could also forsee further informations about every single > change. However, for your convenience, also a patch has been made > public which is located here (against freebsd-curr...@213246): > http://www.freebsd.org/~attilio/Sandvine/STABLE_8/netdump/netdump_alpha_0.diff This followup is made in order to signal the code has been further refined and some bugfixes (reported mostly by pluknet@ testing) have been fixed. More support for interfaces has been added (igb, ixgb, ixgbe). Y
Re: [PATCH] Netdump for review and testing -- preliminary version
The 1st error is caught :). Prior actions was triggering netdump, leaving db> and entering again, then trigger netdump once more. Dumping 1146 MB: 1131 1115 1099 1083 1067 1051 1035 1019 1003 987 971 955 939 923 907 891 875 859 843 827 811 795. . . . . . . . . . . ** DUMP FAILED (ERROR 60) ** Failed to dump the actual raw datas Then I was able to leave ddb back to the working system. P.S. What about to start netdumpsrv from rc.d ? It's very simple. %%% #!/bin/sh # # $FreeBSD$ # # PROVIDE: netdumpsrv # REQUIRE: NETWORKING # KEYWORD: shutdown . /etc/rc.subr name="netdumpsrv" rcvar=`set_rcvar` command="/usr/sbin/${name}" pidfile="/var/run/${name}.pid" load_rc_config $name run_rc_command "$1" %%% -- wbr, pluknet ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
> From: Ryan Stone > Date: Wed, Sep 29, 2010 at 11:32 AM > Subject: Re: [PATCH] Netdump for review and testing -- preliminary version > To: Sergey Kandaurov > > >> db> netdump >> >> --- >> netdump in progress. searching for server.. . . . . . . . . . . . >> Failed to contact netdump server > > This can be fixed by putting: > > nd_server_port = NETDUMP_PORT; > > At the beginning of netdump_trigger. Hi. Thank you, that helped fixing it. 2nd and further netdumps work now. Some more numbers below [hosts are on diff. switches connected with STP on same 1Gb MTU1500 LAN] packets errs bytespackets errs bytes colls [avg:] 3488 04109874 4002 0 704800 0 [at the end:] 3484 04076816 3989 0 680710 0 3510 04086756 4066 0 712280 0 3552 04136112 4107 0 714790 0 3459 04059238 3995 0 681966 0 3499 04077516 4037 0 709734 0 3423 04004157 3936 0 678816 0 13898 0 19118779 14018 02062568 0 22206 0 31680314 21451 02959042 0 22410 0 32086046 21710 02854724 0 22231 0 31747238 21489 02899306 0 input (bce0) output packets errs bytespackets errs bytes colls 18490 0 26180288 17757 02729082 0 336 0 24674 90 0 622960 0 6 0 1148 1 0178 0 1687470 packets captured 2274879 packets received by filter 65676 packets dropped by kernel It shows 30Mbps in average with up to 224 Mbps at the end. The traffic picture is 1472 bytes UDP payload, and 4 bytes ack. It took slightly less then 5 mins to netdump 1146 MB. -- wbr, pluknet ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Fwd: [PATCH] Netdump for review and testing -- preliminary version
-- Forwarded message -- Replying to the list this time... From: Ryan Stone Date: Wed, Sep 29, 2010 at 11:32 AM Subject: Re: [PATCH] Netdump for review and testing -- preliminary version To: Sergey Kandaurov > db> netdump > > --- > netdump in progress. searching for server.. . . . . . . . . . . . > Failed to contact netdump server This can be fixed by putting: nd_server_port = NETDUMP_PORT; At the beginning of netdump_trigger. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
2010/9/29 Sergey Kandaurov : > [just don't know what namely need to test, so] > > All made according to your instructions. > The only way I could trigger netdump was > to run its ddb command by hand. Neither > debug.kdb.enter nor debug.kdb.panic don't do it. You probabilly need to use KDB_UNATTENDED. > Some numbers and output (bit verbose > but again don't know what need to show here) > Entered through debug.kdb.panic, thus Panicstring. > db> netdump > > --- > netdump in progress. searching for server.. dumping to xx.xxx.xx.xx > (00:1a:64:d3:4f:20) > --- > Physical memory: 2026 MB > Dumping 1154 MB: 1139 1123 1107 1091 1075 1059 1043 1027 1011 995 979 > 963 947 931 915 899 883 867 851 835 819 803 787 771 755 739 723 707 > 691 675 659 643 627 611 595 579 563 547 531 515 499 483 467 451 435 > 419 403 387 371 355 339 323 307 291 275 259 243 227 211 195 179 163 > 147 131 115 99 83 67 51 35 19 3 > Dump complete > > netdump finished. > cancelling normal dump > > # ls -hl /home/fooserv/crash/ > total 1182930 > -rw-r--r-- 1 root wheel 404B Sep 29 12:05 info.fooserv.0 > -rw-r--r-- 1 root wheel 1.1G Sep 29 12:05 vmcore.fooserv.0 > > Dump from fooserv [xx.xxx.xx.xx] > Architecture: amd64 > Architecture version: 2 > Dump length: 1210687488B (1154 MB) > blocksize: 8192 > Dumptime: Wed Sep 29 11:55:30 2010 > Hostname: fooserv > Versionstring: FreeBSD 9.0-CURRENT #51: Wed Sep 29 07:18:24 UTC 2010 > r...@fooserv:/usr/obj/usr/src/sys/GENERIC > Panicstring: kdb_sysctl_panic > Header parity check: Pass > Dump complete > > I tried to netdump a bit later again, but it couldn't find server, which > runs though, and client traffic was seen on server-side (w/o reply). > db> netdump > > --- > netdump in progress. searching for server.. . . . . . . . . . . . > Failed to contact netdump server Could you compile the kernel with NETDUMP_CLIENT_DEBUG and retry? (Warning: the log may be huge, so you may be needing to log with serial line) I will try to reproduce locally, if anything. Thanks, Attilio > > As for netdumpserv, > it misses #include before __FBSDID(); > I happily tested it on 7.3-amd64. Yes, there are also other small style(9) nits that needs to be addressed in the netdumpserver as well. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
on 28/09/2010 20:39 Attilio Rao said the following: > In order to work into an "up and running" system (meant as with all > the devices in place) the netdump handler hooks as a pre-sync handler > (differently from other dumping routines). It however suffers some I actually like this idea. I think that regular dump should also be done at that time. Shouldn't we try to dump memory as early as possible, so that as little of it as possible is modified? (Not that I like sync-on-panic option at all) > problems typical of other dumping mechanism. For example, on DDB > entering unlocked version of polling handler is used, in order to > reduce the risk of deadlocks during inspections*. That reflects, among > the netdump methods, the existence of 2 versions of polling hooks, > where the "unlocked" is meant as reducing locking as much as possible. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] Netdump for review and testing -- preliminary version
[just don't know what namely need to test, so] All made according to your instructions. The only way I could trigger netdump was to run its ddb command by hand. Neither debug.kdb.enter nor debug.kdb.panic don't do it. Some numbers and output (bit verbose but again don't know what need to show here) Entered through debug.kdb.panic, thus Panicstring. db> netdump --- netdump in progress. searching for server.. dumping to xx.xxx.xx.xx (00:1a:64:d3:4f:20) --- Physical memory: 2026 MB Dumping 1154 MB: 1139 1123 1107 1091 1075 1059 1043 1027 1011 995 979 963 947 931 915 899 883 867 851 835 819 803 787 771 755 739 723 707 691 675 659 643 627 611 595 579 563 547 531 515 499 483 467 451 435 419 403 387 371 355 339 323 307 291 275 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3 Dump complete netdump finished. cancelling normal dump # ls -hl /home/fooserv/crash/ total 1182930 -rw-r--r-- 1 root wheel 404B Sep 29 12:05 info.fooserv.0 -rw-r--r-- 1 root wheel 1.1G Sep 29 12:05 vmcore.fooserv.0 Dump from fooserv [xx.xxx.xx.xx] Architecture: amd64 Architecture version: 2 Dump length: 1210687488B (1154 MB) blocksize: 8192 Dumptime: Wed Sep 29 11:55:30 2010 Hostname: fooserv Versionstring: FreeBSD 9.0-CURRENT #51: Wed Sep 29 07:18:24 UTC 2010 r...@fooserv:/usr/obj/usr/src/sys/GENERIC Panicstring: kdb_sysctl_panic Header parity check: Pass Dump complete I tried to netdump a bit later again, but it couldn't find server, which runs though, and client traffic was seen on server-side (w/o reply). db> netdump --- netdump in progress. searching for server.. . . . . . . . . . . . Failed to contact netdump server As for netdumpserv, it misses #include before __FBSDID(); I happily tested it on 7.3-amd64. -- wbr, pluknet ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
[PATCH] Netdump for review and testing -- preliminary version
In the last weeks I worked for porting the netdump infrastructure to FreeBSD-CURRENT on the behalf of Sandvine Incorporated. Netdump is a framework that aims for handling kernel coredumps over the TCP/IP suite in order to dump to a separate machine than the running one. That may be used on an interesting number of cases involving disk-less workstations, disk driver debugging or embedded devices. GENERAL FRAMEWORK ARCHITECTURE Netdump is composed, right now, by an userland "server" and a kernel "client". The former is run on the target machine (where the dump will phisically happen) and it is responsible for receiving the packets containing coredumps frame and for correctly writing them on-disk. The latter is part of the kernel installed on the source machine (where the dump is initiated) and is responsible for building correctly UDP packets containing the coredump frames, pushing through the network interface and routing them appropriately. While the server may appear as, pretty much, a simple userland deamon dealing with UDP packets, the client imposes interesting problems as long as its activity is linked to handling kernel core dumping. More precisely, as long as the client is part of the dumping mechanism and the kernel may be in general panic conditions, netdump must ensure "robustness" of operations. That is partially achieved by reworking (and someway replicating) locally some UDP, ARP and IP operations, hardsetting some values (like the default gateway, the destination and the client address) and reducing further interactions with the kernel just to the network interface acitivities. More specifically, it implements a very basic UDP / IPv4 / ARP stack separate from the standard stack (since that may not be in a consistent state). It can dump to a server on the same LAN or via a router (correctly specifying the connection gateway). In order to receive packet on critical conditions, netdump polls the interface. Every network driver can implement hooks to be used by netdump independently by DEVICE_POLLING option, even if it is probabilly a good idea to share some code among them. The reference set of hooks is contained into "struct netdump_methods". And if_lem/if_em driver modifies may be set as reference for netdump hooks implementation. In order to work into an "up and running" system (meant as with all the devices in place) the netdump handler hooks as a pre-sync handler (differently from other dumping routines). It however suffers some problems typical of other dumping mechanism. For example, on DDB entering unlocked version of polling handler is used, in order to reduce the risk of deadlocks during inspections*. That reflects, among the netdump methods, the existence of 2 versions of polling hooks, where the "unlocked" is meant as reducing locking as much as possible. PATCH AND FURTHER WORK The patch is not totally complete and it is not intended to be committed in SVN yet. What I'm looking for now is more testing and review (in particular in terms of architecture) coverage by community. The server should be in realtively "committable" state, though, but I encourage its stress-testing. A manpage is provided along that should be very easy to understand how to use it. Things that can be further improved, as it is now, in the client, are: - Deciding if hardcoding of the kernel parameter is done properly. I personally don't like the sysctl usage and I would prefer an userland small utility used to testing and maybe add some tunables for enabling netdump early in the boot. You may have several opinions on this though. - VIMAGE and IPv6 support. - More drivers support. Right now only if_em (and if_lem) are converted to use netdump and can be used as a draft for other device drivers. if_ixgb should came along in the final, committing, version too. In general I think that all drivers supporting device polling could easilly support also netdump - Ideally dumpsys() in FreeBSD is too much disk-activity oriented. It should be made, instead, more neutral and more flexible to cope better with different interfaces. It is a quite a bit of work, however, and beyond the scope of netdump introduction (even if it could be beneficial for it) Netdump has been developed on a FreeBSD project branch located here: svn://svn.freebsd.org/base/projects/sv/ which could also forsee further informations about every single change. However, for your convenience, also a patch has been made public which is located here (against freebsd-curr...@213246): http://www.freebsd.org/~attilio/Sandvine/STABLE_8/netdump/netdump_alpha_0.diff In order to enable the client it is enough to add at your kernel config: optionsNETDUMP_CLIENT NETDUMP_CLIENT_DEBUG can be specified too in order to have further debuggin traces but I'd encourage to use them just for developing Netdump itself. TYPICAL USAGE This is a set of the available sysctls for netdump: # sysctl -d net.dump net.dump: netdump net.dump.enable: enable network dump n