Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-17 Thread Bruce Evans

On Fri, 15 Oct 2010, Robert N. M. Watson wrote:


On 15 Oct 2010, at 20:39, Garrett Cooper wrote:


   But there are already some cases that aren't properly handled
today in the ddb area dealing with dumping that aren't handled
properly. Take for instance the following two scenarios:
1. Call doadump twice from the debugger.
2. Call doadump, exit the debugger, reenter the debugger, and call
doadump again.
   Both of these scenarios hang reliably for me.
   I'm not saying that we should regress things further, but I'm just
noting that there are most likely a chunk of edgecases that aren't
being handled properly when doing dumps that could be handled better /
fixed.


Even thinking about calling doadump even once from within the debugger is
an error.  I was asleep when the similar error for panic was committed,
and this error has propagated.  Debuggers should use a trampoline to
call the "any" function, not the least so that they can be used to debug
the "any" function without the extra complications to make themself
reentrant.  I think gdb has always used a trampoline for this outside of
the kernel.  Not sure what it does within the kernel, but it would have
even larger problems than in userland finding a place for the trampoline.
In the kernel, there is the additional problem of keeping control while
the "any" function is run.  Other CPUs must be kept stopped and interrupts
must be kept masked, except when the "any" function really needs other CPUs
or unmasked interrupts.  Single stepping also needs this and doesn't have
it (other CPUs and interrupt handlers can run and execute any number of
instructions while you are trying to execute a single one).  All ddb
"commands" that change the system state are really non-ddb commands that
should use an external function via a trampoline.  Panicing and dumping
are just the largest ones, so they are the most impossible to do correctly
as commands and the most in need of ddb to debug them.


Right: one of the points I've made to Attilio is that we need to move to a more 
principled model as to what sorts of things we allow in various kernel 
environments. The early boot is a special environment -- so is the debugger, 
but the debugger on panic is not the same as the debugger when you can 
continue. Likewise, the crash dumping code is special, but also not the same as 
the debugger. Right now, exceptional behaviour to limit hangs/etc is done 
inconsistently. We need to develop a set of principles that tell us what is 
permitted in what contexts, and then use that to drive design decisions, 
normalizing what's there already.


ENONUNIXEDITOR.  Format not recovered.

panic() from within a debugger (or a fast interrupt handler, or a fast
interrupt handler that has trappeded to the debugger by request...) is,
although an error, not too bad since panic() must be prepared to work
starting from the "any" state anyway, and as you mention it doesn'tneed
to be able to return (except for RESTARTABLE_PANICS, which makes things
impossibly difficult).  Continuing from a debugger is feasible mainly
because in the usual case the system state is not changed (except for
time-dependent things).  If you use it to modify memory or i/o or run
one of its unsafe commands then you have to be careful.


This is not dissimilar to what we do with locking already, BTW: we define a set 
of kernel environments (fast interrupt handlers, non-sleepable threads, 
sleepable thread holding non-sleepable locks, etc), and based on those 
principles prevent significant sources of instability that might otherwise 
arise in a complex, concurrent kernel. We need to apply the same sort of 
approach to handling kernel debugging and crashing.


Locking has imposed considerable discipline, which if followed by panic()
would should how wrong most of the things done by panic() are -- it will
hit locks, but shouldn't even be calling functions that have locks, since
such functions expect their locks to work.

The rules for fast interrupt handlers are simple and mostly not followed.
They are that a fast interrupt handler may not access any state not
specially locked by its subsystem.  This means that they may not call
any other subsystem or any upper layer except the null set of ones
documented to be safe to call.  In practice, this means not calling the
"any" function, but it is necessary for atomic ops, bus space accesses,
and a couple of scheduling functions to be safe enough.


BTW, my view is that except in very exceptional cases, it should not be 
possible to continue after generating a dump. Dumps often cause disk 
controllers to get reset, which may leave outstanding I/O in nasty situations. 
Unless the dump device and model is known not to interfere with operation, we 
should set state indicating that the system is non-continuable once a dump has 
occurred.


It might be safe if the system reinitialized everything.  Too hard for just
dumping, but it is needed after resume anyway.  So the following could
reason

Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-15 Thread Robert N. M. Watson

On 15 Oct 2010, at 20:39, Garrett Cooper wrote:

>But there are already some cases that aren't properly handled
> today in the ddb area dealing with dumping that aren't handled
> properly. Take for instance the following two scenarios:
> 1. Call doadump twice from the debugger.
> 2. Call doadump, exit the debugger, reenter the debugger, and call
> doadump again.
>Both of these scenarios hang reliably for me.
>I'm not saying that we should regress things further, but I'm just
> noting that there are most likely a chunk of edgecases that aren't
> being handled properly when doing dumps that could be handled better /
> fixed.

Right: one of the points I've made to Attilio is that we need to move to a more 
principled model as to what sorts of things we allow in various kernel 
environments. The early boot is a special environment -- so is the debugger, 
but the debugger on panic is not the same as the debugger when you can 
continue. Likewise, the crash dumping code is special, but also not the same as 
the debugger. Right now, exceptional behaviour to limit hangs/etc is done 
inconsistently. We need to develop a set of principles that tell us what is 
permitted in what contexts, and then use that to drive design decisions, 
normalizing what's there already.

This is not dissimilar to what we do with locking already, BTW: we define a set 
of kernel environments (fast interrupt handlers, non-sleepable threads, 
sleepable thread holding non-sleepable locks, etc), and based on those 
principles prevent significant sources of instability that might otherwise 
arise in a complex, concurrent kernel. We need to apply the same sort of 
approach to handling kernel debugging and crashing.

BTW, my view is that except in very exceptional cases, it should not be 
possible to continue after generating a dump. Dumps often cause disk 
controllers to get reset, which may leave outstanding I/O in nasty situations. 
Unless the dump device and model is known not to interfere with operation, we 
should set state indicating that the system is non-continuable once a dump has 
occurred.

Robert

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-15 Thread Garrett Cooper
On Fri, Oct 15, 2010 at 12:25 PM, Robert Watson  wrote:
> On Thu, 14 Oct 2010, Attilio Rao wrote:
>
>>> No, what I'm saying is: UMA needs to not call its drain handlers, and
>>> ideally not call into VM to fill slabs, from the dumping context. That's
>>> easy to implement and will cause the dump to fail rather than causing the
>>> system to hang.
>>
>> My point is, however, still the same: that should not happen just for the
>> netdump specific case but for all the dumping/KDB/panic cases (I know it is
>> unlikely current code !netdump calls into UMA but it is not an established
>> pre-requisite and may still happen that some added code does). I still see
>> this as a weakness on the infrastructure, independently from netdump. I can
>> see that your point is that it is vital to netdump correct behaviour though,
>> so I'd wonder if it worths fixing it now or later.
>
> Quite a bit of our kernel and dumping infrastructure special cases debugging
> and dumping behavior to avoid sources of non-robustness.  For example,
> serial drivers avoid locking, and for disk dumps we bypass GEOM to avoid the
> memory allocation, freeing, and threading that it depends on.
>
> The goal here is to be robust when handling dumps: hanging is worse than not
> dumping, since you won't get the dump either way, and if you don't reboot
> then the system requires manual intervention to recover.  Example of things
> that are critical to avoid include:
>
> - The dumping thread tripping over locks held by the panicked thread, or by
>  another now-suspended thread, leading to deadlock against a suspended
>  thread.
>
> - Corrupting dumps by increasing concurrency in the panic case.  We ran into
> a
>  case a year or two ago where changing VM state during the dump on amd64
>  caused file system corruption as the dump code assumed that the space
>  required for a dump didn't change while dumping took place.
>
> Any code dependency we add in the panic / KDB / dump path is one more risk
> that we don't successfully dump and reboot, so we need to minimize that
> code.

But there are already some cases that aren't properly handled
today in the ddb area dealing with dumping that aren't handled
properly. Take for instance the following two scenarios:
1. Call doadump twice from the debugger.
2. Call doadump, exit the debugger, reenter the debugger, and call
doadump again.
Both of these scenarios hang reliably for me.
I'm not saying that we should regress things further, but I'm just
noting that there are most likely a chunk of edgecases that aren't
being handled properly when doing dumps that could be handled better /
fixed.
Thanks,
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-15 Thread Robert Watson

On Thu, 14 Oct 2010, Attilio Rao wrote:

No, what I'm saying is: UMA needs to not call its drain handlers, and 
ideally not call into VM to fill slabs, from the dumping context. That's 
easy to implement and will cause the dump to fail rather than causing the 
system to hang.


My point is, however, still the same: that should not happen just for the 
netdump specific case but for all the dumping/KDB/panic cases (I know it is 
unlikely current code !netdump calls into UMA but it is not an established 
pre-requisite and may still happen that some added code does). I still see 
this as a weakness on the infrastructure, independently from netdump. I can 
see that your point is that it is vital to netdump correct behaviour though, 
so I'd wonder if it worths fixing it now or later.


Quite a bit of our kernel and dumping infrastructure special cases debugging 
and dumping behavior to avoid sources of non-robustness.  For example, serial 
drivers avoid locking, and for disk dumps we bypass GEOM to avoid the memory 
allocation, freeing, and threading that it depends on.


The goal here is to be robust when handling dumps: hanging is worse than not 
dumping, since you won't get the dump either way, and if you don't reboot then 
the system requires manual intervention to recover.  Example of things that 
are critical to avoid include:


- The dumping thread tripping over locks held by the panicked thread, or by
  another now-suspended thread, leading to deadlock against a suspended
  thread.

- Corrupting dumps by increasing concurrency in the panic case.  We ran into a
  case a year or two ago where changing VM state during the dump on amd64
  caused file system corruption as the dump code assumed that the space
  required for a dump didn't change while dumping took place.

Any code dependency we add in the panic / KDB / dump path is one more risk 
that we don't successfully dump and reboot, so we need to minimize that code.


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-14 Thread Attilio Rao
2010/10/14 Robert N. M. Watson :
>
> On 14 Oct 2010, at 15:10, Attilio Rao wrote:
>
>>> My concern is less about occasional lost dumps that destabilising the 
>>> dumping process: calls into the memory allocator can currently trigger a 
>>> lot of interesting behaviours, such as further calls back into the VM 
>>> system, which can then trigger calls into other subsystems. What I'm 
>>> suggesting is that if we want the mbuf allocator to be useful in this 
>>> context, we need to teach it about things not to do in the dumping / crash 
>>> / ... context, which probably means helping uma out a bit in that regard. 
>>> And a watchdog to make sure the dump is making progress.
>>
>> I think that this would be way too complicated just to cope with panic
>> within the VM/UMA (not sure what other subsystems you are referring
>> to, wrt supposed to call). Besides, if we have a panic in the VM I'm
>> sure that normal dumps could also be affected.
>> When dealing with netdump, I'm not trying to fix all the bugs related
>> to our dumping infrastructure because, as long as we already
>> discussed, we know there are quite a few of them, but trying at least
>> to follow the same fragile-ness than what we have today.
>> And again, while I think the "watchdog" idea is good, I think it still
>> applies to normal dumps too, it is not specific to netdump.
>
> No, what I'm saying is: UMA needs to not call its drain handlers, and ideally 
> not call into VM to fill slabs, from the dumping context. That's easy to 
> implement and will cause the dump to fail rather than causing the system to 
> hang.

Ok.
My point is, however, still the same: that should not happen just for
the netdump specific case but for all the dumping/KDB/panic cases (I
know it is unlikely current code !netdump calls into UMA but it is not
an established pre-requisite and may still happen that some added code
does).
I still see this as a weakness on the infrastructure, independently
from netdump. I can see that your point is that it is vital to netdump
correct behaviour though, so I'd wonder if it worths fixing it now or
later.

More people's comment would be appreciated.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-14 Thread Robert N. M. Watson

On 14 Oct 2010, at 15:10, Attilio Rao wrote:

>> My concern is less about occasional lost dumps that destabilising the 
>> dumping process: calls into the memory allocator can currently trigger a lot 
>> of interesting behaviours, such as further calls back into the VM system, 
>> which can then trigger calls into other subsystems. What I'm suggesting is 
>> that if we want the mbuf allocator to be useful in this context, we need to 
>> teach it about things not to do in the dumping / crash / ... context, which 
>> probably means helping uma out a bit in that regard. And a watchdog to make 
>> sure the dump is making progress.
> 
> I think that this would be way too complicated just to cope with panic
> within the VM/UMA (not sure what other subsystems you are referring
> to, wrt supposed to call). Besides, if we have a panic in the VM I'm
> sure that normal dumps could also be affected.
> When dealing with netdump, I'm not trying to fix all the bugs related
> to our dumping infrastructure because, as long as we already
> discussed, we know there are quite a few of them, but trying at least
> to follow the same fragile-ness than what we have today.
> And again, while I think the "watchdog" idea is good, I think it still
> applies to normal dumps too, it is not specific to netdump.

No, what I'm saying is: UMA needs to not call its drain handlers, and ideally 
not call into VM to fill slabs, from the dumping context. That's easy to 
implement and will cause the dump to fail rather than causing the system to 
hang.

Robert___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-14 Thread Attilio Rao
2010/10/14 Robert N. M. Watson :
>
> On 13 Oct 2010, at 18:46, Ryan Stone wrote:
>
>> On Fri, Oct 8, 2010 at 9:15 PM, Robert Watson  wrote:
>>> +               /*
>>> +                * get and fill a header mbuf, then chain data as an
>>> extended
>>> +                * mbuf.
>>> +                */
>>> +               MGETHDR(m, M_DONTWAIT, MT_DATA);
>>>
>>> The idea of calling into the mbuf allocator in this context is just freaky,
>>> and may have some truly awful side effects.  I suppose this is the cost of
>>> trying to combine code paths in the network device driver rather than have
>>> an independent path in the netdump case, but it's quite unfortunate and will
>>> significantly reduce the robustness of netdumps in the face of, for example,
>>> mbuf starvation.
>>
>> Changing this will require very invasive changes to the network
>> drivers.  I know that the Intel drivers allocate their own mbufs for
>> their receive rings and I imagine that all other drivers have to do
>> something similar.  Plus the drivers are responsible for freeing mbufs
>> after they have been transmitted.  It seems to me that the cost of
>> making significant changes to the network drivers to support an
>> alternate lifecycle for netdump mbufs far outweighs the cost of losing
>> a couple of kernel dumps in extreme circumstances.
>
> My concern is less about occasional lost dumps that destabilising the dumping 
> process: calls into the memory allocator can currently trigger a lot of 
> interesting behaviours, such as further calls back into the VM system, which 
> can then trigger calls into other subsystems. What I'm suggesting is that if 
> we want the mbuf allocator to be useful in this context, we need to teach it 
> about things not to do in the dumping / crash / ... context, which probably 
> means helping uma out a bit in that regard. And a watchdog to make sure the 
> dump is making progress.

I think that this would be way too complicated just to cope with panic
within the VM/UMA (not sure what other subsystems you are referring
to, wrt supposed to call). Besides, if we have a panic in the VM I'm
sure that normal dumps could also be affected.
When dealing with netdump, I'm not trying to fix all the bugs related
to our dumping infrastructure because, as long as we already
discussed, we know there are quite a few of them, but trying at least
to follow the same fragile-ness than what we have today.
And again, while I think the "watchdog" idea is good, I think it still
applies to normal dumps too, it is not specific to netdump.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-14 Thread Robert N. M. Watson

On 13 Oct 2010, at 18:46, Ryan Stone wrote:

> On Fri, Oct 8, 2010 at 9:15 PM, Robert Watson  wrote:
>> +   /*
>> +* get and fill a header mbuf, then chain data as an
>> extended
>> +* mbuf.
>> +*/
>> +   MGETHDR(m, M_DONTWAIT, MT_DATA);
>> 
>> The idea of calling into the mbuf allocator in this context is just freaky,
>> and may have some truly awful side effects.  I suppose this is the cost of
>> trying to combine code paths in the network device driver rather than have
>> an independent path in the netdump case, but it's quite unfortunate and will
>> significantly reduce the robustness of netdumps in the face of, for example,
>> mbuf starvation.
> 
> Changing this will require very invasive changes to the network
> drivers.  I know that the Intel drivers allocate their own mbufs for
> their receive rings and I imagine that all other drivers have to do
> something similar.  Plus the drivers are responsible for freeing mbufs
> after they have been transmitted.  It seems to me that the cost of
> making significant changes to the network drivers to support an
> alternate lifecycle for netdump mbufs far outweighs the cost of losing
> a couple of kernel dumps in extreme circumstances.

My concern is less about occasional lost dumps that destabilising the dumping 
process: calls into the memory allocator can currently trigger a lot of 
interesting behaviours, such as further calls back into the VM system, which 
can then trigger calls into other subsystems. What I'm suggesting is that if we 
want the mbuf allocator to be useful in this context, we need to teach it about 
things not to do in the dumping / crash / ... context, which probably means 
helping uma out a bit in that regard. And a watchdog to make sure the dump is 
making progress.

Robert___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-13 Thread Ryan Stone
On Fri, Oct 8, 2010 at 9:15 PM, Robert Watson  wrote:
> +   /*
> +* get and fill a header mbuf, then chain data as an
> extended
> +* mbuf.
> +*/
> +   MGETHDR(m, M_DONTWAIT, MT_DATA);
>
> The idea of calling into the mbuf allocator in this context is just freaky,
> and may have some truly awful side effects.  I suppose this is the cost of
> trying to combine code paths in the network device driver rather than have
> an independent path in the netdump case, but it's quite unfortunate and will
> significantly reduce the robustness of netdumps in the face of, for example,
> mbuf starvation.

Changing this will require very invasive changes to the network
drivers.  I know that the Intel drivers allocate their own mbufs for
their receive rings and I imagine that all other drivers have to do
something similar.  Plus the drivers are responsible for freeing mbufs
after they have been transmitted.  It seems to me that the cost of
making significant changes to the network drivers to support an
alternate lifecycle for netdump mbufs far outweighs the cost of losing
a couple of kernel dumps in extreme circumstances.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-13 Thread Ed Maste
On Wed, Oct 13, 2010 at 01:04:54PM +0200, Attilio Rao wrote:

> 2010/10/9 Robert Watson :

> > (1) Did you consider using tftp as the network dump protocol, rather than a
> > custom protocol? ??It's also a simple UDP-based, ACKed file transfer
> > protocol, with the advantage that it's widely supported by exiting tftp
> > daemons, packet sniffers, security devices, etc. ??This wouldn't require
> > using a stock tftpd, although that would certainly be a benefit as well.
> > ??The post-processing of crashdumps that seems to happen in the netdump
> > server could, presumably, happen "offline" as easily...?
> 
> I don't understand the "offline" question but, really, I don't know
> why tftp wasn't considered in the first place.
> Let me do some small search and see how much we could benefit from it.

I think Robert means having something other than the netdump server
doing the post-processing - e.g., a cron job could look in the directory
where a tftp server saved a core file and run a script to extract the
desired information.

TFTP has a number of advantages; I'm not sure if the original author of
the netdump code considered it and rejected it for some reason.  I think
the original implementation used a broadcast packet to locate the server
though, so a custom server was required in any case.

That said, I don't think it matters too much if we go ahead with the
current version and switch to TFTP later on.

> > (2) Have you thought about adding a checksum to the dump format, since
> > packets are going over the wire on UDP, etc? ??Even an MD5 sum wouldn't be
> > too hard to arrange: all the code is in the kernel already, requires
> > relatively little state storage, and is designed for streamed data.
> 
> Someone else already brought this point and I agree, we could use a
> checksum here.

Both a whole dump and per-packet checksum could be valuable.  If we want
the former I think this should be done in the dump infrastructure, so
that disk dumps get it too.  The latter conflicts somewhat with a desire
to switch to TFTP though (other than a standard UDP packet checksum).

> > The sysctl parts of the patch have a number of issues:
> 
> Sysctl are still not overhauled just because I'm not sure we want to
> keep them. That is one of the points I raised in the main e-mail and
> I'd really would like to hear opinions about if we should keep them
> rather than having a setup userland process, etc.
> I'm sorry about this, but please keep in mind that the file still
> needs a lot of cleanup so some comments are a bit out of date and
> other parts may not be still perfectly overhauled.

Attilio suggested adding a userland tool to configure netdump in lieu of
the current sysctls.  I don't have a strong opinion either way; the only
real advantage to the sysctl approach that I see is that the use of the
mirrored loader tunables is more obvious.

> > The idea of calling into the mbuf allocator in this context is just freaky,
> > and may have some truly awful side effects. ??I suppose this is the cost of
> > trying to combine code paths in the network device driver rather than have
> > an independent path in the netdump case, but it's quite unfortunate and will
> > significantly reduce the robustness of netdumps in the face of, for example,
> > mbuf starvation.
> 
> I'm not sure in which other way we could fix that actually. Maybe a
> pre-allocated pool of mbufs?

Despite the freakiness, I can only offer the observation that it has
worked very well for us in practise.

> > + ?? ?? ?? if (ntohs(ah->ar_hrd) != ARPHRD_ETHER &&
> > + ?? ?? ?? ?? ?? ntohs(ah->ar_hrd) != ARPHRD_IEEE802 &&
> > ...
> >
> > Are you sure you don't want to just check for ETHER here?
> 
> I'd really like to hear Ryan's or Ed's idea here.

Interesting, I can't find this in our local version.  It looks like this
check was copied from if_ether.c::arpintr().

> > + ?? ?? ?? /* XXX: Probably should check if we're the recipient MAC address 
> > */
> > + ?? ?? ?? /* Done ethernet processing. Strip off the ethernet header */
> >
> > Yes, quite probably. ??What if you panic in promiscuous mode?

Well, the packet would get dropped at the next layer up, in nd_handle_ip.

-Ed
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-13 Thread Attilio Rao
2010/10/9 Robert Watson :
> On Fri, 8 Oct 2010, Attilio Rao wrote:
>
>>> GENERAL FRAMEWORK ARCHITECTURE
>>>
>>> Netdump is composed, right now, by an userland "server" and a kernel
>>> "client". The former is run on the target machine (where the dump will
>>> phisically happen) and it is responsible for receiving  the packets
>>> containing coredumps frame and for correctly writing them on-disk. The
>>> latter is part of the kernel installed on the source machine (where the dump
>>> is initiated) and is responsible for building correctly UDP packets
>>> containing the coredump frames, pushing through the network interface and
>>> routing them appropriately.
>
> Hi Attilio:
>
> Network dumps would be a great addition to the FreeBSD debugging suite!  A
> few casual comments and questions, I need to spend some more time pondering
> the implications of the current netdump design later in the week.
>
> (1) Did you consider using tftp as the network dump protocol, rather than a
> custom protocol?  It's also a simple UDP-based, ACKed file transfer
> protocol, with the advantage that it's widely supported by exiting tftp
> daemons, packet sniffers, security devices, etc.  This wouldn't require
> using a stock tftpd, although that would certainly be a benefit as well.
>  The post-processing of crashdumps that seems to happen in the netdump
> server could, presumably, happen "offline" as easily...?

I don't understand the "offline" question but, really, I don't know
why tftp wasn't considered in the first place.
Let me do some small search and see how much we could benefit from it.

> (2) Have you thought about adding a checksum to the dump format, since
> packets are going over the wire on UDP, etc?  Even an MD5 sum wouldn't be
> too hard to arrange: all the code is in the kernel already, requires
> relatively little state storage, and is designed for streamed data.

Someone else already brought this point and I agree, we could use a
checksum here.

> (3) As the bounds checking/etc behavior in dumping grows more complex, it
> seems a shame to replicate it in architecture-specific code.  Could we pull
> the mediaoffset/mediasize checking into common code?  Also, a reserved
> size/offset of "0" meaning "no limit" sounds slightly worrying; it might be
> slightly more conservative to add a flags field to dumperinfo and have a
> flag indicating "size is no object" for netdump, rather than overloading the
> existing fields.

I don't agree with you here.
The real problem is that dumpsys is highly disk-specific (I've
commented about it somewhere, maybe the e-mail or maybe the commit
logs).
What we'd need is to have something like netdumpsys() which tries to
use network, but it is not short to make and the mediasize+mediaoffset
concept really rappresents an higher bound which can't be 0. I think
it is a reasonable compromise so far but it is subjected to further
improvements for sure.

> Some specific patch comments:
>
> + * XXX: This should be split into machdep and non-machdep parts
>
> What MD parts are in the file?

This is just a stale comment.

> The sysctl parts of the patch have a number of issues:

Sysctl are still not overhauled just because I'm not sure we want to
keep them. That is one of the points I raised in the main e-mail and
I'd really would like to hear opinions about if we should keep them
rather than having a setup userland process, etc.
I'm sorry about this, but please keep in mind that the file still
needs a lot of cleanup so some comments are a bit out of date and
other parts may not be still perfectly overhauled.

> +sysctl_force_crash(SYSCTL_HANDLER_ARGS)
>
> Does this belong in the netdump code?  We already have some of these options
> in debug.kdb.*, but perhaps others should be added there as well.

For this specific case, I'd really axe it out rather.

> +               /*
> +                * get and fill a header mbuf, then chain data as an
> extended
> +                * mbuf.
> +                */
> +               MGETHDR(m, M_DONTWAIT, MT_DATA);
>
> The idea of calling into the mbuf allocator in this context is just freaky,
> and may have some truly awful side effects.  I suppose this is the cost of
> trying to combine code paths in the network device driver rather than have
> an independent path in the netdump case, but it's quite unfortunate and will
> significantly reduce the robustness of netdumps in the face of, for example,
> mbuf starvation.

I'm not sure in which other way we could fix that actually. Maybe a
pre-allocated pool of mbufs?

> +       if (ntohs(ah->ar_hrd) != ARPHRD_ETHER &&
> +           ntohs(ah->ar_hrd) != ARPHRD_IEEE802 &&
> +           ntohs(ah->ar_hrd) != ARPHRD_ARCNET &&
> +           ntohs(ah->ar_hrd) != ARPHRD_IEEE1394) {
> +               NETDDEBUG("nd_handle_arp: unknown hardware address fmt "
> +                   "0x%2D)\n", (unsigned char *)&ah->ar_hrd, "");
> +               return;
> +       }
>
> Are you sure you don't want to just check for ETHER her

Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-10 Thread Ed Maste
On Sat, Oct 09, 2010 at 02:15:39AM +0100, Robert Watson wrote:

> Network dumps would be a great addition to the FreeBSD debugging suite!
> [...]

It seems that at EuroBSDCon there was a discussion of Contiki[1] and
the uIPv6 stack[2] that it contains, and I think something like this
could be a great future enhancement for netdump and similar consumers.

Ideally I'd like to see us get to having a clean, simple API for
transmitting and receiving packets and a basic stack that could be
shared by netdump, as well as a network-enabled ddb, gdb backend, and
perhaps console.  Using the uIPv6 stack for this would get us v6 support
(as the name suggests), as well as a supported / maintained stack vs.
what is in netdump now.  It's also BSD-licensed.

This could be a good future project, although I think the current
netdump implementation is worth bringing in soon (taking into account
the good feedback received to date, of course).

[1] http://www.sics.se/contiki/about-contiki.html
[2] http://www.sics.se/contiki/contiki-6lowpan-uipv6-faq.html

- Ed
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-08 Thread Robert Watson

On Fri, 8 Oct 2010, Attilio Rao wrote:


GENERAL FRAMEWORK ARCHITECTURE

Netdump is composed, right now, by an userland "server" and a kernel 
"client". The former is run on the target machine (where the dump will 
phisically happen) and it is responsible for receiving  the packets 
containing coredumps frame and for correctly writing them on-disk. The 
latter is part of the kernel installed on the source machine (where the 
dump is initiated) and is responsible for building correctly UDP packets 
containing the coredump frames, pushing through the network interface and 
routing them appropriately.


Hi Attilio:

Network dumps would be a great addition to the FreeBSD debugging suite!  A few 
casual comments and questions, I need to spend some more time pondering the 
implications of the current netdump design later in the week.


(1) Did you consider using tftp as the network dump protocol, rather than a 
custom protocol?  It's also a simple UDP-based, ACKed file transfer protocol, 
with the advantage that it's widely supported by exiting tftp daemons, packet 
sniffers, security devices, etc.  This wouldn't require using a stock tftpd, 
although that would certainly be a benefit as well.  The post-processing of 
crashdumps that seems to happen in the netdump server could, presumably, 
happen "offline" as easily...?


(2) Have you thought about adding a checksum to the dump format, since packets 
are going over the wire on UDP, etc?  Even an MD5 sum wouldn't be too hard to 
arrange: all the code is in the kernel already, requires relatively little 
state storage, and is designed for streamed data.


(3) As the bounds checking/etc behavior in dumping grows more complex, it 
seems a shame to replicate it in architecture-specific code.  Could we pull 
the mediaoffset/mediasize checking into common code?  Also, a reserved 
size/offset of "0" meaning "no limit" sounds slightly worrying; it might be 
slightly more conservative to add a flags field to dumperinfo and have a flag 
indicating "size is no object" for netdump, rather than overloading the 
existing fields.


Some specific patch comments:

+ * XXX: This should be split into machdep and non-machdep parts

What MD parts are in the file?

The sysctl parts of the patch have a number of issues:

- Please use sysctl_handle_string rather than manual calls and string bounds
  checking with SYSCTL_IN/SYSCTL_OUT.  In general, type-specific sysctl
  handlers are named sysctl_handle_, which I recommend here as well.

- Please use stack-local, fixed-length string buffers as the
  source/destination of copyin/copyout on ifnet names, rather than the fields
  in the structure itself.  We support interface renaming, so the field can
  change, and sysctl has the potential to block for extended periods
  mid-copyin/copyout.

- Because we support ifnet renaming, "none" is probably not a good reserved
  name.  Could you use the empty string or something similar, or just a flag
  to indicate that netdump is disabled?

- "ifp" rather than "ifn" and "nic" is the preferred variable name for ifnet
  pointers throughout the kernel.

- ifnets can, and are, removeable at runtime.  We have a refcount, but that's
  not really what you want.  I suggest simply looking up the ifnet by name
  when it's needed during a dump or for sysctl output, rather than maintaining
  a long-term pointer, which can become stale.  Especially with the
  auto-detect code.

- Since sysctl_nic is only used once, I'd prefer it were inlined into a
  use-specific handler function, avoiding the arg1/arg2 complications that
  make this code a bit harder to read.  sysctl_ip is useful because it's used
  more than once, though.  However, it also very much wants you to call
  sysctl_handle_string!

+sysctl_force_crash(SYSCTL_HANDLER_ARGS)

Does this belong in the netdump code?  We already have some of these options 
in debug.kdb.*, but perhaps others should be added there as well.


+   /*
+* get and fill a header mbuf, then chain data as an extended
+* mbuf.
+*/
+   MGETHDR(m, M_DONTWAIT, MT_DATA);

The idea of calling into the mbuf allocator in this context is just freaky, 
and may have some truly awful side effects.  I suppose this is the cost of 
trying to combine code paths in the network device driver rather than have an 
independent path in the netdump case, but it's quite unfortunate and will 
significantly reduce the robustness of netdumps in the face of, for example, 
mbuf starvation.


+   if (ntohs(ah->ar_hrd) != ARPHRD_ETHER &&
+   ntohs(ah->ar_hrd) != ARPHRD_IEEE802 &&
+   ntohs(ah->ar_hrd) != ARPHRD_ARCNET &&
+   ntohs(ah->ar_hrd) != ARPHRD_IEEE1394) {
+   NETDDEBUG("nd_handle_arp: unknown hardware address fmt "
+   "0x%2D)\n", (unsigned char *)&ah->ar_hrd, "");
+   return;
+   }

Are you sure you don't want to just check for ETHER here?

+   /* XXX: Prob

Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-08 Thread Attilio Rao
2010/9/28 Attilio Rao :
> In the last weeks I worked for porting the netdump infrastructure to
> FreeBSD-CURRENT on the behalf of Sandvine Incorporated.
> Netdump is a framework that aims for handling kernel coredumps over
> the TCP/IP suite in order to dump to a separate machine than the
> running one. That may be used on an interesting number of cases
> involving disk-less workstations, disk driver debugging or embedded
> devices.
>
> GENERAL FRAMEWORK ARCHITECTURE
>
> Netdump is composed, right now, by an userland "server" and a kernel
> "client". The former is run on the target machine (where the dump will
> phisically happen) and it is responsible for receiving  the packets
> containing coredumps frame and for correctly writing them on-disk.
> The latter is part of the kernel installed on the source machine
> (where the dump is initiated) and is responsible for building
> correctly UDP packets containing the coredump frames, pushing through
> the network interface and routing them appropriately.
>
> While the server may appear as, pretty much, a simple userland deamon
> dealing with UDP packets, the client imposes interesting problems as
> long as its activity is linked to handling kernel core dumping. More
> precisely, as long as the client is part of the dumping mechanism and
> the kernel may be in general panic conditions, netdump must ensure
> "robustness" of operations. That is partially achieved by reworking
> (and someway replicating) locally some UDP, ARP and IP operations,
> hardsetting some values (like the default gateway, the destination and
> the client address) and reducing further interactions with the kernel
> just to the network interface acitivities.
> More specifically, it implements a very basic UDP / IPv4 / ARP stack
> separate from the standard stack (since that may not be in a
> consistent state).
> It can dump to a server on the same LAN or via a router (correctly
> specifying the connection gateway).
> In order to receive packet on critical conditions, netdump polls the
> interface. Every network driver can implement hooks to be used by
> netdump independently by DEVICE_POLLING option, even if it is
> probabilly a good idea to share some code among them. The reference
> set of hooks is contained into "struct netdump_methods".
> And if_lem/if_em driver modifies may be set as reference for netdump
> hooks implementation.
>
> In order to work into an "up and running" system (meant as with all
> the devices in place) the netdump handler hooks as a pre-sync handler
> (differently from other dumping routines). It however suffers some
> problems typical of other dumping mechanism. For example, on DDB
> entering unlocked version of polling handler is used, in order to
> reduce the risk of deadlocks during inspections*. That reflects, among
> the netdump methods, the existence of 2 versions of polling hooks,
> where the "unlocked" is meant as reducing locking as much as possible.
>
> PATCH AND FURTHER WORK
>
> The patch is not totally complete and it is not intended to be
> committed in SVN yet. What I'm looking for now is more testing and
> review (in particular in terms of architecture) coverage by community.
> The server should be in realtively "committable" state, though, but I
> encourage its stress-testing. A manpage is provided along that should
> be very easy to understand how to use it.
>
> Things that can be further improved, as it is now, in the client, are:
> - Deciding if hardcoding of the kernel parameter is done properly. I
> personally don't like the sysctl usage and I would prefer an userland
> small utility used to testing and maybe add some tunables for enabling
> netdump early in the boot. You may have several opinions on this
> though.
> - VIMAGE and IPv6 support.
> - More drivers support. Right now only if_em (and if_lem) are
> converted to use netdump and can be used as a draft for other device
> drivers. if_ixgb should came along in the final, committing, version
> too. In general I think that all drivers supporting device polling
> could easilly support also netdump
> - Ideally dumpsys() in FreeBSD is too much disk-activity oriented. It
> should be made, instead, more neutral and more flexible to cope better
> with different interfaces. It is a quite a bit of work, however, and
> beyond the scope of netdump introduction (even if it could be
> beneficial for it)
>
> Netdump has been developed on a FreeBSD project branch located here:
> svn://svn.freebsd.org/base/projects/sv/
>
> which could also forsee further informations about every single
> change. However, for your convenience, also a patch has been made
> public which is located here (against freebsd-curr...@213246):
> http://www.freebsd.org/~attilio/Sandvine/STABLE_8/netdump/netdump_alpha_0.diff


This followup is made in order to signal the code has been further
refined and some bugfixes (reported mostly by pluknet@ testing) have
been fixed. More support for interfaces has been added (igb, ixgb,
ixgbe).
Y

Re: [PATCH] Netdump for review and testing -- preliminary version

2010-09-29 Thread Sergey Kandaurov
The 1st error is caught :).

Prior actions was triggering netdump, leaving db>
and entering again, then trigger netdump once more.

Dumping 1146 MB: 1131 1115 1099 1083 1067 1051 1035 1019 1003 987 971
955 939 923 907 891 875 859 843 827 811 795. . . . . . . . . . .
** DUMP FAILED (ERROR 60) **
Failed to dump the actual raw datas

Then I was able to leave ddb back to the working system.

P.S.
What about to start netdumpsrv from rc.d ? It's very simple.

%%%
#!/bin/sh
#
# $FreeBSD$
#

# PROVIDE: netdumpsrv
# REQUIRE: NETWORKING
# KEYWORD: shutdown

. /etc/rc.subr

name="netdumpsrv"
rcvar=`set_rcvar`
command="/usr/sbin/${name}"
pidfile="/var/run/${name}.pid"

load_rc_config $name
run_rc_command "$1"
%%%

-- 
wbr,
pluknet
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-09-29 Thread Sergey Kandaurov
> From: Ryan Stone 
> Date: Wed, Sep 29, 2010 at 11:32 AM
> Subject: Re: [PATCH] Netdump for review and testing -- preliminary version
> To: Sergey Kandaurov 
>
>
>> db> netdump
>>
>> ---
>> netdump in progress. searching for server.. . . . . . . . . . . .
>> Failed to contact netdump server
>
> This can be fixed by putting:
>
> nd_server_port = NETDUMP_PORT;
>
> At the beginning of netdump_trigger.

Hi.

Thank you, that helped fixing it.
2nd and further netdumps work now.


Some more numbers below
[hosts are on diff. switches connected
with STP on same 1Gb MTU1500 LAN]

   packets  errs  bytespackets  errs  bytes colls
[avg:]
  3488 04109874   4002 0 704800 0
[at the end:]
  3484 04076816   3989 0 680710 0
  3510 04086756   4066 0 712280 0
  3552 04136112   4107 0 714790 0
  3459 04059238   3995 0 681966 0
  3499 04077516   4037 0 709734 0
  3423 04004157   3936 0 678816 0
 13898 0   19118779  14018 02062568 0
 22206 0   31680314  21451 02959042 0
 22410 0   32086046  21710 02854724 0
 22231 0   31747238  21489 02899306 0
input (bce0)   output
   packets  errs  bytespackets  errs  bytes colls
 18490 0   26180288  17757 02729082 0
   336 0  24674 90 0 622960 0
 6 0   1148  1 0178 0

1687470 packets captured
2274879 packets received by filter
65676 packets dropped by kernel

It shows 30Mbps in average with up to 224 Mbps at the end.
The traffic picture is 1472 bytes UDP payload, and 4 bytes ack.
It took slightly less then 5 mins to netdump 1146 MB.

-- 
wbr,
pluknet
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Fwd: [PATCH] Netdump for review and testing -- preliminary version

2010-09-29 Thread Ryan Stone
-- Forwarded message --
Replying to the list this time...

From: Ryan Stone 
Date: Wed, Sep 29, 2010 at 11:32 AM
Subject: Re: [PATCH] Netdump for review and testing -- preliminary version
To: Sergey Kandaurov 


> db> netdump
>
> ---
> netdump in progress. searching for server.. . . . . . . . . . . .
> Failed to contact netdump server

This can be fixed by putting:

nd_server_port = NETDUMP_PORT;

At the beginning of netdump_trigger.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-09-29 Thread Attilio Rao
2010/9/29 Sergey Kandaurov :
> [just don't know what namely need to test, so]
>
> All made according to your instructions.
> The only way I could trigger netdump was
> to run its ddb command by hand. Neither
> debug.kdb.enter nor debug.kdb.panic don't do it.

You probabilly need to use KDB_UNATTENDED.

> Some numbers and output (bit verbose
> but again don't know what need to show here)
> Entered through debug.kdb.panic, thus Panicstring.
> db> netdump
>
> ---
> netdump in progress. searching for server.. dumping to xx.xxx.xx.xx
> (00:1a:64:d3:4f:20)
> ---
> Physical memory: 2026 MB
> Dumping 1154 MB: 1139 1123 1107 1091 1075 1059 1043 1027 1011 995 979
> 963 947 931 915 899 883 867 851 835 819 803 787 771 755 739 723 707
> 691 675 659 643 627 611 595 579 563 547 531 515 499 483 467 451 435
> 419 403 387 371 355 339 323 307 291 275 259 243 227 211 195 179 163
> 147 131 115 99 83 67 51 35 19 3
> Dump complete
>
> netdump finished.
> cancelling normal dump
>
> # ls -hl /home/fooserv/crash/
> total 1182930
> -rw-r--r--  1 root  wheel   404B Sep 29 12:05 info.fooserv.0
> -rw-r--r--  1 root  wheel   1.1G Sep 29 12:05 vmcore.fooserv.0
>
> Dump from fooserv [xx.xxx.xx.xx]
>  Architecture: amd64
>  Architecture version: 2
>  Dump length: 1210687488B (1154 MB)
>  blocksize: 8192
>  Dumptime: Wed Sep 29 11:55:30 2010
>  Hostname: fooserv
>  Versionstring: FreeBSD 9.0-CURRENT #51: Wed Sep 29 07:18:24 UTC 2010
>    r...@fooserv:/usr/obj/usr/src/sys/GENERIC
>  Panicstring: kdb_sysctl_panic
>  Header parity check: Pass
> Dump complete
>
> I tried to netdump a bit later again, but it couldn't find server, which
> runs though, and client traffic was seen on server-side (w/o reply).
> db> netdump
>
> ---
> netdump in progress. searching for server.. . . . . . . . . . . .
> Failed to contact netdump server

Could you compile the kernel with NETDUMP_CLIENT_DEBUG and retry?
(Warning: the log may be huge, so you may be needing to log with serial line)

I will try to reproduce locally, if anything.

Thanks,
Attilio


>
> As for netdumpserv,
> it misses #include  before __FBSDID();
> I happily tested it on 7.3-amd64.

Yes, there are also other small style(9) nits that needs to be
addressed in the netdumpserver as well.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-09-29 Thread Andriy Gapon
on 28/09/2010 20:39 Attilio Rao said the following:
> In order to work into an "up and running" system (meant as with all
> the devices in place) the netdump handler hooks as a pre-sync handler
> (differently from other dumping routines). It however suffers some

I actually like this idea.  I think that regular dump should also be done at 
that
time.
Shouldn't we try to dump memory as early as possible, so that as little of it as
possible is modified?
(Not that I like sync-on-panic option at all)

> problems typical of other dumping mechanism. For example, on DDB
> entering unlocked version of polling handler is used, in order to
> reduce the risk of deadlocks during inspections*. That reflects, among
> the netdump methods, the existence of 2 versions of polling hooks,
> where the "unlocked" is meant as reducing locking as much as possible.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Netdump for review and testing -- preliminary version

2010-09-29 Thread Sergey Kandaurov
[just don't know what namely need to test, so]

All made according to your instructions.
The only way I could trigger netdump was
to run its ddb command by hand. Neither
debug.kdb.enter nor debug.kdb.panic don't do it.

Some numbers and output (bit verbose
but again don't know what need to show here)
Entered through debug.kdb.panic, thus Panicstring.
db> netdump

---
netdump in progress. searching for server.. dumping to xx.xxx.xx.xx
(00:1a:64:d3:4f:20)
---
Physical memory: 2026 MB
Dumping 1154 MB: 1139 1123 1107 1091 1075 1059 1043 1027 1011 995 979
963 947 931 915 899 883 867 851 835 819 803 787 771 755 739 723 707
691 675 659 643 627 611 595 579 563 547 531 515 499 483 467 451 435
419 403 387 371 355 339 323 307 291 275 259 243 227 211 195 179 163
147 131 115 99 83 67 51 35 19 3
Dump complete

netdump finished.
cancelling normal dump

# ls -hl /home/fooserv/crash/
total 1182930
-rw-r--r--  1 root  wheel   404B Sep 29 12:05 info.fooserv.0
-rw-r--r--  1 root  wheel   1.1G Sep 29 12:05 vmcore.fooserv.0

Dump from fooserv [xx.xxx.xx.xx]
  Architecture: amd64
  Architecture version: 2
  Dump length: 1210687488B (1154 MB)
  blocksize: 8192
  Dumptime: Wed Sep 29 11:55:30 2010
  Hostname: fooserv
  Versionstring: FreeBSD 9.0-CURRENT #51: Wed Sep 29 07:18:24 UTC 2010
r...@fooserv:/usr/obj/usr/src/sys/GENERIC
  Panicstring: kdb_sysctl_panic
  Header parity check: Pass
Dump complete

I tried to netdump a bit later again, but it couldn't find server, which
runs though, and client traffic was seen on server-side (w/o reply).
db> netdump

---
netdump in progress. searching for server.. . . . . . . . . . . .
Failed to contact netdump server


As for netdumpserv,
it misses #include  before __FBSDID();
I happily tested it on 7.3-amd64.

-- 
wbr,
pluknet
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


[PATCH] Netdump for review and testing -- preliminary version

2010-09-28 Thread Attilio Rao
In the last weeks I worked for porting the netdump infrastructure to
FreeBSD-CURRENT on the behalf of Sandvine Incorporated.
Netdump is a framework that aims for handling kernel coredumps over
the TCP/IP suite in order to dump to a separate machine than the
running one. That may be used on an interesting number of cases
involving disk-less workstations, disk driver debugging or embedded
devices.

GENERAL FRAMEWORK ARCHITECTURE

Netdump is composed, right now, by an userland "server" and a kernel
"client". The former is run on the target machine (where the dump will
phisically happen) and it is responsible for receiving  the packets
containing coredumps frame and for correctly writing them on-disk.
The latter is part of the kernel installed on the source machine
(where the dump is initiated) and is responsible for building
correctly UDP packets containing the coredump frames, pushing through
the network interface and routing them appropriately.

While the server may appear as, pretty much, a simple userland deamon
dealing with UDP packets, the client imposes interesting problems as
long as its activity is linked to handling kernel core dumping. More
precisely, as long as the client is part of the dumping mechanism and
the kernel may be in general panic conditions, netdump must ensure
"robustness" of operations. That is partially achieved by reworking
(and someway replicating) locally some UDP, ARP and IP operations,
hardsetting some values (like the default gateway, the destination and
the client address) and reducing further interactions with the kernel
just to the network interface acitivities.
More specifically, it implements a very basic UDP / IPv4 / ARP stack
separate from the standard stack (since that may not be in a
consistent state).
It can dump to a server on the same LAN or via a router (correctly
specifying the connection gateway).
In order to receive packet on critical conditions, netdump polls the
interface. Every network driver can implement hooks to be used by
netdump independently by DEVICE_POLLING option, even if it is
probabilly a good idea to share some code among them. The reference
set of hooks is contained into "struct netdump_methods".
And if_lem/if_em driver modifies may be set as reference for netdump
hooks implementation.

In order to work into an "up and running" system (meant as with all
the devices in place) the netdump handler hooks as a pre-sync handler
(differently from other dumping routines). It however suffers some
problems typical of other dumping mechanism. For example, on DDB
entering unlocked version of polling handler is used, in order to
reduce the risk of deadlocks during inspections*. That reflects, among
the netdump methods, the existence of 2 versions of polling hooks,
where the "unlocked" is meant as reducing locking as much as possible.

PATCH AND FURTHER WORK

The patch is not totally complete and it is not intended to be
committed in SVN yet. What I'm looking for now is more testing and
review (in particular in terms of architecture) coverage by community.
The server should be in realtively "committable" state, though, but I
encourage its stress-testing. A manpage is provided along that should
be very easy to understand how to use it.

Things that can be further improved, as it is now, in the client, are:
- Deciding if hardcoding of the kernel parameter is done properly. I
personally don't like the sysctl usage and I would prefer an userland
small utility used to testing and maybe add some tunables for enabling
netdump early in the boot. You may have several opinions on this
though.
- VIMAGE and IPv6 support.
- More drivers support. Right now only if_em (and if_lem) are
converted to use netdump and can be used as a draft for other device
drivers. if_ixgb should came along in the final, committing, version
too. In general I think that all drivers supporting device polling
could easilly support also netdump
- Ideally dumpsys() in FreeBSD is too much disk-activity oriented. It
should be made, instead, more neutral and more flexible to cope better
with different interfaces. It is a quite a bit of work, however, and
beyond the scope of netdump introduction (even if it could be
beneficial for it)

Netdump has been developed on a FreeBSD project branch located here:
svn://svn.freebsd.org/base/projects/sv/

which could also forsee further informations about every single
change. However, for your convenience, also a patch has been made
public which is located here (against freebsd-curr...@213246):
http://www.freebsd.org/~attilio/Sandvine/STABLE_8/netdump/netdump_alpha_0.diff

In order to enable the client it is enough to add at your kernel config:
optionsNETDUMP_CLIENT

NETDUMP_CLIENT_DEBUG can be specified too in order to have further
debuggin traces but I'd encourage to use them just for developing
Netdump itself.

TYPICAL USAGE

This is a set of the available sysctls for netdump:
# sysctl -d net.dump
net.dump: netdump
net.dump.enable: enable network dump
n