Hi Jakub,
I think that the issue here is that we are not discussing desired new
kconsole shape in a very systematic way.
So let's assume there is no kernel console, no physical kernel output
(let's assume all printfs in the kernel are no-ops), no kernel input and
output drivers. Let's assume that we have something that we would like
to call a production-built microkernel.
Now, from this baseline let's define our set of requirements for a
kernel debugging and observability tool. These requirements will tell us
in a systematic way how the implementation should actually look like.
I'll try to define my requirements at the end of my email.
What I see as another problem of the current kconsole is that some of
the applications need to be aware of this purely debugging feature, such
as trace, bdsh, console or compositor.
OK. As I have already suggested, we might remove the SYS_KLOG syscall,
thus keeping the kernel and uspace logging completely isolated.
We might also replace SYS_DEBUG_ACTIVATE_CONSOLE with a different
syscall that won't have any explicit connection to the kconsole. I
suggest to introduce a SYS_PANIC syscall that will just cause a kernel
panic (and thus forcing to show the contents of the kconsole).
Note that there will be always a need to notify the kernel about some
special run-time condition from user space -- just think about the final
part of the system-wide shutdown sequence where you need to use some
SYS_QUESCE call before the user space platform driver physically reboots
the machine.
Do we know under what circumstances can klog loose output?
If the kernel generates the output faster that the user space
counterpart is able to retrieve it (e.g. due preemption), then the
kernel will overwrite parts of the cyclic buffer. Not only will be part
of the original data lost, but the result can be completely garbled.
Is it likely to happen?
Depending on the overall system load it can be very likely.
As of now, I seem to be having the opposite problem, i.e.
redundant output.
This might be only a different manifestation of the same problem -- lack
of explicit synchronization.
Unfortunately yes. How often did we make use of this possibility?
I don't keep records, but usually when you need it, it is the only means
left.
This capability can be partially substituted by the
debugging facilities of the simulators we use.
You usually need it for debugging on real hardware where there is no
external debugging aid.
I also thought about a similar possibility, but the kconsole is actually
useful as an interpreter. It allows you to directly execute kernel
functions in a context of a kernel thread, have them run on other
processors, read/write PIO registers and do other stuff for which it
would not be practical or wise to have a debugging syscall or write a
specialized user program.
I think it is simply inconsistent to say that kernel drivers are bad,
but a kernel-based command-line interpreter is OK.
We should either say that some kernel drivers and the kernel
command-line interpreter is fine (for debugging purposes only) or we
should try to eliminate both.
You can always have the command-line interpreter in user space and pass
the commands to the kernel already pre-parsed. This should allow you to
run arbitrary kernel functions just fine.
I have always thought about the kconsole step as a precondition for
removing all or most of the kernel drivers (timers, interrupt
controllers etc.) in favour of purely uspace ones.
User space processor activation is a good idea, but I simply cannot
imagine how the kernel should operate without a basic timer and
interrupt driver.
It would need to bootstrap in a cooperative multitasking mode (which is
really hard to imagine given the nature of our IPC and future kernel
mechanisms such as RCU) and it would be completely vulnerable to a
denial of service due to any problem in the uspace timer and interrupt
controller drivers.
OK, here comes my requirements for a minimalistic and non-intrusive
kernel debugging facility:
* Provide a kernel cyclic buffer for storing kernel printouts. The
contents of this buffer should be available to:
- An external debugging aid (monitor, hypervisor, emulator, etc.)
thanks to a well-known location of the buffer in physical memory.
- The user space via non-blocking, non-synchronized update
notifications and the possibility to map the buffer by a privileged
task.
* If there is an optional platform output driver compiled in the kernel
(e.g. serial output, hypervisor console output, platform framebuffer
output), the kernel printouts should be mirrored to this output
driver.
- The mirroring should stop as soon as an user space driver maps or
accesses the resources of the output device.
- The mirroring should resume in case of a kernel panic (possibly
requested by user space via the SYS_PANIC syscall). For debugging
purposes, the kernel might be modified to not actually cause a
panic as part of the processing of the syscall.
* Provide kernel run-time information (e.g. SLAB cache statistics,
scheduler statistics, physical memory zones, TLB content, etc.) in
the form of sysinfo entries available to privileged tasks.
* For debugging purposes, optionally provide a SYS_DEBUG syscall that
would allow a privileged task to:
- Initiate I/O and memory access from a kernel thread.
- Run kernel test or benchmark.
- Call arbitrary public kernel function.
- Print a stack trace of a thread.
- Print an arbitrary string to the kernel buffer.
- Provide means to read the contents of the cyclic buffer from user
space in a synchronized manner (i.e. acknowledge the fetches).
Summary:
* The kernel cyclic buffer and its non-synchronized availability to
user space is mandatory.
* The kernel output drivers are optional and non-intrusive, there are
no kernel input drivers.
* Once the user space takes over, the kernel activates the kernel
output drivers only in case of a kernel panic.
* The kernel panic can be forced from user space by SYS_PANIC. The
user space cannot make any assumptions about the state of the system
after issuing this syscall, however, for debugging purposes only the
kernel might skip the panic.
* Most current "output-only" kconsole commands should be replaced by
sysinfo data.
* The rest of the "interactive" kconsole commands should be available
via a dedicated (and optional) SYS_DEBUG syscall. However, the
kernel should not contain a command-line parser, the arguments to
SYS_DEBUG should be already pre-parsed in user space.
* The optional SYS_DEBUG syscall can be used to switch the cyclic
buffer notifications to synchronous fashion with acknowledgements
and back.
* As any output to the kernel cyclic buffer from user space is purely
optional (via the SYS_DEBUG syscall), the kernel cyclic buffer is
not used for user space logging (but it can be used for on-demand
debugging printouts from user space, if the SYS_DEBUG is enabled).
M.D.
_______________________________________________
HelenOS-devel mailing list
[email protected]
http://lists.modry.cz/cgi-bin/listinfo/helenos-devel