Re: Self-documenting code (ha!)

Russell King - ARM Linux Admin Thu, 27 Apr 2000 04:50:07 -0700
Zach Welch writes:
> my comments are inline with Russel's reply

This is pretty standard for unix-orientated people ;)

> I believe that I understand ARM fairly well. I certainly know more than
> nothing about it. In addition, I believe trying to maintain a small size of
> the source base by reducing the limiting the overall number of comments is
> self-defeating.

Err, no one's talking about removing comments.

> I'm not disagreeing with the seperation of code, which I think is _very_
> well accomplished. My concern is with bringing it together in the mind of
> the programmer.

However, adding a comment into the code every two or so code lines is
just not on - it adds too much clutter for the people writing the code,
and then not only does the code have to be maintained, but also the
comments.  This usually ends up with the code being maintained, and the
comments not.  The comments eventually becomes wrong, and then there is
no point in having the comments there because they are now misleading.

The above is a view held not just by me, but by the vast majority of
kernel developers.

> I used the page table example because it had just come up,
> which turns out not to have been a clear case. The proc_info trick comes
> closer to making my point.

I'll agree that there could be some more commenting about the linker
trick used there, but that trick is used through out the kernel code
now, so "its nothing new".  However, I agree that a well placed comment
there would not be unreasonable.

> In the Linux coding style guide, it talks about using <10 locals. Aren't
> registers the same thing? And yet, the code in head-armv.s doesn't
> (sufficiently, IMO) describe how these are being flung around.

The ARM only has 16 registers.  3 of these have a well-defined meaning.
For a typical assembler function, all the other 13 are generally not used
unless it's some large complex function.  Take the code in head-armv.S:

__entry:                        4 registers used
__ret:                          1 registers used
__mmap_switched:                9 registers used but commented sucinctly
create_page_tables:             7 registers used but commented
__error:                        3 registers used
__lookup_processor_type:        6 registers used
__lookup_architecture_type:     8 registers used

So generally we're talking about <10 "local variables" in almost every
case, which is within the human minds comprehension.

(You've also gone back to the "how" which the coding style explicitly
 mentions that you should not do, and I agree.  Comments should only
 describe "what", no matter what the context of the comment.)

> The same arguement applies here - I can't keep the contents of all
> the registers straight in my mind as they are carried from one block
> to another.

You don't have to.  Any that are transferred between the various "functions"
is documented in the comments before the function, thus:

/*
 * Read processor ID register (CP#15, CR0), and determine
 * processor type.
 *
 * Returns:
 *      r5, r6, r7 corrupted
 *      r8  = page table flags
 *      r9  = processor ID
 *      r10 = pointer to processor structure
 */

Ok, so the following bit of code does what the comment says.  Without seeing
the code, it indicates that r5, r6 and r7 will not be preserved, and that is
expected to be in r8, r9 and r10.

Notice that you do not need to know how this information is obtained - in
fact that may well change to some other method, but the above principle
will still hold true because that is the purpose of the code.

When you know the purpose of the code, and decide that you need to know
in great detail how it works, you're really only looking at (in this
example that you originally picked) 18 lines of code (not withstanding your
and my comments above about __proc_info).

That's 18 lines with 6 registers used.  That should be well within the scope
of the human mind to understand.

> Yes, it should. But in many cases, it isn't. When trying to get a high level
> view of the code, those are the most important. But to see how the pieces
> actually fit together, you really need that next level of detail (that I'm
> preaching for).

See comment above.

> By way of another example, right before __switch_data
> label, the add pc, r10, #12 instruction (which jumps to initialize the
> processor dependent mm code) is similarly mindboggling until you figure out
> that the proc_info structure actually has an opcode built into it (which is
> a pretty big deviation from the traditional seperation of data and code).

Ok looking at this one in depth then.  Ok, what does r10 hold?  Well,
__lookup_processor_type appears to have something to do with it - its
tested shortly after to see if its an invalid processor.  Ok, lets look
at that code.

 *      r10 = pointer to processor structure

Ah, r10 contains that.  Ok, so

        add pc, r10, #12

is jumping into the processor structure.  This must mean that the structure
contains opcodes itself.  Ok, looking in include/asm-arm/procinfo.h, it
re-inforces this:

 * NOTE! The following structure is defined by assembly
 * language, NOT C code.  For more information, check:
 *  arch/arm/mm/proc-*.S and arch/arm/kernel/head-armv.S

So, what do arch/arm/mm/proc-*.S contain?

__sa1110_proc_info:
...
                b       __sa110_setup

Ah, ok, an instruction within the structure.

Oh, btw, the above is very common in most hand-written ARM assembler.

> Presently, neither of these tricks are mentioned at all, and it is these
> kinds of things that _need_ to be documented!!!

Why do they need to be documented when they are almost self explanitory.

> I do know what I am doing (Really!), I am familiar with ARM, and I _want_ to
> delve deeply into how the kernel is working. I should NOT be required to go
> any further than the distribution to do it.

Err, no.  You should get the high level view about what its doing.  Then,
refer to the manuals to find out the information about the area that the
code is concerned with, and then go back to the distribution.  You are not
expected to maintain code without the reference documentation, and any
attempt to do otherwise is just plain wrong.  That is true of safety-crit
systems as well, incidentally.

> All the magic should be
> documented. Relying on someone else to understand the magic can have tragic
> consequences - if they're hit by a bolt of lightning, all that information
> goes up in smoke.

I believe that Philip Blundell and Nico Pitre have a good understanding of
how the bits of code work.  Maybe you ought to ask them for their opinions?

> I don't think this example is valid - we're all plumbers here. Some more
> experienced than others, but plumbers nonetheless. And anyone who's had to
> work on their own plumbing will tell you that having a _detailed_ blueprint
> is essential, particularly if you can't immediately see the parts you'll be
> working on or don't even know that those parts even existed. Also, I must
> re-emphasize that I DO NOT want every T and bend commented, and it should
> not be necessary to do so.

It is valid.  Most houses do not have these detailed blueprints.  In fact,
I'd guess that when a plumber turns up on your door that he has very little
or no knowledge about how your house is plumbed.  He has to do some
measure of investigation without "comments", use his experiance from other
jobs he's done previously, and use the knowledge that he's been taught at
plumber school.

> Argh, I really didn't want to get involved in this type of debate (wholly
> opinionated and unresolvable to anyone's complete satisfaction), but I
> honestly believe that the quality of the comments could be improved markedly
> from where they currently are. This stems directly from the fact that I (an
> experienced programmer who was already fairly well familiar with ARM) would
> _really_ have appreciated better comments having been there, and that I'd
> like to see future programmers get off easier.

I believe that you're the one trying to air their opinions here.  I've been
trying to state the facts.

If you are talking about having a comment added to describe "what" a
specific bit of code is doing, then that is reasonable.  Just don't go
overboard and try to get "how" when that's not really what you're after
in the first place. ;)
   _____
  |_____| ------------------------------------------------- ---+---+-
  |   |        Russell King       [EMAIL PROTECTED]      --- ---
  | | | |            http://www.arm.linux.org.uk/            /  /  |
  | +-+-+                                                     --- -+-
  /   |               THE developer of ARM Linux              |+| /|\
 /  | | |                                                     ---  |
    +-+-+ -------------------------------------------------  /\\\  |

unsubscribe: body of `unsubscribe linux-arm' to [EMAIL PROTECTED]
++        Please use [EMAIL PROTECTED] for           ++
++                        kernel-related discussions.                      ++
Re: Self-documenting code (ha!)

Reply via email to