On Mon, 31 May 2021 14:48:42 -0400 Douglas McIlroy <douglas.mcil...@dartmouth.edu> wrote:
> > Now that I think of it, the current system is makewhatis/apropos. I > > often get a ton of noise entries, usually perl modules, but maybe > > there?s a way around that. > > apropos whatever | grep -v '(3p' That's what we have, but is it the best we can do? I bet we agree the answer to that question is No. Try for example apropos color On my system, with xlib and some perl stuff installed, that yields 178 entries. I can winnow it down to 38 with: man -k color | grep -v ^'[xX]' | grep -v :: including ppmquant (for X) and ctail (for dot) and 3 items releated to what I want, dircolors. That's a 2% signal/noise ratio. Ironically, most of the apropos output is not apropos to the input. The problem, I assert, is lack of context. apropos has no way to know I"m interested in colors for filenames in ls. The fiirst order of business IMO is to navigate large documents by indexed keyword, something the info reader does tolerably well. More sophisticated -- and requiring no new input -- would be an ability to "zoom out" of the context of one manpage to show related pages that reference the term. If I'm reading the ls(1) page and don't find what I want, what's "in the neighborhood"? Well, dpkg tell us that ls(1) is part of GNU coreutils. AFAIK, the man system offers no way to ask "what coretutils manpages reference color"? A further outer ring of association can also be derived from the packaging system, namely packages that depend on the package, or that it depends on, or that are recommended, subject to the constraint (or not) that they're installed. Another basis for "zoom out" could be the kind of work that made Google rich: citation counts. If ls(1) references certain documents or environment variables, what other documents reference those same documents/variables? If many do, that's information. It's not rocket surgery, either; it's basically what cscope has been doing for 30 years for function calls. ISTM that we rely too heavily on general tools like regular expressions, and don't exploit information already present in our systems. We're training ourselves to create "google-able" terms -- like go-lang for the Go language -- because general purpose search engines lack context specifiers. We also don't leverage the documentation writer's expertise and *time*: any effort to add index terms to documentation is nothing next to the thousands or millions of times that page will be read and searched. That is why I want to provide authors with macros for index terms: to let them to express their expertise for the benefit of all. I don't see how the value of a subject index can be doubted, given that every large body of information is indexed, be it the Encyclopedia Britannica or your local library. Nor is technical feasibility a high bar. The real obstacle, as ever, is people. Where there's a will there's a way. But: is there a will? --jkl