Re: RFC: index core pods with X<>

Ivan Tubert-Brohman Tue, 26 Jul 2005 13:37:16 -0700

Nicholas Clark wrote:

First, a definition. By "scope", I mean the part of the document that isdeemed relevant to an index entry, and that may be extracted and shownin isolation by a processing or display tool. For example, perldoc -fconsiders the scope of a function to end at the beginning of the next=item, or at the end of the enclosing =over.
* There's no limitation as to the number of times that a given entry canappear in a document or collection of documents. That is, it is not anerror to have X<whatever> appear twice in the same file.
This means that if two or more adjacent paragraphs are needed to make sense,
it's no problem under the scope rules - just mark both.

Good point. And perhaps some of the programs that use the data couldchoose two treat two consecutive paragraphs as one entry. Although insome cases it's simpler to just choose a slightly wider scope, as itmight be better to have too much context than not enough.

Currently it is used in only *one* place in the perl documentation:pod/perlfunc.pod uses it for the "-X" filetest operators.

* It should be considered case-insensitive.


This would lead to some ambiguities:

    -b  File is a block special file.
    -B  File is a "binary" file (opposite of -T).
    -c  File is a character special file.
    -C  Same for inode change time (Unix, may differ for other platforms)
    -s  File has nonzero size (returns size in bytes).
    -S  File is a socket.
    -t  Filehandle is opened to a tty.
    -T  File is an ASCII text file (heuristic guess).

I'm not sure if this is really a problem

Or maybe I was too quick to dismiss case-sensitivity. Maybe it should beup to the processing programs to decide whether they want to becase-sensitive or not (and maybe the user can control it withcommand-line arguments and such). But if we go for case-sensitivity, I'dadd a style rule that says:

* all entries should be written in lowercase, unless uppercase isnecessary due to case sensitivity. For example, for generic keyworkslike "operator", use X<operator>, not X<Operator>.

Perl comes with over 100 files in the pod/ directory, totaling over100,000 lines of POD. Obviously, indexing all of it by hand is a verylarge task, so the question arises as to who will do it. If people agreethat this is a good idea and are willing to apply the patches, I couldlead the project, and hope to attracting volunteers. In the worst case(no one else is willing to help), I believe that even if I can't index*all* of the pods, a partial index is better than no index at all. Iwould start with the documents that I consider more important, such asperlop, perlsub, perlre, perlobj, etc. Documents such as perldelta* andthe faqs probably don't need indexing that much.
Potentially the faqs do. The deltas might benefit from it, particularly if
searching for a topic on a recent set of documentation brings up an important
bug fix in a specific version of perl, and the user realises that there perl
is older than this.

I agree that it can be useful, it's just a lower priority IMO. The faqsbecause they can already be searched with perldoc -q (at least thequestions), and perldelta* because it's a bit "esoteric" (not what abeginner would be looking at when trying to figure out the purpose of anoperator or function, for example ;-)

The proposal seems very well thought through, and I'd be very happy to see
you start on this soon. I'd hope that you'd soon attract volunteers to help,
but as you rightly say, only time will tell.

Thanks, I'm glad you like it. I've already started. Any volunteers? ;-)I'll post later in perlmonks to see if there are any potentialvolunteers out there.


Ivan

Re: RFC: index core pods with X<>

Reply via email to