Adam Kennedy wrote:
* has_pod_index: The POD contains at least one X<> keyword that helps
POD indexers. Whether only one is usefull is open for debate, because
at least the license (X<gpl>), your CPAN ID under authors (x<tels>),
and some generic keyword what your module (X<foo>) is about can
probably added even for the most minimal module.
Can you give an example of how this has any practical impact on
anything?
Here is the main page for the project.
http://pod-indexing.annocpan.org/wiki/index.cgi
They talk only about the Perl core doc at this point, probably because
adding keywords there is already enough work. AFAIK the core docs are
now covered, so individual modules would be next.
Yep, a google-like search engine could save the effort of manually
tagging with keywords, but I think this idea is more practical and
will improve perldoc greatly.
I hate to say it, but this indexing thing has seemed to be ass-backwards
to me from the beginning.
Instead of having one person combine a Pod Parser and Plucene indexer or
some other simple process, they expect the 3500 authors to ADD extra
content to all their POD?
Well, indexing all of CPAN was never in my original goals. My goal is to
make the core documentation more usable, and I haven't seen any
automated search engine that does that.
For example, let's say you want to find the definition of "scalar".
Sure, you can use grep and find that there are 77 documents where
"scalar" appears a total of 738 times. But which is the good one? (And
which section of the document?) You can try to come up with some clever
ranking algorithm, but it is not trivial (and it's not so easy to define
things like PageRank[tm] in this case). I'd rather have a human indexer
label the place, or just a handful of places, that have the most
relevant information for that keyword.
Cheers,
Ivan