Re: CPANTS: has_signature, has_pod_index

Ivan Tubert-Brohman Mon, 07 Nov 2005 05:41:21 -0800

Adam Kennedy wrote:

* has_pod_index: The POD contains at least one X<> keyword that helps
POD indexers. Whether only one is usefull is open for debate, because
at least the license (X<gpl>), your CPAN ID under authors (x<tels>),
and some generic keyword what your module (X<foo>) is about can
probably added even for the most minimal module.
Can you give an example of how this has any practical impact on
anything?
Here is the main page for the project.
    http://pod-indexing.annocpan.org/wiki/index.cgi
They talk only about the Perl core doc at this point, probably becauseadding keywords there is already enough work. AFAIK the core docs arenow covered, so individual modules would be next.
Yep, a google-like search engine could save the effort of manuallytagging with keywords, but I think this idea is more practical andwill improve perldoc greatly.
I hate to say it, but this indexing thing has seemed to be ass-backwardsto me from the beginning.
Instead of having one person combine a Pod Parser and Plucene indexer orsome other simple process, they expect the 3500 authors to ADD extracontent to all their POD?


Well, indexing all of CPAN was never in my original goals. My goal is to
make the core documentation more usable, and I haven't seen any
automated search engine that does that.

For example, let's say you want to find the definition of "scalar".
Sure, you can use grep and find that there are 77 documents where
"scalar" appears a total of 738 times. But which is the good one? (And
which section of the document?) You can try to come up with some clever
ranking algorithm, but it is not trivial (and it's not so easy to define
things like PageRank[tm] in this case). I'd rather have a human indexer
label the place, or just a handful of places, that have the most
relevant information for that keyword.

Cheers,
Ivan

Re: CPANTS: has_signature, has_pod_index

Reply via email to