Adam Kennedy wrote:

* has_pod_index: The POD contains at least one X<> keyword that helps
POD indexers. Whether only one is usefull is open for debate, because
at least the license (X<gpl>), your CPAN ID under authors (x<tels>),
and some generic keyword what your module (X<foo>) is about can
probably added even for the most minimal module.


Can you give an example of how this has any practical impact on
anything?



Here is the main page for the project.
    http://pod-indexing.annocpan.org/wiki/index.cgi

They talk only about the Perl core doc at this point, probably because adding keywords there is already enough work. AFAIK the core docs are now covered, so individual modules would be next.

Yep, a google-like search engine could save the effort of manually tagging with keywords, but I think this idea is more practical and will improve perldoc greatly.


I hate to say it, but this indexing thing has seemed to be ass-backwards to me from the beginning.

Instead of having one person combine a Pod Parser and Plucene indexer or some other simple process, they expect the 3500 authors to ADD extra content to all their POD?

Well, indexing all of CPAN was never in my original goals. My goal is to make the core documentation more usable, and I haven't seen any automated search engine that does that.

For example, let's say you want to find the definition of "scalar". Sure, you can use grep and find that there are 77 documents where "scalar" appears a total of 738 times. But which is the good one? (And which section of the document?) You can try to come up with some clever ranking algorithm, but it is not trivial (and it's not so easy to define things like PageRank[tm] in this case). I'd rather have a human indexer label the place, or just a handful of places, that have the most relevant information for that keyword.

Cheers,
Ivan

Reply via email to