Hello,

I have been using Nutch for a few days now, and it seems to be working
great. One thing that I do need is the ability to index HTML meta tags from
pages. I'm using Nutch to search some article, so there are tags like
"author" in the html pages. From searching the mailing list, I saw that
there were a few requests made last year for this, but that there was no
built-in functionality. Is this accurate?

A few people suggested writing plug-ins while some other claimed that you
could modify certain files to do the job. Is there a simple way to do this
or do I have no choice but to write a plug-in for it? 

I read http://wiki.apache.org/nutch/WritingPluginExample-0%2e9 but it seems
somewhat overwhelming at this point. Any suggestions would be helpful.

Thanks.

Cheers
-- 
View this message in context: 
http://www.nabble.com/Indexing-HTML-meta-tags-tp21438171p21438171.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to