Thanks for the reply.  I will create a new list about writing plug-ins since
it is technically a new topic.

If ay of the other people have suggestions please add them. I read somewhere
that we can copy the existing index-more plugin and add a few lines so that
it reads meta tags and indexes them. Any ideas about that?

Cheers,



Doğacan Güney-3 wrote:
> 
> On Tue, Jan 13, 2009 at 5:38 PM, ahammad <[email protected]> wrote:
>>
>> Hello,
>>
>> I have been using Nutch for a few days now, and it seems to be working
>> great. One thing that I do need is the ability to index HTML meta tags
>> from
>> pages. I'm using Nutch to search some article, so there are tags like
>> "author" in the html pages. From searching the mailing list, I saw that
>> there were a few requests made last year for this, but that there was no
>> built-in functionality. Is this accurate?
>>
>> A few people suggested writing plug-ins while some other claimed that you
>> could modify certain files to do the job. Is there a simple way to do
>> this
>> or do I have no choice but to write a plug-in for it?
>>
> 
> No unfortunately you will have to write a plug-in for it. I have
> something in mind
> that will make extracting data from html pages easier, but that's for
> post-1.0.
> 
>> I read http://wiki.apache.org/nutch/WritingPluginExample-0%2e9 but it
>> seems
>> somewhat overwhelming at this point. Any suggestions would be helpful.
>>
>> Thanks.
>>
>> Cheers
>> --
>> View this message in context:
>> http://www.nabble.com/Indexing-HTML-meta-tags-tp21438171p21438171.html
>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> Doğacan Güney
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Indexing-HTML-meta-tags-tp21438171p21441215.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to