28 maj 2009 kl. 12.22 skrev Gaurav Kumar:
Hi everyone,
I am doing a project using Lucene where i need to index HTML files.
I am
using Tika to parse HTML files. But i need to index files according
to their
tags which means that every text present in different HTML tag (like
<p>
<a>) should be stored in different fields. Can i do that. If yes
how? Also
can i assign different weightage to the tokens present in different
fields.
If yes how?
You might want to explain what it is you try to achieve with this. I
suspect you might want to use payloads rather than index the tokens in
multiple fields.
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org