you will need to develop parser and indexer. but remember that in current implementation content is not stored in lucene index,
indexed - yes nut not stored. Best Regards Alexander Aristov 2009/5/28 Gaurav Kumar <gaurav.bond.it...@gmail.com> > Hi everyone, > > I am doing a project using Lucene where i need to index HTML files. I am > using Tika to parse HTML files. But i need to index files according to > their > tags which means that every text present in different HTML tag (like <p> > <a>) should be stored in different fields. Can i do that. If yes how? Also > can i assign different weightage to the tokens present in different fields. > If yes how? >