Payload!! 2010/10/14 Christoph Hermann <herm...@informatik.uni-freiburg.de>
> Hi, > > is there a way to store additional metadata with fields? > > My Problem is as follows: > I'm extracting extended html with tika. This extended html contains > references > to pages, x,y values of the text etc. I want to be able to retrieve those > values when text was found while searching. > > So when creating the Document, i'm storing a Field for every part of the > texts > content of the document i'm currently indexing (lets call it "content"). > > Example: > I have the following content: > <html><body> > <span page="1" x="1", y="1">This is a very</span> > <span page="1" x="1", y="2">interesting text.</span> > <span page="2" x="1", y="1">This is boring text</span> > </body></html> > > So i would store the following: > > doc.add(new Field("content", "This is a very", Field.Store.YES, > Field.Index.YES)); > doc.add(new Field("content", "interesting text", Field.Store.YES, > Field.Index.YES)); > doc.add(new Field("content", "This is boring text", Field.Store.YES, > Field.Index.YES)); > > Is there any way to include the page,x,y values in there? > I'd like to display the page when retrieving the results. > > I thought about storing the same field twice and adding the page,x,y values > at > the beginning of the Field and then when retrieving the field extract those > values, but maybe theres a better way? > > regards > Christoph Hermann > > -- > Christoph Hermann > Institut für Informatik > Tel: +49 761-203-8171 Fax: +49 761-203-8162 > e-mail: herm...@informatik.uni-freiburg.de > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >