Hi,

is there a way to store additional metadata with fields?

My Problem is as follows:
I'm extracting extended html with tika. This extended html contains references 
to pages, x,y values of the text etc. I want to be able to retrieve those 
values when text was found while searching.

So when creating the Document, i'm storing a Field for every part of the texts 
content of the document i'm currently indexing (lets call it "content").

Example:
I have the following content:
<html><body>
<span page="1" x="1", y="1">This is a very</span>
<span page="1" x="1", y="2">interesting text.</span>
<span page="2" x="1", y="1">This is boring text</span>
</body></html>

So i would store the following:

doc.add(new Field("content", "This is a very", Field.Store.YES, 
Field.Index.YES));
doc.add(new Field("content", "interesting text", Field.Store.YES, 
Field.Index.YES));
doc.add(new Field("content", "This is boring text", Field.Store.YES, 
Field.Index.YES));

Is there any way to include the page,x,y values in there?
I'd like to display the page when retrieving the results.

I thought about storing the same field twice and adding the page,x,y values at 
the beginning of the Field and then when retrieving the field extract those 
values, but maybe theres a better way?

regards
Christoph Hermann

-- 
Christoph Hermann
Institut für Informatik
Tel: +49 761-203-8171 Fax: +49 761-203-8162
e-mail: herm...@informatik.uni-freiburg.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to