Re: Payloads

Michael Busch Thu, 18 Jan 2007 08:39:03 -0800

Nadav Har'El wrote:

On Thu, Jan 18, 2007, Michael Busch wrote about "Re: Payloads":
As you pointed out it is still possible to have per-doc payloads. Youneed an analyzer which adds just one Token with payload to a specificfield for each doc. I understand that this code would be quite ugly onthe app side. A more elegant solution might be LUCENE-580. With thatpatch you are able to add pre-analyzed fields (i. e. TokenStreams) to aDocument without having to use an analyzer. You could use a TokenStream
Thanks, this sounds like a good idea.

In fact, I could live with something even simpler: I want to be able
to create a Field with a single token (with its payload). If I need more
than one of these tokens with payloads, I can just add several fields with
the same name (this should work, although the description of LUCENE-580
suggests that it might have a bug in this area).

I'll add a comment about this use-case to LUCENE-580.

Yes for your use case it would indeed make sense to just add a singleToken to a field. But there are other use cases that would benefit from580. E. g. when using UIMA as a parser. UIMA does not work per-field, itmaterializes the tokens of all fields in a CAS. So the indexer can'tcall the parser per field, the parsing has to be done before indexing.So it would make sense to do the parsing and then add TokenStreams forthe different fields to the Document that only iterate through the CAS.This is of course also possible by adding multiple Field instancescontaining single Tokens to a Document, but the performance wouldsuffer. Each Token would be wrapped in a Field object and then hold in alist in Document.


So I think being able to add TokenStreams to a Document makes sense.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Payloads

Reply via email to