Re: Merging two tokenized fields

liat oren Fri, 27 Feb 2009 13:02:56 -0800

Thanks for your answer - I will store both texts (I have my own objects' ids
that i use to identify the documents) and will index the text after the
merge.


Thank you,
Liat

2009/2/26 Erick Erickson <erickerick...@gmail.com>

> Reconstructing a field from an index is
> 1> slow
> 2> lossy (what about stemmed words? stopwords? )
>
> UNLESS you have stored the data (Field.Store.YES/COMPRESS),
> in which case you can just get the field from each index and put it
> in the new one. Tokenization has little to do with this although you
> could get a similar effect with untokenized fields but why would you
> want to?
>
> I assume you have a way to uniquely identify the documents that you
> want to combine, relying on the Lucene doc ID is fragile....
>
> If possible, your best bet would be to reconstruct the new index
> from the source you used to create the original indexes.
>
> Maybe a higher-level problem statement would help generate
> more suggestions.
>
> Best
> Erick
>
> On Thu, Feb 26, 2009 at 7:07 AM, liat oren <oren.l...@gmail.com> wrote:
>
> > Hi,
> >
> > I have two indexes, each has a tokenized field and I would like to
> combine
> > them both into one field in a new index.
> > How can it be done?
> > (Is it a good approach or is it better to hold them as untokenized text
> and
> > only when I create the new index, then to tokenize it?)
> >
> > Many thanks,
> > Liat
> >
>

Re: Merging two tokenized fields

Reply via email to