The lack of response to this question makes me think that either there
is no good answer, or maybe the question was too obtuse. So I'll give
it one more go with some more detail ...
My main goal is to implement autocompletion with a mix of words and
short phrases, where the words are drawn from the text of largish
documents, and the phrases are author names and document titles.
I think the best way to accomplish this is to concoct a single field
that contains data from these other "source" fields (as usual with
copyField), but with some of the fields treated as keywords (ie with
their values inserted as single tokens), and others tokenized. I
believe this would be possible at the Lucene level by calling
Document.addField () with multiple fields having the same name: some
marked as TOKENIZED and others not. I think the tokenized fields would
have to share the same analyzer, but that's OK for my case.
I can't see how this could be made to happen in Solr without a lot of
custom coding though. It seems as if the conversion from Solr fields to
Lucene fields is not an easy thing to influence. If anyone has an idea
how to achieve the subgoal, or perhaps a different way of getting at the
main goal, I'd love to hear about it.
So far my only other idea is to write some kind of custom analyzer that
treats short texts as keywords and tokenizes longer ones, which is
probably what I'll look at if nothing else comes up.
Thanks
Mike
On 4/9/2014 4:16 PM, Michael Sokolov wrote:
I think I would like to do something like copyfield from a bunch of
fields into a single field, but with different analysis for each
source, and I'm pretty sure that's not a thing. Is there some
alternate way to accomplish my goal?
Which is to have a suggester that suggests words from my full text
field and complete phrases drawn from my author and title fields all
at the same time. So If I could index author and title using
KeyWordAnalyzer, and full text tokenized, that would be the bees knees.
-Mike