On 12/11/2011 1:54 PM, Brian Lamb wrote:
By nature of my schema, I have several multivalued fields. Each one I
populate with a separate entity. Is there a better way to do it? For
example, could I pull in all the singular data in one sitting and then come
back in later and populate with the multivalued items.

An alternate approach in some cases would be to do a GROUP_CONCAT and then
populate the multivalued column with some transformation. Is that possible?

Lastly, is it possible to use copyField to copy three regular fields into
one multiValued field and have all the data show up?

The best way to proceed may depend on whether you actually need the field to be multivalued (returning an array in search results), or if you simply need to be able to search on all the values. For me, it's the latter - the field isn't stored.

I use the GROUP_CONCAT method (hidden in a database view, so Solr doesn't need to know about it) to put multiple values into a field, separated by semicolons. I then use the following single-valued fieldType to split those up and make all the values searchable. The tokenizer splits by semicolons followed by zero or more spaces, the pattern filter strips leading and trailing punctuation from each token. The ICU filter is basically a better implementation of the ascii folding filter and the lowercase filter, in a single pass. The others are fairly self-explanatory:

<!-- lowercases, tokenize by semicolons -->
<fieldType name="lcsemi" class="solr.TextField" sortMissingLast="true" positionIncrementGap="0" omitNorms="true">
<analyzer>
<tokenizer class="solr.PatternTokenizerFactory" pattern="; *"/>
<filter class="solr.PatternReplaceFilterFactory"
          pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$"
          replacement="$2"
          allowempty="false"
        />
<filter class="solr.ICUFoldingFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.TrimFilterFactory"/>
</analyzer>
</fieldType>

If you actually do need the field to be multivalued, then you'll need to do dataimport transformation as mentioned by Gora, who also replied.

Thanks,
Shawn

Reply via email to