On 7/29/2010 12:18 PM, Chris Hostetter wrote:
it also depends on what you want to get *out* if this is a stored field
... using an analyzer like this will deal with letting you facet on the
individual terms, but the stored vaue returned with each document will
still be a single semi-colon seperated string.
modifying your indexing code (either DIH, or whatever) will actually
result in an array of distinct values being returned for the documents.
Very true. In my case, the added field is not stored, just indexed. I
would only store it for debugging purposes, so it would be OK to keep
the semicolons even then.
My initial approach was to grab the values (which are in another table)
with a DIH subentity and store them in a multivalued field, but that
reduced index speed to a crawl. That's because instead of one query for
the entire import, it was making an individual subquery for every
document returned by the main query. Switching to a left join, I
couldn't see any performance difference, and it's still one query.
If I were to revamp my build system to import in chunks, I could
probably use the DIH option that caches the queries, but I really don't
want to do that.