We use Toke's scenario in a couple places too. We are capable writing a URP
that does it, but it feels dirty, and replacing config with code seems like
something it makes sense to avoid.

Having top level support of some kind of analysis in URP or somewhere else
can also help in other situations where you want some analysis before it
goes in a non-TextField.

Ryan

On Thu, Nov 24, 2016 at 00:53 Toke Eskildsen <t...@statsbiblioteket.dk> wrote:

> On Wed, 2016-11-23 at 13:23 +0000, David Smiley wrote:
> > This is supported at the Lucene level via SortedSetDocValues.  Solr
> > doesn't yet support this for its TextField
> > -- https://issues.apache.org/jira/browse/SOLR-8362
> >  however you could work around this with an URP or copyField
>
> copyfield does not help here as that copies the raw values. We need the
> normalised values for display when we do faceting.
>
> >  or perhaps subclassing TextField so that you can tokenize the text a
> > second time to generate a list of SortedSetDocValuesField.  Probably
> > least painless is to use another field.
>
> So to facet on the normalised (analyzed really) values on a Text field
> in a post-FieldCache Solr, I would need to write an URP or some other
> custom code. I can manage that or just do the normalisation as part of
> the pre-processing.
>
> Question is if my scenario (using analyzers for facet terms) is wide-
> spread? If so, I find this increase in implementation requirements
> problematic.
>
>
> I don't care for FieldCache as such - SOLR-8362 would be a better
> solution for the scenario I describe. Or maybe an URP that makes it
> easy to provide a list of analyzers? I am simply looking for a way
> that a random end-user can easily do faceting on analyzed terms,
> leveraging all the nice build-in filters in Solr.
>
> - Toke Eskildsen, State and University Library, Denmark
>

Reply via email to