Re: Multiple Word Facets

Ahmet Arslan Tue, 26 Oct 2010 19:32:43 -0700

Facets are generated from indexed terms.

Depending on your need/use-case:


You can use a additional separate String field (which is not tokenized) for 
facets, populate it via copyField. Search on tokenized field facet on 
non-tokenized field.

Or

You can add solr.ShingleFilterFactory to your index analyzer to form multiple 
word terms.

--- On Wed, 10/27/10, Adam Estrada <estrada.a...@gmail.com> wrote:

> From: Adam Estrada <estrada.a...@gmail.com>
> Subject: Multiple Word Facets
> To: solr-user@lucene.apache.org
> Date: Wednesday, October 27, 2010, 4:43 AM
> All,
> I am a new to Solr faceting and stuck on how to get
> multiple-word
> facets returned from a standard Solr query. See below for
> what is
> currently being returned.
> 
> <lst name="facet_counts">
> <lst name="facet_queries"/>
> <lst name="facet_fields">
> <lst name="title">
> <int name="Federal">89</int>
> <int name="EFLHD">87</int>
> <int name="Eastern">87</int>
> <int name="Lands">87</int>
> <int name="Highways">84</int>
> <int name="FHWA">60</int>
> <int name="Transportation">32</int>
> <int name="GIS">22</int>
> <int name="Planning">19</int>
> <int name="Asset">15</int>
> <int name="Environment">15</int>
> <int name="Management">14</int>
> <int name="Realty">12</int>
> <int name="Highway">11</int>
> <int name="HEP">10</int>
> <int name="Program">9</int>
> <int name="HEPGIS">7</int>
> <int name="Resources">7</int>
> <int name="Roads">7</int>
> <int name="EEI">6</int>
> <int name="Environmental">6</int>
> <int name="Right">6</int>
> <int name="Way">6</int>
> ...etc...
> 
> There are many terms in there that are 2 or 3 word phrases.
> For
> example, Eastern Federal Lands Highway Division all gets
> broken down
> in to the individual words that make up the total group of
> words. I've
> seen quite a few websites that do what it is I am trying to
> do here so
> any suggestions at this point would be great. See my schema
> below
> (copied from the example schema).
> 
>     <fieldType name="text"
> class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>          <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
>     <filter
> class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="false"/>
>         <filter
> class="solr.StopFilterFactory"
>                
> ignoreCase="true"
>                
> words="stopwords.txt"
>                
> enablePositionIncrements="true"
>                
> />
>     <filter
> class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"
> generateNumberParts="1" catenateWords="0"
> catenateNumbers="0"
> catenateAll="0" splitOnCaseChange="1"/>
>         <filter
> class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
> 
> Similar for type="query". Please advise on how to group or
> cluster
> document terms so that they can be used as facets.
> 
> Many thanks in advance,
> Adam Estrada
>

Re: Multiple Word Facets

Reply via email to