Yes. I am using 1.3. When is 1.4 due for release?
Yonik Seeley-2 wrote: > > Are you using Solr 1.3? > You might want to try the latest 1.4 test build - faceting has changed a > lot. > > -Yonik > http://www.lucidimagination.com > > On Thu, Jun 4, 2009 at 12:01 PM, Yao Ge <yao...@gmail.com> wrote: >> >> I am index a database with over 1 millions rows. Two of fields contain >> unstructured text but size of each fields is limited (256 characters). >> >> I come up with an idea to use visualize the text fields using text cloud >> by >> turning the two text fields in facets. The weight of font and size is of >> each facet value (words) derived from the facet counts. I used simpler >> field >> type so that the there is no stemming to these facet values: >> <fieldType name="word" class="solr.TextField" >> positionIncrementGap="100" >>> >> <analyzer> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" >> ignoreCase="true" expand="false"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords.txt"/> >> <filter class="solr.WordDelimiterFilterFactory" >> generateWordParts="0" generateNumberParts="0" catenateWords="1" >> catenateNumbers="1" catenateAll="0"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >> </analyzer> >> </fieldType> >> >> The facet query is considerably slower comparing to other facets from >> structured database fields (with highly repeated values). What I found >> interesting is that even after I constrained search results to just a few >> hunderd hits using other facets, these text facets are still very slow. >> >> I understand that text fields are not good candidate for faceting as it >> can >> contain very large number of unique values. However why it is still slow >> after my matching documents is reduced to hundreds? Is it because the >> whole >> filter is cached (regardless the matching docs) and I don't have enough >> filter cache size to fit the whole list? >> >> The following is my filterCahce setting: >> <filterCache class="solr.LRUCache" size="5120" initialSize="512" >> autowarmCount="128"/> >> >> Lastly, what I really want to is to give user a chance to visualize and >> filter on top relevant words in the free-text fields. Are there >> alternative >> to facet field approach? term vectors? I can do client side process based >> on >> top N (say 100) hits for this but it is my last option. >> -- >> View this message in context: >> http://www.nabble.com/Faceting-on-text-fields-tp23872891p23872891.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Faceting-on-text-fields-tp23872891p23876051.html Sent from the Solr - User mailing list archive at Nabble.com.