Hi Tomás, Thank you very much for your suggestion. I took another crack at it using your recommendation and it worked ideally. The only thing I had to change was
<analyzer type="query"> <tokenizer class="solr.KeywordTokenizerFactory" /> </analyzer> to <analyzer type="query"> <tokenizer class="solr.LowerCaseTokenizerFactory" /> </analyzer> The first did not produce any results but the second worked beautifully. Thanks! Brian Lamb 2011/5/31 Tomás Fernández Löbbe <tomasflo...@gmail.com> > ...or also use the LowerCaseTokenizerFactory at query time for consistency, > but not the edge ngram filter. > > 2011/5/31 Tomás Fernández Löbbe <tomasflo...@gmail.com> > > > Hi Brian, I don't know if I understand what you are trying to achieve. > You > > want the term query "abcdefg" to have an idf of 1 insead of 7? I think > using > > the KeywordTokenizerFilterFactory at query time should work. I would be > > something like: > > > > <fieldType name="edgengram" class="solr.TextField" > > positionIncrementGap="1000"> > > <analyzer type="index"> > > > > <tokenizer class="solr.LowerCaseTokenizerFactory" /> > > <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" > > maxGramSize="25" side="front" /> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.KeywordTokenizerFactory" /> > > </analyzer> > > </fieldType> > > > > this way, at query time "abcdefg" won't be turned to "a ab abc abcd abcde > > abcdef abcdefg". At index time it will. > > > > Regards, > > Tomás > > > > > > On Tue, May 31, 2011 at 1:07 PM, Brian Lamb < > brian.l...@journalexperts.com > > > wrote: > > > >> <fieldType name="edgengram" class="solr.TextField" > >> positionIncrementGap="1000"> > >> <analyzer> > >> <tokenizer class="solr.LowerCaseTokenizerFactory" /> > >> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" > >> maxGramSize="25" side="front" /> > >> </analyzer> > >> </fieldType> > >> > >> I believe I used that link when I initially set up the field and it > worked > >> great (and I'm still using it in other places). In this particular > example > >> however it does not appear to be practical for me. I mentioned that I > have > >> a > >> similarity class that returns 1 for the idf and in the case of an > >> edgengram, > >> it returns 1 * length of the search string. > >> > >> Thanks, > >> > >> Brian Lamb > >> > >> On Tue, May 31, 2011 at 11:34 AM, bmdakshinamur...@gmail.com < > >> bmdakshinamur...@gmail.com> wrote: > >> > >> > Can you specify the analyzer you are using for your queries? > >> > > >> > May be you could use a KeywordAnalyzer for your queries so you don't > end > >> up > >> > matching parts of your query. > >> > > >> > > >> > http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ > >> > This should help you. > >> > > >> > On Tue, May 31, 2011 at 8:24 PM, Brian Lamb > >> > <brian.l...@journalexperts.com>wrote: > >> > > >> > > In this particular case, I will be doing a solr search based on user > >> > > preferences. So I will not be depending on the user to type > "abcdefg". > >> > That > >> > > will be automatically generated based on user selections. > >> > > > >> > > The contents of the field do not contain spaces and since I am > created > >> > the > >> > > search parameters, case isn't important either. > >> > > > >> > > Thanks, > >> > > > >> > > Brian Lamb > >> > > > >> > > On Tue, May 31, 2011 at 9:44 AM, Erick Erickson < > >> erickerick...@gmail.com > >> > > >wrote: > >> > > > >> > > > That'll work for your case, although be aware that string types > >> aren't > >> > > > analyzed at all, > >> > > > so case matters, as do spaces etc..... > >> > > > > >> > > > What is the use-case here? If you explain it a bit there might be > >> > > > better answers.... > >> > > > > >> > > > Best > >> > > > Erick > >> > > > > >> > > > On Fri, May 27, 2011 at 9:17 AM, Brian Lamb > >> > > > <brian.l...@journalexperts.com> wrote: > >> > > > > For this, I ended up just changing it to string and using > >> "abcdefg*" > >> > to > >> > > > > match. That seems to work so far. > >> > > > > > >> > > > > Thanks, > >> > > > > > >> > > > > Brian Lamb > >> > > > > > >> > > > > On Wed, May 25, 2011 at 4:53 PM, Brian Lamb > >> > > > > <brian.l...@journalexperts.com>wrote: > >> > > > > > >> > > > >> Hi all, > >> > > > >> > >> > > > >> I'm running into some confusion with the way edgengram works. I > >> have > >> > > the > >> > > > >> field set up as: > >> > > > >> > >> > > > >> <fieldType name="edgengram" class="solr.TextField" > >> > > > >> positionIncrementGap="1000"> > >> > > > >> <analyzer> > >> > > > >> <tokenizer class="solr.LowerCaseTokenizerFactory" /> > >> > > > >> <filter class="solr.EdgeNGramFilterFactory" > >> minGramSize="1" > >> > > > >> maxGramSize="100" side="front" /> > >> > > > >> </analyzer> > >> > > > >> </fieldType> > >> > > > >> > >> > > > >> I've also set up my own similarity class that returns 1 as the > >> idf > >> > > > score. > >> > > > >> What I've found this does is if I match a string "abcdefg" > >> against a > >> > > > field > >> > > > >> containing "abcdefghijklmnop", then the idf will score that as > a > >> 7: > >> > > > >> > >> > > > >> 7.0 = idf(myfield: a=51 ab=23 abc=2 abcd=2 abcde=2 abcdef=2 > >> > abcdefg=2) > >> > > > >> > >> > > > >> I get why that's happening, but is there a way to avoid that? > Do > >> I > >> > > need > >> > > > to > >> > > > >> do a new field type to achieve the desired affect? > >> > > > >> > >> > > > >> Thanks, > >> > > > >> > >> > > > >> Brian Lamb > >> > > > >> > >> > > > > > >> > > > > >> > > > >> > > >> > > >> > > >> > -- > >> > Thanks and Regards, > >> > DakshinaMurthy BM > >> > > >> > > > > >