Re: Implementing phrase autopop up

2009-11-25 Thread Shalin Shekhar Mangar
On Tue, Nov 24, 2009 at 11:58 PM, darniz rnizamud...@edmunds.com wrote:



 i created a filed as same as the lucid blog says.

 field name=autocomp type=edgytext indexed=true stored=true
 omitNorms=true omitTermFreqAndPositions=true/

 with the following field configurtion

 fieldType name=edgytext class=solr.TextField
 positionIncrementGap=100
 −
 analyzer type=index
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EdgeNGramFilterFactory minGramSize=1
 maxGramSize=25/
 /analyzer
 −
 analyzer type=query
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 /fieldType

 Now when i query i get the correct phrases for example if search for
 autocomp:how to i get all the correct phrases like

 How to find a car
 How to find a mechanic
 How to choose the right insurance company

 etc... which is good.

 Now I have two question.
 1) Is it necessary to give the query in quote. My gut feeling is yes, since
 if you dont give quote i get phrases beginning with How followed by some
 other words like How can etc...


Yes since we want to do phrase searches on n-grams



 2)if i search for word for example choose, it gives me nothing
 I was expecting to see a result considering there is a word choose in the
 phrase
 How to choose the right insurance company

 i might look more at documentation but do you have anything to advice.


EdgeNgram creates n-grams from the starting or the ending edge therefore you
can't match words in the middle of a phrase. Try using NGramFilterFactory
instead.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Implementing phrase autopop up

2009-11-24 Thread darniz

Thanks for your input
You made a valid point, if we are using field type as text to get
autocomplete it wont work because it goes through tokenizer.
Hence looks like for my use case i need to have a field which uses ngram and
copy. Here is what i did

i created a filed as same as the lucid blog says.

field name=autocomp type=edgytext indexed=true stored=true
omitNorms=true omitTermFreqAndPositions=true/

with the following field configurtion

fieldType name=edgytext class=solr.TextField
positionIncrementGap=100
−
analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=1
maxGramSize=25/
/analyzer
−
analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType

Now when i query i get the correct phrases for example if search for 
autocomp:how to i get all the correct phrases like

How to find a car
How to find a mechanic 
How to choose the right insurance company

etc... which is good.

Now I have two question.
1) Is it necessary to give the query in quote. My gut feeling is yes, since 
if you dont give quote i get phrases beginning with How followed by some
other words like How can etc...

2)if i search for word for example choose, it gives me nothing
I was expecting to see a result considering there is a word choose in the
phrase 
How to choose the right insurance company

i might look more at documentation but do you have anything to advice.

darniz









Shalin Shekhar Mangar wrote:
 
 On Tue, Nov 24, 2009 at 10:12 AM, darniz rnizamud...@edmunds.com wrote:
 

 hello all
 Let me first explain the task i am trying to do.
 i have article with title for example
 doc
 str name=titleCar Insurance for Teenage Drivers/str
 /doc
 −
 doc
 str name=titleA Total Loss? /str
 /doc
 If a user begins to type car insu i want the autopop to show up with the
 entire phrase.
 There are two ways to implement this.
 First is to use the termcomponent and the other is to use a field with
 field
 type which uses solr.EdgeNGramFilterFactor filter.

 I started with using with Term component and i declared a term request
 handler and gave the following query

 http://localhost:8080/solr/terms?terms.fl=titleterms.prefix=car
 The issue is that its not giving the entire pharse, it gives me back
 results
 like car, caravan, carbon. Now  i know using terms.prefix will only give
 me
 results where the sentence start with car. On top of this i also want if
 there is word like car somewhere in between the title that should also
 show
 up in autopop very much similar like google where a word is not
 necessarily
 start at the beginning but it could be present anywhere in the middle of
 the
 title.
 The question is does TermComponent is a good candidate or  using a custom
 field lets the name is autoPopupText with field type configured with all
 filter and EdgeNGramFilterFactor defined and copying the title to the
 autoPopupText field and using it to power autopopup.

 The other thing is that using  EdgeNGramFilterFactor is more from index
 point of view when you index document you need to know which fields you
 want
 to copy to autoPopupText field where as using Term component is more like
 you can define at query time what fields you want to use to fetch
 autocomplete from.

 Any idea whats the best and why the Term component is not giving me an
 entire phrase which i mentioned earlier.
 FYI
 my title field is of type text.

 
 
 You are using a tokenized field type with TermsComponent therefore each
 word
 in your phrase gets indexed as a separate token. You should use a
 non-tokenized type (such as a string type) with TermsComponent. However,
 this will only let you search by prefix and not by words in between the
 phrase.
 
 Your best bet here would be to use EdgeNGramFilterFactory. If your index
 is
 very large, you can consider doing a prefix search on shingles too.
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://old.nabble.com/Implementing-phrase-autopop-up-tp26490419p26499912.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Implementing phrase autopop up

2009-11-24 Thread darniz

can anybody update me if its possible that a word within a phrase is match,
that phrase can be displayed.

darniz

darniz wrote:
 
 Thanks for your input
 You made a valid point, if we are using field type as text to get
 autocomplete it wont work because it goes through tokenizer.
 Hence looks like for my use case i need to have a field which uses ngram
 and copy. Here is what i did
 
 i created a filed as same as the lucid blog says.
 
 field name=autocomp type=edgytext indexed=true stored=true
 omitNorms=true omitTermFreqAndPositions=true/
 
 with the following field configurtion
 
 fieldType name=edgytext class=solr.TextField
 positionIncrementGap=100
 −
 analyzer type=index
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EdgeNGramFilterFactory minGramSize=1
 maxGramSize=25/
 /analyzer
 −
 analyzer type=query
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 /fieldType
 
 Now when i query i get the correct phrases for example if search for 
 autocomp:how to i get all the correct phrases like
 
 How to find a car
 How to find a mechanic 
 How to choose the right insurance company
 
 etc... which is good.
 
 Now I have two question.
 1) Is it necessary to give the query in quote. My gut feeling is yes,
 since  if you dont give quote i get phrases beginning with How followed by
 some other words like How can etc...
 
 2)if i search for word for example choose, it gives me nothing
 I was expecting to see a result considering there is a word choose in
 the phrase 
 How to choose the right insurance company
 
 i might look more at documentation but do you have anything to advice.
 
 darniz
 
 
 
 
 
 
 
 
 
 Shalin Shekhar Mangar wrote:
 
 On Tue, Nov 24, 2009 at 10:12 AM, darniz rnizamud...@edmunds.com wrote:
 

 hello all
 Let me first explain the task i am trying to do.
 i have article with title for example
 doc
 str name=titleCar Insurance for Teenage Drivers/str
 /doc
 −
 doc
 str name=titleA Total Loss? /str
 /doc
 If a user begins to type car insu i want the autopop to show up with the
 entire phrase.
 There are two ways to implement this.
 First is to use the termcomponent and the other is to use a field with
 field
 type which uses solr.EdgeNGramFilterFactor filter.

 I started with using with Term component and i declared a term request
 handler and gave the following query

 http://localhost:8080/solr/terms?terms.fl=titleterms.prefix=car
 The issue is that its not giving the entire pharse, it gives me back
 results
 like car, caravan, carbon. Now  i know using terms.prefix will only give
 me
 results where the sentence start with car. On top of this i also want if
 there is word like car somewhere in between the title that should also
 show
 up in autopop very much similar like google where a word is not
 necessarily
 start at the beginning but it could be present anywhere in the middle of
 the
 title.
 The question is does TermComponent is a good candidate or  using a
 custom
 field lets the name is autoPopupText with field type configured with all
 filter and EdgeNGramFilterFactor defined and copying the title to the
 autoPopupText field and using it to power autopopup.

 The other thing is that using  EdgeNGramFilterFactor is more from index
 point of view when you index document you need to know which fields you
 want
 to copy to autoPopupText field where as using Term component is more
 like
 you can define at query time what fields you want to use to fetch
 autocomplete from.

 Any idea whats the best and why the Term component is not giving me an
 entire phrase which i mentioned earlier.
 FYI
 my title field is of type text.

 
 
 You are using a tokenized field type with TermsComponent therefore each
 word
 in your phrase gets indexed as a separate token. You should use a
 non-tokenized type (such as a string type) with TermsComponent. However,
 this will only let you search by prefix and not by words in between the
 phrase.
 
 Your best bet here would be to use EdgeNGramFilterFactory. If your index
 is
 very large, you can consider doing a prefix search on shingles too.
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 
 
 

-- 
View this message in context: 
http://old.nabble.com/Implementing-phrase-autopop-up-tp26490419p26506470.html
Sent from the Solr - User mailing list archive at Nabble.com.



Implementing phrase autopop up

2009-11-23 Thread darniz

hello all
Let me first explain the task i am trying to do.
i have article with title for example
doc
str name=titleCar Insurance for Teenage Drivers/str
/doc
−
doc
str name=titleA Total Loss? /str
/doc
If a user begins to type car insu i want the autopop to show up with the
entire phrase.
There are two ways to implement this.
First is to use the termcomponent and the other is to use a field with field
type which uses solr.EdgeNGramFilterFactor filter.

I started with using with Term component and i declared a term request
handler and gave the following query

http://localhost:8080/solr/terms?terms.fl=titleterms.prefix=car
The issue is that its not giving the entire pharse, it gives me back results
like car, caravan, carbon. Now  i know using terms.prefix will only give me
results where the sentence start with car. On top of this i also want if
there is word like car somewhere in between the title that should also show
up in autopop very much similar like google where a word is not necessarily
start at the beginning but it could be present anywhere in the middle of the
title.
The question is does TermComponent is a good candidate or  using a custom
field lets the name is autoPopupText with field type configured with all
filter and EdgeNGramFilterFactor defined and copying the title to the
autoPopupText field and using it to power autopopup.

The other thing is that using  EdgeNGramFilterFactor is more from index
point of view when you index document you need to know which fields you want
to copy to autoPopupText field where as using Term component is more like
you can define at query time what fields you want to use to fetch
autocomplete from.

Any idea whats the best and why the Term component is not giving me an
entire phrase which i mentioned earlier.
FYI
my title field is of type text.
Thanks
darniz

-- 
View this message in context: 
http://old.nabble.com/Implementing-phrase-autopop-up-tp26490419p26490419.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Implementing phrase autopop up

2009-11-23 Thread Shalin Shekhar Mangar
On Tue, Nov 24, 2009 at 10:12 AM, darniz rnizamud...@edmunds.com wrote:


 hello all
 Let me first explain the task i am trying to do.
 i have article with title for example
 doc
 str name=titleCar Insurance for Teenage Drivers/str
 /doc
 −
 doc
 str name=titleA Total Loss? /str
 /doc
 If a user begins to type car insu i want the autopop to show up with the
 entire phrase.
 There are two ways to implement this.
 First is to use the termcomponent and the other is to use a field with
 field
 type which uses solr.EdgeNGramFilterFactor filter.

 I started with using with Term component and i declared a term request
 handler and gave the following query

 http://localhost:8080/solr/terms?terms.fl=titleterms.prefix=car
 The issue is that its not giving the entire pharse, it gives me back
 results
 like car, caravan, carbon. Now  i know using terms.prefix will only give me
 results where the sentence start with car. On top of this i also want if
 there is word like car somewhere in between the title that should also show
 up in autopop very much similar like google where a word is not necessarily
 start at the beginning but it could be present anywhere in the middle of
 the
 title.
 The question is does TermComponent is a good candidate or  using a custom
 field lets the name is autoPopupText with field type configured with all
 filter and EdgeNGramFilterFactor defined and copying the title to the
 autoPopupText field and using it to power autopopup.

 The other thing is that using  EdgeNGramFilterFactor is more from index
 point of view when you index document you need to know which fields you
 want
 to copy to autoPopupText field where as using Term component is more like
 you can define at query time what fields you want to use to fetch
 autocomplete from.

 Any idea whats the best and why the Term component is not giving me an
 entire phrase which i mentioned earlier.
 FYI
 my title field is of type text.



You are using a tokenized field type with TermsComponent therefore each word
in your phrase gets indexed as a separate token. You should use a
non-tokenized type (such as a string type) with TermsComponent. However,
this will only let you search by prefix and not by words in between the
phrase.

Your best bet here would be to use EdgeNGramFilterFactory. If your index is
very large, you can consider doing a prefix search on shingles too.

-- 
Regards,
Shalin Shekhar Mangar.