Re: distinct on my result

2010-03-11 Thread stocki

okay.
we have a lot of products and i just importet the name of each product to a
core.
make an edgengram to this and my autoCOMPLETION runs.

but i want an auto-suggestion:

example.

autoCompletion--   I: harry O: harry potter...
but when the input ist -- I. potter -- O: /

so what i want is, that i get harry potter ... when i tipping potter
into my search field!

any idea ? 

i think the solution is a mixe of termsComponent and EdgeNGram or not ? 

i am a little bit despair, and in this forum are too many information about
it =( 


gwk-4 wrote:
 
 Hi,
 
 The autosuggest core is filled by a simple script (written in PHP) which 
 request facet values for all the possible strings one can search for and 
 adds them one by one as a document. Our case has some special issues due 
 to the fact that we search in multiple languages (Typing España will 
 suggest Spain and the other way around when on the Spanish site). We 
 have about 97500 documents yeilding approximately 12500 different 
 documents in our autosuggest-core and the autosuggest-update script 
 takes about 5 minutes to do a full re-index (all this is done on a 
 separate server and replicated so the indexing has no impact on the 
 performance of the site).
 
 Regards,
 
 gwk
 
 On 3/10/2010 3:09 PM, stocki wrote:
 okay. thx

 my suggestion run in another core;)

 do you distinct during the import with DIH ?

 
 
 

-- 
View this message in context: 
http://old.nabble.com/distinct-on-my-result-tp27849951p27864088.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: distinct on my result

2010-03-11 Thread stocki

hey,

okay i show your my settings ;)
i use an extra core with the standard requesthandler.


SCHEMA.XML
field name=id type=string  indexed=true stored=true required=true
/
field name=name type=textindexed=true stored=true
required=true /
field name=suggest type=autocomplete indexed=true stored=true 
multiValued=true/
copyField source=name  dest=suggest/

so i copy my names to the field suggest and use the EdgeNGramFilter and some
others 

fieldType name=autocomplete class=solr.TextField
analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /   
filter class=solr.EdgeNGramFilterFactory 
maxGramSize=100
minGramSize=1 /  
filter class=solr.StandardFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.SnowballPorterFilterFactory 
language=German2
protected=protwords.txt/ 
filter class=solr.SnowballPorterFilterFactory 
language=English
protected=protwords.txt/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
/analyzer
analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /
filter class=solr.EdgeNGramFilterFactory 
maxGramSize=100
minGramSize=1 /
filter class=solr.StandardFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.SnowballPorterFilterFactory 
language=German2
protected=protwords.txt/ 
filter class=solr.SnowballPorterFilterFactory 
language=English
protected=protwords.txt/ 
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/

/analyzer
/fieldType


so with this konfig i get the results above ...

maybe i have t many filters ;) ?!



gwk-4 wrote:
 
 Hi,
 
 I'm no expert on the full-text search features of Solr but I guess that 
 has something to do with your fieldtype, or query. Are you using the 
 standard request handler or dismax for your queries? And what analysers 
 are you using on your product name field?
 
 Regards,
 
 gwk
 
 On 3/11/2010 3:24 PM, stocki wrote:
 okay.
 we have a lot of products and i just importet the name of each product to
 a
 core.
 make an edgengram to this and my autoCOMPLETION runs.

 but i want an auto-suggestion:

 example.

 autoCompletion--I: harry O: harry potter...
 but when the input ist --  I. potter -- O: /

 so what i want is, that i get harry potter ... when i tipping potter
 into my search field!

 any idea ?

 i think the solution is a mixe of termsComponent and EdgeNGram or not ?

 i am a little bit despair, and in this forum are too many information
 about
 it =(


 gwk-4 wrote:

 Hi,

 The autosuggest core is filled by a simple script (written in PHP) which
 request facet values for all the possible strings one can search for and
 adds them one by one as a document. Our case has some special issues due
 to the fact that we search in multiple languages (Typing España will
 suggest Spain and the other way around when on the Spanish site). We
 have about 97500 documents yeilding approximately 12500 different
 documents in our autosuggest-core and the autosuggest-update script
 takes about 5 minutes to do a full re-index (all this is done on a
 separate server and replicated so the indexing has no impact on the
 performance of the site).

 Regards,

 gwk

 On 3/10/2010 3:09 PM, stocki wrote:

 okay. thx

 my suggestion run in another core;)

 do you distinct during the import with DIH ?






 
 
 

-- 
View this message in context: 
http://old.nabble.com/distinct-on-my-result-tp27849951p27865058.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: distinct on my result

2010-03-11 Thread gwk

Hi,

Try replacing KeywordTokenizerFactory with a WhitespaceTokenizerFactory 
so it'll create separate terms per word. After a reindex it should work.


Regards,

gwk

On 3/11/2010 4:33 PM, stocki wrote:

hey,

okay i show your my settings ;)
i use an extra core with the standard requesthandler.


SCHEMA.XML
field name=id type=string  indexed=true stored=true required=true
/
field name=name type=textindexed=true stored=true
required=true /
field name=suggest type=autocomplete indexed=true stored=true
multiValued=true/
copyField source=name  dest=suggest/

so i copy my names to the field suggest and use the EdgeNGramFilter and some
others

fieldType name=autocomplete class=solr.TextField
 analyzer type=index
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory /
filter class=solr.EdgeNGramFilterFactory 
maxGramSize=100
minGramSize=1 / 
filter class=solr.StandardFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.SnowballPorterFilterFactory 
language=German2
protected=protwords.txt/
filter class=solr.SnowballPorterFilterFactory 
language=English
protected=protwords.txt/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
 /analyzer
 analyzer type=query
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory /
filter class=solr.EdgeNGramFilterFactory 
maxGramSize=100
minGramSize=1 /
filter class=solr.StandardFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=German2
protected=protwords.txt/
filter class=solr.SnowballPorterFilterFactory language=English
protected=protwords.txt/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/

 /analyzer
/fieldType


so with this konfig i get the results above ...

maybe i have t many filters ;) ?!



gwk-4 wrote:
   

Hi,

I'm no expert on the full-text search features of Solr but I guess that
has something to do with your fieldtype, or query. Are you using the
standard request handler or dismax for your queries? And what analysers
are you using on your product name field?

Regards,

gwk

On 3/11/2010 3:24 PM, stocki wrote:
 

okay.
we have a lot of products and i just importet the name of each product to
a
core.
make an edgengram to this and my autoCOMPLETION runs.

but i want an auto-suggestion:

example.

autoCompletion-- I: harry O: harry potter...
but when the input ist --   I. potter -- O: /

so what i want is, that i get harry potter ... when i tipping potter
into my search field!

any idea ?

i think the solution is a mixe of termsComponent and EdgeNGram or not ?

i am a little bit despair, and in this forum are too many information
about
it =(


gwk-4 wrote:

   

Hi,

The autosuggest core is filled by a simple script (written in PHP) which
request facet values for all the possible strings one can search for and
adds them one by one as a document. Our case has some special issues due
to the fact that we search in multiple languages (Typing España will
suggest Spain and the other way around when on the Spanish site). We
have about 97500 documents yeilding approximately 12500 different
documents in our autosuggest-core and the autosuggest-update script
takes about 5 minutes to do a full re-index (all this is done on a
separate server and replicated so the indexing has no impact on the
performance of the site).

Regards,

gwk

On 3/10/2010 3:09 PM, stocki wrote:

 

okay. thx

my suggestion run in another core;)

do you distinct during the import with DIH ?


   



 
   



 
   




distinct on my result

2010-03-10 Thread stocki

hello.

i implement my suggest-function with edgengramfilter.
now when i get my result , is the result not distinct. often ist the name
double or more.

is it possible that solr gives me only distinct result ?

 response:{numFound:172,start:0,docs:[
{
 name:Halloween},
{
 name:Hallo Taxi},
{
 name:Halloween},
{
 name:Hallstatt},
{
 name:Hallo Mary},
{
 name:Halloween},
{
 name:Halloween},
{
 name:Halloween},
{
 name:Halleluja},
{
 name:Halloween}]

so how can i delete Halloween from solr ? 
i didnt want delete it from client-side

thx



-- 
View this message in context: 
http://old.nabble.com/distinct-on-my-result-tp27849951p27849951.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: distinct on my result

2010-03-10 Thread gwk

Hi,

I ran into the same issue, and what I did (at 
http://www.mysecondhome.co.uk/) was to create a separate core just for 
autosuggest which is fully updated once an hour which contains the 
distinct values of the items I want to look for including the count so I 
can display the approximate amount of results in the suggest dropdown. 
This might not be a good solution when your data is updated frequently 
but for us it's worked very well so far. Maybe you can also use 
clustering so you won't have to create a separate core but I'm thinking 
my solution performs better (although I haven't tested it so I could be 
horribly horribly wrong).


Regards,

gwk

On 3/10/2010 2:55 PM, stocki wrote:

hello.

i implement my suggest-function with edgengramfilter.
now when i get my result , is the result not distinct. often ist the name
double or more.

is it possible that solr gives me only distinct result ?

  response:{numFound:172,start:0,docs:[
{
 name:Halloween},
{
 name:Hallo Taxi},
{
 name:Halloween},
{
 name:Hallstatt},
{
 name:Hallo Mary},
{
 name:Halloween},
{
 name:Halloween},
{
 name:Halloween},
{
 name:Halleluja},
{
 name:Halloween}]

so how can i delete Halloween from solr ?
i didnt want delete it from client-side

thx



   




Re: distinct on my result

2010-03-10 Thread stocki

hey.

okay. thx 

my suggestion run in another core ;)

do you distinct during the import with DIH ?



gwk-4 wrote:
 
 Hi,
 
 I ran into the same issue, and what I did (at 
 http://www.mysecondhome.co.uk/) was to create a separate core just for 
 autosuggest which is fully updated once an hour which contains the 
 distinct values of the items I want to look for including the count so I 
 can display the approximate amount of results in the suggest dropdown. 
 This might not be a good solution when your data is updated frequently 
 but for us it's worked very well so far. Maybe you can also use 
 clustering so you won't have to create a separate core but I'm thinking 
 my solution performs better (although I haven't tested it so I could be 
 horribly horribly wrong).
 
 Regards,
 
 gwk
 
 On 3/10/2010 2:55 PM, stocki wrote:
 hello.

 i implement my suggest-function with edgengramfilter.
 now when i get my result , is the result not distinct. often ist the name
 double or more.

 is it possible that solr gives me only distinct result ?

   response:{numFound:172,start:0,docs:[
  {
   name:Halloween},
  {
   name:Hallo Taxi},
  {
   name:Halloween},
  {
   name:Hallstatt},
  {
   name:Hallo Mary},
  {
   name:Halloween},
  {
   name:Halloween},
  {
   name:Halloween},
  {
   name:Halleluja},
  {
   name:Halloween}]

 so how can i delete Halloween from solr ?
 i didnt want delete it from client-side

 thx




 
 
 

-- 
View this message in context: 
http://old.nabble.com/distinct-on-my-result-tp27849951p27850157.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: distinct on my result

2010-03-10 Thread gwk

Hi,

The autosuggest core is filled by a simple script (written in PHP) which 
request facet values for all the possible strings one can search for and 
adds them one by one as a document. Our case has some special issues due 
to the fact that we search in multiple languages (Typing España will 
suggest Spain and the other way around when on the Spanish site). We 
have about 97500 documents yeilding approximately 12500 different 
documents in our autosuggest-core and the autosuggest-update script 
takes about 5 minutes to do a full re-index (all this is done on a 
separate server and replicated so the indexing has no impact on the 
performance of the site).


Regards,

gwk

On 3/10/2010 3:09 PM, stocki wrote:

okay. thx

my suggestion run in another core;)

do you distinct during the import with DIH ?