subject:"Search Suggestion Filtering"

Re: Search Suggestion Filtering

2014-01-21 Thread Areek Zillur

Regarding LUCENE-5350, the context is the filter. i.e. the context is
prefixed with every entry (suggestion) that is in the context. So when
users lookup foo entry in context of bar, the actual lookup is
bar(ctx_seperator)foo.  This filters entries that match foo in another
context in the lookup. For details on use-cases see the description of the
jira.

Regarding SOLR-5378, the final documentation is unfortunately unavailable
as of now (will try to fix that ASAP). For now, you can look at the
description of SOLR-5378 (outdated response format) along with the
description of SOLR-5529 (updated response format). I have linked the
related JIRAs in SOLR-5378.

Hope that helps,
Areek


On Mon, Jan 20, 2014 at 2:23 AM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 Hi guys, following this thread I have some question :

 1) regarding LUCENE-5350, what is the context quoted ? Is it the context a
 filter query ?

 2) regarding https://issues.apache.org/jira/browse/SOLR-5378, do we have
 the final documentation available ?

 Cheers


 2014/1/16 Hamish Campbell hamish.campb...@koordinates.com

  Thank you Jorge. We looked at phrase suggestions from previous user
  queries, but they're not so useful in our case. However, I have a
 follow-up
  question about similar functionality that I'll post shortly.
 
  The list might like to know that I've come up with a quick and
 exceedingly
  dirty strikehack/strike solution that works for our limited case.
 
  You have been warned!
 
  Note that we're using django-haystack to actually interact with Solr:
 
  1. Set nonFuzzyPrefix of the Suggester to 4.
  2. At index time, the haystack index will build suggestion terms by
  extracting the relevant terms and prefixing with a 4 (alpha) character
  reference for the target instance.
  3. At search time, the user's query is split, terms are prefixed and
  concatenated. The new query is sent to solr and the results are cleaned
 of
  references before returned to the front end.
 
  I'm not proud of it, but it works. =D
 
 
 
  On Fri, Jan 17, 2014 at 3:13 AM, Jorge Luis Betancourt González 
  jlbetanco...@uci.cu wrote:
 
   In a custom application we have, we use a separated core (under Solr
   3.6.1) to store the queries used by the users and then provide the
   autocomplete feauture. In our case we need to filter some phrases, that
  we
   don't need to be suggested to the users. I build a custom
   UpdateRequestProcessor to implement this logic, so we define this
  blocking
   patterns in some external source of information (DB, files, etc.). For
  the
   suggestions per-se we use as a base
   https://github.com/cominvent/autocomplete configuration, described in
   www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
   which is pretty usable as it comes. I found (personally) this approach
  way
   more flexible than the original suggester component, but it involves
   storing the user's queries into a separated core.
  
   Greetings,
  
   - Original Message -
   From: Hamish Campbell hamish.campb...@koordinates.com
   To: solr-user@lucene.apache.org
   Sent: Wednesday, January 15, 2014 9:10:16 PM
   Subject: Re: Search Suggestion Filtering
  
   Thanks Tomás, I'll take a look.
  
   Still interested to hear from anyone about using queries to populate
 the
   list - I'm willing to give up a bit of performance for the flexibility
 it
   would provide.
  
  
   On Thu, Jan 16, 2014 at 1:06 PM, Tomás Fernández Löbbe 
   tomasflo...@gmail.com wrote:
  
I think your use case is the one described in LUCENE-5350, maybe you
  want
to take a look to the patch and comments there.
   
Tomás
   
   
On Wed, Jan 15, 2014 at 12:58 PM, Hamish Campbell 
hamish.campb...@koordinates.com wrote:
   
 Hi all,

 I'm looking into options for filtering the search suggestions
   dictionary.

 Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory
  using
   a
 field based dictionary, we're indexing records for a multi-tenanted
   SaaS
 platform. SearchHandler records are always filtered by the
 particular
 client warehouse (e.g. by domain), however we need a way to apply a
similar
 filter to the spell check dictionary to prevent leaking terms
 between
 clients. In other words: when client A searches for a document
 title
   they
 should not receive spelling suggestions for client B's document
  titles.

 This has been asked a couple of times, on the mailing list and on
 StackOverflow. Some of the suggested approaches:

 1. Use dynamic fields to create dictionaries per-warehouse
 (mentioned
here:


   
  
 
 http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html
 )

 That might be a reasonable option for us (we already considered a
   similar
 approach), but at what point does this stop scaling efficiently?
 How
   many
 dynamic fields are too many?

 2. Run

Re: Search Suggestion Filtering

2014-01-20 Thread Alessandro Benedetti

Hi guys, following this thread I have some question :

1) regarding LUCENE-5350, what is the context quoted ? Is it the context a
filter query ?

2) regarding https://issues.apache.org/jira/browse/SOLR-5378, do we have
the final documentation available ?

Cheers


2014/1/16 Hamish Campbell hamish.campb...@koordinates.com

 Thank you Jorge. We looked at phrase suggestions from previous user
 queries, but they're not so useful in our case. However, I have a follow-up
 question about similar functionality that I'll post shortly.

 The list might like to know that I've come up with a quick and exceedingly
 dirty strikehack/strike solution that works for our limited case.

 You have been warned!

 Note that we're using django-haystack to actually interact with Solr:

 1. Set nonFuzzyPrefix of the Suggester to 4.
 2. At index time, the haystack index will build suggestion terms by
 extracting the relevant terms and prefixing with a 4 (alpha) character
 reference for the target instance.
 3. At search time, the user's query is split, terms are prefixed and
 concatenated. The new query is sent to solr and the results are cleaned of
 references before returned to the front end.

 I'm not proud of it, but it works. =D



 On Fri, Jan 17, 2014 at 3:13 AM, Jorge Luis Betancourt González 
 jlbetanco...@uci.cu wrote:

  In a custom application we have, we use a separated core (under Solr
  3.6.1) to store the queries used by the users and then provide the
  autocomplete feauture. In our case we need to filter some phrases, that
 we
  don't need to be suggested to the users. I build a custom
  UpdateRequestProcessor to implement this logic, so we define this
 blocking
  patterns in some external source of information (DB, files, etc.). For
 the
  suggestions per-se we use as a base
  https://github.com/cominvent/autocomplete configuration, described in
  www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
  which is pretty usable as it comes. I found (personally) this approach
 way
  more flexible than the original suggester component, but it involves
  storing the user's queries into a separated core.
 
  Greetings,
 
  - Original Message -
  From: Hamish Campbell hamish.campb...@koordinates.com
  To: solr-user@lucene.apache.org
  Sent: Wednesday, January 15, 2014 9:10:16 PM
  Subject: Re: Search Suggestion Filtering
 
  Thanks Tomás, I'll take a look.
 
  Still interested to hear from anyone about using queries to populate the
  list - I'm willing to give up a bit of performance for the flexibility it
  would provide.
 
 
  On Thu, Jan 16, 2014 at 1:06 PM, Tomás Fernández Löbbe 
  tomasflo...@gmail.com wrote:
 
   I think your use case is the one described in LUCENE-5350, maybe you
 want
   to take a look to the patch and comments there.
  
   Tomás
  
  
   On Wed, Jan 15, 2014 at 12:58 PM, Hamish Campbell 
   hamish.campb...@koordinates.com wrote:
  
Hi all,
   
I'm looking into options for filtering the search suggestions
  dictionary.
   
Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory
 using
  a
field based dictionary, we're indexing records for a multi-tenanted
  SaaS
platform. SearchHandler records are always filtered by the particular
client warehouse (e.g. by domain), however we need a way to apply a
   similar
filter to the spell check dictionary to prevent leaking terms between
clients. In other words: when client A searches for a document title
  they
should not receive spelling suggestions for client B's document
 titles.
   
This has been asked a couple of times, on the mailing list and on
StackOverflow. Some of the suggested approaches:
   
1. Use dynamic fields to create dictionaries per-warehouse (mentioned
   here:
   
   
  
 
 http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html
)
   
That might be a reasonable option for us (we already considered a
  similar
approach), but at what point does this stop scaling efficiently? How
  many
dynamic fields are too many?
   
2. Run a query to populate the suggestion list (also mentioned in
 that
thread)
   
If I understand this correctly, this would give us a lot of
 flexibility
   and
power: for example to give a more nuanced result set using the users
permissions to expose private documents in their spelling
 suggestions.
   
I expect this would be a slow query, but our total document count is
currently relatively small (on the order of 10^3 objects) and I
 imagine
   you
could create a specific word index with the appropriate fields to
 keep
   this
in check. Is this a feasible approach, and if so, how do you build a
dynamic suggestion list?
   
3. Other options:
   
It seems like this is a common problem - and we could through some
resources at building an extension to provide some limited suggestion
dictionary filtering. Is anyone already doing something similar

Re: Search Suggestion Filtering

2014-01-16 Thread Jorge Luis Betancourt González

In a custom application we have, we use a separated core (under Solr 3.6.1) to 
store the queries used by the users and then provide the autocomplete feauture. 
In our case we need to filter some phrases, that we don't need to be suggested 
to the users. I build a custom UpdateRequestProcessor to implement this logic, 
so we define this blocking patterns in some external source of information 
(DB, files, etc.). For the suggestions per-se we use as a base 
https://github.com/cominvent/autocomplete‎ configuration, described in 
www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/‎ which is 
pretty usable as it comes. I found (personally) this approach way more flexible 
than the original suggester component, but it involves storing the user's 
queries into a separated core.

Greetings,

- Original Message -
From: Hamish Campbell hamish.campb...@koordinates.com
To: solr-user@lucene.apache.org
Sent: Wednesday, January 15, 2014 9:10:16 PM
Subject: Re: Search Suggestion Filtering

Thanks Tomás, I'll take a look.

Still interested to hear from anyone about using queries to populate the
list - I'm willing to give up a bit of performance for the flexibility it
would provide.


On Thu, Jan 16, 2014 at 1:06 PM, Tomás Fernández Löbbe 
tomasflo...@gmail.com wrote:

 I think your use case is the one described in LUCENE-5350, maybe you want
 to take a look to the patch and comments there.

 Tomás


 On Wed, Jan 15, 2014 at 12:58 PM, Hamish Campbell 
 hamish.campb...@koordinates.com wrote:

  Hi all,
 
  I'm looking into options for filtering the search suggestions dictionary.
 
  Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory using a
  field based dictionary, we're indexing records for a multi-tenanted SaaS
  platform. SearchHandler records are always filtered by the particular
  client warehouse (e.g. by domain), however we need a way to apply a
 similar
  filter to the spell check dictionary to prevent leaking terms between
  clients. In other words: when client A searches for a document title they
  should not receive spelling suggestions for client B's document titles.
 
  This has been asked a couple of times, on the mailing list and on
  StackOverflow. Some of the suggested approaches:
 
  1. Use dynamic fields to create dictionaries per-warehouse (mentioned
 here:
 
 
 http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html
  )
 
  That might be a reasonable option for us (we already considered a similar
  approach), but at what point does this stop scaling efficiently? How many
  dynamic fields are too many?
 
  2. Run a query to populate the suggestion list (also mentioned in that
  thread)
 
  If I understand this correctly, this would give us a lot of flexibility
 and
  power: for example to give a more nuanced result set using the users
  permissions to expose private documents in their spelling suggestions.
 
  I expect this would be a slow query, but our total document count is
  currently relatively small (on the order of 10^3 objects) and I imagine
 you
  could create a specific word index with the appropriate fields to keep
 this
  in check. Is this a feasible approach, and if so, how do you build a
  dynamic suggestion list?
 
  3. Other options:
 
  It seems like this is a common problem - and we could through some
  resources at building an extension to provide some limited suggestion
  dictionary filtering. Is anyone already doing something similar, or has
  found a clever hack around this, or can suggest a starting point?
 
  Thanks everyone!
 
  --
  Hamish Campbell
  Koordinates Ltd http://koordinates.com/?_bzhc=esig
  PH   +64 9 966 0433
  FAX +64 9 966 0045
 




-- 
Hamish Campbell
Koordinates Ltd http://koordinates.com/?_bzhc=esig
PH   +64 9 966 0433
FAX +64 9 966 0045

III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu

Re: Search Suggestion Filtering

2014-01-16 Thread Hamish Campbell

Thank you Jorge. We looked at phrase suggestions from previous user
queries, but they're not so useful in our case. However, I have a follow-up
question about similar functionality that I'll post shortly.

The list might like to know that I've come up with a quick and exceedingly
dirty strikehack/strike solution that works for our limited case.

You have been warned!

Note that we're using django-haystack to actually interact with Solr:

1. Set nonFuzzyPrefix of the Suggester to 4.
2. At index time, the haystack index will build suggestion terms by
extracting the relevant terms and prefixing with a 4 (alpha) character
reference for the target instance.
3. At search time, the user's query is split, terms are prefixed and
concatenated. The new query is sent to solr and the results are cleaned of
references before returned to the front end.

I'm not proud of it, but it works. =D



On Fri, Jan 17, 2014 at 3:13 AM, Jorge Luis Betancourt González 
jlbetanco...@uci.cu wrote:

 In a custom application we have, we use a separated core (under Solr
 3.6.1) to store the queries used by the users and then provide the
 autocomplete feauture. In our case we need to filter some phrases, that we
 don't need to be suggested to the users. I build a custom
 UpdateRequestProcessor to implement this logic, so we define this blocking
 patterns in some external source of information (DB, files, etc.). For the
 suggestions per-se we use as a base
 https://github.com/cominvent/autocomplete configuration, described in
 www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
 which is pretty usable as it comes. I found (personally) this approach way
 more flexible than the original suggester component, but it involves
 storing the user's queries into a separated core.

 Greetings,

 - Original Message -
 From: Hamish Campbell hamish.campb...@koordinates.com
 To: solr-user@lucene.apache.org
 Sent: Wednesday, January 15, 2014 9:10:16 PM
 Subject: Re: Search Suggestion Filtering

 Thanks Tomás, I'll take a look.

 Still interested to hear from anyone about using queries to populate the
 list - I'm willing to give up a bit of performance for the flexibility it
 would provide.


 On Thu, Jan 16, 2014 at 1:06 PM, Tomás Fernández Löbbe 
 tomasflo...@gmail.com wrote:

  I think your use case is the one described in LUCENE-5350, maybe you want
  to take a look to the patch and comments there.
 
  Tomás
 
 
  On Wed, Jan 15, 2014 at 12:58 PM, Hamish Campbell 
  hamish.campb...@koordinates.com wrote:
 
   Hi all,
  
   I'm looking into options for filtering the search suggestions
 dictionary.
  
   Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory using
 a
   field based dictionary, we're indexing records for a multi-tenanted
 SaaS
   platform. SearchHandler records are always filtered by the particular
   client warehouse (e.g. by domain), however we need a way to apply a
  similar
   filter to the spell check dictionary to prevent leaking terms between
   clients. In other words: when client A searches for a document title
 they
   should not receive spelling suggestions for client B's document titles.
  
   This has been asked a couple of times, on the mailing list and on
   StackOverflow. Some of the suggested approaches:
  
   1. Use dynamic fields to create dictionaries per-warehouse (mentioned
  here:
  
  
 
 http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html
   )
  
   That might be a reasonable option for us (we already considered a
 similar
   approach), but at what point does this stop scaling efficiently? How
 many
   dynamic fields are too many?
  
   2. Run a query to populate the suggestion list (also mentioned in that
   thread)
  
   If I understand this correctly, this would give us a lot of flexibility
  and
   power: for example to give a more nuanced result set using the users
   permissions to expose private documents in their spelling suggestions.
  
   I expect this would be a slow query, but our total document count is
   currently relatively small (on the order of 10^3 objects) and I imagine
  you
   could create a specific word index with the appropriate fields to keep
  this
   in check. Is this a feasible approach, and if so, how do you build a
   dynamic suggestion list?
  
   3. Other options:
  
   It seems like this is a common problem - and we could through some
   resources at building an extension to provide some limited suggestion
   dictionary filtering. Is anyone already doing something similar, or has
   found a clever hack around this, or can suggest a starting point?
  
   Thanks everyone!
  
   --
   Hamish Campbell
   Koordinates Ltd http://koordinates.com/?_bzhc=esig
   PH   +64 9 966 0433
   FAX +64 9 966 0045
  
 



 --
 Hamish Campbell
 Koordinates Ltd http://koordinates.com/?_bzhc=esig
 PH   +64 9 966 0433
 FAX +64 9 966 0045

 
 III

Search Suggestion Filtering

2014-01-15 Thread Hamish Campbell

Hi all,

I'm looking into options for filtering the search suggestions dictionary.

Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory using a
field based dictionary, we're indexing records for a multi-tenanted SaaS
platform. SearchHandler records are always filtered by the particular
client warehouse (e.g. by domain), however we need a way to apply a similar
filter to the spell check dictionary to prevent leaking terms between
clients. In other words: when client A searches for a document title they
should not receive spelling suggestions for client B's document titles.

This has been asked a couple of times, on the mailing list and on
StackOverflow. Some of the suggested approaches:

1. Use dynamic fields to create dictionaries per-warehouse (mentioned here:
http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html
)

That might be a reasonable option for us (we already considered a similar
approach), but at what point does this stop scaling efficiently? How many
dynamic fields are too many?

2. Run a query to populate the suggestion list (also mentioned in that
thread)

If I understand this correctly, this would give us a lot of flexibility and
power: for example to give a more nuanced result set using the users
permissions to expose private documents in their spelling suggestions.

I expect this would be a slow query, but our total document count is
currently relatively small (on the order of 10^3 objects) and I imagine you
could create a specific word index with the appropriate fields to keep this
in check. Is this a feasible approach, and if so, how do you build a
dynamic suggestion list?

3. Other options:

It seems like this is a common problem - and we could through some
resources at building an extension to provide some limited suggestion
dictionary filtering. Is anyone already doing something similar, or has
found a clever hack around this, or can suggest a starting point?

Thanks everyone!

-- 
Hamish Campbell
Koordinates Ltd http://koordinates.com/?_bzhc=esig
PH   +64 9 966 0433
FAX +64 9 966 0045

Re: Search Suggestion Filtering

2014-01-15 Thread Tomás Fernández Löbbe

I think your use case is the one described in LUCENE-5350, maybe you want
to take a look to the patch and comments there.

Tomás


On Wed, Jan 15, 2014 at 12:58 PM, Hamish Campbell 
hamish.campb...@koordinates.com wrote:

 Hi all,

 I'm looking into options for filtering the search suggestions dictionary.

 Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory using a
 field based dictionary, we're indexing records for a multi-tenanted SaaS
 platform. SearchHandler records are always filtered by the particular
 client warehouse (e.g. by domain), however we need a way to apply a similar
 filter to the spell check dictionary to prevent leaking terms between
 clients. In other words: when client A searches for a document title they
 should not receive spelling suggestions for client B's document titles.

 This has been asked a couple of times, on the mailing list and on
 StackOverflow. Some of the suggested approaches:

 1. Use dynamic fields to create dictionaries per-warehouse (mentioned here:

 http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html
 )

 That might be a reasonable option for us (we already considered a similar
 approach), but at what point does this stop scaling efficiently? How many
 dynamic fields are too many?

 2. Run a query to populate the suggestion list (also mentioned in that
 thread)

 If I understand this correctly, this would give us a lot of flexibility and
 power: for example to give a more nuanced result set using the users
 permissions to expose private documents in their spelling suggestions.

 I expect this would be a slow query, but our total document count is
 currently relatively small (on the order of 10^3 objects) and I imagine you
 could create a specific word index with the appropriate fields to keep this
 in check. Is this a feasible approach, and if so, how do you build a
 dynamic suggestion list?

 3. Other options:

 It seems like this is a common problem - and we could through some
 resources at building an extension to provide some limited suggestion
 dictionary filtering. Is anyone already doing something similar, or has
 found a clever hack around this, or can suggest a starting point?

 Thanks everyone!

 --
 Hamish Campbell
 Koordinates Ltd http://koordinates.com/?_bzhc=esig
 PH   +64 9 966 0433
 FAX +64 9 966 0045

Re: Search Suggestion Filtering

2014-01-15 Thread Hamish Campbell

Thanks Tomás, I'll take a look.

Still interested to hear from anyone about using queries to populate the
list - I'm willing to give up a bit of performance for the flexibility it
would provide.


On Thu, Jan 16, 2014 at 1:06 PM, Tomás Fernández Löbbe 
tomasflo...@gmail.com wrote:

 I think your use case is the one described in LUCENE-5350, maybe you want
 to take a look to the patch and comments there.

 Tomás


 On Wed, Jan 15, 2014 at 12:58 PM, Hamish Campbell 
 hamish.campb...@koordinates.com wrote:

  Hi all,
 
  I'm looking into options for filtering the search suggestions dictionary.
 
  Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory using a
  field based dictionary, we're indexing records for a multi-tenanted SaaS
  platform. SearchHandler records are always filtered by the particular
  client warehouse (e.g. by domain), however we need a way to apply a
 similar
  filter to the spell check dictionary to prevent leaking terms between
  clients. In other words: when client A searches for a document title they
  should not receive spelling suggestions for client B's document titles.
 
  This has been asked a couple of times, on the mailing list and on
  StackOverflow. Some of the suggested approaches:
 
  1. Use dynamic fields to create dictionaries per-warehouse (mentioned
 here:
 
 
 http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html
  )
 
  That might be a reasonable option for us (we already considered a similar
  approach), but at what point does this stop scaling efficiently? How many
  dynamic fields are too many?
 
  2. Run a query to populate the suggestion list (also mentioned in that
  thread)
 
  If I understand this correctly, this would give us a lot of flexibility
 and
  power: for example to give a more nuanced result set using the users
  permissions to expose private documents in their spelling suggestions.
 
  I expect this would be a slow query, but our total document count is
  currently relatively small (on the order of 10^3 objects) and I imagine
 you
  could create a specific word index with the appropriate fields to keep
 this
  in check. Is this a feasible approach, and if so, how do you build a
  dynamic suggestion list?
 
  3. Other options:
 
  It seems like this is a common problem - and we could through some
  resources at building an extension to provide some limited suggestion
  dictionary filtering. Is anyone already doing something similar, or has
  found a clever hack around this, or can suggest a starting point?
 
  Thanks everyone!
 
  --
  Hamish Campbell
  Koordinates Ltd http://koordinates.com/?_bzhc=esig
  PH   +64 9 966 0433
  FAX +64 9 966 0045
 




-- 
Hamish Campbell
Koordinates Ltd http://koordinates.com/?_bzhc=esig
PH   +64 9 966 0433
FAX +64 9 966 0045

Re: Search Suggestion Filtering

2014-01-15 Thread Areek Zillur

Hey Hamish,

You might want to check this out LUCENE-5402 . I added support for
index-time pruning for suggesters that consumes from the index itself.
I plan to add this support to file-based suggesters as well.
In order to use this functionality from Solr, more changes are required. I
am planning to support this in the new SuggesterComponent (SOLR-5378) in
Solr.

Hope that helps!

Areek


On Wed, Jan 15, 2014 at 6:10 PM, Hamish Campbell 
hamish.campb...@koordinates.com wrote:

 Thanks Tomás, I'll take a look.

 Still interested to hear from anyone about using queries to populate the
 list - I'm willing to give up a bit of performance for the flexibility it
 would provide.


 On Thu, Jan 16, 2014 at 1:06 PM, Tomás Fernández Löbbe 
 tomasflo...@gmail.com wrote:

  I think your use case is the one described in LUCENE-5350, maybe you want
  to take a look to the patch and comments there.
 
  Tomás
 
 
  On Wed, Jan 15, 2014 at 12:58 PM, Hamish Campbell 
  hamish.campb...@koordinates.com wrote:
 
   Hi all,
  
   I'm looking into options for filtering the search suggestions
 dictionary.
  
   Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory using
 a
   field based dictionary, we're indexing records for a multi-tenanted
 SaaS
   platform. SearchHandler records are always filtered by the particular
   client warehouse (e.g. by domain), however we need a way to apply a
  similar
   filter to the spell check dictionary to prevent leaking terms between
   clients. In other words: when client A searches for a document title
 they
   should not receive spelling suggestions for client B's document titles.
  
   This has been asked a couple of times, on the mailing list and on
   StackOverflow. Some of the suggested approaches:
  
   1. Use dynamic fields to create dictionaries per-warehouse (mentioned
  here:
  
  
 
 http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html
   )
  
   That might be a reasonable option for us (we already considered a
 similar
   approach), but at what point does this stop scaling efficiently? How
 many
   dynamic fields are too many?
  
   2. Run a query to populate the suggestion list (also mentioned in that
   thread)
  
   If I understand this correctly, this would give us a lot of flexibility
  and
   power: for example to give a more nuanced result set using the users
   permissions to expose private documents in their spelling suggestions.
  
   I expect this would be a slow query, but our total document count is
   currently relatively small (on the order of 10^3 objects) and I imagine
  you
   could create a specific word index with the appropriate fields to keep
  this
   in check. Is this a feasible approach, and if so, how do you build a
   dynamic suggestion list?
  
   3. Other options:
  
   It seems like this is a common problem - and we could through some
   resources at building an extension to provide some limited suggestion
   dictionary filtering. Is anyone already doing something similar, or has
   found a clever hack around this, or can suggest a starting point?
  
   Thanks everyone!
  
   --
   Hamish Campbell
   Koordinates Ltd http://koordinates.com/?_bzhc=esig
   PH   +64 9 966 0433
   FAX +64 9 966 0045
  
 



 --
 Hamish Campbell
 Koordinates Ltd http://koordinates.com/?_bzhc=esig
 PH   +64 9 966 0433
 FAX +64 9 966 0045

Re: Search Suggestion Filtering

Re: Search Suggestion Filtering

Re: Search Suggestion Filtering

Re: Search Suggestion Filtering

Search Suggestion Filtering

Re: Search Suggestion Filtering

Re: Search Suggestion Filtering

Re: Search Suggestion Filtering

8 matches

Site Navigation

Mail list logo

Footer information