Re: Solr with many indexes

2011-08-02 Thread Vikram Kumar
We have a multi-tenant Solr deployment with a core for each user.

Due to the limitations we are facing with number of cores,
lazy-loading (and associated warm-up times), we are researching about
consolidating several users into one core with queries limited by
user-id field.

My question is about autosuggest.

1. Are there ways we can limit the autosuggest to only documents with
matching ids?

2. What other SOLR operations like these which need further
consideration when merging multiple indices and limiting by a field?

-- Vikram

On Sat, Jan 22, 2011 at 4:02 PM, Erick Erickson erickerick...@gmail.com wrote:
 See below.

 On Wed, Jan 19, 2011 at 7:26 PM, Joscha Feth jos...@feth.com wrote:

 Hello Erick,

 Thanks for your answer!

 But I question why you *require* many different indexes. [...] including
  isolating one
  users'
  data from all others, [...]


 Yes, thats exactly what I am after - I need to make sure that indexes don't
 mix, as every user shall only be able to query his own data (index).


 well, this can also be handled by simply appending the equivalent of
 +user:theuser
 to each query. This solution does have some interesting side effects
 though.
 In particular if you autosuggest based on combined documents, users will see
 terms NOT in documents they own.



 And even using lots of cores can be made to work if you don't pre-warm
  newly-opened
  cores, assuming that the response time when using cold searchers is
  adequate.
 

 Could you explain that further or point me to some documentation? Are you
 talking about: http://wiki.apache.org/solr/CoreAdmin#UNLOAD? if yes, LOAD
 does not seem to be implemented, yet. Or has this something to do with
 http://wiki.apache.org/solr/SolrCaching#autowarmCount only? About what
 time
 per X documents are we talking here for delay if auto warming is disabled?
 Is there more documentation about this setting?


 It's the autoWarm parameter. When you open a core the first few queries that
 run
 on it will pay some penalty for filling caches etc. If your cores are small
 enough,
 then this penalty may not be noticeable to your users, in which case you can
 just
 not bother autowarming (see firstSearcher , newSearcher). You might also
 be able to get away with having very small caches, it mostly depends on your
 usage patterns. If your pattern as that a user signs on, makes one search
 and
 signs off, there may not be much good in having large caches. On the other
 and,
 if users sign on and search for hours continually, their experience may be
 enhanced
 by having significant caches. It all depends.

 Hopt that helps
 Erick


 Kind regards,
 Joscha





-- 
- Vikram


Re: Solr with many indexes

2011-08-02 Thread Otis Gospodnetic
Hello,


From: Vikram Kumar vikrambku...@gmail.com

We have a multi-tenant Solr deployment with a core for each user.

Due to the limitations we are facing with number of cores,
lazy-loading (and associated warm-up times), we are researching about
consolidating several users into one core with queries limited by
user-id field.

My question is about autosuggest.

1. Are there ways we can limit the autosuggest to only documents with
matching ids?


Not sure about Solr's Suggester, but yes this and more is doable with 
Sematext's Autocomplete: http://sematext.com/products/autocomplete/index.html

2. What other SOLR operations like these which need further
consideration when merging multiple indices and limiting by a field?


Spellchecking is the first thing that comes to mind.  Not sure what else...

Otis


On Sat, Jan 22, 2011 at 4:02 PM, Erick Erickson erickerick...@gmail.com 
wrote:
 See below.

 On Wed, Jan 19, 2011 at 7:26 PM, Joscha Feth jos...@feth.com wrote:

 Hello Erick,

 Thanks for your answer!

 But I question why you *require* many different indexes. [...] including
  isolating one
  users'
  data from all others, [...]


 Yes, thats exactly what I am after - I need to make sure that indexes don't
 mix, as every user shall only be able to query his own data (index).


 well, this can also be handled by simply appending the equivalent of
 +user:theuser
 to each query. This solution does have some interesting side effects
 though.
 In particular if you autosuggest based on combined documents, users will see
 terms NOT in documents they own.



 And even using lots of cores can be made to work if you don't pre-warm
  newly-opened
  cores, assuming that the response time when using cold searchers is
  adequate.
 

 Could you explain that further or point me to some documentation? Are you
 talking about: http://wiki.apache.org/solr/CoreAdmin#UNLOAD? if yes, LOAD
 does not seem to be implemented, yet. Or has this something to do with
 http://wiki.apache.org/solr/SolrCaching#autowarmCount only? About what
 time
 per X documents are we talking here for delay if auto warming is disabled?
 Is there more documentation about this setting?


 It's the autoWarm parameter. When you open a core the first few queries that
 run
 on it will pay some penalty for filling caches etc. If your cores are small
 enough,
 then this penalty may not be noticeable to your users, in which case you can
 just
 not bother autowarming (see firstSearcher , newSearcher). You might also
 be able to get away with having very small caches, it mostly depends on your
 usage patterns. If your pattern as that a user signs on, makes one search
 and
 signs off, there may not be much good in having large caches. On the other
 and,
 if users sign on and search for hours continually, their experience may be
 enhanced
 by having significant caches. It all depends.

 Hopt that helps
 Erick


 Kind regards,
 Joscha





-- 
- Vikram


 


Re: Solr with many indexes

2011-01-22 Thread Erick Erickson
See below.

On Wed, Jan 19, 2011 at 7:26 PM, Joscha Feth jos...@feth.com wrote:

 Hello Erick,

 Thanks for your answer!

 But I question why you *require* many different indexes. [...] including
  isolating one
  users'
  data from all others, [...]


 Yes, thats exactly what I am after - I need to make sure that indexes don't
 mix, as every user shall only be able to query his own data (index).


well, this can also be handled by simply appending the equivalent of
+user:theuser
to each query. This solution does have some interesting side effects
though.
In particular if you autosuggest based on combined documents, users will see
terms NOT in documents they own.



 And even using lots of cores can be made to work if you don't pre-warm
  newly-opened
  cores, assuming that the response time when using cold searchers is
  adequate.
 

 Could you explain that further or point me to some documentation? Are you
 talking about: http://wiki.apache.org/solr/CoreAdmin#UNLOAD? if yes, LOAD
 does not seem to be implemented, yet. Or has this something to do with
 http://wiki.apache.org/solr/SolrCaching#autowarmCount only? About what
 time
 per X documents are we talking here for delay if auto warming is disabled?
 Is there more documentation about this setting?


It's the autoWarm parameter. When you open a core the first few queries that
run
on it will pay some penalty for filling caches etc. If your cores are small
enough,
then this penalty may not be noticeable to your users, in which case you can
just
not bother autowarming (see firstSearcher , newSearcher). You might also
be able to get away with having very small caches, it mostly depends on your
usage patterns. If your pattern as that a user signs on, makes one search
and
signs off, there may not be much good in having large caches. On the other
and,
if users sign on and search for hours continually, their experience may be
enhanced
by having significant caches. It all depends.

Hopt that helps
Erick


 Kind regards,
 Joscha



Solr with many indexes

2011-01-19 Thread Joscha Feth
Hello Solrs,

I am looking into using Solr, but my intended usage would require having
many different indexes which are not connected (e.g some index-tenancy with
one or multiple indexes per user).
I understand that creating independent indexes in Solr happens by creating
Solr cores via CoreAdmin.
I came across this document: http://wiki.apache.org/solr/LotsOfCores which
basically tells me that having many indexes is not an intended use for Solr.
Is this also true for SolrCloud (http://wiki.apache.org/solr/SolrCloud)?
If yes, about what upper limit of indexes are we talking about here? Tens?
Hundreds? Thousands?

Thank you very much!
Regards,
Joscha Feth


Re: Solr with many indexes

2011-01-19 Thread Erick Erickson
Solr will handle lots of cores, but that page is talking about lots.
Thousands.

But I question why you *require* many different indexes. It's perfectly
reasonable
to store different fields in different documents in the *same* index, unlike
a table in an RDBMS.

There are good reasons to have separate cores, including isolating one
users'
data from all others, so I'm not saying that you should necessarily put
everything
into a single core

And even using lots of cores can be made to work if you don't pre-warm
newly-opened
cores, assuming that the response time when using cold searchers is
adequate.

Best
Erick

On Wed, Jan 19, 2011 at 7:41 AM, Joscha Feth jos...@feth.com wrote:

 Hello Solrs,

 I am looking into using Solr, but my intended usage would require having
 many different indexes which are not connected (e.g some index-tenancy with
 one or multiple indexes per user).
 I understand that creating independent indexes in Solr happens by creating
 Solr cores via CoreAdmin.
 I came across this document: http://wiki.apache.org/solr/LotsOfCores which
 basically tells me that having many indexes is not an intended use for
 Solr.
 Is this also true for SolrCloud (http://wiki.apache.org/solr/SolrCloud)?
 If yes, about what upper limit of indexes are we talking about here? Tens?
 Hundreds? Thousands?

 Thank you very much!
 Regards,
 Joscha Feth



Re: Solr with many indexes

2011-01-19 Thread Joscha Feth
Hello Erick,

Thanks for your answer!

But I question why you *require* many different indexes. [...] including
 isolating one
 users'
 data from all others, [...]


Yes, thats exactly what I am after - I need to make sure that indexes don't
mix, as every user shall only be able to query his own data (index).

And even using lots of cores can be made to work if you don't pre-warm
 newly-opened
 cores, assuming that the response time when using cold searchers is
 adequate.


Could you explain that further or point me to some documentation? Are you
talking about: http://wiki.apache.org/solr/CoreAdmin#UNLOAD? if yes, LOAD
does not seem to be implemented, yet. Or has this something to do with
http://wiki.apache.org/solr/SolrCaching#autowarmCount only? About what time
per X documents are we talking here for delay if auto warming is disabled?
Is there more documentation about this setting?

Kind regards,
Joscha