Re: Solr with many indexes
We have a multi-tenant Solr deployment with a core for each user. Due to the limitations we are facing with number of cores, lazy-loading (and associated warm-up times), we are researching about consolidating several users into one core with queries limited by user-id field. My question is about autosuggest. 1. Are there ways we can limit the autosuggest to only documents with matching ids? 2. What other SOLR operations like these which need further consideration when merging multiple indices and limiting by a field? -- Vikram On Sat, Jan 22, 2011 at 4:02 PM, Erick Erickson erickerick...@gmail.com wrote: See below. On Wed, Jan 19, 2011 at 7:26 PM, Joscha Feth jos...@feth.com wrote: Hello Erick, Thanks for your answer! But I question why you *require* many different indexes. [...] including isolating one users' data from all others, [...] Yes, thats exactly what I am after - I need to make sure that indexes don't mix, as every user shall only be able to query his own data (index). well, this can also be handled by simply appending the equivalent of +user:theuser to each query. This solution does have some interesting side effects though. In particular if you autosuggest based on combined documents, users will see terms NOT in documents they own. And even using lots of cores can be made to work if you don't pre-warm newly-opened cores, assuming that the response time when using cold searchers is adequate. Could you explain that further or point me to some documentation? Are you talking about: http://wiki.apache.org/solr/CoreAdmin#UNLOAD? if yes, LOAD does not seem to be implemented, yet. Or has this something to do with http://wiki.apache.org/solr/SolrCaching#autowarmCount only? About what time per X documents are we talking here for delay if auto warming is disabled? Is there more documentation about this setting? It's the autoWarm parameter. When you open a core the first few queries that run on it will pay some penalty for filling caches etc. If your cores are small enough, then this penalty may not be noticeable to your users, in which case you can just not bother autowarming (see firstSearcher , newSearcher). You might also be able to get away with having very small caches, it mostly depends on your usage patterns. If your pattern as that a user signs on, makes one search and signs off, there may not be much good in having large caches. On the other and, if users sign on and search for hours continually, their experience may be enhanced by having significant caches. It all depends. Hopt that helps Erick Kind regards, Joscha -- - Vikram
Re: Solr with many indexes
Hello, From: Vikram Kumar vikrambku...@gmail.com We have a multi-tenant Solr deployment with a core for each user. Due to the limitations we are facing with number of cores, lazy-loading (and associated warm-up times), we are researching about consolidating several users into one core with queries limited by user-id field. My question is about autosuggest. 1. Are there ways we can limit the autosuggest to only documents with matching ids? Not sure about Solr's Suggester, but yes this and more is doable with Sematext's Autocomplete: http://sematext.com/products/autocomplete/index.html 2. What other SOLR operations like these which need further consideration when merging multiple indices and limiting by a field? Spellchecking is the first thing that comes to mind. Not sure what else... Otis On Sat, Jan 22, 2011 at 4:02 PM, Erick Erickson erickerick...@gmail.com wrote: See below. On Wed, Jan 19, 2011 at 7:26 PM, Joscha Feth jos...@feth.com wrote: Hello Erick, Thanks for your answer! But I question why you *require* many different indexes. [...] including isolating one users' data from all others, [...] Yes, thats exactly what I am after - I need to make sure that indexes don't mix, as every user shall only be able to query his own data (index). well, this can also be handled by simply appending the equivalent of +user:theuser to each query. This solution does have some interesting side effects though. In particular if you autosuggest based on combined documents, users will see terms NOT in documents they own. And even using lots of cores can be made to work if you don't pre-warm newly-opened cores, assuming that the response time when using cold searchers is adequate. Could you explain that further or point me to some documentation? Are you talking about: http://wiki.apache.org/solr/CoreAdmin#UNLOAD? if yes, LOAD does not seem to be implemented, yet. Or has this something to do with http://wiki.apache.org/solr/SolrCaching#autowarmCount only? About what time per X documents are we talking here for delay if auto warming is disabled? Is there more documentation about this setting? It's the autoWarm parameter. When you open a core the first few queries that run on it will pay some penalty for filling caches etc. If your cores are small enough, then this penalty may not be noticeable to your users, in which case you can just not bother autowarming (see firstSearcher , newSearcher). You might also be able to get away with having very small caches, it mostly depends on your usage patterns. If your pattern as that a user signs on, makes one search and signs off, there may not be much good in having large caches. On the other and, if users sign on and search for hours continually, their experience may be enhanced by having significant caches. It all depends. Hopt that helps Erick Kind regards, Joscha -- - Vikram
Re: Solr with many indexes
See below. On Wed, Jan 19, 2011 at 7:26 PM, Joscha Feth jos...@feth.com wrote: Hello Erick, Thanks for your answer! But I question why you *require* many different indexes. [...] including isolating one users' data from all others, [...] Yes, thats exactly what I am after - I need to make sure that indexes don't mix, as every user shall only be able to query his own data (index). well, this can also be handled by simply appending the equivalent of +user:theuser to each query. This solution does have some interesting side effects though. In particular if you autosuggest based on combined documents, users will see terms NOT in documents they own. And even using lots of cores can be made to work if you don't pre-warm newly-opened cores, assuming that the response time when using cold searchers is adequate. Could you explain that further or point me to some documentation? Are you talking about: http://wiki.apache.org/solr/CoreAdmin#UNLOAD? if yes, LOAD does not seem to be implemented, yet. Or has this something to do with http://wiki.apache.org/solr/SolrCaching#autowarmCount only? About what time per X documents are we talking here for delay if auto warming is disabled? Is there more documentation about this setting? It's the autoWarm parameter. When you open a core the first few queries that run on it will pay some penalty for filling caches etc. If your cores are small enough, then this penalty may not be noticeable to your users, in which case you can just not bother autowarming (see firstSearcher , newSearcher). You might also be able to get away with having very small caches, it mostly depends on your usage patterns. If your pattern as that a user signs on, makes one search and signs off, there may not be much good in having large caches. On the other and, if users sign on and search for hours continually, their experience may be enhanced by having significant caches. It all depends. Hopt that helps Erick Kind regards, Joscha
Solr with many indexes
Hello Solrs, I am looking into using Solr, but my intended usage would require having many different indexes which are not connected (e.g some index-tenancy with one or multiple indexes per user). I understand that creating independent indexes in Solr happens by creating Solr cores via CoreAdmin. I came across this document: http://wiki.apache.org/solr/LotsOfCores which basically tells me that having many indexes is not an intended use for Solr. Is this also true for SolrCloud (http://wiki.apache.org/solr/SolrCloud)? If yes, about what upper limit of indexes are we talking about here? Tens? Hundreds? Thousands? Thank you very much! Regards, Joscha Feth
Re: Solr with many indexes
Solr will handle lots of cores, but that page is talking about lots. Thousands. But I question why you *require* many different indexes. It's perfectly reasonable to store different fields in different documents in the *same* index, unlike a table in an RDBMS. There are good reasons to have separate cores, including isolating one users' data from all others, so I'm not saying that you should necessarily put everything into a single core And even using lots of cores can be made to work if you don't pre-warm newly-opened cores, assuming that the response time when using cold searchers is adequate. Best Erick On Wed, Jan 19, 2011 at 7:41 AM, Joscha Feth jos...@feth.com wrote: Hello Solrs, I am looking into using Solr, but my intended usage would require having many different indexes which are not connected (e.g some index-tenancy with one or multiple indexes per user). I understand that creating independent indexes in Solr happens by creating Solr cores via CoreAdmin. I came across this document: http://wiki.apache.org/solr/LotsOfCores which basically tells me that having many indexes is not an intended use for Solr. Is this also true for SolrCloud (http://wiki.apache.org/solr/SolrCloud)? If yes, about what upper limit of indexes are we talking about here? Tens? Hundreds? Thousands? Thank you very much! Regards, Joscha Feth
Re: Solr with many indexes
Hello Erick, Thanks for your answer! But I question why you *require* many different indexes. [...] including isolating one users' data from all others, [...] Yes, thats exactly what I am after - I need to make sure that indexes don't mix, as every user shall only be able to query his own data (index). And even using lots of cores can be made to work if you don't pre-warm newly-opened cores, assuming that the response time when using cold searchers is adequate. Could you explain that further or point me to some documentation? Are you talking about: http://wiki.apache.org/solr/CoreAdmin#UNLOAD? if yes, LOAD does not seem to be implemented, yet. Or has this something to do with http://wiki.apache.org/solr/SolrCaching#autowarmCount only? About what time per X documents are we talking here for delay if auto warming is disabled? Is there more documentation about this setting? Kind regards, Joscha