Thanks a lot everybody for the responses ... I am going to do some 
practical/empirical testing and will report
matt

--- On Wed, 1/27/10, Tom Hill <solr-l...@worldware.com> wrote:

From: Tom Hill <solr-l...@worldware.com>
Subject: Re: Multiple Cores Vs. Single Core for the following use case
To: solr-user@lucene.apache.org
Date: Wednesday, January 27, 2010, 2:47 PM

Hi -

I'd probably go with a single core on this one, just for ease of operations..

But here are some thoughts:

One advantage I can see to multiple cores, though, would be better idf
calculations. With individual cores, each user only sees the idf for his own
documents. With a single core, the idf will be across all documents. In
theory, better relevance.

While multi-core will use more ram to start with, and I would expect it to
use more disk (term dictionary per core). Filters would add to the memory
footprint of the multiple core setup.

However, if you only end up sorting/faceting on some of the cores, your
memory use with multiple cores may actually be less. With multiple cores,
each field cache only covers one user's docs. With single core, you have one
field cache entry per doc in the whole corpus. Depending on usage patterns,
index sizes, etc, this could be a significant amount of memory.

Tom


On Wed, Jan 27, 2010 at 11:38 AM, Amit Nithian <anith...@gmail.com> wrote:

> It sounds to me that multiple cores won't scale.. wouldn't you have to
> create multiple configurations per each core and does the ranking function
> change per user?
>
> I would imagine that the filter method would work better.. the caching is
> there and as mentioned earlier would be fast for multiple searches. If you
> have searches for the same user, then add that to your warming queries list
> so that on server startup, the cache will be warm for certain users that
> you
> know tend to do a lot of searches. This can be known empirically or by log
> mining.
>
> I haven't used multiple cores but I suspect that having that many
> configuration files parsed and loaded in memory can't be good for memory
> usage over filter caching.
>
> Just my 2 cents
> Amit
>
> On Wed, Jan 27, 2010 at 8:58 AM, Matthieu Labour
> <matthieu_lab...@yahoo.com>wrote:
>
> > Thanks Didier for your response
> > And in your opinion, this should be as fast as if I would getCore(userId)
> > -- provided that the core is already open -- and then search for "Paris"
> ?
> > matt
> >
> > --- On Wed, 1/27/10, didier deshommes <dfdes...@gmail.com> wrote:
> >
> > From: didier deshommes <dfdes...@gmail.com>
> > Subject: Re: Multiple Cores Vs. Single Core for the following use case
> > To: solr-user@lucene.apache.org
> > Date: Wednesday, January 27, 2010, 10:52 AM
> >
> > On Wed, Jan 27, 2010 at 9:48 AM, Matthieu Labour
> > <matthieu_lab...@yahoo.com> wrote:
> > > What I am trying to understand is the search/filter algorithm. If I
> have
> > 1 core with all documents and I  search for "Paris" for userId="123", is
> > lucene going to first search for all Paris documents and then apply a
> filter
> > on the userId ? If this is the case, then I am better off having a
> specific
> > index for the user="123" because this will be faster
> >
> > If you want to apply the filter to userid first, use filter queries
> > (http://wiki.apache.org/solr/CommonQueryParameters#fq). This will
> > filter by userid first then search for "Paris".
> >
> > didier
> >
> > >
> > >
> > >
> > >
> > >
> > > --- On Wed, 1/27/10, Marc Sturlese <marc.sturl...@gmail.com> wrote:
> > >
> > > From: Marc Sturlese <marc.sturl...@gmail.com>
> > > Subject: Re: Multiple Cores Vs. Single Core for the following use case
> > > To: solr-user@lucene.apache.org
> > > Date: Wednesday, January 27, 2010, 2:22 AM
> > >
> > >
> > > In case you are going to use core per user take a look to this patch:
> > > http://wiki.apache.org/solr/LotsOfCores
> > >
> > > Trey-13 wrote:
> > >>
> > >> Hi Matt,
> > >>
> > >> In most cases you are going to be better off going with the userid
> > method
> > >> unless you have a very small number of users and a very large number
> of
> > >> docs/user. The userid method will likely be much easier to manage, as
> > you
> > >> won't have to spin up a new core every time you add a new user.  I
> would
> > >> start here and see if the performance is good enough for your
> > requirements
> > >> before you start worrying about it not being efficient.
> > >>
> > >> That being said, I really don't have any idea what your data looks
> like.
> > >> How many users do you have?  How many documents per user?  Are any
> > >> documents
> > >> shared by multiple users?
> > >>
> > >> -Trey
> > >>
> > >>
> > >>
> > >> On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
> > >> <matthieu_lab...@yahoo.com>wrote:
> > >>
> > >>> Hi
> > >>>
> > >>>
> > >>>
> > >>> Shall I set up Multiple Core or Single core for the following use
> case:
> > >>>
> > >>>
> > >>>
> > >>> I have X number of users.
> > >>>
> > >>>
> > >>>
> > >>> When I do a search, I always know for which user I am doing a search
> > >>>
> > >>>
> > >>>
> > >>> Shall I set up X cores, 1 for each user ? Or shall I set up 1 core
> and
> > >>> add
> > >>> a userId field to each document?
> > >>>
> > >>>
> > >>>
> > >>> If I choose the 1 core solution then I am concerned with performance.
> > >>> Let's say I search for "NewYork" ... If lucene returns all "New York"
> > >>> matches for all users and then filters based on the userId, then this
> > >>> is going to be less efficient than if I have sharded per user and
> send
> > >>> the request for "New York" to the user's core
> > >>>
> > >>>
> > >>>
> > >>> Thank you for your help
> > >>>
> > >>>
> > >>>
> > >>> matt
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>
> > >>
> > >
> > > --
> > > View this message in context:
> >
> http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
> > > Sent from the Solr - User mailing list archive at Nabble.com.
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>



      

Reply via email to