Hi Mark, Darren

Thanks very much for your help, Will try collection for each customer then.

Regards,
Yandong


2012/5/22 Mark Miller <markrmil...@gmail.com>

> I think the key is this: you want to think of a SolrCore on a single node
> Solr installation as a collection on a multi node SolrCloud installation.
>
> So if you would use multiple SolrCore's with a std Solr setup, you should
> be using multiple collections in SolrCloud. If you were going to try to do
> everything in one SolrCore, that would be like putting everything in one
> collection in SolrCloud. I don't think it generally makes sense to try and
> work at the SolrCore level when working with SolrCloud. This will be made
> more clear once we add a simple collections api.
>
> So I think your choice should be similar to using a single node - do you
> want to put everything in one 'collection' and use a filter to separate
> customers (with all its caveats and limitations) or do you want to use a
> collection per customer. You can always start up more clusters if you reach
> any limits.
>
>
>
> On May 22, 2012, at 10:08 AM, Darren Govoni wrote:
>
> > I'm curious what the solrcloud experts say, but my suggestion is to try
> not to over-engineering the search architecture  on solrcloud. For example,
> what is the benefit of managing the what cores are indexed and searched?
> Having to know those details, in my mind, works against the automation in
> solrcore, but maybe there's a good reason you want to do it this way.
> >
> > <br><br><br>------- Original Message -------
> > On 5/22/2012  07:35 AM Yandong Yao wrote:<br>Hi Darren,
> > <br>
> > <br>Thanks very much for your reply.
> > <br>
> > <br>The reason I want to control core indexing/searching is that I want
> to
> > <br>use one core to store one customer's data (all customer share same
> > <br>config):  such as customer 1 use coreForCustomer1 and customer 2
> > <br>use coreForCustomer2.
> > <br>
> > <br>Is there any better way than using different core for different
> customer?
> > <br>
> > <br>Another way maybe use different collection for different customer,
> while
> > <br>not sure how many collections solr cloud could support. Which way is
> better
> > <br>in terms of flexibility/scalability? (suppose there are tens of
> thousands
> > <br>customers).
> > <br>
> > <br>Regards,
> > <br>Yandong
> > <br>
> > <br>2012/5/22 Darren Govoni <dar...@ontrenet.com>
> > <br>
> > <br>> Why do you want to control what gets indexed into a core and then
> > <br>> knowing what core to search? That's the kind of "knowing" that
> SolrCloud
> > <br>> solves. In SolrCloud, it handles the distribution of documents
> across
> > <br>> shards and retrieves them regardless of which node is searched
> from.
> > <br>> That is the point of "cloud", you don't know the details of where
> > <br>> exactly documents are being managed (i.e. they are cloudy). It can
> > <br>> change and re-balance from time to time. SolrCloud performs the
> > <br>> distributed search for you, therefore when you try to search a
> node/core
> > <br>> with no documents, all the results from the "cloud" are retrieved
> > <br>> regardless. This is considered "A Good Thing".
> > <br>>
> > <br>> It requires a change in thinking about indexing and searching....
> > <br>>
> > <br>> On Tue, 2012-05-22 at 08:43 +0800, Yandong Yao wrote:
> > <br>> > Hi Guys,
> > <br>> >
> > <br>> > I use following command to start solr cloud according to solr
> cloud wiki.
> > <br>> >
> > <br>> > yydzero:example bjcoe$ java -Dbootstrap_confdir=./solr/conf
> > <br>> > -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar
> start.jar
> > <br>> > yydzero:example2 bjcoe$ java -Djetty.port=7574
> -DzkHost=localhost:9983
> > <br>> -jar
> > <br>> > start.jar
> > <br>> >
> > <br>> > Then I have created several cores using CoreAdmin API (
> > <br>> > http://localhost:8983/solr/admin/cores?action=CREATE&name=
> > <br>> > <coreName>&collection=collection1), and clusterstate.json show
> following
> > <br>> > topology:
> > <br>> >
> > <br>> >
> > <br>> > collection1:
> > <br>> >     -- shard1:
> > <br>> >           -- collection1
> > <br>> >           -- CoreForCustomer1
> > <br>> >           -- CoreForCustomer3
> > <br>> >           -- CoreForCustomer5
> > <br>> >     -- shard2:
> > <br>> >           -- collection1
> > <br>> >           -- CoreForCustomer2
> > <br>> >           -- CoreForCustomer4
> > <br>> >
> > <br>> >
> > <br>> > 1) Index:
> > <br>> >
> > <br>> > Using following command to index mem.xml file in exampledocs
> directory.
> > <br>> >
> > <br>> > yydzero:exampledocs bjcoe$ java -Durl=
> > <br>> > http://localhost:8983/solr/coreForCustomer3/update -jar
> post.jar mem.xml
> > <br>> > SimplePostTool: version 1.4
> > <br>> > SimplePostTool: POSTing files to
> > <br>> > http://localhost:8983/solr/coreForCustomer3/update..
> > <br>> > SimplePostTool: POSTing file mem.xml
> > <br>> > SimplePostTool: COMMITting Solr index changes.
> > <br>> >
> > <br>> > And now SolrAdmin UI shows that 'coreForCustomer1',
> 'coreForCustomer3',
> > <br>> > 'coreForCustomer5' has 3 documents (mem.xml has 3 documents) and
> other 2
> > <br>> > core has 0 documents.
> > <br>> >
> > <br>> > *Question 1:*  Is this expected behavior? How do I to index
> documents
> > <br>> into
> > <br>> > a specific core?
> > <br>> >
> > <br>> > *Question 2*:  If SolrCloud don't support this yet, how could I
> extend it
> > <br>> > to support this feature (index document to particular core),
> where
> > <br>> should i
> > <br>> > start, the hashing algorithm?
> > <br>> >
> > <br>> > *Question 3*:  Why the documents are also indexed into
> 'coreForCustomer1'
> > <br>> > and 'coreForCustomer5'?  The default replica for documents are
> 1, right?
> > <br>> >
> > <br>> > Then I try to index some document to 'coreForCustomer2':
> > <br>> >
> > <br>> > $ java -Durl=http://localhost:8983/solr/coreForCustomer2/update-jar
> > <br>> > post.jar ipod_video.xml
> > <br>> >
> > <br>> > While 'coreForCustomer2' still have 0 documents and documents in
> > <br>> ipod_video
> > <br>> > are indexed to core for customer 1/3/5.
> > <br>> >
> > <br>> > *Question 4*:  Why this happens?
> > <br>> >
> > <br>> > 2) Search: I use "
> > <br>> >
> http://localhost:8983/solr/coreForCustomer2/select?q=*%3A*&wt=xml"; to
> > <br>> > search against 'CoreForCustomer2', while it will return all
> documents in
> > <br>> > the whole collection even though this core has no documents at
> all.
> > <br>> >
> > <br>> > Then I use "
> > <br>> >
> > <br>>
> http://localhost:8983/solr/coreForCustomer2/select?q=*%3A*&wt=xml&shards=localhost:8983/solr/coreForCustomer2
> > <br>> ",
> > <br>> > and it will return 0 documents.
> > <br>> >
> > <br>> > *Question 5*: So If want to search against a particular core, we
> need to
> > <br>> > use 'shards' parameter and use solrCore name as parameter value,
> right?
> > <br>> >
> > <br>> >
> > <br>> > Thanks very much in advance!
> > <br>> >
> > <br>> > Regards,
> > <br>> > Yandong
> > <br>>
> > <br>>
> > <br>>
> > <br>
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>
>

Reply via email to