On 9/17/2018 7:04 AM, KARTHICKRM wrote:
Dear SOLR Team,

We are beginners to Apache SOLR, We need following clarifications from you.

Much of what I'm going to say is a mirror of what you were already told by Jan.  All of Jan's responses are good.

1.      In SOLRCloud, How can we install more than one Shared on Single PC?

One Solr instance can run multiple indexes.  Except for one specific scenario that I hope you don't run into, you should NOT run multiple Solr instances per server.  There should only be one.  If your query rate is very low, then you can get good performance from multiple shards per node, but with a high query rate, you'll only want one shard per node.

2.      How many maximum number of shared can be added under on SOLRCloud?

There is no practical limit.  If you create enough of them (more than a few hundred), you can end up with severe scalability problems related to SolrCloud's interaction with ZooKeeper.

3.      In my application there is no need of ACID properties, other than
this can I use SOLR as a Complete Database?

Solr is NOT a database.  All of its capability and all the optimizations it contains are all geared towards search.  If you try to use it as a database, you're going to be disappointed with it.

4.      In Which OS we can feel the better performance, Windows Server OS /
Linux?

From those two choices, I would strongly recommend Linux. If you have an open source operating system that you prefer to Linux, go with that.

5.      If a SOLR Core contains 2 Billion indexes, what is the recommended
RAM size and Java heap space for better performance?

I hope you mean 2 billion documents here, not 2 billion indexes.  Even though technically speaking there's nothing preventing SolrCloud from handling that many indexes, you'll run into scalability problems long before you reach that many.

If you do mean documents ... don't put that many documents in one core.  That number includes deleted documents, which means there's a good possibility of going beyond the actual limit if you try to have 2 billion documents that haven't been deleted.

6.      I have 20 fields per document, how many maximum number of documents
can be inserted / retrieved in a single request?

There's no limit to the number that can be retrieved.  But because the entire response must be built in memory, you can run your Solr install out of heap memory by trying to build a large response.  Streaming expressions can be used for really large results to avoid the memory issues.

As for the number of documents that can be inserted by a single request ... Solr defaults to a maximum POST body size of 2 megabytes.  This can be increased through an option in solrconfig.xml.  Unless your documents are huge, this is usually enough to send several thousand at once, which should be plenty.

7.       If I have Billions of indexes, If the "start" parameter is 10th
Million index and "end" parameter is  start+100th index, for this case any
performance issue will be raised ?

Let's say that you send a request with these parameters, and the index has three shards:

start=10000000&rows=100

Every shard in the index is going to return a result to the coordinating node of ten million plus 100.  That's thirty million individual results.  The coordinating node will combine those results, sort them, and then request full documents for the 100 specific rows that were requested.  This takes a lot of time and a lot of memory.

For deep paging, use cursorMark.  For large result sets, use streaming expressions.  I have used cursorMark ... it's only disadvantage is that you can't jump straight to page 10000, you must go through all of the earlier pages too.  But page 10000 will be just as fast as page 1.  I have never used streaming expressions.

8.      Which .net client is best for SOLR?

No idea.  The only client produced by this project is the Java client.  All other clients are third-party, including .NET clients.

9.      Is there any limitation for single field, I mean about the size for
blob data?

There are technically no limitations here.  But if your data is big enough, it begins to cause scalability problems.  It takes time to read data off the disk, for the CPU to process it, etc.

In conclusion, I have much the same thing to say as Jan said.  It sounds to me like you're not after a search engine, and that Solr might not be the right product for what you're trying to accomplish.  I'll say this again: Solr is NOT a database.

Thanks,
Shawn

Reply via email to