On 9/17/2018 7:04 AM, KARTHICKRM wrote:
Dear SOLR Team,
We are beginners to Apache SOLR, We need following clarifications from you.
Much of what I'm going to say is a mirror of what you were already told
by Jan. All of Jan's responses are good.
1. In SOLRCloud, How can we install more than one Shared on Single PC?
One Solr instance can run multiple indexes. Except for one specific
scenario that I hope you don't run into, you should NOT run multiple
Solr instances per server. There should only be one. If your query
rate is very low, then you can get good performance from multiple shards
per node, but with a high query rate, you'll only want one shard per node.
2. How many maximum number of shared can be added under on SOLRCloud?
There is no practical limit. If you create enough of them (more than a
few hundred), you can end up with severe scalability problems related to
SolrCloud's interaction with ZooKeeper.
3. In my application there is no need of ACID properties, other than
this can I use SOLR as a Complete Database?
Solr is NOT a database. All of its capability and all the optimizations
it contains are all geared towards search. If you try to use it as a
database, you're going to be disappointed with it.
4. In Which OS we can feel the better performance, Windows Server OS /
Linux?
From those two choices, I would strongly recommend Linux. If you have
an open source operating system that you prefer to Linux, go with that.
5. If a SOLR Core contains 2 Billion indexes, what is the recommended
RAM size and Java heap space for better performance?
I hope you mean 2 billion documents here, not 2 billion indexes. Even
though technically speaking there's nothing preventing SolrCloud from
handling that many indexes, you'll run into scalability problems long
before you reach that many.
If you do mean documents ... don't put that many documents in one core.
That number includes deleted documents, which means there's a good
possibility of going beyond the actual limit if you try to have 2
billion documents that haven't been deleted.
6. I have 20 fields per document, how many maximum number of documents
can be inserted / retrieved in a single request?
There's no limit to the number that can be retrieved. But because the
entire response must be built in memory, you can run your Solr install
out of heap memory by trying to build a large response. Streaming
expressions can be used for really large results to avoid the memory issues.
As for the number of documents that can be inserted by a single request
... Solr defaults to a maximum POST body size of 2 megabytes. This can
be increased through an option in solrconfig.xml. Unless your documents
are huge, this is usually enough to send several thousand at once, which
should be plenty.
7. If I have Billions of indexes, If the "start" parameter is 10th
Million index and "end" parameter is start+100th index, for this case any
performance issue will be raised ?
Let's say that you send a request with these parameters, and the index
has three shards:
start=10000000&rows=100
Every shard in the index is going to return a result to the coordinating
node of ten million plus 100. That's thirty million individual
results. The coordinating node will combine those results, sort them,
and then request full documents for the 100 specific rows that were
requested. This takes a lot of time and a lot of memory.
For deep paging, use cursorMark. For large result sets, use streaming
expressions. I have used cursorMark ... it's only disadvantage is that
you can't jump straight to page 10000, you must go through all of the
earlier pages too. But page 10000 will be just as fast as page 1. I
have never used streaming expressions.
8. Which .net client is best for SOLR?
No idea. The only client produced by this project is the Java client.
All other clients are third-party, including .NET clients.
9. Is there any limitation for single field, I mean about the size for
blob data?
There are technically no limitations here. But if your data is big
enough, it begins to cause scalability problems. It takes time to read
data off the disk, for the CPU to process it, etc.
In conclusion, I have much the same thing to say as Jan said. It sounds
to me like you're not after a search engine, and that Solr might not be
the right product for what you're trying to accomplish. I'll say this
again: Solr is NOT a database.
Thanks,
Shawn