Hi Shawn,

Thank you very much for your analysis. I currently don’t have multiple machines 
to play with. I will try "one Solr instance and one ZK instance would be more 
efficient on a single server” you suggested.

Thanks again,

Chuming



On Nov 4, 2018, at 7:56 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 11/4/2018 8:38 AM, Chuming Chen wrote:
>> I have shared a tar ball with you (apa...@elyograg.org) from google drive. 
>> The tar ball includes logs directories of 4 nodes, solrconfig.xml, 
>> solr.in.sh, and screenshot of TOP command. The log files is about 1 day’s 
>> log. However, I restarted the solr cloud several times during that period.
> 
> Runtime represented in the GC log for node1 is 23 minutes. Not anywhere near 
> a full day.
> 
> Runtime represented in thc GC log for node2 is just under 16 minutes.
> 
> Runtime represented in the GC log for node3 is 434 milliseconds.
> 
> Runtime represented in the GC log for node4 is 501 milliseconds.
> 
> This is not enough to even make a guess, much less a reasoned recommendation 
> about the heap size you will actually need.  There must be enough runtime 
> that there have been significant garbage collections so we can get a sense 
> about how much memory the application actually needs.
> 
>> I want to make it clear. I don’t have 4 physical machines. I have 48 cores 
>> server. All 4 solr nodes are running on the same physical machine. Each node 
>> has 1 shard and 1 replicate. I also have a ZooKeeper ensemble running on the 
>> same machine with 3 different ports.
> 
> Why?  You get absolutely no redundancy that way.  One Solr instance and one 
> ZK instance would be more efficient on a single server.  The increase in 
> efficiency probably wouldn't be significant, but it WOULD be more efficient.  
> You really can't get a sense about how separate servers will behave if all 
> the software is running on a single server.
> 
>> I am curious to know what Solr is doing when the CPU usage is 100% or more 
>> than 100%. Because for some queries, I think even just looping through all 
>> the document without using any index might be faster.
> 
> I have no way to answer this question.  Solr will be doing whatever you asked 
> it to do.
> 
> The screenshot of the top output shows that all four of the nodes there are 
> using about 3GB of memory each (RES minus SHR).  Which would be consistent 
> with the very short runtimes noted by the GC logs.  The VIRT column reveals 
> that each node has about 100GB of index data.  So about 400GB total index 
> data.  Not much can be determined when the runtime is so small.
> 
> Thanks,
> Shawn
> 

Reply via email to