Hi Shawn, Thank you very much for your analysis. I currently don’t have multiple machines to play with. I will try "one Solr instance and one ZK instance would be more efficient on a single server” you suggested.
Thanks again, Chuming On Nov 4, 2018, at 7:56 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 11/4/2018 8:38 AM, Chuming Chen wrote: >> I have shared a tar ball with you (apa...@elyograg.org) from google drive. >> The tar ball includes logs directories of 4 nodes, solrconfig.xml, >> solr.in.sh, and screenshot of TOP command. The log files is about 1 day’s >> log. However, I restarted the solr cloud several times during that period. > > Runtime represented in the GC log for node1 is 23 minutes. Not anywhere near > a full day. > > Runtime represented in thc GC log for node2 is just under 16 minutes. > > Runtime represented in the GC log for node3 is 434 milliseconds. > > Runtime represented in the GC log for node4 is 501 milliseconds. > > This is not enough to even make a guess, much less a reasoned recommendation > about the heap size you will actually need. There must be enough runtime > that there have been significant garbage collections so we can get a sense > about how much memory the application actually needs. > >> I want to make it clear. I don’t have 4 physical machines. I have 48 cores >> server. All 4 solr nodes are running on the same physical machine. Each node >> has 1 shard and 1 replicate. I also have a ZooKeeper ensemble running on the >> same machine with 3 different ports. > > Why? You get absolutely no redundancy that way. One Solr instance and one > ZK instance would be more efficient on a single server. The increase in > efficiency probably wouldn't be significant, but it WOULD be more efficient. > You really can't get a sense about how separate servers will behave if all > the software is running on a single server. > >> I am curious to know what Solr is doing when the CPU usage is 100% or more >> than 100%. Because for some queries, I think even just looping through all >> the document without using any index might be faster. > > I have no way to answer this question. Solr will be doing whatever you asked > it to do. > > The screenshot of the top output shows that all four of the nodes there are > using about 3GB of memory each (RES minus SHR). Which would be consistent > with the very short runtimes noted by the GC logs. The VIRT column reveals > that each node has about 100GB of index data. So about 400GB total index > data. Not much can be determined when the runtime is so small. > > Thanks, > Shawn >