Hi All , Just thought of giving quick update on this. So we were able to *knock down this issue by using jvisualvm* which comes with java . So , we enabled monitoring through jmx and the CPU profiling showed (as attached in one of my previous emails) *Highlighting taking maximum processing.* Mysteriously , this was happening in highlighting-> merge which was invoked through when we enabled *mergecontiguous=true* I'm still surprised as to turning this only property false, resolved the issue and we happily went live last week.
Later , as I found the code for this particular property is causing endless recursions as I traced. Please guide / share if you may have any other thoughts. Thanks, Atita On Fri, Jul 28, 2017 at 7:18 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 7/27/2017 1:30 AM, Atita Arora wrote: > > What OS is Solr running on? I'm only asking because some additional > > information I'm after has different gathering methods depending on OS. > > Other questions: > > > > /*OpenJDK 64-Bit Server VM (25.141-b16) for linux-amd64 JRE > > (1.8.0_141-b16), built on Jul 20 2017 21:47:59 by "mockbuild" with gcc > > 4.4.7 20120313 (Red Hat 4.4.7-18)*/ > > /*Memory: 4k page, physical 264477520k(92198808k free), swap 0k(0k > free)*/ > > Linux is the easiest to get good information from. Run the "top" > program in a commandline session. Press shift-M to sort by memory size, > and grab a screenshot. Share that screenshot with a file sharing site > and give us the URL. > > > Is there only one Solr process per machine, or more than one? > > /*On an average yes , one solr process per machine , however , we do > > have a machine (where this log is taken) has two solr processes > > (master and slave)*/ > > Running a master and a slave on one machine does nothing for > redundancy. They need to be on separate machines for that to really > help. As for multiple processes per machine, tou can have many indexes > in one Solr instance -- you don't need more than one in most cases. > > > How many total documents are managed by one machine? > > */About 220945 per machine ( and double for this machine as it has > > instance of master as well as other slave)/* > > > > How big is all the index data managed by one machine? > > */The index is about 4G./* > > If less than a quarter of a million documents results in a 4GB index, > those documents must be ENORMOUS, or else there is something strange > going on. > > > What is the max heap on each Solr process? > > */Max heap is 25G for each Solr Process. (Xms 25g Xmx 25g)/* > > */ > > /* > > The reason of choosing RAMDirectory was that it was used in the > > similar manner while the production Solr was on Version 4.3.2, so no > > particular reason but just replicated how it was working , never > > thought this may give troubles. > > Set up the slaves just like the masters, with > NRTCachingDirectoryFactory. For a couple hundred thousand docs, you > probably only need a 2GB heap, possibly even less. > > > I had included a pastebin of GC snapshot (the complete log was too big > > to be included in the pastebin , so pasted a sampler) > > I asked for the full log because that's what I need to look deeper. A > sampler won't be enough. There are file sharing websites for sharing > larger content, and if you compress the file before uploading it, you > should be able to achieve a fairly impressive compression ratio. > Dropbox is generally a good choice for sharing fairly large content. > Dropbox also works for image data, like the "top" screenshot I asked for > above. > > > Another thing is as we observed the CPU cycles yesterday in high load > > condition we observed that the Highlighter component was taking > > longest , is there anything in particular we forgot to include that > > highlighting doesn't gives a performance hit . > > Attached is the snapshot taken from jvisualvm. > > Attachments rarely make it through the mailing list. Yours didn't, so I > cannot see that snapshot. > > I do not know anything about highlighting, so I cannot comment on how > much CPU it takes. I've never used the feature. > > My best idea about why your CPU is so high is problems with garbage > collection. To look into that, I need to have the full GC log. The > rest of the information I've asked for will help focus my efforts. > > Thanks, > Shawn > >