Also, any suggestions on debugging? What should I look for and how? Thanks
On Thu, Jan 23, 2014 at 10:01 AM, Software Dev <static.void....@gmail.com>wrote: > Thanks for suggestions. After reading that document I feel even more > confused though because I always thought that hard commits should be less > frequent that hard commits. > > Is there any way to configure autoCommit, softCommit values on a per > request basis? The majority of the time we have small flow of updates > coming in and we would like to see them in ASAP. However we occasionally > need to do some bulk indexing (once a week or less) and the need to see > those updates right away isn't as critical. > > I would say 95% of the time we are in "Index-Light Query-Light/Heavy" mode > and the other 5% is "Index-Heavy Query-Light/Heavy" mode. > > Thanks > > > On Wed, Jan 22, 2014 at 5:33 PM, Erick Erickson > <erickerick...@gmail.com>wrote: > >> When you're doing hard commits, is it with openSeacher = true or >> false? It should probably be false... >> >> Here's a rundown of the soft/hard commit consequences: >> >> >> http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ >> >> I suspect (but, of course, can't prove) that you're over-committing >> and hitting segment >> merges without meaning to... >> >> FWIW, >> Erick >> >> On Wed, Jan 22, 2014 at 1:46 PM, Software Dev <static.void....@gmail.com> >> wrote: >> > A suggestion would be to hard commit much less often, ie every 10 >> > minutes, and see if there is a change. >> > >> > - Will try this >> > >> > How much system RAM ? JVM Heap ? Enough space in RAM for system disk >> cache ? >> > >> > - We have 18G of ram 12 dedicated to Solr but as of right now the total >> > index size is only 5GB >> > >> > Ah, and what about network IO ? Could that be a limiting factor ? >> > >> > - What is the size of your documents ? A few KB, MB, ... ? >> > >> > Under 1MB >> > >> > - Again, total index size is only 5GB so I dont know if this would be a >> > problem >> > >> > >> > >> > >> > >> > >> > On Wed, Jan 22, 2014 at 12:26 AM, Andre Bois-Crettez >> > <andre.b...@kelkoo.com>wrote: >> > >> >> 1 node having more load should be the leader (because of the extra work >> >> of receiving and distributing updates, but my experiences show only a >> >> bit more CPU usage, and no difference in disk IO). >> >> >> >> A suggestion would be to hard commit much less often, ie every 10 >> >> minutes, and see if there is a change. >> >> How much system RAM ? JVM Heap ? Enough space in RAM for system disk >> cache >> >> ? >> >> What is the size of your documents ? A few KB, MB, ... ? >> >> Ah, and what about network IO ? Could that be a limiting factor ? >> >> >> >> >> >> André >> >> >> >> >> >> On 2014-01-21 23:40, Software Dev wrote: >> >> >> >>> Any other suggestions? >> >>> >> >>> >> >>> On Mon, Jan 20, 2014 at 2:49 PM, Software Dev < >> static.void....@gmail.com> >> >>> wrote: >> >>> >> >>> 4.6.0 >> >>>> >> >>>> >> >>>> On Mon, Jan 20, 2014 at 2:47 PM, Mark Miller <markrmil...@gmail.com >> >>>> >wrote: >> >>>> >> >>>> What version are you running? >> >>>>> >> >>>>> - Mark >> >>>>> >> >>>>> On Jan 20, 2014, at 5:43 PM, Software Dev < >> static.void....@gmail.com> >> >>>>> wrote: >> >>>>> >> >>>>> We also noticed that disk IO shoots up to 100% on 1 of the nodes. >> Do >> >>>>>> all >> >>>>>> updates get sent to one machine or something? >> >>>>>> >> >>>>>> >> >>>>>> On Mon, Jan 20, 2014 at 2:42 PM, Software Dev < >> >>>>>> >> >>>>> static.void....@gmail.com>wrote: >> >>>>> >> >>>>>> We commit have a soft commit every 5 seconds and hard commit every >> 30. >> >>>>>>> >> >>>>>> As >> >>>>> >> >>>>>> far as docs/second it would guess around 200/sec which doesn't seem >> >>>>>>> >> >>>>>> that >> >>>>> >> >>>>>> high. >> >>>>>>> >> >>>>>>> >> >>>>>>> On Mon, Jan 20, 2014 at 2:26 PM, Erick Erickson < >> >>>>>>> >> >>>>>> erickerick...@gmail.com>wrote: >> >>>>> >> >>>>>> Questions: How often do you commit your updates? What is your >> >>>>>>>> indexing rate in docs/second? >> >>>>>>>> >> >>>>>>>> In a SolrCloud setup, you should be using a CloudSolrServer. If >> the >> >>>>>>>> server is having trouble keeping up with updates, switching to >> CUSS >> >>>>>>>> probably wouldn't help. >> >>>>>>>> >> >>>>>>>> So I suspect there's something not optimal about your setup >> that's >> >>>>>>>> the culprit. >> >>>>>>>> >> >>>>>>>> Best, >> >>>>>>>> Erick >> >>>>>>>> >> >>>>>>>> On Mon, Jan 20, 2014 at 4:00 PM, Software Dev < >> >>>>>>>> >> >>>>>>> static.void....@gmail.com> >> >>>>> >> >>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> We are testing our shiny new Solr Cloud architecture but we are >> >>>>>>>>> experiencing some issues when doing bulk indexing. >> >>>>>>>>> >> >>>>>>>>> We have 5 solr cloud machines running and 3 indexing machines >> >>>>>>>>> >> >>>>>>>> (separate >> >>>>> >> >>>>>> from the cloud servers). The indexing machines pull off ids from a >> >>>>>>>>> >> >>>>>>>> queue >> >>>>> >> >>>>>> then they index and ship over a document via a CloudSolrServer. It >> >>>>>>>>> >> >>>>>>>> appears >> >>>>>>>> >> >>>>>>>>> that the indexers are too fast because the load (particularly >> disk >> >>>>>>>>> >> >>>>>>>> io) >> >>>>> >> >>>>>> on >> >>>>>>>> >> >>>>>>>>> the solr cloud machines spikes through the roof making the >> entire >> >>>>>>>>> >> >>>>>>>> cluster >> >>>>>>>> >> >>>>>>>>> unusable. It's kind of odd because the total index size is not >> even >> >>>>>>>>> large..ie, < 10GB. Are there any optimization/enhancements I >> could >> >>>>>>>>> >> >>>>>>>> try >> >>>>> >> >>>>>> to >> >>>>>>>> >> >>>>>>>>> help alleviate these problems? >> >>>>>>>>> >> >>>>>>>>> I should note that for the above collection we have only have 1 >> >>>>>>>>> shard >> >>>>>>>>> >> >>>>>>>> thats >> >>>>>>>> >> >>>>>>>>> replicated across all machines so all machines have the full >> index. >> >>>>>>>>> >> >>>>>>>>> Would we benefit from switching to a ConcurrentUpdateSolrServer >> >>>>>>>>> where >> >>>>>>>>> >> >>>>>>>> all >> >>>>>>>> >> >>>>>>>>> updates get sent to 1 machine and 1 machine only? We could then >> >>>>>>>>> >> >>>>>>>> remove >> >>>>> >> >>>>>> this >> >>>>>>>> >> >>>>>>>>> machine from our cluster than that handles user requests. >> >>>>>>>>> >> >>>>>>>>> Thanks for any input. >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>> >> >>> -- >> >>> André Bois-Crettez >> >>> >> >>> Software Architect >> >>> Search Developer >> >>> http://www.kelkoo.com/ >> >>> >> >> >> >> Kelkoo SAS >> >> Société par Actions Simplifiée >> >> Au capital de € 4.168.964,30 >> >> Siège social : 8, rue du Sentier 75002 Paris >> >> 425 093 069 RCS Paris >> >> >> >> Ce message et les pièces jointes sont confidentiels et établis à >> >> l'attention exclusive de leurs destinataires. Si vous n'êtes pas le >> >> destinataire de ce message, merci de le détruire et d'en avertir >> >> l'expéditeur. >> >> >> > >