Re: anyone have any clues about this exception
If you are updating all the time, don't forceMerge at all, unless you want to put the overhead of big merges at a known time. Otherwise, leave it alone. wunder On Oct 12, 2012, at 3:56 PM, Erick Erickson wrote: > Right. If I've multiplied right, you're essentially replacing your entire > index > every day given the rate you're adding documents. > > Have a look at MergePolicy, here are a couple of references: > http://juanggrande.wordpress.com/2011/02/07/merge-policy-internals/ > https://lucene.apache.org/core/old_versioned_docs/versions/3_2_0/api/core/org/apache/lucene/index/MergePolicy.html > http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html > > But unless you're having problems with performance, I'd consider just > optimizing once a day at off-peak hours. > > FWIW, > Erick > > On Fri, Oct 12, 2012 at 5:35 PM, Petersen, Robert wrote: >> Hi Erick, >> >> After reading the discussion you guys were having about renaming optimize to >> forceMerge I realized I was guilty of over-optimizing like you guys were >> worried about! We have about 15 million docs indexed now and we spin about >> 50-300 adds per second 24/7, most of them being updates to existing >> documents whose data has changed since the last time it was indexed (which >> we keep track of in a DB table). There are some new documents being added >> in the mix and some deletes as well too. >> >> I understand now how the merge policy caps the number of segments. I used >> to think they would grow unbounded and thus optimize was required. How does >> the large number of updates of existing documents affect the need to >> optimize, by causing a large number of deletes with a 're-add'? And so I >> suppose that means the index size tends to grow with the deleted docs >> hanging around in the background, as it were. >> >> So in our situation, what frequency of optimize would you recommend? We're >> on 3.6.1 btw... >> >> Thanks, >> Robi >> >> -Original Message- >> From: Erick Erickson [mailto:erickerick...@gmail.com] >> Sent: Thursday, October 11, 2012 5:29 AM >> To: solr-user@lucene.apache.org >> Subject: Re: anyone have any clues about this exception >> >> Well, you'll actually be able to optimize, it's just called forceMerge. >> >> But the point is that optimize seems like something that _of course_ you >> want to do, when in reality it's not something you usually should do at all. >> Optimize does two things: >> 1> merges all the segments into one (usually) >> 2> removes all of the info associated with deleted documents. >> >> Of the two, point <2> is the one that really counts and that's done whenever >> segment merging is done anyway. So unless you have a very large number of >> deletes (or updates of the same document), optimize buys you very little. >> You can tell this by the difference between numDocs and maxDoc in the admin >> page. >> >> So what happens if you just don't bother to optimize? Take a look at merge >> policy to help control how merging happens perhaps as an alternative. >> >> Best >> Erick >> >> On Wed, Oct 10, 2012 at 3:04 PM, Petersen, Robert wrote: >>> You could be right. Going back in the logs, I noticed it used to happen >>> less frequently and always towards the end of an optimize operation. It is >>> probably my indexer timing out waiting for updates to occur during >>> optimizes. The errors grew recently due to my upping the indexer >>> threadcount to 22 threads, so there's a lot more timeouts occurring now. >>> Also our index has grown to double the old size so the optimize operation >>> has started taking a lot longer, also contributing to what I'm seeing. I >>> have just changed my optimize frequency from three times a day to one time >>> a day after reading the following: >>> >>> Here they are talking about completely deprecating the optimize >>> command in the next version of solr... >>> https://issues.apache.org/jira/browse/SOLR-3141c >>> >>> >>> -Original Message- >>> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] >>> Sent: Wednesday, October 10, 2012 11:10 AM >>> To: solr-user@lucene.apache.org >>> Subject: Re: anyone have any clues about this exception >>> >>> Something timed out, the other end closed the connection. This end tried to >>> write to closed pipe and
Re: anyone have any clues about this exception
Right. If I've multiplied right, you're essentially replacing your entire index every day given the rate you're adding documents. Have a look at MergePolicy, here are a couple of references: http://juanggrande.wordpress.com/2011/02/07/merge-policy-internals/ https://lucene.apache.org/core/old_versioned_docs/versions/3_2_0/api/core/org/apache/lucene/index/MergePolicy.html http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html But unless you're having problems with performance, I'd consider just optimizing once a day at off-peak hours. FWIW, Erick On Fri, Oct 12, 2012 at 5:35 PM, Petersen, Robert wrote: > Hi Erick, > > After reading the discussion you guys were having about renaming optimize to > forceMerge I realized I was guilty of over-optimizing like you guys were > worried about! We have about 15 million docs indexed now and we spin about > 50-300 adds per second 24/7, most of them being updates to existing documents > whose data has changed since the last time it was indexed (which we keep > track of in a DB table). There are some new documents being added in the mix > and some deletes as well too. > > I understand now how the merge policy caps the number of segments. I used to > think they would grow unbounded and thus optimize was required. How does the > large number of updates of existing documents affect the need to optimize, by > causing a large number of deletes with a 're-add'? And so I suppose that > means the index size tends to grow with the deleted docs hanging around in > the background, as it were. > > So in our situation, what frequency of optimize would you recommend? We're > on 3.6.1 btw... > > Thanks, > Robi > > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Thursday, October 11, 2012 5:29 AM > To: solr-user@lucene.apache.org > Subject: Re: anyone have any clues about this exception > > Well, you'll actually be able to optimize, it's just called forceMerge. > > But the point is that optimize seems like something that _of course_ you want > to do, when in reality it's not something you usually should do at all. > Optimize does two things: > 1> merges all the segments into one (usually) > 2> removes all of the info associated with deleted documents. > > Of the two, point <2> is the one that really counts and that's done whenever > segment merging is done anyway. So unless you have a very large number of > deletes (or updates of the same document), optimize buys you very little. You > can tell this by the difference between numDocs and maxDoc in the admin page. > > So what happens if you just don't bother to optimize? Take a look at merge > policy to help control how merging happens perhaps as an alternative. > > Best > Erick > > On Wed, Oct 10, 2012 at 3:04 PM, Petersen, Robert wrote: >> You could be right. Going back in the logs, I noticed it used to happen >> less frequently and always towards the end of an optimize operation. It is >> probably my indexer timing out waiting for updates to occur during >> optimizes. The errors grew recently due to my upping the indexer >> threadcount to 22 threads, so there's a lot more timeouts occurring now. >> Also our index has grown to double the old size so the optimize operation >> has started taking a lot longer, also contributing to what I'm seeing. I >> have just changed my optimize frequency from three times a day to one time a >> day after reading the following: >> >> Here they are talking about completely deprecating the optimize >> command in the next version of solr... >> https://issues.apache.org/jira/browse/SOLR-3141c >> >> >> -Original Message- >> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] >> Sent: Wednesday, October 10, 2012 11:10 AM >> To: solr-user@lucene.apache.org >> Subject: Re: anyone have any clues about this exception >> >> Something timed out, the other end closed the connection. This end tried to >> write to closed pipe and died, something tried to catch that exception and >> write its own and died even worse? Just making it up really, but sounds good >> (plus a 3-year Java tech-support hunch). >> >> If it happens often enough, see if you can run WireShark on that machine's >> network interface and catch the whole network conversation in action. Often, >> there is enough clues there by looking at tcp packets and/or stuff >> transmitted. WireShark is a power-tool, so takes a little while the first >> time, but the learning will pay for itself over and over again. >
RE: anyone have any clues about this exception
Hi Erick, After reading the discussion you guys were having about renaming optimize to forceMerge I realized I was guilty of over-optimizing like you guys were worried about! We have about 15 million docs indexed now and we spin about 50-300 adds per second 24/7, most of them being updates to existing documents whose data has changed since the last time it was indexed (which we keep track of in a DB table). There are some new documents being added in the mix and some deletes as well too. I understand now how the merge policy caps the number of segments. I used to think they would grow unbounded and thus optimize was required. How does the large number of updates of existing documents affect the need to optimize, by causing a large number of deletes with a 're-add'? And so I suppose that means the index size tends to grow with the deleted docs hanging around in the background, as it were. So in our situation, what frequency of optimize would you recommend? We're on 3.6.1 btw... Thanks, Robi -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, October 11, 2012 5:29 AM To: solr-user@lucene.apache.org Subject: Re: anyone have any clues about this exception Well, you'll actually be able to optimize, it's just called forceMerge. But the point is that optimize seems like something that _of course_ you want to do, when in reality it's not something you usually should do at all. Optimize does two things: 1> merges all the segments into one (usually) 2> removes all of the info associated with deleted documents. Of the two, point <2> is the one that really counts and that's done whenever segment merging is done anyway. So unless you have a very large number of deletes (or updates of the same document), optimize buys you very little. You can tell this by the difference between numDocs and maxDoc in the admin page. So what happens if you just don't bother to optimize? Take a look at merge policy to help control how merging happens perhaps as an alternative. Best Erick On Wed, Oct 10, 2012 at 3:04 PM, Petersen, Robert wrote: > You could be right. Going back in the logs, I noticed it used to happen less > frequently and always towards the end of an optimize operation. It is > probably my indexer timing out waiting for updates to occur during optimizes. > The errors grew recently due to my upping the indexer threadcount to 22 > threads, so there's a lot more timeouts occurring now. Also our index has > grown to double the old size so the optimize operation has started taking a > lot longer, also contributing to what I'm seeing. I have just changed my > optimize frequency from three times a day to one time a day after reading the > following: > > Here they are talking about completely deprecating the optimize > command in the next version of solr... > https://issues.apache.org/jira/browse/SOLR-3141c > > > -Original Message- > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] > Sent: Wednesday, October 10, 2012 11:10 AM > To: solr-user@lucene.apache.org > Subject: Re: anyone have any clues about this exception > > Something timed out, the other end closed the connection. This end tried to > write to closed pipe and died, something tried to catch that exception and > write its own and died even worse? Just making it up really, but sounds good > (plus a 3-year Java tech-support hunch). > > If it happens often enough, see if you can run WireShark on that machine's > network interface and catch the whole network conversation in action. Often, > there is enough clues there by looking at tcp packets and/or stuff > transmitted. WireShark is a power-tool, so takes a little while the first > time, but the learning will pay for itself over and over again. > > Regards, >Alex. > > Personal blog: http://blog.outerthoughts.com/ > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > - Time is the quality of nature that keeps events from happening all > at once. Lately, it doesn't seem to be working. (Anonymous - via GTD > book) > > > On Wed, Oct 10, 2012 at 11:31 PM, Petersen, Robert wrote: >> Tomcat localhost log (not the catalina log) for my solr 3.6.1 (master) >> instance contains lots of these exceptions but solr itself seems to be doing >> fine... any ideas? I'm not seeing these exceptions being logged on my slave >> servers btw, just the master where we do our indexing only. >> >> >> >> Oct 9, 2012 5:34:11 PM org.apache.catalina.core.StandardWrapperValve >> invoke >> SEVERE: Servlet.service() for servlet default threw exception >> java.lang.IllegalStateException >> at >> org.apach
Re: anyone have any clues about this exception
Well, you'll actually be able to optimize, it's just called forceMerge. But the point is that optimize seems like something that _of course_ you want to do, when in reality it's not something you usually should do at all. Optimize does two things: 1> merges all the segments into one (usually) 2> removes all of the info associated with deleted documents. Of the two, point <2> is the one that really counts and that's done whenever segment merging is done anyway. So unless you have a very large number of deletes (or updates of the same document), optimize buys you very little. You can tell this by the difference between numDocs and maxDoc in the admin page. So what happens if you just don't bother to optimize? Take a look at merge policy to help control how merging happens perhaps as an alternative. Best Erick On Wed, Oct 10, 2012 at 3:04 PM, Petersen, Robert wrote: > You could be right. Going back in the logs, I noticed it used to happen less > frequently and always towards the end of an optimize operation. It is > probably my indexer timing out waiting for updates to occur during optimizes. > The errors grew recently due to my upping the indexer threadcount to 22 > threads, so there's a lot more timeouts occurring now. Also our index has > grown to double the old size so the optimize operation has started taking a > lot longer, also contributing to what I'm seeing. I have just changed my > optimize frequency from three times a day to one time a day after reading the > following: > > Here they are talking about completely deprecating the optimize command in > the next version of solr… > https://issues.apache.org/jira/browse/SOLR-3141c > > > -Original Message- > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] > Sent: Wednesday, October 10, 2012 11:10 AM > To: solr-user@lucene.apache.org > Subject: Re: anyone have any clues about this exception > > Something timed out, the other end closed the connection. This end tried to > write to closed pipe and died, something tried to catch that exception and > write its own and died even worse? Just making it up really, but sounds good > (plus a 3-year Java tech-support hunch). > > If it happens often enough, see if you can run WireShark on that machine's > network interface and catch the whole network conversation in action. Often, > there is enough clues there by looking at tcp packets and/or stuff > transmitted. WireShark is a power-tool, so takes a little while the first > time, but the learning will pay for itself over and over again. > > Regards, >Alex. > > Personal blog: http://blog.outerthoughts.com/ > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > - Time is the quality of nature that keeps events from happening all at once. > Lately, it doesn't seem to be working. (Anonymous - via GTD > book) > > > On Wed, Oct 10, 2012 at 11:31 PM, Petersen, Robert wrote: >> Tomcat localhost log (not the catalina log) for my solr 3.6.1 (master) >> instance contains lots of these exceptions but solr itself seems to be doing >> fine... any ideas? I'm not seeing these exceptions being logged on my slave >> servers btw, just the master where we do our indexing only. >> >> >> >> Oct 9, 2012 5:34:11 PM org.apache.catalina.core.StandardWrapperValve >> invoke >> SEVERE: Servlet.service() for servlet default threw exception >> java.lang.IllegalStateException >> at >> org.apache.catalina.connector.ResponseFacade.sendError(ResponseFacade.java:407) >> at >> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:389) >> at >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:291) >> at >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) >> at >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) >> at >> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) >> at >> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) >> at >> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) >> at >> com.googlecode.psiprobe.Tomcat60AgentValve.invoke(Tomcat60AgentValve.java:30) >> at >> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) >> at >> org.apache.catalina.core.StandardEngineV
RE: anyone have any clues about this exception
You could be right. Going back in the logs, I noticed it used to happen less frequently and always towards the end of an optimize operation. It is probably my indexer timing out waiting for updates to occur during optimizes. The errors grew recently due to my upping the indexer threadcount to 22 threads, so there's a lot more timeouts occurring now. Also our index has grown to double the old size so the optimize operation has started taking a lot longer, also contributing to what I'm seeing. I have just changed my optimize frequency from three times a day to one time a day after reading the following: Here they are talking about completely deprecating the optimize command in the next version of solr… https://issues.apache.org/jira/browse/SOLR-3141c -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Wednesday, October 10, 2012 11:10 AM To: solr-user@lucene.apache.org Subject: Re: anyone have any clues about this exception Something timed out, the other end closed the connection. This end tried to write to closed pipe and died, something tried to catch that exception and write its own and died even worse? Just making it up really, but sounds good (plus a 3-year Java tech-support hunch). If it happens often enough, see if you can run WireShark on that machine's network interface and catch the whole network conversation in action. Often, there is enough clues there by looking at tcp packets and/or stuff transmitted. WireShark is a power-tool, so takes a little while the first time, but the learning will pay for itself over and over again. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Wed, Oct 10, 2012 at 11:31 PM, Petersen, Robert wrote: > Tomcat localhost log (not the catalina log) for my solr 3.6.1 (master) > instance contains lots of these exceptions but solr itself seems to be doing > fine... any ideas? I'm not seeing these exceptions being logged on my slave > servers btw, just the master where we do our indexing only. > > > > Oct 9, 2012 5:34:11 PM org.apache.catalina.core.StandardWrapperValve > invoke > SEVERE: Servlet.service() for servlet default threw exception > java.lang.IllegalStateException > at > org.apache.catalina.connector.ResponseFacade.sendError(ResponseFacade.java:407) > at > org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:389) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:291) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > com.googlecode.psiprobe.Tomcat60AgentValve.invoke(Tomcat60AgentValve.java:30) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) > at java.lang.Thread.run(Unknown Source)
Re: anyone have any clues about this exception
Something timed out, the other end closed the connection. This end tried to write to closed pipe and died, something tried to catch that exception and write its own and died even worse? Just making it up really, but sounds good (plus a 3-year Java tech-support hunch). If it happens often enough, see if you can run WireShark on that machine's network interface and catch the whole network conversation in action. Often, there is enough clues there by looking at tcp packets and/or stuff transmitted. WireShark is a power-tool, so takes a little while the first time, but the learning will pay for itself over and over again. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Wed, Oct 10, 2012 at 11:31 PM, Petersen, Robert wrote: > Tomcat localhost log (not the catalina log) for my solr 3.6.1 (master) > instance contains lots of these exceptions but solr itself seems to be doing > fine... any ideas? I'm not seeing these exceptions being logged on my slave > servers btw, just the master where we do our indexing only. > > > > Oct 9, 2012 5:34:11 PM org.apache.catalina.core.StandardWrapperValve invoke > SEVERE: Servlet.service() for servlet default threw exception > java.lang.IllegalStateException > at > org.apache.catalina.connector.ResponseFacade.sendError(ResponseFacade.java:407) > at > org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:389) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:291) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > com.googlecode.psiprobe.Tomcat60AgentValve.invoke(Tomcat60AgentValve.java:30) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) > at java.lang.Thread.run(Unknown Source)
anyone have any clues about this exception
Tomcat localhost log (not the catalina log) for my solr 3.6.1 (master) instance contains lots of these exceptions but solr itself seems to be doing fine... any ideas? I'm not seeing these exceptions being logged on my slave servers btw, just the master where we do our indexing only. Oct 9, 2012 5:34:11 PM org.apache.catalina.core.StandardWrapperValve invoke SEVERE: Servlet.service() for servlet default threw exception java.lang.IllegalStateException at org.apache.catalina.connector.ResponseFacade.sendError(ResponseFacade.java:407) at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:389) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:291) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at com.googlecode.psiprobe.Tomcat60AgentValve.invoke(Tomcat60AgentValve.java:30) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at java.lang.Thread.run(Unknown Source)