[ https://issues.apache.org/jira/browse/SOLR-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761677#action_12761677 ]
Artem Russakovskii commented on SOLR-1482: ------------------------------------------ Also, just saw this on the first slave: {quote} INFO: Closing searc...@3efceb09 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} Oct 2, 2009 11:43:27 AM org.apache.solr.handler.SnapPuller doCommit INFO: Force open index writer to make sure older index files get deleted Oct 2, 2009 11:43:35 AM org.apache.solr.update.SolrIndexWriter finalize SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!! {quote} > Solr master and slave freeze after query > ---------------------------------------- > > Key: SOLR-1482 > URL: https://issues.apache.org/jira/browse/SOLR-1482 > Project: Solr > Issue Type: Bug > Affects Versions: 1.4 > Environment: Nightly 9/28/09. > 14 individual instances per server, using JNDI. > replicateAfter commit, 5 min interval polling. > All caches are currently commented out, on both slave and master. > Lots of ongoing commits - large chunks of data, each accompanied by a commit. > This is to guarantee that anything we think is now in Solr remains there in > case the server crashes. > Reporter: Artem Russakovskii > Priority: Critical > Attachments: catalina.out, catalina2.out > > > We're having issues with the deployment of 2 master-slave setups. > One of the master-slave setups is OK (so far) but on the other both the > master and the slave keep freezing, but only after I send a query to them. > And by freezing I mean indefinite hanging, with almost no output to log, no > errors, nothing. It's as if there's some sort of a deadlock. The hanging > servers need to be killed with -9, otherwise they keep hanging. > The query I send queries all instances at the same time using the ?shards= > syntax. > On the slave, the logs just stop - nothing shows up anymore after the query > is issued. On the master, they're a bit more descriptive. This information > seeps through very-very slowly, as you can see from the timestamps: > {quote} > SEVERE: java.lang.OutOfMemoryError: PermGen space > Oct 1, 2009 2:16:00 PM org.apache.solr.common.SolrException log > SEVERE: java.lang.OutOfMemoryError: PermGen space > Oct 1, 2009 2:19:37 PM org.apache.catalina.connector.CoyoteAdapter service > SEVERE: An exception or error occurred in the container during the request > processing > java.lang.OutOfMemoryError: PermGen space > Oct 1, 2009 2:19:37 PM org.apache.coyote.http11.Http11Processor process > SEVERE: Error processing request > java.lang.OutOfMemoryError: PermGen space > Oct 1, 2009 2:19:39 PM org.apache.catalina.connector.CoyoteAdapter service > SEVERE: An exception or error occurred in the container during the request > processing > java.lang.OutOfMemoryError: PermGen space > Exception in thread "ContainerBackException in thread "pool-29-threadOct 1, > 2009 2:21:47 PM org.apache.catalina.connector.CoyoteAdapter service > SEVERE: An exception or error occurred in the container during the request > processing > java.lang.OutOfMemoryError: PermGen space > Oct 1, 2009 2:21:47 PM > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process > SEVERE: Error reading request, ignored > java.lang.OutOfMemoryError: PermGen space > Oct 1, 2009 2:21:47 PM > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process > SEVERE: Error reading request, ignored > java.lang.OutOfMemoryError: PermGen space > -22" java.lang.OutOfMemoryError: PermGen space > Oct 1, 2009 2:21:47 PM > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process > SEVERE: Error reading request, ignored > java.lang.OutOfMemoryError: PermGen space > Exception in thread "http-8080-42" Oct 1, 2009 2:21:47 PM > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process > SEVERE: Error reading request, ignored > java.lang.OutOfMemoryError: PermGen space > Oct 1, 2009 2:21:47 PM > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process > SEVERE: Error reading request, ignored > java.lang.OutOfMemoryError: PermGen space > Oct 1, 2009 2:21:47 PM > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process > SEVERE: Error reading request, ignored > java.lang.OutOfMemoryError: PermGen space > Exception in thread "http-8080-26" Exception in thread "http-8080-32" > Exception in thread "http-8080-25" Exception in thread "http-8080-22" > Exception in thread "http-8080-15" Exception in thread "http-8080-45" > Exception in thread "http-8080-13" Exception in thread "http-8080-48" > Exception in thread "http-8080-7" Exception in thread "http-8080-38" > Exception in thread "http-8080-39" Exception in thread "http-8080-28" > Exception in thread "http-8080-1" Exception in thread "http-8080-2" Exception > in thread "http-8080-12" Exception in thread "http-8080-44" Exception in > thread "http-8080-47" Exception in thread "http-8080-29" Exception in thread > "http-8080-33" Exception in thread "http-8080-27" Exception in thread > "http-8080-36" Exception in thread "http-8080-113" Exception in thread > "http-8080-112" Exception in thread "http-8080-37" Exception in thread > "http-8080-18" java.lang.OutOfMemoryError: PermGen space > java.lang.OutOfMemoryError: PermGen space > java.lang.OutOfMemoryError: PermGen space > java.lang.OutOfMemoryError: PermGen space > java.lang.OutOfMemoryError: PermGen space > Exception in thread "http-8080-34" java.lang.OutOfMemoryError: PermGen space > java.lang.OutOfMemoryError: PermGen space > Exception in thread "http-8080-103" > {quote} > So the problem seems to be related to PermGen space. I found > http://www.nabble.com/Number-of-webapps-td22198080.html and tried > -XX:MaxPermSize=256m, but it didn't fix the problem. The current > CATALINA_OPTS looks like this: > {quote} > export CATALINA_OPTS="-XX:MaxPermSize=256m -Xmx6500m -XX:+UseConcMarkSweepGC" > {quote} > Is the only solution at this point going multicore, as Noble suggested (is > Noble your first name? I always assumed it was Paul and Noble was part of the > nickname)? Will multicore get rid of the problem, before we spend time > looking at it? For multicore, will the existing data dirs be compatible or > would a complete reindex be needed? > I'm willing to provide any information to you guys, just not sure what at the > moment. I'm also open to communicating outside of JIRA, at artem [_aT_] plaxo > {dot} com. > Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.