[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

Markus Jelsma (JIRA) Wed, 25 Jul 2012 03:17:39 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422135#comment-13422135
 ]


Markus Jelsma commented on SOLR-1781:
-------------------------------------

Hi,

I'll restart one node with two cores.

{code}
#cat cores/openindex_b/data/index.properties 
#index properties
#Wed Jul 25 09:58:26 UTC 2012
index=index.20120725095644707
{code}

{code}
# du -h cores/
4.0K    cores/lib
46M     cores/openindex_b/data/index.20120725095644707
404K    cores/openindex_b/data/tlog
46M     cores/openindex_b/data
46M     cores/openindex_b
98M     cores/openindex_a/data/index.20120725095843731
124K    cores/openindex_a/data/tlog
98M     cores/openindex_a/data
98M     cores/openindex_a
144M    cores/
{code}

2012-07-25 10:01:09,176 WARN [solr.core.SolrCore] - [main] - : New index 
directory detected: old=null 
new=/opt/solr/cores/openindex_b/data/index.20120725095644707
...
2012-07-25 10:01:17,303 WARN [solr.core.SolrCore] - [main] - : New index 
directory detected: old=null 
new=/opt/solr/cores/openindex_a/data/index.20120725095843731
...
2012-07-25 10:01:55,016 WARN [solr.core.SolrCore] - [RecoveryThread] - : New 
index directory detected: 
old=/opt/solr/cores/openindex_b/data/index.20120725095644707 
new=/opt/solr/cores/openindex_b/data/index.20120725100120496
...
2012-07-25 10:03:35,236 WARN [solr.core.SolrCore] - [RecoveryThread] - : New 
index directory detected: 
old=/opt/solr/cores/openindex_a/data/index.20120725100220706 
new=/opt/solr/cores/openindex_a/data/index.20120725100321897


{code}
# du -h cores/
4.0K    cores/lib
46M     cores/openindex_b/data/index.20120725095644707
404K    cores/openindex_b/data/tlog
46M     cores/openindex_b/data/index.20120725100120496
91M     cores/openindex_b/data
91M     cores/openindex_b
98M     cores/openindex_a/data/index.20120725100321897
98M     cores/openindex_a/data/index.20120725100220706
124K    cores/openindex_a/data/tlog
196M    cores/openindex_a/data
196M    cores/openindex_a
287M    cores/
{code}

A few minutes later we still have multiple index directories. No updates have 
been sent to the cluster during this whole scenario. Each time another 
directory appears it comes with a lot of I/O, on these RAM limited machines 
it's almost trashing because of the additional directory. It does not create 
another directory on each restart but sometimes does, it restarted the same 
machine again and now i have three dirs for each core.

I'll turn on INFO logging for the node and restart it again without deleting 
the surpluss dirs. The master and slave versions are still the same.

{code}
# du -h cores/
4.0K    cores/lib
46M     cores/openindex_b/data/index.20120725100813961
42M     cores/openindex_b/data/index.20120725101349376
46M     cores/openindex_b/data/index.20120725095644707
46M     cores/openindex_b/data/index.20120725101231289
404K    cores/openindex_b/data/tlog
46M     cores/openindex_b/data/index.20120725100120496
223M    cores/openindex_b/data
223M    cores/openindex_b
98M     cores/openindex_a/data/index.20120725101252920
98M     cores/openindex_a/data/index.20120725100220706
124K    cores/openindex_a/data/tlog
196M    cores/openindex_a/data
196M    cores/openindex_a
418M    cores/
{code}

Maybe it cannot find the current index directory on start up (in my case).


2012-07-25 10:13:36,125 WARN [solr.core.SolrCore] - [main] - : New index 
directory detected: old=null 
new=/opt/solr/cores/openindex_b/data/index.20120725101231289
2012-07-25 10:13:45,840 WARN [solr.core.SolrCore] - [main] - : New index 
directory detected: old=null 
new=/opt/solr/cores/openindex_a/data/index.20120725101252920
2012-07-25 10:15:41,393 WARN [solr.core.SolrCore] - [RecoveryThread] - : New 
index directory detected: 
old=/opt/solr/cores/openindex_b/data/index.20120725101231289 
new=/opt/solr/cores/openindex_b/data/index.20120725101349376
2012-07-25 10:15:46,895 WARN [solr.cloud.RecoveryStrategy] - [main-EventThread] 
- : Stopping recovery for core openindex_b 
zkNodeName=nl2.index.openindex.io:8080_solr_openindex_b
2012-07-25 10:15:46,952 WARN [solr.core.SolrCore] - [RecoveryThread] - : 
[openindex_a] Error opening new searcher. exceeded limit of 
maxWarmingSearchers=1, try again later.
2012-07-25 10:15:47,298 ERROR [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
- : Error while trying to recover.
org.apache.solr.common.SolrException: Error opening new searcher. exceeded 
limit of maxWarmingSearchers=1, try again later.
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1365)
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1157)
        at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:560)
        at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:316)
        at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:210)
2012-07-25 10:15:47,299 ERROR [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
- : Recovery failed - trying again...

This is crazy :)

btw: this is today's build.
                
> Replication index directories not always cleaned up
> ---------------------------------------------------
>
>                 Key: SOLR-1781
>                 URL: https://issues.apache.org/jira/browse/SOLR-1781
>             Project: Solr
>          Issue Type: Bug
>          Components: replication (java), SolrCloud
>    Affects Versions: 1.4
>         Environment: Windows Server 2003 R2, Java 6b18
>            Reporter: Terje Sten Bjerkseth
>            Assignee: Mark Miller
>             Fix For: 4.0, 5.0
>
>         Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

Reply via email to