Re: Cannot finish recovery due to always met ReplicationHandler SnapPull failed: Unable to download xxx.fdt completely

2014-08-07 Thread forest_soup
I have opened one JIRA for it:
https://issues.apache.org/jira/browse/SOLR-6333



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cannot-finish-recovery-due-to-always-met-ReplicationHandler-SnapPull-failed-Unable-to-download-xxx-fy-tp4151611p4151631.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Cannot finish recovery due to always met ReplicationHandler SnapPull failed: Unable to download xxx.fdt completely

2014-08-07 Thread forest_soup
Thanks. 
My env is 2 VM with good network condition. So not sure why it is happened.
We are trying to reproduce it. The peersync fail log is :
2014年7月25日 上午6:30:48
WARN
SnapPuller
Error in fetching packets
java.io.EOFException
at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
at
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
at 
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cannot-finish-recovery-due-to-always-met-ReplicationHandler-SnapPull-failed-Unable-to-download-xxx-fy-tp4151611p4151621.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Cannot finish recovery due to always met ReplicationHandler SnapPull failed: Unable to download xxx.fdt completely

2014-08-07 Thread Shalin Shekhar Mangar
Why does PeerSync take so much time? Are these two nodes in different data
centers or are they connected by a slow link?


On Thu, Aug 7, 2014 at 12:41 PM, forest_soup  wrote:

> I have 2 solr nodes(solr1 and solr2) in a SolrCloud.
> After some issue happened, solr2 are in recovering state. The peersync
> cannot finish in about 15 min, so it turn to snappull.
> But when it's doing snap pull, it always met this issue below. Meanwhile,
> there are still update requests sent to this recovering node(solr2) and the
> good node(solr1). And the index in the recovering node is deleted and
> rebuild again and again. So it takes lots of time to finish.
>
> Is it a bug or as solr design?
> And could anyone help me on accelerate the progress of recovery?
>
> Thanks!
>
> 2014年7月17日 下午5:12:50ERROR   ReplicationHandler  SnapPull failed
> :org.apache.solr.common.SolrException: Unable to download _vdq.fdt
> completely. Downloaded 0!=182945
> SnapPull failed :org.apache.solr.common.SolrException: Unable to download
> _vdq.fdt completely. Downloaded 0!=182945
>at
>
> org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapPuller.java:1305)
>at
>
> org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1185)
>at
> org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
>at
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
>at
>
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
>at
> org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
>at
>
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
>at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
>
>
> We have below settings in solrconfig.xml:
>  
>1000
>${solr.autoCommit.maxTime:15000}
>true
>  
>
>  
>
>${solr.autoSoftCommit.maxTime:-1}
>  
>
> and the 8 is as default.
>
> my solrconfig.xml is as attached.  solrconfig.xml
> 
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Cannot-finish-recovery-due-to-always-met-ReplicationHandler-SnapPull-failed-Unable-to-download-xxx-fy-tp4151611.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Regards,
Shalin Shekhar Mangar.


Cannot finish recovery due to always met ReplicationHandler SnapPull failed: Unable to download xxx.fdt completely

2014-08-07 Thread forest_soup
I have 2 solr nodes(solr1 and solr2) in a SolrCloud. 
After some issue happened, solr2 are in recovering state. The peersync
cannot finish in about 15 min, so it turn to snappull. 
But when it's doing snap pull, it always met this issue below. Meanwhile,
there are still update requests sent to this recovering node(solr2) and the
good node(solr1). And the index in the recovering node is deleted and
rebuild again and again. So it takes lots of time to finish. 

Is it a bug or as solr design? 
And could anyone help me on accelerate the progress of recovery? 

Thanks! 

2014年7月17日 下午5:12:50ERROR   ReplicationHandler  SnapPull failed
:org.apache.solr.common.SolrException: Unable to download _vdq.fdt
completely. Downloaded 0!=182945 
SnapPull failed :org.apache.solr.common.SolrException: Unable to download
_vdq.fdt completely. Downloaded 0!=182945 
   at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapPuller.java:1305)
 
   at
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1185)
 
   at
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771) 
   at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421) 
   at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322) 
   at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155) 
   at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437) 
   at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247) 


We have below settings in solrconfig.xml: 
   
   1000  
   ${solr.autoCommit.maxTime:15000}
   true  
 

   

   ${solr.autoSoftCommit.maxTime:-1}
 

and the 8 is as default. 

my solrconfig.xml is as attached.  solrconfig.xml
  



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cannot-finish-recovery-due-to-always-met-ReplicationHandler-SnapPull-failed-Unable-to-download-xxx-fy-tp4151611.html
Sent from the Solr - User mailing list archive at Nabble.com.