[jira] Commented: (SOLR-1144) replication hang

2009-05-08 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12707405#action_12707405
 ] 

Yonik Seeley commented on SOLR-1144:


bq. ReplicationHandler does not cause the hang on the master.

The slave is waiting forever, but it *could* be due to a bug on either the 
master or the slave, and it could be due to the replication handler.  It could 
also be another Solr bug somewhere, or it could be a Tomcat bug.

What is apparent is that since there is no replication stack trace on the 
master, it thinks it finished the file send (either that or got an exception), 
but the slave is still expecting more for some reason.  Perhaps if we used 
non-persistent connections for replication, the master would close the 
connection when it thought it had sent everything?


 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1144) replication hang

2009-05-08 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12707421#action_12707421
 ] 

Noble Paul commented on SOLR-1144:
--

The master closes the connection if everything is written.  if the download of 
a file is complete slave also closes the stream . The fact that the slave 
continued to wait means the file has not been downloaded completely. 

 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1144) replication hang

2009-05-08 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12707425#action_12707425
 ] 

Yonik Seeley commented on SOLR-1144:


bq. The master closes the connection if everything is written. 

Hmmm, that doesn't jive with the slave hanging on a read though... seems like 
the only way read() should block is if there is no more data to read currently 
and the socket is still open.

 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1144) replication hang

2009-05-07 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12706868#action_12706868
 ] 

Noble Paul commented on SOLR-1144:
--

ReplicationHandler does not cause the hang on the master. On the slave the 
SnapPuller was waiting forever which I hope would have fixed with SOLR-1096

 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1144) replication hang

2009-05-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12706199#action_12706199
 ] 

Yonik Seeley commented on SOLR-1144:


Hmmm, I had trouble finding SOLR-1096 before.
But it looks like it was used mainly for adding a timeout.  There's still an 
underlying bug somewhere, right?

 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1144) replication hang

2009-05-05 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12706302#action_12706302
 ] 

Noble Paul commented on SOLR-1144:
--

the stacktrace http://markmail.org/message/ecr6m4rf4iy2d652 . 

I suspect the following two threads are blocked

{code}
'NioBlockingSelector.BlockPoller-2' Id=10, RUNNABLE on lock=, total cpu
time=5580.ms user time=2120.ms
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at
org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSe
lector.java:305)
'NioBlockingSelector.BlockPoller-1' Id=9, RUNNABLE on lock=, total cpu
time=333280.ms user time=107520.ms
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollrrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at
org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSe
lector.java:305)
{code}



 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1144) replication hang

2009-05-04 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705891#action_12705891
 ] 

Noble Paul commented on SOLR-1144:
--

isn't this same as SOLR-1096 ?

 replication hang
 

 Key: SOLR-1144
 URL: https://issues.apache.org/jira/browse/SOLR-1144
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley

 It seems that replication can sometimes hang.
 http://www.lucidimagination.com/search/document/403305a3fda18599

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.