[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry

2011-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041155#comment-13041155
 ] 

Hudson commented on CASSANDRA-2644:
---

Integrated in Cassandra-0.8 #146 (See 
[https://builds.apache.org/hudson/job/Cassandra-0.8/146/])


> Make bootstrap retry
> 
>
> Key: CASSANDRA-2644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2644
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.0 beta 2
>Reporter: Chris Goffinet
>Assignee: Chris Goffinet
> Fix For: 0.8.1
>
> Attachments: 
> 0001-Make-ExpiringMap-have-objects-with-specific-timeouts.patch, 
> 0002-Make-bootstrap-retry-and-increment-timeout-for-every.patch
>
>
> We ran into a situation where we had rpc_timeout set to 1 second, and the 
> node needing to compute the token took over a second (1.6 seconds). The 
> bootstrapping node hangs forever without getting a token because the expiring 
> map removes it before the reply comes back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry

2011-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041150#comment-13041150
 ] 

Hudson commented on CASSANDRA-2644:
---

Integrated in Cassandra #912 (See 
[https://builds.apache.org/hudson/job/Cassandra/912/])


> Make bootstrap retry
> 
>
> Key: CASSANDRA-2644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2644
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.0 beta 2
>Reporter: Chris Goffinet
>Assignee: Chris Goffinet
> Fix For: 0.8.1
>
> Attachments: 
> 0001-Make-ExpiringMap-have-objects-with-specific-timeouts.patch, 
> 0002-Make-bootstrap-retry-and-increment-timeout-for-every.patch
>
>
> We ran into a situation where we had rpc_timeout set to 1 second, and the 
> node needing to compute the token took over a second (1.6 seconds). The 
> bootstrapping node hangs forever without getting a token because the expiring 
> map removes it before the reply comes back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry

2011-05-30 Thread Chris Goffinet (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041045#comment-13041045
 ] 

Chris Goffinet commented on CASSANDRA-2644:
---

Made changes to 02 patch, and commited to 0.8.1. Thanks!

> Make bootstrap retry
> 
>
> Key: CASSANDRA-2644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2644
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.0 beta 2
>Reporter: Chris Goffinet
>Assignee: Chris Goffinet
> Fix For: 0.8.1
>
> Attachments: 
> 0001-Make-ExpiringMap-have-objects-with-specific-timeouts.patch, 
> 0002-Make-bootstrap-retry-and-increment-timeout-for-every.patch
>
>
> We ran into a situation where we had rpc_timeout set to 1 second, and the 
> node needing to compute the token took over a second (1.6 seconds). The 
> bootstrapping node hangs forever without getting a token because the expiring 
> map removes it before the reply comes back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry

2011-05-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035098#comment-13035098
 ] 

Jonathan Ellis commented on CASSANDRA-2644:
---

bq. But there are still cases that retries will recover from... flapping/down 
nodes

Fair enough, but increasing the timeout is still unwarranted.  Let's just make 
it wait for max(DEFAULT_TIMEOUT, BOOTSTRAP_TIMEOUT) with B_T equal to, say, 30s.

Committed patch 01 to 0.8.1 branch, btw.

> Make bootstrap retry
> 
>
> Key: CASSANDRA-2644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2644
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.0 beta 2
>Reporter: Chris Goffinet
>Assignee: Chris Goffinet
> Fix For: 0.8.1
>
> Attachments: 
> 0001-Make-ExpiringMap-have-objects-with-specific-timeouts.patch, 
> 0002-Make-bootstrap-retry-and-increment-timeout-for-every.patch
>
>
> We ran into a situation where we had rpc_timeout set to 1 second, and the 
> node needing to compute the token took over a second (1.6 seconds). The 
> bootstrapping node hangs forever without getting a token because the expiring 
> map removes it before the reply comes back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry

2011-05-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032871#comment-13032871
 ] 

Sylvain Lebresne commented on CASSANDRA-2644:
-

Not fully related to the discussion here but streaming is another part of 
bootstrap so I'll mention that CASSANDRA-2433 introduces some mechanism to 
handle unrecoverable failures during streaming (that is, streaming already 
retry on errors but 1) it retries indefinitely while the CASSANDRA-2433 
introduce a max retry and 2) it doesn't detect the other end being dead). 
Anyway, just referencing the ticket so that if this ticket becomes "make 
bootstrap handle failures better", we don't duplicate efforts.   

> Make bootstrap retry
> 
>
> Key: CASSANDRA-2644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2644
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.0 beta 2
>Reporter: Chris Goffinet
>Assignee: Chris Goffinet
> Fix For: 0.8.1
>
> Attachments: 
> 0001-Make-ExpiringMap-have-objects-with-specific-timeouts.patch, 
> 0002-Make-bootstrap-retry-and-increment-timeout-for-every.patch
>
>
> We ran into a situation where we had rpc_timeout set to 1 second, and the 
> node needing to compute the token took over a second (1.6 seconds). The 
> bootstrapping node hangs forever without getting a token because the expiring 
> map removes it before the reply comes back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry

2011-05-12 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032823#comment-13032823
 ] 

Stu Hood commented on CASSANDRA-2644:
-

Good point.

> Make bootstrap retry
> 
>
> Key: CASSANDRA-2644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2644
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.0 beta 2
>Reporter: Chris Goffinet
>Assignee: Chris Goffinet
> Fix For: 0.8.1
>
> Attachments: 
> 0001-Make-ExpiringMap-have-objects-with-specific-timeouts.patch, 
> 0002-Make-bootstrap-retry-and-increment-timeout-for-every.patch
>
>
> We ran into a situation where we had rpc_timeout set to 1 second, and the 
> node needing to compute the token took over a second (1.6 seconds). The 
> bootstrapping node hangs forever without getting a token because the expiring 
> map removes it before the reply comes back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry

2011-05-12 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032741#comment-13032741
 ] 

Jonathan Ellis commented on CASSANDRA-2644:
---

I think the retry logic is a distraction here. If it doesn't work the first 
time because of anything other than "we didn't wait long enough" (i.e. it 
errored out) it's not likely to magically unbreak for the second.

Suggest just giving it a long retry to begin with.

> Make bootstrap retry
> 
>
> Key: CASSANDRA-2644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2644
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.0 beta 2
>Reporter: Chris Goffinet
>Assignee: Chris Goffinet
> Fix For: 0.8.1
>
> Attachments: 
> 0001-Make-ExpiringMap-have-objects-with-specific-timeouts.patch, 
> 0002-Make-bootstrap-retry-and-increment-timeout-for-every.patch
>
>
> We ran into a situation where we had rpc_timeout set to 1 second, and the 
> node needing to compute the token took over a second (1.6 seconds). The 
> bootstrapping node hangs forever without getting a token because the expiring 
> map removes it before the reply comes back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry

2011-05-12 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032734#comment-13032734
 ] 

Stu Hood commented on CASSANDRA-2644:
-

+1
Thanks Chris!

> Make bootstrap retry
> 
>
> Key: CASSANDRA-2644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2644
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.0 beta 2
>Reporter: Chris Goffinet
>Assignee: Chris Goffinet
> Fix For: 0.8.1
>
> Attachments: 
> 0001-Make-ExpiringMap-have-objects-with-specific-timeouts.patch, 
> 0002-Make-bootstrap-retry-and-increment-timeout-for-every.patch
>
>
> We ran into a situation where we had rpc_timeout set to 1 second, and the 
> node needing to compute the token took over a second (1.6 seconds). The 
> bootstrapping node hangs forever without getting a token because the expiring 
> map removes it before the reply comes back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry

2011-05-12 Thread Chris Goffinet (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032715#comment-13032715
 ] 

Chris Goffinet commented on CASSANDRA-2644:
---

I have a patch for this I'll be adding within the next day. I make ExpiringMap 
support custom timeouts per object, and make bootstrap getToken retry, while 
exponentially increasing the timeout until retries is met.

> Make bootstrap retry
> 
>
> Key: CASSANDRA-2644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2644
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.0 beta 2
>Reporter: Chris Goffinet
>Assignee: Chris Goffinet
>
> We ran into a situation where we had rpc_timeout set to 1 second, and the 
> node needing to compute the token took over a second (1.6 seconds). The 
> bootstrapping node hangs forever without getting a token.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira