[ 
https://issues.apache.org/jira/browse/CASSANDRA-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumava Ghosh updated CASSANDRA-5830:
-------------------------------------

    Description: 
Following is the code segment (StorageProxy.java:361) which causes the issue: 

Start is the start time of the paxos, is always less than the current system 
time, and therefore the negative difference is always less than the timeout. 

{code:title=StorageProxy.java|borderStyle=solid}
private static UUID beginAndRepairPaxos(long start, ByteBuffer key, CFMetaData 
metadata, List<InetAddress> liveEndpoints, int requiredParticipants, 
ConsistencyLevel consistencyForPaxos)
    throws WriteTimeoutException
    {
        long timeout = 
TimeUnit.MILLISECONDS.toNanos(DatabaseDescriptor.getCasContentionTimeout());

        PrepareCallback summary = null;
        while (start - System.nanoTime() < timeout)
        {
            long ballotMillis = summary == null
                              ? System.currentTimeMillis()
                              : Math.max(System.currentTimeMillis(), 1 + 
UUIDGen.unixTimestamp(summary.inProgressCommit.ballot));
            UUID ballot = UUIDGen.getTimeUUID(ballotMillis);
{code}

Here, the paxos gets stuck when PREPARE returns 'true' but with 
inProgressCommit. The code in StorageProxy.java:beginAndRepairPaxos() then 
tries to issue a PROPOSE and COMMIT for the inProgressCommit, and if it 
repeatedly receives 'false' as a PREPARE_RESPONSE it gets stuck in an endless 
loop until PREPARE_RESPONSE is true. 

  was:
Following is the code segment (StorageProxy.java:328) which causes the issue: 

Start is the start time of the paxos, is always less than the current system 
time, and therefore the negative difference is always less than the timeout. 

{code:title=StorageProxy.java|borderStyle=solid}
private static UUID beginAndRepairPaxos(long start, ByteBuffer key, CFMetaData 
metadata, List<InetAddress> liveEndpoints, int requiredParticipants, 
ConsistencyLevel consistencyForPaxos)
    throws WriteTimeoutException
    {
        long timeout = 
TimeUnit.MILLISECONDS.toNanos(DatabaseDescriptor.getCasContentionTimeout());

        PrepareCallback summary = null;
        while (start - System.nanoTime() < timeout)
        {
            long ballotMillis = summary == null
                              ? System.currentTimeMillis()
                              : Math.max(System.currentTimeMillis(), 1 + 
UUIDGen.unixTimestamp(summary.inProgressCommit.ballot));
            UUID ballot = UUIDGen.getTimeUUID(ballotMillis);
{code}

Here, the paxos gets stuck when PREPARE returns 'true' but with 
inProgressCommit. The code in StorageProxy.java:beginAndRepairPaxos() then 
tries to issue a PROPOSE and COMMIT for the inProgressCommit, and if it 
repeatedly receives 'false' as a PREPARE_RESPONSE it gets stuck in an endless 
loop until PREPARE_RESPONSE is true. 

    
> Paxos loops endlessly due to faulty condition check
> ---------------------------------------------------
>
>                 Key: CASSANDRA-5830
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5830
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 2.0 beta 2
>            Reporter: Soumava Ghosh
>
> Following is the code segment (StorageProxy.java:361) which causes the issue: 
> Start is the start time of the paxos, is always less than the current system 
> time, and therefore the negative difference is always less than the timeout. 
> {code:title=StorageProxy.java|borderStyle=solid}
> private static UUID beginAndRepairPaxos(long start, ByteBuffer key, 
> CFMetaData metadata, List<InetAddress> liveEndpoints, int 
> requiredParticipants, ConsistencyLevel consistencyForPaxos)
>     throws WriteTimeoutException
>     {
>         long timeout = 
> TimeUnit.MILLISECONDS.toNanos(DatabaseDescriptor.getCasContentionTimeout());
>         PrepareCallback summary = null;
>         while (start - System.nanoTime() < timeout)
>         {
>             long ballotMillis = summary == null
>                               ? System.currentTimeMillis()
>                               : Math.max(System.currentTimeMillis(), 1 + 
> UUIDGen.unixTimestamp(summary.inProgressCommit.ballot));
>             UUID ballot = UUIDGen.getTimeUUID(ballotMillis);
> {code}
> Here, the paxos gets stuck when PREPARE returns 'true' but with 
> inProgressCommit. The code in StorageProxy.java:beginAndRepairPaxos() then 
> tries to issue a PROPOSE and COMMIT for the inProgressCommit, and if it 
> repeatedly receives 'false' as a PREPARE_RESPONSE it gets stuck in an endless 
> loop until PREPARE_RESPONSE is true. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to