Vinayakumar B created HDFS-10902:
------------------------------------

             Summary: QJM should not consider stale/failed txn available in any 
one of JNs.
                 Key: HDFS-10902
                 URL: https://issues.apache.org/jira/browse/HDFS-10902
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: qjm
            Reporter: Vinayakumar B
            Assignee: Vinayakumar B
            Priority: Critical


In one of our cluster faced an issue, where NameNode restart failed due to a 
stale/failed txn available in one JN but not others. 

Scenario is:
1. Full cluster restart
2. startLogSegment Txn(195222) synced in Only one JN but failed to others, 
because they were shutting down. Only editlog file was created but txn was not 
synced in others, so after restart they were marked as empty.
3. Cluster restarted. During failover, this new logSegment missed the recovery 
because this JN was slow in responding to this call.
4. Other JNs recover was successfull, as there was no in-progress files.
5. editlog.openForWrite() detected that (195222) was already available, and 
failed the failover.

Same steps repeated until that stale editlog in JN was manually deleted.

Since QJM is a quorum of JNs, txn is considered successfull, if its written min 
quorum. Otherwise it will be failed.
So, same case should be applied while selecting streams for reading also.
Stale/failed txns available in only less JNs should not be considered for 
reading.

HDFS-10519, does similar work to consider 'durable' txns based on 
'committedTxnId'. But updating 'committedTxnId' for every flush with one more 
RPC seems tobe problematic to performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to