[ 
https://issues.apache.org/jira/browse/HDFS-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-17645:
-----------------------------------
    Description: 
[~boky01] found a bug where an edit log can abort a Standby NameNode when it 
transition to active state.

 

I was able to reproduce it in a unit test. Here's it: 
[https://github.com/jojochuang/hadoop/commit/f051f0359e2ea63bbf492921fbffb87f1f1bcf56]

 

Roughly speaking:
{quote}create a long file name

update file name limit configuration (reduce 
dfs.namenode.fs-limits.max-component-length from 255 to 5) on the StandBy 
NameNode.

Restart Standby NameNode.

Failover to Standby NameNode.
{quote}
 

Looking at the relevant code, the intent was that as long as a file is created 
and its edit log persisted, even if the file name limit is changed, upon 
NameNode restart, the edit log replay should accept the edit log regardless of 
the file name length limit change. I think there is a bug in StandBy NameNode 
and the fix is to call FSNamespace.setImageLoaded(true) inside 
transitionToActive() to indicate that it is replaying an accepted edit log, not 
requesting a new transaction.

 

  was:
[~boky01] found a bug where an edit log can abort a Standby NameNode when it 
transition to active state.

 

I was able to reproduce it in a unit test. Roughly speaking:
{quote}create a long file name

update file name limit configuration (reduce 
dfs.namenode.fs-limits.max-component-length from 255 to 5) on the StandBy 
NameNode.

Restart Standby NameNode.

Failover to Standby NameNode.
{quote}
 

Looking at the relevant code, the intent was that as long as a file is created 
and its edit log persisted, even if the file name limit is changed, upon 
NameNode restart, the edit log replay should accept the edit log regardless of 
the file name length limit change. I think there is a bug in StandBy NameNode 
and the fix is to call FSNamespace.setImageLoaded(true) inside 
transitionToActive() to indicate that it is replaying an accepted edit log, not 
requesting a new transaction.

 


> Standby NameNode may fail to replay OP_ADD during transition to active
> ----------------------------------------------------------------------
>
>                 Key: HDFS-17645
>                 URL: https://issues.apache.org/jira/browse/HDFS-17645
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Wei-Chiu Chuang
>            Priority: Major
>
> [~boky01] found a bug where an edit log can abort a Standby NameNode when it 
> transition to active state.
>  
> I was able to reproduce it in a unit test. Here's it: 
> [https://github.com/jojochuang/hadoop/commit/f051f0359e2ea63bbf492921fbffb87f1f1bcf56]
>  
> Roughly speaking:
> {quote}create a long file name
> update file name limit configuration (reduce 
> dfs.namenode.fs-limits.max-component-length from 255 to 5) on the StandBy 
> NameNode.
> Restart Standby NameNode.
> Failover to Standby NameNode.
> {quote}
>  
> Looking at the relevant code, the intent was that as long as a file is 
> created and its edit log persisted, even if the file name limit is changed, 
> upon NameNode restart, the edit log replay should accept the edit log 
> regardless of the file name length limit change. I think there is a bug in 
> StandBy NameNode and the fix is to call FSNamespace.setImageLoaded(true) 
> inside transitionToActive() to indicate that it is replaying an accepted edit 
> log, not requesting a new transaction.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to