from:"He Xiaoqiao \(Jira\)"

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-18 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821040#comment-16821040
 ] 

He Xiaoqiao commented on HDFS-14437:


[~angerszhuuu],Thanks for your correction, I will check it later.

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>   synchronized(journalSetLock) {
> IOUtils.cleanup(LOG, journalSet);
>   }
>   terminate(1, msg);
> }
>   } finally {
> // Prevent RuntimeException from blocking other log edit write 
> doneWithAutoSyncScheduling();
>   }
>   //editLogStream may become null,
>   //so store a

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-18 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820974#comment-16820974
 ] 

He Xiaoqiao commented on HDFS-14437:


Thanks [~starphin].
{quote}But the table you presented will hardly occurred, for rollEditLog and 
FSEditLog#logEdit are almost called with FSNamesystem.fsLock holding such as 
mkdir, setAcl. I'm still working on it to find that racing case.{quote}
It is complex to reproduce actually, especially in unit test. I think we could 
construct the right way if we make sure it is truth. for instance multi-thread 
to logSync, just my own opinion, maybe there is more graceful way.
I think it is not necessary to consider #fsLock as mentioned above, it is not 
related with #fsLock, we should test FSEditLog independently.

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-18 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820953#comment-16820953
 ] 

He Xiaoqiao commented on HDFS-14437:


[~angerszhuuu],[~starphin] please help to double check.
[~angerszhuuu] would you like to submit patch to fix this issue, If need any 
help, please send ping to me.

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>   synchronized(journalSetLock) {
> IOUtils.cleanup(LOG, journalSet);
>   }
>   terminate(1, msg);
> }
>   } finally {
> // Prevent RuntimeException from blocking other log edit write 
>

[jira] [Comment Edited] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-18 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820942#comment-16820942
 ] 

He Xiaoqiao edited comment on HDFS-14437 at 4/18/19 10:30 AM:
--

Thanks [~angerszhuuu],[~starphin], It looks that root cause is clearer.
 some minor comment:
 [~starphin], IIUC, all FSEditLog#logSync is out of FsLock and only 
FSEditLog#rollEditlog hold on FsLock, so {{FSEditLog}} is not related with 
{{FsLock}} overall. Thus {{FSEditLog}} using {{synchronized}} to control 
concurrency.
 In my opinion, The following code segment in FSEditLog#logSync which is out of 
{{synchronized}} is core reason.
{code:java}
  try {
if (logStream != null) {
  logStream.flush();
}
  }
{code}
Consider scenario:
||Time||Thread1(rollEditlog)||Thread2(invoke logSync by somewhat NN write op)||
|t1| |*enter synchronized*|
|t2| |check Syncing|
|t3| |set #syncStart and #isSyncRunning|
|t4| |swap buffers|
|t5| |do auto sync scheduling|
|t6| |*exit synchronized*|
|t7|*enter synchronized*| |
|t8|endCurrentLogSegment#logEdit| |
|t9|endCurrentLogSegment#logSyncAll (then doble buffer will be empty)| |
|t10|checkArgument pass about txid| |
|t11| |logStream.flush() (then double buffer will fill will editlog record, 
maybe many records here)|
|t12|journalSet.finalizeLogSegment| |
|t13|try to close JournalAndStream but fail since double buffer not | |
|t14|exception and terminate| |

I try to reproduce this issue with Unit test, first try to stop rollEditLog at 
{{journalSet.finalizeLogSegment}} in #endCurrentLogSegment, then #logEdit 
something and #logSync, after that, resume #rollEditLog, the expect exception 
appears.

If this is truth, I am confused that:
 1. Why only few report about this issue, IMO, the probability may be high.
 2. I agree to invoke #waitForSyncToFinish at the beginning of #rollEditLog 
could resolve this issue.
 Please correct if something I missed, Thanks again.


was (Author: hexiaoqiao):
Thanks [~angerszhuuu],[~starphin], It looks that root cause is clearer.
some minor comment:
[~starphin], IIUC, all FSEditLog#logSync is out of FsLock and only 
FSEditLog#rollEditlog hold on FsLock, so {{FSEditLog}} is not related with 
{{FsLock}} overall. Thus {{FSEditLog}} using {{synchronized}} to control 
concurrency.
In my opinion, The following code segment in FSEditLog#logSync which is out of 
{{synchronized}} is core reason.
{code:java}
  try {
if (logStream != null) {
  logStream.flush();
}
  }
{code}
Consider scenario:
||Time||Thread1(rollEditlog)||Thread2(invoke logSync by somewhat NN write op)||
|t1| |*enter synchronized*|
|t2| |check Syncing|
|t3| |set #syncStart and #isSyncRunning|
|t4| |swap buffers|
|t5| |do auto sync scheduling|
|t6| |*exit synchronized*|
|t7|*enter synchronized*| |
|t8|endCurrentLogSegment#logEdit| |
|t9|endCurrentLogSegment#logSyncAll (then doble buffer will be empty)| |
|t10|checkArgument pass about txid| |
|t11| |logStream.flush() (then double buffer will fill will editlog record, 
maybe many records here)|
|t12|journalSet.finalizeLogSegment| |
|t13|try to close JournalAndStream but fail since double buffer not | |
|t14|exception and terminate| |
I try to reproduce this issue with Unit test, first try to stop rollEditLog at 
{{journalSet.finalizeLogSegment}} in #endCurrentLogSegment, then #logEdit 
something and #logSync, after that, resume #rollEditLog, the expect exception 
appears.

If this is truth, I am confused that:
1. Why only few report about this issue, IMO, the probability may be high.
2. How to fix that, only invoke #waitForSyncToFinish at the beginning of 
#rollEditLog could not be resolved.
Please correct if something I missed, Thanks again.

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-18 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820942#comment-16820942
 ] 

He Xiaoqiao commented on HDFS-14437:


Thanks [~angerszhuuu],[~starphin], It looks that root cause is clearer.
some minor comment:
[~starphin], IIUC, all FSEditLog#logSync is out of FsLock and only 
FSEditLog#rollEditlog hold on FsLock, so {{FSEditLog}} is not related with 
{{FsLock}} overall. Thus {{FSEditLog}} using {{synchronized}} to control 
concurrency.
In my opinion, The following code segment in FSEditLog#logSync which is out of 
{{synchronized}} is core reason.
{code:java}
  try {
if (logStream != null) {
  logStream.flush();
}
  }
{code}
Consider scenario:
||Time||Thread1(rollEditlog)||Thread2(invoke logSync by somewhat NN write op)||
|t1| |*enter synchronized*|
|t2| |check Syncing|
|t3| |set #syncStart and #isSyncRunning|
|t4| |swap buffers|
|t5| |do auto sync scheduling|
|t6| |*exit synchronized*|
|t7|*enter synchronized*| |
|t8|endCurrentLogSegment#logEdit| |
|t9|endCurrentLogSegment#logSyncAll (then doble buffer will be empty)| |
|t10|checkArgument pass about txid| |
|t11| |logStream.flush() (then double buffer will fill will editlog record, 
maybe many records here)|
|t12|journalSet.finalizeLogSegment| |
|t13|try to close JournalAndStream but fail since double buffer not | |
|t14|exception and terminate| |
I try to reproduce this issue with Unit test, first try to stop rollEditLog at 
{{journalSet.finalizeLogSegment}} in #endCurrentLogSegment, then #logEdit 
something and #logSync, after that, resume #rollEditLog, the expect exception 
appears.

If this is truth, I am confused that:
1. Why only few report about this issue, IMO, the probability may be high.
2. How to fix that, only invoke #waitForSyncToFinish at the beginning of 
#rollEditLog could not be resolved.
Please correct if something I missed, Thanks again.

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
>

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-18 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820750#comment-16820750
 ] 

He Xiaoqiao commented on HDFS-14437:


[~angerszhuuu], tracking code logic again, and do you mean the following 
run-sequence cause this issue?
||Time||Thread1(rollEditlog)||Thread2(somewhat write)||
|t1|logEdit|-|
|t2|logSyncAll|-|
|t3|-|logSync|
|t4|finalize|-|
|t5|exception and terminate| |

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>   synchronized(journalSetLock) {
> IOUtils.cleanup(LOG, journalSet);
>   }
>   terminate(1, msg);
> }
>   }

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-17 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820742#comment-16820742
 ] 

He Xiaoqiao commented on HDFS-14437:


[~angerszhuuu], Thanks for the ping. It is interesting dig. I think it is OK 
here if attach somewhat documents.
I confused detailed explain about #endCurrentLogSegment:
When we invoke #endCurrentLogSegment, it will do all log sync firstly, then 
finalize log segment, which are all in synchronized. IIUC, it should guarantee 
that edit double-buffer are empty before finalize. Look forward the truth. 
Thanks again.

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>

[jira] [Commented] (HDFS-14430) RBF: Fix MockNamenode bug about mocking RPC getListing and mkdir

2019-04-16 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819724#comment-16819724
 ] 

He Xiaoqiao commented on HDFS-14430:


[~elgoiri],[~ayushtkn] Thanks for point out that, it makes sense for me to fix 
it in HDFS-14117. I will watch that issue and will close this one later. Thanks.

> RBF: Fix MockNamenode bug about mocking RPC getListing and mkdir
> 
>
> Key: HDFS-14430
> URL: https://issues.apache.org/jira/browse/HDFS-14430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14430-HDFS-13891.001.patch
>
>
> Some unexpected result when invoke mocking #getListing and #mkdirs in current 
> MockNamenode implement.
> * for mock mkdirs, we do not check if parent directory exists.
> * for mock getListing, some child dirs/files are not listing.
> It may be cause some unexpected result and cause some unit test fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14430) RBF: Fix MockNamenode bug about mocking RPC getListing and mkdir

2019-04-16 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819107#comment-16819107
 ] 

He Xiaoqiao commented on HDFS-14430:


Thanks [~ayushtkn] for your quick response, I just check interface using 
MockNamenode locally, but some  result is not my expectation. For instance 
using MockNamenode,
1. mkdir '/user', '/user/hive/warehouse', '/user/hadoop/test';
2. then get null when invoke getListing of '/user'.
I expect the correct result may be {'/user/hive', '/user/hadoop'}.

> RBF: Fix MockNamenode bug about mocking RPC getListing and mkdir
> 
>
> Key: HDFS-14430
> URL: https://issues.apache.org/jira/browse/HDFS-14430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14430-HDFS-13891.001.patch
>
>
> Some unexpected result when invoke mocking #getListing and #mkdirs in current 
> MockNamenode implement.
> * for mock mkdirs, we do not check if parent directory exists.
> * for mock getListing, some child dirs/files are not listing.
> It may be cause some unexpected result and cause some unit test fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14430) RBF: Fix MockNamenode bug about mocking RPC getListing and mkdir

2019-04-16 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14430:
---
Attachment: HDFS-14430-HDFS-13891.001.patch
Status: Patch Available  (was: Open)

upload v001 and fix #mkdirs and #getListing
 cc [~elgoiri], please take a look.
 * I am not sure if there are some other consideration which not check parent 
path.
 * This change will cause some unit test failed.

If I missing something, please correct. Thanks.

> RBF: Fix MockNamenode bug about mocking RPC getListing and mkdir
> 
>
> Key: HDFS-14430
> URL: https://issues.apache.org/jira/browse/HDFS-14430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14430-HDFS-13891.001.patch
>
>
> Some unexpected result when invoke mocking #getListing and #mkdirs in current 
> MockNamenode implement.
> * for mock mkdirs, we do not check if parent directory exists.
> * for mock getListing, some child dirs/files are not listing.
> It may be cause some unexpected result and cause some unit test fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14430) RBF: Fix MockNamenode bug about mocking RPC getListing and mkdir

2019-04-16 Thread He Xiaoqiao (JIRA)

He Xiaoqiao created HDFS-14430:
--

 Summary: RBF: Fix MockNamenode bug about mocking RPC getListing 
and mkdir
 Key: HDFS-14430
 URL: https://issues.apache.org/jira/browse/HDFS-14430
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: rbf
Reporter: He Xiaoqiao
Assignee: He Xiaoqiao


Some unexpected result when invoke mocking #getListing and #mkdirs in current 
MockNamenode implement.
* for mock mkdirs, we do not check if parent directory exists.
* for mock getListing, some child dirs/files are not listing.
It may be cause some unexpected result and cause some unit test fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14421) HDFS block two replicas exist in one DataNode

2019-04-15 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818000#comment-16818000
 ] 

He Xiaoqiao commented on HDFS-14421:


[~yuanbo], thanks for your comments, I am still confused that why same block 
located more than one replicas in one datanode instance even if change VERSION 
files of different disks which belong to one datanode instance. looks forward 
to your more digging information.

> HDFS block two replicas exist in one DataNode
> -
>
> Key: HDFS-14421
> URL: https://issues.apache.org/jira/browse/HDFS-14421
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yuanbo Liu
>Priority: Major
> Attachments: 326942161.log
>
>
> We're using Hadoop-2.7.0.
> There is a file in the cluster and it's replication factor is 2. Those two 
> replicas exist in one Datande. the fsck info is here:
> {color:#707070}BP-499819267-xx.xxx.131.201-1452072365222:blk_1400651575_326942161
>  len=484045 repl=2 
> [DatanodeInfoWithStorage[xx.xxx.80.205:50010,DS-d321be27-cbd4-4edd-81ad-29b3d021ee82,DISK],
>  
> DatanodeInfoWithStorage[xx.xx.80.205:50010,DS-d321be27-cbd4-4edd-81ad-29b3d021ee82,DISK]].{color}
> and this is the exception from xx.xx.80.205
> {color:#707070}org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException:
>  Replica not found for 
> BP-499819267-xx.xxx.131.201-1452072365222:blk_1400651575_326942161{color}
> It's confusing that why NameNode doesn't update block map after exception. 
> What's the reason of two replicas existing in one Datande?
> Hope to get your comments. Thanks in advance.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14421) HDFS block two replicas exist in one DataNode

2019-04-11 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815307#comment-16815307
 ] 

He Xiaoqiao commented on HDFS-14421:


[~yuanbo] I don't think we can find the root cause only dependence with 
datanode's log. Any namenode log about blk_1400651575?

> HDFS block two replicas exist in one DataNode
> -
>
> Key: HDFS-14421
> URL: https://issues.apache.org/jira/browse/HDFS-14421
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yuanbo Liu
>Priority: Major
> Attachments: 326942161.log
>
>
> We're using Hadoop-2.7.0.
> There is a file in the cluster and it's replication factor is 2. Those two 
> replicas exist in one Datande. the fsck info is here:
> {color:#707070}BP-499819267-xx.xxx.131.201-1452072365222:blk_1400651575_326942161
>  len=484045 repl=2 
> [DatanodeInfoWithStorage[xx.xxx.80.205:50010,DS-d321be27-cbd4-4edd-81ad-29b3d021ee82,DISK],
>  
> DatanodeInfoWithStorage[xx.xx.80.205:50010,DS-d321be27-cbd4-4edd-81ad-29b3d021ee82,DISK]].{color}
> and this is the exception from xx.xx.80.205
> {color:#707070}org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException:
>  Replica not found for 
> BP-499819267-xx.xxx.131.201-1452072365222:blk_1400651575_326942161{color}
> It's confusing that why NameNode doesn't update block map after exception. 
> What's the reason of two replicas existing in one Datande?
> Hope to get your comments. Thanks in advance.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14421) HDFS block two replicas exist in one DataNode

2019-04-11 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815123#comment-16815123
 ] 

He Xiaoqiao commented on HDFS-14421:


Thanks [~yuanbo] for your report, it's very interesting issue, could you share 
more information about block blk_1400651575, it will be very helpful if collect 
all logs about blk_1400651575 from namenode and datanode.

> HDFS block two replicas exist in one DataNode
> -
>
> Key: HDFS-14421
> URL: https://issues.apache.org/jira/browse/HDFS-14421
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yuanbo Liu
>Priority: Major
>
> We're using Hadoop-2.7.0.
> There is a file which replication factor is 2. Those two replicas exist in 
> one Datande. the fsck info is here:
> {color:#707070}BP-499819267-xx.xxx.131.201-1452072365222:blk_1400651575_326942161
>  len=484045 repl=2 
> [DatanodeInfoWithStorage[xx.xxx.80.205:50010,DS-d321be27-cbd4-4edd-81ad-29b3d021ee82,DISK],
>  
> DatanodeInfoWithStorage[xx.xx.80.205:50010,DS-d321be27-cbd4-4edd-81ad-29b3d021ee82,DISK]].{color}
> and this is the exception from xx.xx.80.205
> {color:#707070}org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException:
>  Replica not found for 
> BP-499819267-xx.xxx.131.201-1452072365222:blk_1400651575_326942161{color}
> It's confusing that why NameNode doesn't update block map after exception. 
> What's the reason of two replicas exist in one Datande?
> Hope to get anyone's comments. Thanks in advance.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-04-10 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815088#comment-16815088
 ] 

He Xiaoqiao commented on HDFS-13248:


Thanks [~elgoiri], I just send mail to hdfs-dev and common-dev to invite more 
folks involve to give some more suggestions and votes. Thanks again.

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality 
> Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-04-09 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813539#comment-16813539
 ] 

He Xiaoqiao commented on HDFS-13248:


Thanks [~elgoiri] and [~ayushtkn]
{quote}Fake client hostname: I don't quite get this one.{quote}
It is possible to construct fake client hostname no matter which solution we 
choice between changing {{ClientProtocol}} and {{RPC Framework}}, since 
ClientProtocol interface or Protocbuffer are totally open to client, and client 
could input arbitrary hostname as itself. However I believe it's under control 
and no more security risk even if it is fake hostname in generally. Just point 
the leak.
Totally, I am not very clear preference for changing {{ClientProtocol}} or 
{{RPC}} layer now. I believe there may be less works If we adopt to changing 
{{RPC/IPC}} layer solution.
My suggestion is we should vote one solution and catch up this feature to 
native as soon as possible. We still have chances to optimize it latter. This 
is just my personal opinion.:)

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality 
> Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-04-08 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812173#comment-16812173
 ] 

He Xiaoqiao commented on HDFS-13248:


Thanks [~elgoiri],
{quote}My main concern with modifying ClientProtocol is that it requires the 
client itself to change.
The change is backwards compatible but for it to work you need the client to be 
up to date.
>From our experience, this is pretty challenging.{quote}
The documents does not depict scope of changing detailed. Actually, we only 
need `modify the Namenode and the Router` rather than require change client if 
we push to use modifying {{ClientProtocol}}
(1) All client keep to use current interface of {{ClientProtocol}};
(2) When router receive RPC request, it get client hostname firstly, then 
switch to invoke additional method  which include parameter {{ClientMachine}} 
to Namenode;
(3) When RPC request to Namenode, it determine to use {{clientMachine}} if not 
null or get client hostname by {{Server.getRemoteAddress}} if {{ClientMachine}} 
is null.
In one word, We need to modify Namenode and Router but Client.

{quote}BTW, we could do right away the one that 
RouterRpcServer#getBlockLocations() reorders the destinations.{quote}
I agree to reorder at Router again, but I think it's not necessary if we can 
handle this case under this ticket. Since reorder operation may reduce QPS of 
router. Please correct me if there are something wrong. FYI.

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality 
> Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-04-06 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811521#comment-16811521
 ] 

He Xiaoqiao commented on HDFS-13248:


[~elgoiri],[~ayushtkn] Thanks for the document. I draft another summary 
approaches for solve RBF data locality. welcome furthermore comments and 
suggestions. [^RBF Data Locality Design.pdf] 
[~ayushtkn] If you have time, let's cooperate in the design document.

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality 
> Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-04-06 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-13248:
---
Attachment: RBF Data Locality Design.pdf

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality 
> Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-04-03 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809478#comment-16809478
 ] 

He Xiaoqiao commented on HDFS-13248:


Thanks [~elgoiri], [~ayushtkn]
To [~elgoiri],
{quote}we need a full design docs.{quote}
Correct, design docs will attach later. I am not very familiar with solution 
`modifying the RPC protocol`, anyone would like to explain will very helpful.
To [~ayushtkn],
{quote}Saw this getAdditionalDatanode() having client name parameter in(need to 
dig in more) Wouldn't that work for us?{quote}
Parameter #clientName about {{getAdditionalDatanode}} just a name tag and do 
not include any hostname/IP unless we change it.
{code:java}
this.clientName = "DFSClient_" + dfsClientConf.getTaskId() + "_" +
ThreadLocalRandom.current().nextInt()  + "_" +
Thread.currentThread().getId();
{code}
{quote}Do we need to do with scenarios like HBASE-22103?{quote}
If extend protocol, we need compatible with all current interface.
Please correct if something wrong.

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, clientMachine-call-path.jpeg, debug-info-1.jpeg, 
> debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-04-03 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808863#comment-16808863
 ] 

He Xiaoqiao commented on HDFS-13248:


Overall, we need to extend three method in #ClientProtocol: 
addBlock/getAdditionalDatanode/getBlockLocations. 
1. In order to avoid compatibility issues, we could just add new method as 
aboves with additional only parameter #clientHostname. And keep all current 
interface. 
2. The new interface just for Router in generally, of course it can invoke by 
client directly, but I think the risk is under control:(1) RBF final target is 
disable access from DFSClient to Namenode directly rather than through Router. 
(2) If not disable, I think DFSClient using a fake #clientHostname will not 
weaken data security. Welcome any more suggestions.
Based on above conditions, I has implemented quick-and-dirty prototype and run 
on my test env for weeks.
Anyway, it is necessary to vote and get suggestions through mail-list. I would 
like to push that forward if not any more suggestions here after this week.

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, clientMachine-call-path.jpeg, debug-info-1.jpeg, 
> debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-04-01 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806803#comment-16806803
 ] 

He Xiaoqiao commented on HDFS-13248:


{quote}In any case, the proper approach as discussed earlier would be to be 
able to let know the Namenode which is the actual client.
This may require some change/addition to the protocol.{quote}
+1 for changing protocol, maybe have high-cost but I think it is more safe and 
graceful solution than add using {{favoredNodesList}} since it will change the 
original means and we have to solve some corner case just as [~ayushtkn] 
mentioned above.

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, clientMachine-call-path.jpeg, debug-info-1.jpeg, 
> debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14385) RBF: Optimize MiniRouterDFSCluster with optional light weight MiniDFSCluster

2019-04-01 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14385:
---
Attachment: HDFS-14385-HDFS-13891.001.patch
Status: Patch Available  (was: Open)

Sorry for late updating. upload quick-and-dirty patch 
[^HDFS-14385-HDFS-13891.001.patch] and I add parameter #isMockNamenode as 
condition in {{MiniRouterDFSCluster}} to determine if start {{MiniDFSCluster}} 
or {{MockNamenode}}. will update related unit test after first reviews.

> RBF: Optimize MiniRouterDFSCluster with optional light weight MiniDFSCluster
> 
>
> Key: HDFS-14385
> URL: https://issues.apache.org/jira/browse/HDFS-14385
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14385-HDFS-13891.001.patch
>
>
> MiniRouterDFSCluster mimic federated HDFS cluster with routers to support RBF 
> test, In MiniRouterDFSCluster, it starts MiniDFSCluster with complete roles 
> of HDFS which have significant time cost. As HDFS-14351 discussed, it is 
> better to provide mock MiniDFSCluster/Namenodes as one option to support some 
> test case and reduce time cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14316) RBF: Support unavailable subclusters for mount points with multiple destinations

2019-03-25 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800723#comment-16800723
 ] 

He Xiaoqiao commented on HDFS-14316:


{quote}In HDFS-14316-HDFS-13891.010.patch I added a separate unit test 
(TestRouterFaultTolerant) which uses the MockNamenode (He Xiaoqiao you may want 
to take a look).
{quote}
Sorry for the late response and I will take time out this days to work on 
update MiniRouterDFSCluster ref MockNamenode, Thanks [~elgoiri] call me here.

> RBF: Support unavailable subclusters for mount points with multiple 
> destinations
> 
>
> Key: HDFS-14316
> URL: https://issues.apache.org/jira/browse/HDFS-14316
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14316-HDFS-13891.000.patch, 
> HDFS-14316-HDFS-13891.001.patch, HDFS-14316-HDFS-13891.002.patch, 
> HDFS-14316-HDFS-13891.003.patch, HDFS-14316-HDFS-13891.004.patch, 
> HDFS-14316-HDFS-13891.005.patch, HDFS-14316-HDFS-13891.006.patch, 
> HDFS-14316-HDFS-13891.007.patch, HDFS-14316-HDFS-13891.008.patch, 
> HDFS-14316-HDFS-13891.009.patch, HDFS-14316-HDFS-13891.010.patch, 
> HDFS-14316-HDFS-13891.011.patch, HDFS-14316-HDFS-13891.012.patch, 
> HDFS-14316-HDFS-13891.013.patch
>
>
> Currently mount points with multiple destinations (e.g., HASH_ALL) fail 
> writes when the destination subcluster is down. We need an option to allow 
> writing in other subclusters when one is down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14385) RBF: Optimize MiniRouterDFSCluster with optional light weight MiniDFSCluster

2019-03-25 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800718#comment-16800718
 ] 

He Xiaoqiao commented on HDFS-14385:


Thanks [~elgoiri] for more comments, I will work on this jira the next days.

> RBF: Optimize MiniRouterDFSCluster with optional light weight MiniDFSCluster
> 
>
> Key: HDFS-14385
> URL: https://issues.apache.org/jira/browse/HDFS-14385
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
>
> MiniRouterDFSCluster mimic federated HDFS cluster with routers to support RBF 
> test, In MiniRouterDFSCluster, it starts MiniDFSCluster with complete roles 
> of HDFS which have significant time cost. As HDFS-14351 discussed, it is 
> better to provide mock MiniDFSCluster/Namenodes as one option to support some 
> test case and reduce time cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14385) RBF: Optimize MiniRouterDFSCluster with optional light weight MiniDFSCluster

2019-03-20 Thread He Xiaoqiao (JIRA)

He Xiaoqiao created HDFS-14385:
--

 Summary: RBF: Optimize MiniRouterDFSCluster with optional light 
weight MiniDFSCluster
 Key: HDFS-14385
 URL: https://issues.apache.org/jira/browse/HDFS-14385
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: rbf
Reporter: He Xiaoqiao
Assignee: He Xiaoqiao


MiniRouterDFSCluster mimic federated HDFS cluster with routers to support RBF 
test, In MiniRouterDFSCluster, it starts MiniDFSCluster with complete roles of 
HDFS which have significant time cost. As HDFS-14351 discussed, it is better to 
provide mock MiniDFSCluster/Namenodes as one option to support some test case 
and reduce time cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-20 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797013#comment-16797013
 ] 

He Xiaoqiao commented on HDFS-14351:


Thanks [~elgoiri] and [~ayushtkn] for the reviews and commit.
{quote}As you have the mini cluster with the mock nodes, do you mind opening 
the JIRA?
{quote}
NP, I will create new JIRA to trace MiniRouterDFSCluster with light weight 
MiniDFSCluster as option for different test case.

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: HDFS-13891
>
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351-HDFS-13891.003.patch, 
> HDFS-14351-HDFS-13891.004.patch, HDFS-14351-HDFS-13891.005.patch, 
> HDFS-14351-HDFS-13891.006.patch, HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-19 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796792#comment-16796792
 ] 

He Xiaoqiao commented on HDFS-14351:


Pull [^HDFS-14351-HDFS-13891.006.patch] and test pass locally, time cost is 
same as [~elgoiri]. +1 for [^HDFS-14351-HDFS-13891.006.patch].
 Build quick-and-dirty MiniRouterDFSCluster with MockNamenode replace 
MiniDFSCluster and compare setup time cost, MiniRouterDFSCluter with 
MockNamenode takes 2.8s vs native MiniRouterDFSCluter takes 10.2s.
{quote}I think we should create a new JIRA to have a light weight 
MiniDFSCluster equivalent with these MockNamenodes.
{quote}
+1, it is not necessary for some tests to setup all roles of Cluster. As 
mentioned above, maybe we should build MockMiniDFSCluter replace with 
MockNamenodes.

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351-HDFS-13891.003.patch, 
> HDFS-14351-HDFS-13891.004.patch, HDFS-14351-HDFS-13891.005.patch, 
> HDFS-14351-HDFS-13891.006.patch, HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14374) Expose total number of delegation tokens in AbstractDelegationTokenSecretManager

2019-03-16 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794154#comment-16794154
 ] 

He Xiaoqiao commented on HDFS-14374:


[~crh] Thanks for working on this. just one suggestion, we could expose the 
number as one metric, if that it will be more convenient to monitor. FYI.

> Expose total number of delegation tokens in 
> AbstractDelegationTokenSecretManager
> 
>
> Key: HDFS-14374
> URL: https://issues.apache.org/jira/browse/HDFS-14374
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14374.001.patch, HDFS-14374.002.patch
>
>
> AbstractDelegationTokenSecretManager should expose total number of active 
> delegation tokens for specific implementations to track for observability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-14 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793280#comment-16793280
 ] 

He Xiaoqiao commented on HDFS-14351:


+1,  [^HDFS-14351-HDFS-13891.005.patch].
Thanks, it's good work for unit test. [~elgoiri] you could assign the task to 
yourself at any time if necessary.

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351-HDFS-13891.003.patch, 
> HDFS-14351-HDFS-13891.004.patch, HDFS-14351-HDFS-13891.005.patch, 
> HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-14 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792641#comment-16792641
 ] 

He Xiaoqiao commented on HDFS-14351:


fix checkstyle warn in [^HDFS-14351-HDFS-13891.004.patch] 

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351-HDFS-13891.003.patch, 
> HDFS-14351-HDFS-13891.004.patch, HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-14 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14351:
---
Attachment: HDFS-14351-HDFS-13891.004.patch

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351-HDFS-13891.003.patch, 
> HDFS-14351-HDFS-13891.004.patch, HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-14 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792563#comment-16792563
 ] 

He Xiaoqiao commented on HDFS-14351:


[~elgoiri] Thanks, upload [^HDFS-14351-HDFS-13891.003.patch] to test 
configuration monitor namenodes only.

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351-HDFS-13891.003.patch, 
> HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-14 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14351:
---
Attachment: HDFS-14351-HDFS-13891.003.patch

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351-HDFS-13891.003.patch, 
> HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-13 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791494#comment-16791494
 ] 

He Xiaoqiao commented on HDFS-14351:


Attached  [^HDFS-14351-HDFS-13891.002.patch] following comments.
[~elgoiri], I am not very familiar with #MiniRouterDFSCluster, please help to 
review, if something is wrong correct me. Thanks again.

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-13 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14351:
---
Attachment: HDFS-14351-HDFS-13891.002.patch

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, 
> HDFS-14351-HDFS-13891.002.patch, HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-13 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791344#comment-16791344
 ] 

He Xiaoqiao commented on HDFS-14351:


I will add new test to for RouterNamenodeMonitoringConfig later. Thanks 
[~elgoiri].

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, HDFS-14351.001.patch, 
> HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-12 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790332#comment-16790332
 ] 

He Xiaoqiao commented on HDFS-14351:


Thanks [~elgoiri] for reviewing.
[^HDFS-14351-HDFS-13891.001.patch] rebased branch HDFS-13891 and just change 
configuration settings tiny about #TestRouterNamenodeMonitoring.
Do we need another more unit test for this configuration item since it has 
verified nsid and nnid in #TestRouterNamenodeMonitoring?

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, HDFS-14351.001.patch, 
> HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-12 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14351:
---
Attachment: HDFS-14351-HDFS-13891.001.patch

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351-HDFS-13891.001.patch, HDFS-14351.001.patch, 
> HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-11 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14351:
---
Attachment: HDFS-14351.002.patch

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-11 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14351:
---
Attachment: (was: HDFS-14351.002.patch)

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351.001.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-11 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789470#comment-16789470
 ] 

He Xiaoqiao commented on HDFS-14351:


attach another patch  [^HDFS-14351.002.patch] based on branch trunk.

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-11 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14351:
---
Attachment: HDFS-14351.002.patch

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351.001.patch, HDFS-14351.002.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-10 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14351:
---
Summary: RBF: Optimize configuration item resolving for monitor namenode  
(was: Optimize configuration item resolving for monitor namenode)

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14351) Optimize configuration item resolving for monitor namenode

2019-03-10 Thread He Xiaoqiao (JIRA)

He Xiaoqiao created HDFS-14351:
--

 Summary: Optimize configuration item resolving for monitor namenode
 Key: HDFS-14351
 URL: https://issues.apache.org/jira/browse/HDFS-14351
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: rbf
Reporter: He Xiaoqiao
Assignee: He Xiaoqiao


We invoke {{configuration.get}} to resolve configuration item 
`dfs.federation.router.monitor.namenode` at `Router.java`, then split the value 
by comma to get nsid and nnid, it may confused users since this is not 
compatible with blank space but other common parameters could do. The following 
segment show example that resolve fails.
{code:java}
  
dfs.federation.router.monitor.namenode
nameservice1.nn1, nameservice1.nn2

  The identifier of the namenodes to monitor and heartbeat.

  
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14351) RBF: Optimize configuration item resolving for monitor namenode

2019-03-10 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14351:
---
Attachment: HDFS-14351.001.patch
Status: Patch Available  (was: Open)

I attached a quick-and-dirty demonstration patch without unit test. Please 
correct me if something wrong.

> RBF: Optimize configuration item resolving for monitor namenode
> ---
>
> Key: HDFS-14351
> URL: https://issues.apache.org/jira/browse/HDFS-14351
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14351.001.patch
>
>
> We invoke {{configuration.get}} to resolve configuration item 
> `dfs.federation.router.monitor.namenode` at `Router.java`, then split the 
> value by comma to get nsid and nnid, it may confused users since this is not 
> compatible with blank space but other common parameters could do. The 
> following segment show example that resolve fails.
> {code:java}
>   
> dfs.federation.router.monitor.namenode
> nameservice1.nn1, nameservice1.nn2
> 
>   The identifier of the namenodes to monitor and heartbeat.
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13532) RBF: Adding security

2019-03-07 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787509#comment-16787509
 ] 

He Xiaoqiao commented on HDFS-13532:


[~elgoiri], [~crh]. Thanks for very helpful suggestions. The migration steps 
are very clear and helpful! I have to estimate the whole cost to migrate to RBF 
completely since all our default filesystem is viewfs://nameservice/(include 
hivemeta, and user applications) and very massive scale, so it will bring high 
cost to switch hdfs://nameservice. I would like to share information in time 
and thanks for your help again.

> RBF: Adding security
> 
>
> Key: HDFS-13532
> URL: https://issues.apache.org/jira/browse/HDFS-13532
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
> Attachments: RBF _ Security delegation token thoughts.pdf, RBF _ 
> Security delegation token thoughts_updated.pdf, RBF _ Security delegation 
> token thoughts_updated_2.pdf, RBF-DelegationToken-Approach1b.pdf, RBF_ 
> Security delegation token thoughts_updated_3.pdf, Security_for_Router-based 
> Federation_design_doc.pdf
>
>
> HDFS Router based federation should support security. This includes 
> authentication and delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-03-06 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786417#comment-16786417
 ] 

He Xiaoqiao commented on HDFS-13248:


Thanks [~elgoiri],
 [^HDFS-13248.005.patch] seems a trick but valid solution.
However I don't think this approach could apply to #getBlockLocations, which 
interface define show as following, looks very different from #addBlock.
{code:java}
  LocatedBlocks getBlockLocations(String src, long offset, long length)
  throws IOException;
{code}
looking forward to more graceful approach. and I would like to join in and push 
this issue forward.

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, clientMachine-call-path.jpeg, debug-info-1.jpeg, 
> debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13532) RBF: Adding security

2019-03-06 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786382#comment-16786382
 ] 

He Xiaoqiao commented on HDFS-13532:


[~crh],[~elgoiri],[~brahmareddy], really appreciate your feedback.
 Basically I am concerning 2 points in one word:
 * (1) how to gray upgrade HDFS to support RBF with security feature.
 * (2) performance cost using ZKDelegationTokenSecretManagerImpl.

And It is clear about (2) performance of ZKDelegationTokenSecretManagerImpl 
with my colleague's help. it is OK for me that >5K QPS.

I do not understand about gray upgrade completely. First of all, I would like 
to share ideal plan for me to upgrade RBF smoothly: (1) HDFS build on 
Federation + ViewFS now. (2) It's better for me to rolling upgrade Client 
rather than switch to RBF once time.

[~elgoiri] and [~crh] both mentioned solution with 'Router nameservice' as 
following step: 
*  (1) update YARN(RM/NM) configuration within new router nameservice; 
*  (2) rolling client to support RBF; 
*  (3) updete YARN(RM/NM) configuration which include router nameservice config 
only; 

IIUC, this solution will not solve delegation token issue, since client obtains 
DT from router only after step (2) and submit job normally, however executor 
will fail when request to NameNode due to DT checks fail, since for some 
compute engine (for instance MR) it merges client and NM configuration 
together, then executor still request to NameNode directly without proper DT.

To [~crh]
{quote}jobs try to access something like hdfs://router-nameservice/mydata, rm 
will use the same filesystem i.e. hdfs://router-nameservice to renew tokens
{quote}
I think it need to enhance compute engine, may be more high-cost.
{quote}Routers not having security feature was a big hindrance in adopting it 
for any secure use case irrespective of scale.
{quote}
security feature is also very important for me, I try my best to dig solution 
that can transmit to RBF smoothly.
 Thanks [~crh], [~elgoiri] again.

> RBF: Adding security
> 
>
> Key: HDFS-13532
> URL: https://issues.apache.org/jira/browse/HDFS-13532
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
> Attachments: RBF _ Security delegation token thoughts.pdf, RBF _ 
> Security delegation token thoughts_updated.pdf, RBF _ Security delegation 
> token thoughts_updated_2.pdf, RBF-DelegationToken-Approach1b.pdf, RBF_ 
> Security delegation token thoughts_updated_3.pdf, Security_for_Router-based 
> Federation_design_doc.pdf
>
>
> HDFS Router based federation should support security. This includes 
> authentication and delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-03-06 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785674#comment-16785674
 ] 

He Xiaoqiao commented on HDFS-13248:


[~elgoiri] I think this is also issue about read operation, since namenode gets 
router hostname/ip rather than client information so it could not sort block 
locations correctly as expect, right?

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, clientMachine-call-path.jpeg, debug-info-1.jpeg, 
> debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13532) RBF: Adding security

2019-03-05 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784726#comment-16784726
 ] 

He Xiaoqiao commented on HDFS-13532:


Thanks [~brahmareddy] and [~elgoiri] for your detailed comments.
To [~elgoiri],
{quote}If the job is submitted against the Router, then the job can only access 
data through RBF.
However, I think this is OK; as I mentioned before you could still have jobs 
that query the NameNodes directly.{quote}
IIUC, client/jobsubmitter and executors have to switch to RBF in the same time, 
otherwise, delegation token check will not pass since they are not matching 
distributed from namenode and router.
on another side, majority compute engine run on yarn rely on RM to renew token, 
So In on word, it looks that there are no graceful solution to support rolling 
upgrade, for instance rolling upgrade client to RBF, then YARN(RM/NM)?
{quote}For the RM itself, you can transition it from using RBF or not whenever 
you want.
{quote}
As mentioned above, I am confused about RM using RBF or not. your more explains 
is greatly appreciated.

To [~brahmareddy]
{quote}Did you try it..? do you've failed logs..?{quote}
I am sorry that no time to test this case now, I will offer more info in time 
when cover this scenario.

Thanks again.

> RBF: Adding security
> 
>
> Key: HDFS-13532
> URL: https://issues.apache.org/jira/browse/HDFS-13532
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
> Attachments: RBF _ Security delegation token thoughts.pdf, RBF _ 
> Security delegation token thoughts_updated.pdf, RBF _ Security delegation 
> token thoughts_updated_2.pdf, RBF-DelegationToken-Approach1b.pdf, RBF_ 
> Security delegation token thoughts_updated_3.pdf, Security_for_Router-based 
> Federation_design_doc.pdf
>
>
> HDFS Router based federation should support security. This includes 
> authentication and delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13532) RBF: Adding security

2019-03-05 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784327#comment-16784327
 ] 

He Xiaoqiao commented on HDFS-13532:


Thanks for the great works here, I have followed RBF recently and sorry for 
no-timely questions. I found that branch-HDFS-13891 implement Approach#1, and 
it works well in my test env. And some confusion about Approach #1:
1. any suggestions or guide for upgrade gracefully? Approach #1 based on two 
point: (1) router takes over delegation tokens management from namenodes at 
all, (2) namenode only maintain delegation token request from router. right? 
IIUC, maybe there are no graceful gray solution to upgrade clients. Consider 
about one job submit to YARN from client which is upgrade to support RBF, and 
all delegation tokens are distributed from router, but if yarn still not 
upgrade, all executors will authenticate fail to namenode since delegation 
token is not matching. Of course this issue is also true if upgrade yarn first 
then client.
2. any performance test results about zookeeper which manage massive delegation 
tokens? I am not very familiar with zookeeper, and if there are obvious 
performance differences between zookeeper and memory at namenode before RBF. If 
no evaluation, I would like to test it later.
3. if znode number impact performance of delegation token request in zookeeper? 
delegation token request ops is very high for a large cluster, for instance, 
1000K jobs every day and the maximum lifetime for which a delegation token is 
valid set default by 7 days, in the worst case, it will backlog 7000K znodes at 
all. some risk for more large cluster?
4. any plan to support different approach and let user to choice?
Please correct me if there are something wrong. Thanks again.

> RBF: Adding security
> 
>
> Key: HDFS-13532
> URL: https://issues.apache.org/jira/browse/HDFS-13532
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
> Attachments: RBF _ Security delegation token thoughts.pdf, RBF _ 
> Security delegation token thoughts_updated.pdf, RBF _ Security delegation 
> token thoughts_updated_2.pdf, RBF-DelegationToken-Approach1b.pdf, RBF_ 
> Security delegation token thoughts_updated_3.pdf, Security_for_Router-based 
> Federation_design_doc.pdf
>
>
> HDFS Router based federation should support security. This includes 
> authentication and delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14332) NetworkTopology#getWeightUsingNetworkLocation return unexpected result

2019-03-04 Thread He Xiaoqiao (JIRA)

He Xiaoqiao created HDFS-14332:
--

 Summary: NetworkTopology#getWeightUsingNetworkLocation return 
unexpected result
 Key: HDFS-14332
 URL: https://issues.apache.org/jira/browse/HDFS-14332
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: He Xiaoqiao
Assignee: He Xiaoqiao


Consider the following scenario:
1. there are 4 slaves and topology like:
Rack: /IDC/RACK1
   hostname1
   hostname2
Rack: /IDC/RACK2
   hostname3
   hostname4
2. Reader from hostname1, and calculate weight between reader and [hostname1, 
hostname3, hostname4] by #getWeight, and their corresponding values are [0,4,4]
3. Reader from client which is not in the topology, and in the same IDC but in 
none rack of the topology, and calculate weight between reader and [hostname1, 
hostname3, hostname4] by #getWeightUsingNetworkLocation, and their 
corresponding values are [2,2,2]
4. Other different Reader can get the similar results.
The weight result for case #3 is obviously not the expected value, the truth is 
[4,4,4]. this issue may cause reader not really following arrange: local -> 
local rack -> remote rack. 

After dig the detailed implement, the root cause is 
#getWeightUsingNetworkLocation only calculate distance between Racks rather 
than hosts.

I think we should add constant 2 to correct the weight of 
#getWeightUsingNetworkLocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14332) NetworkTopology#getWeightUsingNetworkLocation return unexpected result

2019-03-04 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14332:
---
Attachment: HDFS-14332.001.patch
Status: Patch Available  (was: Open)

> NetworkTopology#getWeightUsingNetworkLocation return unexpected result
> --
>
> Key: HDFS-14332
> URL: https://issues.apache.org/jira/browse/HDFS-14332
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14332.001.patch
>
>
> Consider the following scenario:
> 1. there are 4 slaves and topology like:
> Rack: /IDC/RACK1
>hostname1
>hostname2
> Rack: /IDC/RACK2
>hostname3
>hostname4
> 2. Reader from hostname1, and calculate weight between reader and [hostname1, 
> hostname3, hostname4] by #getWeight, and their corresponding values are 
> [0,4,4]
> 3. Reader from client which is not in the topology, and in the same IDC but 
> in none rack of the topology, and calculate weight between reader and 
> [hostname1, hostname3, hostname4] by #getWeightUsingNetworkLocation, and 
> their corresponding values are [2,2,2]
> 4. Other different Reader can get the similar results.
> The weight result for case #3 is obviously not the expected value, the truth 
> is [4,4,4]. this issue may cause reader not really following arrange: local 
> -> local rack -> remote rack. 
> After dig the detailed implement, the root cause is 
> #getWeightUsingNetworkLocation only calculate distance between Racks rather 
> than hosts.
> I think we should add constant 2 to correct the weight of 
> #getWeightUsingNetworkLocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-28 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781343#comment-16781343
 ] 

He Xiaoqiao commented on HDFS-14305:


[~xkrogen],[~csun] After check all dev branches containing HDFS-6440 
(branch-3.0, branch-3.1, branch-3.2), it can cherry-pick directly, and do not 
need to add new patches in my opinion. FYI. 

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch, HDFS-14305.005.patch, 
> HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14314) fullBlockReportLeaseId should be reset after registering to NN

2019-02-28 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781326#comment-16781326
 ] 

He Xiaoqiao commented on HDFS-14314:


Thanks [~starphin],
+1, [^HDFS-14314-trunk.005.patch] LGTM. I run failed unit test (exclude 
TestWebHdfsTimeouts,TestJournalNodeSync) locally and all passed. I also agree 
that test failures (TestWebHdfsTimeouts,TestJournalNodeSync ) are not related.
Pending another one reviews.

> fullBlockReportLeaseId should be reset after registering to NN
> --
>
> Key: HDFS-14314
> URL: https://issues.apache.org/jira/browse/HDFS-14314
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.4
> Environment:  
>  
>  
>Reporter: star
>Priority: Critical
> Fix For: 2.8.4
>
> Attachments: HDFS-14314-trunk.001.patch, HDFS-14314-trunk.001.patch, 
> HDFS-14314-trunk.002.patch, HDFS-14314-trunk.003.patch, 
> HDFS-14314-trunk.004.patch, HDFS-14314-trunk.005.patch, HDFS-14314.0.patch, 
> HDFS-14314.2.patch, HDFS-14314.patch
>
>
>   since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
> block lease id from active NN before sending full block to NN. Then DN will 
> send full block report together with lease id. If the lease id is invalid, NN 
> will reject the full block report and log "not in the pending set".
>   In a case when DN is doing full block reporting while NN is restarted. 
> It happens that DN will later send a full block report with lease id 
> ,acquired from previous NN instance, which is invalid to the new NN instance. 
> Though DN recognized the new NN instance by heartbeat and reregister itself, 
> it did not reset the lease id from previous instance.
>   The issuse may cause DNs to temporarily go dead, making it unsafe to 
> restart NN especially in hadoop cluster which has large amount of DNs. 
> HDFS-12914 reported the issue  without any clues why it occurred and remain 
> unsolved.
>    To make it clear, look at code below. We take it from method 
> offerService of class BPServiceActor. We eliminate some code to focus on 
> current issue. fullBlockReportLeaseId is a local variable to hold lease id 
> from NN. Exceptions will occur at blockReport call when NN restarting, which 
> will be caught by catch block in while loop. Thus fullBlockReportLeaseId will 
> not be set to 0. After NN restarted, DN will send full block report which 
> will be rejected by the new NN instance. DN will never send full block report 
> until the next full block report schedule, about an hour later.
>   Solution is simple, just reset fullBlockReportLeaseId to 0 after any 
> exception or after registering to NN. Thus it will ask for a valid 
> fullBlockReportLeaseId from new NN instance.
> {code:java}
> private void offerService() throws Exception {
>   long fullBlockReportLeaseId = 0;
>   //
>   // Now loop for a long time
>   //
>   while (shouldRun()) {
> try {
>   final long startTime = scheduler.monotonicNow();
>   //
>   // Every so often, send heartbeat or block-report
>   //
>   final boolean sendHeartbeat = scheduler.isHeartbeatDue(startTime);
>   HeartbeatResponse resp = null;
>   if (sendHeartbeat) {
>   
> boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) &&
> scheduler.isBlockReportDue(startTime);
> scheduler.scheduleNextHeartbeat();
> if (!dn.areHeartbeatsDisabledForTests()) {
>   resp = sendHeartBeat(requestBlockReportLease);
>   assert resp != null;
>   if (resp.getFullBlockReportLeaseId() != 0) {
> if (fullBlockReportLeaseId != 0) {
>   LOG.warn(nnAddr + " sent back a full block report lease " +
>   "ID of 0x" +
>   Long.toHexString(resp.getFullBlockReportLeaseId()) +
>   ", but we already have a lease ID of 0x" +
>   Long.toHexString(fullBlockReportLeaseId) + ". " +
>   "Overwriting old lease ID.");
> }
> fullBlockReportLeaseId = resp.getFullBlockReportLeaseId();
>   }
>  
> }
>   }
>
>  
>   if ((fullBlockReportLeaseId != 0) || forceFullBr) {
> //Exception occurred here when NN restarting
> cmds = blockReport(fullBlockReportLeaseId);
> fullBlockReportLeaseId = 0;
>   }
>   
> } catch(RemoteException re) {
>   
>   } // while (shouldRun())
> } // offerService{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-28 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781274#comment-16781274
 ] 

He Xiaoqiao commented on HDFS-14305:


Thanks [~arpitagarwal],
{quote}Is this a compatible change and can it be applied safely during rolling 
upgrade without breaking anything?{quote}
I believe this fix will not introduce incompatibility as [~xkrogen] and [~csun] 
descriptions.
{quote}This does make me wonder if we should push this back to all branches 
containing HDFS-6440. {quote}
+1 for backporting this fix to other branches. I will prepare patches soon.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch, HDFS-14305.005.patch, 
> HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14201) Ability to disallow safemode NN to become active

2019-02-27 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780169#comment-16780169
 ] 

He Xiaoqiao commented on HDFS-14201:


+1 LGTM, Thanks [~surmountian].

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
> Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch, 
> HDFS-14201.003.patch, HDFS-14201.004.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-27 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780050#comment-16780050
 ] 

He Xiaoqiao commented on HDFS-14305:


Thanks [~xkrogen].
+1, LGTM. Is it necessary to mark that 64 namenodes limit scope just for single 
Namespace in release note? I think this message may be useful for Federation. 
FYI.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch, HDFS-14305.005.patch, 
> HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14201) Ability to disallow safemode NN to become active

2019-02-27 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779063#comment-16779063
 ] 

He Xiaoqiao commented on HDFS-14201:


Thanks [~surmountian],  [^HDFS-14201.004.patch] looks fine to me. Just a little 
worry about TestHASafeMode#testTransitionToActiveWhenSafeMode: create new 
MiniDFSCluster may cause local path conflict and write editlog failure, This 
problem has appeared by Jenkins when apply  [^HDFS-14201.002.patch], 
refer:https://builds.apache.org/job/PreCommit-HDFS-Build/26315/testReport/, and 
I am sorry to point out so late. FYI.

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
> Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch, 
> HDFS-14201.003.patch, HDFS-14201.004.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14314) fullBlockReportLeaseId should be reset after registering to NN

2019-02-27 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779013#comment-16779013
 ] 

He Xiaoqiao commented on HDFS-14314:


Thanks [~starphin] for your contribution, it seems to be getting close to the 
truth. Please fix code style following 
https://builds.apache.org/job/PreCommit-HDFS-Build/26332/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 and check failure unit 
test(https://builds.apache.org/job/PreCommit-HDFS-Build/26332/testReport/) if 
related to your patch, maybe running at local is good choice. FYI. BTW, all 
links I just mentioned are refer from Jenkins result as the last comment shows.

> fullBlockReportLeaseId should be reset after registering to NN
> --
>
> Key: HDFS-14314
> URL: https://issues.apache.org/jira/browse/HDFS-14314
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.4
> Environment:  
>  
>  
>Reporter: star
>Priority: Critical
> Fix For: 2.8.4
>
> Attachments: HDFS-14314-trunk.001.patch, HDFS-14314-trunk.001.patch, 
> HDFS-14314-trunk.002.patch, HDFS-14314.0.patch, HDFS-14314.2.patch, 
> HDFS-14314.patch
>
>
>   since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
> block lease id from active NN before sending full block to NN. Then DN will 
> send full block report together with lease id. If the lease id is invalid, NN 
> will reject the full block report and log "not in the pending set".
>   In a case when DN is doing full block reporting while NN is restarted. 
> It happens that DN will later send a full block report with lease id 
> ,acquired from previous NN instance, which is invalid to the new NN instance. 
> Though DN recognized the new NN instance by heartbeat and reregister itself, 
> it did not reset the lease id from previous instance.
>   The issuse may cause DNs to temporarily go dead, making it unsafe to 
> restart NN especially in hadoop cluster which has large amount of DNs. 
> HDFS-12914 reported the issue  without any clues why it occurred and remain 
> unsolved.
>    To make it clear, look at code below. We take it from method 
> offerService of class BPServiceActor. We eliminate some code to focus on 
> current issue. fullBlockReportLeaseId is a local variable to hold lease id 
> from NN. Exceptions will occur at blockReport call when NN restarting, which 
> will be caught by catch block in while loop. Thus fullBlockReportLeaseId will 
> not be set to 0. After NN restarted, DN will send full block report which 
> will be rejected by the new NN instance. DN will never send full block report 
> until the next full block report schedule, about an hour later.
>   Solution is simple, just reset fullBlockReportLeaseId to 0 after any 
> exception or after registering to NN. Thus it will ask for a valid 
> fullBlockReportLeaseId from new NN instance.
> {code:java}
> private void offerService() throws Exception {
>   long fullBlockReportLeaseId = 0;
>   //
>   // Now loop for a long time
>   //
>   while (shouldRun()) {
> try {
>   final long startTime = scheduler.monotonicNow();
>   //
>   // Every so often, send heartbeat or block-report
>   //
>   final boolean sendHeartbeat = scheduler.isHeartbeatDue(startTime);
>   HeartbeatResponse resp = null;
>   if (sendHeartbeat) {
>   
> boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) &&
> scheduler.isBlockReportDue(startTime);
> scheduler.scheduleNextHeartbeat();
> if (!dn.areHeartbeatsDisabledForTests()) {
>   resp = sendHeartBeat(requestBlockReportLease);
>   assert resp != null;
>   if (resp.getFullBlockReportLeaseId() != 0) {
> if (fullBlockReportLeaseId != 0) {
>   LOG.warn(nnAddr + " sent back a full block report lease " +
>   "ID of 0x" +
>   Long.toHexString(resp.getFullBlockReportLeaseId()) +
>   ", but we already have a lease ID of 0x" +
>   Long.toHexString(fullBlockReportLeaseId) + ". " +
>   "Overwriting old lease ID.");
> }
> fullBlockReportLeaseId = resp.getFullBlockReportLeaseId();
>   }
>  
> }
>   }
>
>  
>   if ((fullBlockReportLeaseId != 0) || forceFullBr) {
> //Exception occurred here when NN restarting
> cmds = blockReport(fullBlockReportLeaseId);
> fullBlockReportLeaseId = 0;
>   }
>   
> } catch(RemoteException re) {
>   
>   } // while (shouldRun())
> } // offerService{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-26 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779006#comment-16779006
 ] 

He Xiaoqiao commented on HDFS-14305:


 [^HDFS-14305.006.patch] fix code style, I try to run fail test 
TestBPOfferService#testTrySendErrorReportWhenNNThrowsIOException and 
TestEditLogTailer#testRollEditLogIOExceptionForRemoteNN at local and it passed, 
Please help to double check.
Another failure unit test TestJournalNodeSync, I believe it is not related to 
this patch. FYI.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch, HDFS-14305.005.patch, 
> HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-26 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14305:
---
Attachment: HDFS-14305.006.patch

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch, HDFS-14305.005.patch, 
> HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-26 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778835#comment-16778835
 ] 

He Xiaoqiao commented on HDFS-14305:


Thanks [~vagarychen],[~csun],[~xkrogen] for your comments, update and upload 
new patch [^HDFS-14305.005.patch], pending jenkins.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch, HDFS-14305.005.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-26 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14305:
---
Attachment: HDFS-14305.005.patch

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch, HDFS-14305.005.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14314) fullBlockReportLeaseId should be reset after registering to NN

2019-02-26 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778046#comment-16778046
 ] 

He Xiaoqiao commented on HDFS-14314:


Hi [~starphin],
1. build failed since missing import 
package(org.apache.hadoop.hdfs.server.protocol.StorageBlockReport and 
org.apache.hadoop.hdfs.server.protocol.BlockReportContext), please correct it.
2. please delete some useless blank line as following and some other where like 
thus.
{quote}
@@ -188,6 +203,24 @@ public HeartbeatResponse answer(InvocationOnMock 
invocation) throws Throwable {
   }
 
 
+  private class HeartbeatRegisterAnswer implements Answer {
{quote}
3. I just suggest add timeout parameter for {{testRefreshLeaseId}} such as 
{{@Test(timeout = 3)}}
FYI.

> fullBlockReportLeaseId should be reset after registering to NN
> --
>
> Key: HDFS-14314
> URL: https://issues.apache.org/jira/browse/HDFS-14314
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.4
> Environment:  
>  
>  
>Reporter: star
>Priority: Critical
> Fix For: 2.8.4
>
> Attachments: HDFS-14314-trunk.001.patch, HDFS-14314-trunk.001.patch, 
> HDFS-14314.0.patch, HDFS-14314.2.patch, HDFS-14314.patch
>
>
>   since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
> block lease id from active NN before sending full block to NN. Then DN will 
> send full block report together with lease id. If the lease id is invalid, NN 
> will reject the full block report and log "not in the pending set".
>   In a case when DN is doing full block reporting while NN is restarted. 
> It happens that DN will later send a full block report with lease id 
> ,acquired from previous NN instance, which is invalid to the new NN instance. 
> Though DN recognized the new NN instance by heartbeat and reregister itself, 
> it did not reset the lease id from previous instance.
>   The issuse may cause DNs to temporarily go dead, making it unsafe to 
> restart NN especially in hadoop cluster which has large amount of DNs. 
> HDFS-12914 reported the issue  without any clues why it occurred and remain 
> unsolved.
>    To make it clear, look at code below. We take it from method 
> offerService of class BPServiceActor. We eliminate some code to focus on 
> current issue. fullBlockReportLeaseId is a local variable to hold lease id 
> from NN. Exceptions will occur at blockReport call when NN restarting, which 
> will be caught by catch block in while loop. Thus fullBlockReportLeaseId will 
> not be set to 0. After NN restarted, DN will send full block report which 
> will be rejected by the new NN instance. DN will never send full block report 
> until the next full block report schedule, about an hour later.
>   Solution is simple, just reset fullBlockReportLeaseId to 0 after any 
> exception or after registering to NN. Thus it will ask for a valid 
> fullBlockReportLeaseId from new NN instance.
> {code:java}
> private void offerService() throws Exception {
>   long fullBlockReportLeaseId = 0;
>   //
>   // Now loop for a long time
>   //
>   while (shouldRun()) {
> try {
>   final long startTime = scheduler.monotonicNow();
>   //
>   // Every so often, send heartbeat or block-report
>   //
>   final boolean sendHeartbeat = scheduler.isHeartbeatDue(startTime);
>   HeartbeatResponse resp = null;
>   if (sendHeartbeat) {
>   
> boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) &&
> scheduler.isBlockReportDue(startTime);
> scheduler.scheduleNextHeartbeat();
> if (!dn.areHeartbeatsDisabledForTests()) {
>   resp = sendHeartBeat(requestBlockReportLease);
>   assert resp != null;
>   if (resp.getFullBlockReportLeaseId() != 0) {
> if (fullBlockReportLeaseId != 0) {
>   LOG.warn(nnAddr + " sent back a full block report lease " +
>   "ID of 0x" +
>   Long.toHexString(resp.getFullBlockReportLeaseId()) +
>   ", but we already have a lease ID of 0x" +
>   Long.toHexString(fullBlockReportLeaseId) + ". " +
>   "Overwriting old lease ID.");
> }
> fullBlockReportLeaseId = resp.getFullBlockReportLeaseId();
>   }
>  
> }
>   }
>
>  
>   if ((fullBlockReportLeaseId != 0) || forceFullBr) {
> //Exception occurred here when NN restarting
> cmds = blockReport(fullBlockReportLeaseId);
> fullBlockReportLeaseId = 0;
>   }
>   
> } catch(RemoteException re) {
>   
>   } // while (shouldRun())
> } // offerService{code}
>  



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HDFS-14314) fullBlockReportLeaseId should be reset after registering to NN

2019-02-26 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1672#comment-1672
 ] 

He Xiaoqiao commented on HDFS-14314:


To [~jojochuang], please help to add [~starphin] as contributor,
To [~starphin], after that, please click `submit patch` and re-upload patch, 
then it will auto-trigger jenkins to run unittest.

> fullBlockReportLeaseId should be reset after registering to NN
> --
>
> Key: HDFS-14314
> URL: https://issues.apache.org/jira/browse/HDFS-14314
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.4
> Environment:  
>  
>  
>Reporter: star
>Priority: Critical
> Fix For: 2.8.4
>
> Attachments: HDFS-14314-trunk.001.patch, HDFS-14314.0.patch, 
> HDFS-14314.2.patch, HDFS-14314.patch
>
>
>   since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
> block lease id from active NN before sending full block to NN. Then DN will 
> send full block report together with lease id. If the lease id is invalid, NN 
> will reject the full block report and log "not in the pending set".
>   In a case when DN is doing full block reporting while NN is restarted. 
> It happens that DN will later send a full block report with lease id 
> ,acquired from previous NN instance, which is invalid to the new NN instance. 
> Though DN recognized the new NN instance by heartbeat and reregister itself, 
> it did not reset the lease id from previous instance.
>   The issuse may cause DNs to temporarily go dead, making it unsafe to 
> restart NN especially in hadoop cluster which has large amount of DNs. 
> HDFS-12914 reported the issue  without any clues why it occurred and remain 
> unsolved.
>    To make it clear, look at code below. We take it from method 
> offerService of class BPServiceActor. We eliminate some code to focus on 
> current issue. fullBlockReportLeaseId is a local variable to hold lease id 
> from NN. Exceptions will occur at blockReport call when NN restarting, which 
> will be caught by catch block in while loop. Thus fullBlockReportLeaseId will 
> not be set to 0. After NN restarted, DN will send full block report which 
> will be rejected by the new NN instance. DN will never send full block report 
> until the next full block report schedule, about an hour later.
>   Solution is simple, just reset fullBlockReportLeaseId to 0 after any 
> exception or after registering to NN. Thus it will ask for a valid 
> fullBlockReportLeaseId from new NN instance.
> {code:java}
> private void offerService() throws Exception {
>   long fullBlockReportLeaseId = 0;
>   //
>   // Now loop for a long time
>   //
>   while (shouldRun()) {
> try {
>   final long startTime = scheduler.monotonicNow();
>   //
>   // Every so often, send heartbeat or block-report
>   //
>   final boolean sendHeartbeat = scheduler.isHeartbeatDue(startTime);
>   HeartbeatResponse resp = null;
>   if (sendHeartbeat) {
>   
> boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) &&
> scheduler.isBlockReportDue(startTime);
> scheduler.scheduleNextHeartbeat();
> if (!dn.areHeartbeatsDisabledForTests()) {
>   resp = sendHeartBeat(requestBlockReportLease);
>   assert resp != null;
>   if (resp.getFullBlockReportLeaseId() != 0) {
> if (fullBlockReportLeaseId != 0) {
>   LOG.warn(nnAddr + " sent back a full block report lease " +
>   "ID of 0x" +
>   Long.toHexString(resp.getFullBlockReportLeaseId()) +
>   ", but we already have a lease ID of 0x" +
>   Long.toHexString(fullBlockReportLeaseId) + ". " +
>   "Overwriting old lease ID.");
> }
> fullBlockReportLeaseId = resp.getFullBlockReportLeaseId();
>   }
>  
> }
>   }
>
>  
>   if ((fullBlockReportLeaseId != 0) || forceFullBr) {
> //Exception occurred here when NN restarting
> cmds = blockReport(fullBlockReportLeaseId);
> fullBlockReportLeaseId = 0;
>   }
>   
> } catch(RemoteException re) {
>   
>   } // while (shouldRun())
> } // offerService{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-25 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777652#comment-16777652
 ] 

He Xiaoqiao commented on HDFS-14305:


Hi [~csun],[~xkrogen],[~jojochuang], update  [^HDFS-14305.004.patch] following 
review comments. Please give another review if you have some time. Thanks.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-25 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14305:
---
Attachment: HDFS-14305.004.patch

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14314) fullBlockReportLeaseId should be reset after registering to NN

2019-02-25 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777640#comment-16777640
 ] 

He Xiaoqiao commented on HDFS-14314:


[~starphin], Thanks for your contribution,
some minor comment,
a. please follow community code style such as indented by 4 spaces, delete 
extra blank line, and some requisite comment for new method, I thinks the 
format problem is only in the unit test.
b.naming your patch following -..patch, 
refer: 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute#HowToContribute-Namingyourpatch.
+1 after update.

> fullBlockReportLeaseId should be reset after registering to NN
> --
>
> Key: HDFS-14314
> URL: https://issues.apache.org/jira/browse/HDFS-14314
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.4
> Environment:  
>  
>  
>Reporter: star
>Priority: Critical
> Fix For: 2.8.4
>
> Attachments: HDFS-14314.0.patch, HDFS-14314.2.patch, HDFS-14314.patch
>
>
>   since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
> block lease id from active NN before sending full block to NN. Then DN will 
> send full block report together with lease id. If the lease id is invalid, NN 
> will reject the full block report and log "not in the pending set".
>   In a case when DN is doing full block reporting while NN is restarted. 
> It happens that DN will later send a full block report with lease id 
> ,acquired from previous NN instance, which is invalid to the new NN instance. 
> Though DN recognized the new NN instance by heartbeat and reregister itself, 
> it did not reset the lease id from previous instance.
>   The issuse may cause DNs to temporarily go dead, making it unsafe to 
> restart NN especially in hadoop cluster which has large amount of DNs. 
> HDFS-12914 reported the issue  without any clues why it occurred and remain 
> unsolved.
>    To make it clear, look at code below. We take it from method 
> offerService of class BPServiceActor. We eliminate some code to focus on 
> current issue. fullBlockReportLeaseId is a local variable to hold lease id 
> from NN. Exceptions will occur at blockReport call when NN restarting, which 
> will be caught by catch block in while loop. Thus fullBlockReportLeaseId will 
> not be set to 0. After NN restarted, DN will send full block report which 
> will be rejected by the new NN instance. DN will never send full block report 
> until the next full block report schedule, about an hour later.
>   Solution is simple, just reset fullBlockReportLeaseId to 0 after any 
> exception or after registering to NN. Thus it will ask for a valid 
> fullBlockReportLeaseId from new NN instance.
> {code:java}
> private void offerService() throws Exception {
>   long fullBlockReportLeaseId = 0;
>   //
>   // Now loop for a long time
>   //
>   while (shouldRun()) {
> try {
>   final long startTime = scheduler.monotonicNow();
>   //
>   // Every so often, send heartbeat or block-report
>   //
>   final boolean sendHeartbeat = scheduler.isHeartbeatDue(startTime);
>   HeartbeatResponse resp = null;
>   if (sendHeartbeat) {
>   
> boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) &&
> scheduler.isBlockReportDue(startTime);
> scheduler.scheduleNextHeartbeat();
> if (!dn.areHeartbeatsDisabledForTests()) {
>   resp = sendHeartBeat(requestBlockReportLease);
>   assert resp != null;
>   if (resp.getFullBlockReportLeaseId() != 0) {
> if (fullBlockReportLeaseId != 0) {
>   LOG.warn(nnAddr + " sent back a full block report lease " +
>   "ID of 0x" +
>   Long.toHexString(resp.getFullBlockReportLeaseId()) +
>   ", but we already have a lease ID of 0x" +
>   Long.toHexString(fullBlockReportLeaseId) + ". " +
>   "Overwriting old lease ID.");
> }
> fullBlockReportLeaseId = resp.getFullBlockReportLeaseId();
>   }
>  
> }
>   }
>
>  
>   if ((fullBlockReportLeaseId != 0) || forceFullBr) {
> //Exception occurred here when NN restarting
> cmds = blockReport(fullBlockReportLeaseId);
> fullBlockReportLeaseId = 0;
>   }
>   
> } catch(RemoteException re) {
>   
>   } // while (shouldRun())
> } // offerService{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14201) Ability to disallow safemode NN to become active

2019-02-25 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777624#comment-16777624
 ] 

He Xiaoqiao commented on HDFS-14201:


Thanks [~surmountian] to push forward this issue.
{quote} I think combining the logic in HDFS-14201.002.patch and 
HDFS-14201.003.patch could be an option.
The same configuration item would be controlling these logic to be on/off.
{quote}
+1, it makes sense to me, Thanks again.

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
> Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch, 
> HDFS-14201.003.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-25 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777091#comment-16777091
 ] 

He Xiaoqiao commented on HDFS-14305:


Thanks [~xkrogen]. I will update code style and add some comment later.
{quote}I think 10 bits for the mask seems a little high to me; I agree with 
Chao that I can't think of a situation where you would need more than 32 or 64, 
and fewer bits for the per-NN key space mean a higher chance of collision on a 
NameNode restart.{quote}
Considering that there are total 32 bits of Integer and it is enough for 
rolling serial no using 22 bits. another side, fewer bits for mask more 
namenodes it could cover that avoid collision. So I choose 10 bits.
Of course, it is OK for me if choose number of mask bits between 3~10. Thanks 
again.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14201) Ability to disallow safemode NN to become active

2019-02-24 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776513#comment-16776513
 ] 

He Xiaoqiao commented on HDFS-14201:


Thanks [~surmountian] for quick response,  [^HDFS-14201.003.patch] LGTM for 
auto-failover using ZKFC, I am also going to concern if it can cover case about 
manual transition without ZKFC when namenode still in safemode. FYI.

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
> Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch, 
> HDFS-14201.003.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-24 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776511#comment-16776511
 ] 

He Xiaoqiao commented on HDFS-14305:


Fix bug about bit-shift and add new unit test [^HDFS-14305.003.patch]. trigger 
jenkins again.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-24 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14305:
---
Attachment: HDFS-14305.003.patch

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14314) fullBlockReportLeaseId should be reset after registering to NN

2019-02-24 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776238#comment-16776238
 ] 

He Xiaoqiao commented on HDFS-14314:


[~starphin], I do not think fail tests (TestJournalNodeSync) is related with 
this patch since it failed for some time.  It may be better if add new unit 
test. FYI.

> fullBlockReportLeaseId should be reset after registering to NN
> --
>
> Key: HDFS-14314
> URL: https://issues.apache.org/jira/browse/HDFS-14314
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.4
> Environment:  
>  
>  
>Reporter: star
>Priority: Critical
> Fix For: 2.8.4
>
> Attachments: HDFS-14314.0.patch, HDFS-14314.patch
>
>
>   since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
> block lease id from active NN before sending full block to NN. Then DN will 
> send full block report together with lease id. If the lease id is invalid, NN 
> will reject the full block report and log "not in the pending set".
>   In a case when DN is doing full block reporting while NN is restarted. 
> It happens that DN will later send a full block report with lease id 
> ,acquired from previous NN instance, which is invalid to the new NN instance. 
> Though DN recognized the new NN instance by heartbeat and reregister itself, 
> it did not reset the lease id from previous instance.
>   The issuse may cause DNs to temporarily go dead, making it unsafe to 
> restart NN especially in hadoop cluster which has large amount of DNs. 
> HDFS-12914 reported the issue  without any clues why it occurred and remain 
> unsolved.
>    To make it clear, look at code below. We take it from method 
> offerService of class BPServiceActor. We eliminate some code to focus on 
> current issue. fullBlockReportLeaseId is a local variable to hold lease id 
> from NN. Exceptions will occur at blockReport call when NN restarting, which 
> will be caught by catch block in while loop. Thus fullBlockReportLeaseId will 
> not be set to 0. After NN restarted, DN will send full block report which 
> will be rejected by the new NN instance. DN will never send full block report 
> until the next full block report schedule, about an hour later.
>   Solution is simple, just reset fullBlockReportLeaseId to 0 after any 
> exception or after registering to NN. Thus it will ask for a valid 
> fullBlockReportLeaseId from new NN instance.
> {code:java}
> private void offerService() throws Exception {
>   long fullBlockReportLeaseId = 0;
>   //
>   // Now loop for a long time
>   //
>   while (shouldRun()) {
> try {
>   final long startTime = scheduler.monotonicNow();
>   //
>   // Every so often, send heartbeat or block-report
>   //
>   final boolean sendHeartbeat = scheduler.isHeartbeatDue(startTime);
>   HeartbeatResponse resp = null;
>   if (sendHeartbeat) {
>   
> boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) &&
> scheduler.isBlockReportDue(startTime);
> scheduler.scheduleNextHeartbeat();
> if (!dn.areHeartbeatsDisabledForTests()) {
>   resp = sendHeartBeat(requestBlockReportLease);
>   assert resp != null;
>   if (resp.getFullBlockReportLeaseId() != 0) {
> if (fullBlockReportLeaseId != 0) {
>   LOG.warn(nnAddr + " sent back a full block report lease " +
>   "ID of 0x" +
>   Long.toHexString(resp.getFullBlockReportLeaseId()) +
>   ", but we already have a lease ID of 0x" +
>   Long.toHexString(fullBlockReportLeaseId) + ". " +
>   "Overwriting old lease ID.");
> }
> fullBlockReportLeaseId = resp.getFullBlockReportLeaseId();
>   }
>  
> }
>   }
>
>  
>   if ((fullBlockReportLeaseId != 0) || forceFullBr) {
> //Exception occurred here when NN restarting
> cmds = blockReport(fullBlockReportLeaseId);
> fullBlockReportLeaseId = 0;
>   }
>   
> } catch(RemoteException re) {
>   
>   } // while (shouldRun())
> } // offerService{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14201) Ability to disallow safemode NN to become active

2019-02-24 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776235#comment-16776235
 ] 

He Xiaoqiao commented on HDFS-14201:


 [^HDFS-14201.002.patch] add new configuration element for safemode checking 
when transition to active.

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
> Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14201) Ability to disallow safemode NN to become active

2019-02-24 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14201:
---
Attachment: HDFS-14201.002.patch

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
> Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-24 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14305:
---
Attachment: HDFS-14305.002.patch
Status: Patch Available  (was: Open)

[~csun],[~xkrogen], [^HDFS-14305.002.patch] using 10 bits to identify index of 
NameNode in the same namespace, and the remainder 22 bits auto-incr, which can 
cover <1024 namenodes in one namespace and fix serial No. overlap about 
{{BlockTokenSecretManager}} with the previous implementation without HDFS-6440. 
Please help to review at your convenience.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-24 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776199#comment-16776199
 ] 

He Xiaoqiao commented on HDFS-14305:


Thanks [~csun],[~xkrogen] for your quick response.
{quote}One potential issue with the patch 001 is that when keys are updated 
(which will call setSerialNo), it could go to a range that belongs to a 
different NameNode{quote}
To [~csun], with patch 001 I think serial no will not be overlap for different 
namenodes if fixed number of namenodes in the same namespace. But it will 
appear when add/remove namenodes (e.g. observers) and have to re-config and 
restart all namenode in the  same namespace. I think you also mean that, right?
{quote}Instead of 1 bit, we can either pre-allocate a fixed number of bits 
(e.g., 5), or calculate the number of bits needed from the total number of 
configured namenodes. {quote}
I agree with pre-allocate a fixed number of bits for different namenodes. 
[~xkrogen],[~csun] any more suggestions.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14305.001.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-22 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774944#comment-16774944
 ] 

He Xiaoqiao commented on HDFS-14305:


[~csun],[~jojochuang] I attached a quick-and-dirty demonstration patch without 
unittest [^HDFS-14305.001.patch]. Please correct me if there are something 
wrong.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14305.001.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-22 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14305:
---
Attachment: HDFS-14305.001.patch

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14305.001.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-22 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774934#comment-16774934
 ] 

He Xiaoqiao commented on HDFS-14305:


hi [~csun], I think this issue triggered only after HDFS-6440. Before that, it 
is work well in HA cluster with 2 NameNodes (based on branch-2.7). Check 
{{serialNo}} NO. scope and shows as following and no overlap between 2 
namenodes:
{quote}nnIndex=0: [0, 2147483647]
 nnIndex=1: [-2147483648, -1]
{quote}
HDFS-6440 used {{intRange}} + {{nnRangeStart}} replace {{nnIndex}}, and only 
distributed positive integer to different namenodes, but when initialize 
serialNo it could be negtive integer since invoke {{new 
SecureRandom().nextInt()}}, and cause serialno overlap between different 
namenodes in same namespace. In one words, the root cause is 
{{SecureRandom().nextInt()}}.
 I propose to use only positive integer as serialNo of BlockTokenSecretManager 
to avoid this issue. FYI.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14309) name node fail over failed with ssh fence failed because of jsch login failed with key check

2019-02-21 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774825#comment-16774825
 ] 

He Xiaoqiao commented on HDFS-14309:


Hi [~iamgd67], Thanks to report this issue. HADOOP-14100 has fixed it, and also 
merge into branch-2.7, I think you could backport HADOOP-14100 to your own 
branch. FYI.

> name node fail over failed with ssh fence failed because of jsch login failed 
> with key check
> 
>
> Key: HDFS-14309
> URL: https://issues.apache.org/jira/browse/HDFS-14309
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.3
> Environment: linux CentOS release 6.8 (Final) kernel 
> 2.6.32-642.6.1.el6.x86_64
>  
>Reporter: qiang Liu
>Priority: Major
> Attachments: HDFS-14309-branch-2.7.3.001.patch
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> name node fail over failed with ssh fence failed because of jsch login failed 
> with key check.
> the loged error is "Algorithm negotiation fail"
> update jsch to 0.1.54 works ok



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14201) Ability to disallow safemode NN to become active

2019-02-21 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774799#comment-16774799
 ] 

He Xiaoqiao commented on HDFS-14201:


Hi [~surmountian], [^HDFS-14201.001.patch] is my improvement for online 
production env, and I have turn off ability about #transitionToActive when 
namenode is still in safemode. It looks work well for over half a years. FYI.

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
> Attachments: HDFS-14201.001.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14201) Ability to disallow safemode NN to become active

2019-02-21 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14201:
---
Attachment: HDFS-14201.001.patch

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
> Attachments: HDFS-14201.001.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

2019-02-20 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao resolved HDFS-14186.

Resolution: Won't Fix

Thanks all for your helps, this issue has resolved with HADOOP-12173 + 
HDFS-9198 and it works very well as expect in our production env. Thanks again.

> blockreport storm slow down namenode restart seriously in large cluster
> ---
>
> Key: HDFS-14186
> URL: https://issues.apache.org/jira/browse/HDFS-14186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

2019-01-25 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752055#comment-16752055
 ] 

He Xiaoqiao commented on HDFS-14186:


The issue about namenode restart time-consume seriously have solved by patch 
HADOOP-12173+HDFS-9198.
The root cause is as following:
A. NetworkTopology#toString is hot point only for hadoop-2.7.1.
B. Serial BR processing affects performance when restart.

Opt A cause process RPC #proceregisterDatanode cost long time, the worst case 
like: 
{quote}2019-01-21 18:08:06,303 DEBUG org.apache.hadoop.ipc.Server: Served: 
registerDatanode queueTime= 66079 procesingTime= 3266{quote}
And QueueCall is always full, So some DataNode has to retry until register 
successfully. Stack trace like:
{quote}"IPC Server handler 40 on 8040" #149 daemon prio=5 os_prio=0 
tid=0x7f7ff571c800 nid=0x2a9dd runnable [0x7f19b10ce000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.net.NetworkTopology$InnerNode.getLeaf(NetworkTopology.java:340)
at 
org.apache.hadoop.net.NetworkTopology$InnerNode.getLeaf(NetworkTopology.java:340)
at 
org.apache.hadoop.net.NetworkTopology.toString(NetworkTopology.java:831)
at org.apache.hadoop.net.NetworkTopology.add(NetworkTopology.java:403)
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:1029)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4741)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1487)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:97)
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:33709)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:847)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:790)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2458){quote}
Opt B is very understandable and quick fix by HDFS-9198 then BR RPC do not 
occupy CallQueue for long time anymore.

On test env building by dynamometer, with 40K nodes, 1.5B inodes+blocks, 
NameNode restart can finished under 1.5H.

However, another issue that mentioned above, there are still about 10min that 
service rpc CallQueue' load not decrease after NameNode safemode leave since 
BlockReport do not process completely.

> blockreport storm slow down namenode restart seriously in large cluster
> ---
>
> Key: HDFS-14186
> URL: https://issues.apache.org/jira/browse/HDFS-14186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

2019-01-23 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750718#comment-16750718
 ] 

He Xiaoqiao commented on HDFS-14186:


Thanks [~daryn],[~kihwal] for your help, this issue is based on 2.7.1 and no 
async BR processing, I am just testing Patch HDFS-9198, I will report result 
timely when finish testing.

> blockreport storm slow down namenode restart seriously in large cluster
> ---
>
> Key: HDFS-14186
> URL: https://issues.apache.org/jira/browse/HDFS-14186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

2019-01-23 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14186:
---
Affects Version/s: 2.7.1

> blockreport storm slow down namenode restart seriously in large cluster
> ---
>
> Key: HDFS-14186
> URL: https://issues.apache.org/jira/browse/HDFS-14186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14211) [Consistent Observer Reads] Allow for configurable "always msync" mode

2019-01-17 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745811#comment-16745811
 ] 

He Xiaoqiao commented on HDFS-14211:


[~xkrogen] Thanks for pushing this important phase forward.
As description mentioned, a client will call #msync before every single read 
operation. Does that mean *every* read operation will be lead to request to 
Observer forever or just for 'Third-party Communication'? 
I think there should be a graceful strategy to decide how much request to ANN 
and the others to Observer? Since RPC request to Observer will brings extra 
latency based on how long to catch up state id, and IIUC  it may reduce 
throughput when lots write operations but not reach ANN threshold (sorry for no 
benchmark numbers, just based on design docs and descriptions), FYI.

> [Consistent Observer Reads] Allow for configurable "always msync" mode
> --
>
> Key: HDFS-14211
> URL: https://issues.apache.org/jira/browse/HDFS-14211
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Erik Krogen
>Priority: Major
>
> To allow for reads to be serviced from an ObserverNode (see HDFS-12943) in a 
> consistent way, an {{msync}} API was introduced (HDFS-13688) to allow for a 
> client to fetch the latest transaction ID from the Active NN, thereby 
> ensuring that subsequent reads from the ObserverNode will be up-to-date with 
> the current state of the Active.
> Using this properly, however, requires application-side changes: for 
> examples, a NodeManager should call {{msync}} before localizing the resources 
> for a client, since it received notification of the existence of those 
> resources via communicate which is out-of-band to HDFS and thus could 
> potentially attempt to localize them prior to the availability of those 
> resources on the ObserverNode.
> Until such application-side changes can be made, which will be a longer-term 
> effort, we need to provide a mechanism for unchanged clients to utilize the 
> ObserverNode without exposing such a client to inconsistencies. This is 
> essentially phase 3 of the roadmap outlined in the [design 
> document|https://issues.apache.org/jira/secure/attachment/12915990/ConsistentReadsFromStandbyNode.pdf]
>  for HDFS-12943.
> The design document proposes some heuristics based on understanding of how 
> common applications (e.g. MR) use HDFS for resources. As an initial pass, we 
> can simply have a flag which tells a client to call {{msync}} before _every 
> single_ read operation. This may seem counterintuitive, as it turns every 
> read operation into two RPCs: {{msync}} to the Active following by an actual 
> read operation to the Observer. However, the {{msync}} operation is extremely 
> lightweight, as it does not acquire the {{FSNamesystemLock}}, and in 
> experiments we have found that this approach can easily scale to well over 
> 100,000 {{msync}} operations per second on the Active (while still servicing 
> approx. 10,000 write op/s). Combined with the fast-path edit log tailing for 
> standby/observer nodes (HDFS-13150), this "always msync" approach should 
> introduce only a few ms of extra latency to each read call.
> Below are some experimental results collected from experiments which convert 
> a normal RPC workload into one in which all read operations are turned into 
> an {{msync}}. The baseline is a workload of 1.5k write op/s and 25k read op/s.
> ||Rate Multiplier|2|4|6|8||
> ||RPC Queue Avg Time (ms)|14|53|110|125||
> ||RPC Queue NumOps Avg (k)|51|102|147|177||
> ||RPC Queue NumOps Max (k)|148|269|306|312||
> _(numbers are approximate and should be viewed primarily for their trends)_
> Results are promising up to between 4x and 6x of the baseline workload, which 
> is approx. 100-150k read op/s.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

2019-01-15 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743589#comment-16743589
 ] 

He Xiaoqiao commented on HDFS-14186:


[~elgoiri], thanks for your comments.
{quote}In any case, I think that if the DN is not registered, it cannot be 
marked as DEAD.{quote}
what I wanted to say was that lifeline could fix the case that datanode has 
registered when startup, but there is also another issue that datanode could 
not register successfully since namenode is overrun when namenode at startup 
progress especially in a large cluster.
{quote}Ideally a stack trace of the thread that is holding the other 
requests.{quote}
I will post stack trace soon. another side, I think namenode massive log about 
processing blockreport should be very telling. Thanks again.

> blockreport storm slow down namenode restart seriously in large cluster
> ---
>
> Key: HDFS-14186
> URL: https://issues.apache.org/jira/browse/HDFS-14186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13688) Introduce msync API call

2019-01-15 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742910#comment-16742910
 ] 

He Xiaoqiao commented on HDFS-13688:


Thanks all for great work here.
I am confused about the new API #msync, from the design docs it introduces an 
RPC call msync to ensure consistent read. IIUC application upon HDFS has to 
adapt the new API if open 'Consistent Read' feature, this changes involve 
complex works since there are more and more engine run on HDFS, I believe it is 
a gigantic project if all compute engines to match this change. So my question 
is any plan to restrain data consistent checking in DFSClient only? If I missed 
something or understood incorrectly please correct me. Thanks again.

> Introduce msync API call
> 
>
> Key: HDFS-13688
> URL: https://issues.apache.org/jira/browse/HDFS-13688
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Fix For: HDFS-12943, 3.3.0
>
> Attachments: HDFS-13688-HDFS-12943.001.patch, 
> HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, 
> HDFS-13688-HDFS-12943.003.patch, HDFS-13688-HDFS-12943.004.patch, 
> HDFS-13688-HDFS-12943.005.patch, HDFS-13688-HDFS-12943.WIP.002.patch, 
> HDFS-13688-HDFS-12943.WIP.patch
>
>
> As mentioned in the design doc in HDFS-12943, to ensure consistent read, we 
> need to introduce an RPC call {{msync}}. Specifically, client can issue a 
> msync call to Observer node along with a transactionID. The msync will only 
> return when the Observer's transactionID has caught up to the given ID. This 
> JIRA is to add this API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

2019-01-14 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742816#comment-16742816
 ] 

He Xiaoqiao commented on HDFS-14186:


[~elgoiri] Thanks for your comments. sorry for missing something. just quick 
scan branch trunk again, IIUC #sendLifeline is lock free, and we can set a 
different port for the lifeline server, thus, DataNode may be not set DEAD when 
NameNode startup, but open lifeline feature is just effect after DataNode has 
registered successfully. As description above, if service port is overrun and 
some DataNode does not register successfully, lifeline could not fix it, FYI. 
Please correct me if there are something wrong.

> blockreport storm slow down namenode restart seriously in large cluster
> ---
>
> Key: HDFS-14186
> URL: https://issues.apache.org/jira/browse/HDFS-14186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

2019-01-14 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742740#comment-16742740
 ] 

He Xiaoqiao commented on HDFS-14186:


some more information, when I try to set 
{{dfs.namenode.safemode.min.datanodes}} to num of slaves and it works as 
expectation, and no datanode is set DEAD and re-register and resend block 
report again.

> blockreport storm slow down namenode restart seriously in large cluster
> ---
>
> Key: HDFS-14186
> URL: https://issues.apache.org/jira/browse/HDFS-14186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

2019-01-14 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742738#comment-16742738
 ] 

He Xiaoqiao commented on HDFS-14186:


Thanks further discussing. I would like to answer some doubts raised above.
 To [~kihwal],
{quote}one thing to note is that the rpc processing time can be misleading in 
this case.
{quote}
>From namenode sample log as following.
{quote}2019-01-14 22:32:35,383 INFO BlockStateChange: BLOCK* processReport: 
from storage DS-dd5c0397-3fcd-43fb-a71b-1eef6a2307f1 node 
DatanodeRegistration(datanodeip:50010, datanodeUuid=$datanodeuuid, 
infoPort=50075, infoSecurePort=0, ipcPort=50020, 
storageInfo=lv=-57;cid=$clusterud;nsid=$nsid;c=0), blocks: 11847, 
hasStaleStorage: true, processing time: 15 msecs
{quote}
The processing time is from namenode log rather than rpc processing time 
metrics, and it is exact I believe.
{quote}In 2.7 days, we ended up configuring datanodes breaking up block reports 
unconditionally and that helped NN startup performance.
{quote}
30K~40K blocks at per datanode and taking less than 60ms average for processing 
per block report, above 15K slaves overall. I do not split block report per 
storage since the number blocks of datanode is not large enough. as mentioned 
above, average 30K~40K per datanode, it is not necessary to split any more.
 The item 'dfs.blockreport.split.threshold' of configuration looks work well 
using 2.7.1 based on tracing code, please correct If I missing something.
{quote}we can have NN check whether all storage reports are received from all 
registered nodes.
{quote}
It is good suggestion. however, it is hard to collect all storage of the whole 
cluster when namenode startup, If only using registered nodes, maybe this issue 
could be not resolved completely since there may be some unregister datanode 
continue to report and the load of namenode could not release.

to [~elgoiri]
{quote}This is caused by namenode getting overwhelmed. Besides, the lifeline 
rpc will use the same service rpc port whose queue is constantly overrun in 
this case. For the lifeline server, one can set a different port so it should 
have a different RPC queue altogether, right?
{quote}
Thanks [~kihwal]'s detailed explain, another side, lifeline have unobviously 
effect when namenode startup due to the global lock of namenode and processing 
block report holds write lock, all register/report rpc have to queue and 
process one by one. In one word, NameNode have no remaining time and resource 
to process block report storm even if RPC can enqueue.

> blockreport storm slow down namenode restart seriously in large cluster
> ---
>
> Key: HDFS-14186
> URL: https://issues.apache.org/jira/browse/HDFS-14186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13473) DataNode update BlockKeys using mode PULL rather than PUSH from NameNode

2019-01-13 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-13473:
---
Attachment: HDFS-13473-trunk.007.patch

> DataNode update BlockKeys using mode PULL rather than PUSH from NameNode
> 
>
> Key: HDFS-13473
> URL: https://issues.apache.org/jira/browse/HDFS-13473
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-13473-trunk.001.patch, HDFS-13473-trunk.002.patch, 
> HDFS-13473-trunk.003.patch, HDFS-13473-trunk.004.patch, 
> HDFS-13473-trunk.005.patch, HDFS-13473-trunk.006.patch, 
> HDFS-13473-trunk.007.patch
>
>
> It is passive behavior about updating Block keys for DataNode currently, and 
> it depends on if NameNode return #KeyUpdateCommand for heartbeat response.
> There are several problems of this Block keys synchronization mode:
> a. NameNode can't be sensed about if Block Keys reach DataNode successfully,
> b. It is also not sensed for DataNode who meets some exception while receive 
> or process heartbeat response which include BlockKeyCommand,
> such as HDFS-13441 and HDFS-12749 mentioned.
> So I propose improve Push Block Keys from NameNode for DataNode to DataNode 
> Pull Block Keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13473) DataNode update BlockKeys using mode PULL rather than PUSH from NameNode

2019-01-13 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16741743#comment-16741743
 ] 

He Xiaoqiao commented on HDFS-13473:


v007 fix checkstyle, check fail unit test and it passed at local machine, I 
think it is not related to this patch.

> DataNode update BlockKeys using mode PULL rather than PUSH from NameNode
> 
>
> Key: HDFS-13473
> URL: https://issues.apache.org/jira/browse/HDFS-13473
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-13473-trunk.001.patch, HDFS-13473-trunk.002.patch, 
> HDFS-13473-trunk.003.patch, HDFS-13473-trunk.004.patch, 
> HDFS-13473-trunk.005.patch, HDFS-13473-trunk.006.patch, 
> HDFS-13473-trunk.007.patch
>
>
> It is passive behavior about updating Block keys for DataNode currently, and 
> it depends on if NameNode return #KeyUpdateCommand for heartbeat response.
> There are several problems of this Block keys synchronization mode:
> a. NameNode can't be sensed about if Block Keys reach DataNode successfully,
> b. It is also not sensed for DataNode who meets some exception while receive 
> or process heartbeat response which include BlockKeyCommand,
> such as HDFS-13441 and HDFS-12749 mentioned.
> So I propose improve Push Block Keys from NameNode for DataNode to DataNode 
> Pull Block Keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13473) DataNode update BlockKeys using mode PULL rather than PUSH from NameNode

2019-01-13 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16741584#comment-16741584
 ] 

He Xiaoqiao commented on HDFS-13473:


 [^HDFS-13473-trunk.006.patch] rebase branch trunk and trigger jenkins again.

> DataNode update BlockKeys using mode PULL rather than PUSH from NameNode
> 
>
> Key: HDFS-13473
> URL: https://issues.apache.org/jira/browse/HDFS-13473
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-13473-trunk.001.patch, HDFS-13473-trunk.002.patch, 
> HDFS-13473-trunk.003.patch, HDFS-13473-trunk.004.patch, 
> HDFS-13473-trunk.005.patch, HDFS-13473-trunk.006.patch
>
>
> It is passive behavior about updating Block keys for DataNode currently, and 
> it depends on if NameNode return #KeyUpdateCommand for heartbeat response.
> There are several problems of this Block keys synchronization mode:
> a. NameNode can't be sensed about if Block Keys reach DataNode successfully,
> b. It is also not sensed for DataNode who meets some exception while receive 
> or process heartbeat response which include BlockKeyCommand,
> such as HDFS-13441 and HDFS-12749 mentioned.
> So I propose improve Push Block Keys from NameNode for DataNode to DataNode 
> Pull Block Keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13473) DataNode update BlockKeys using mode PULL rather than PUSH from NameNode

2019-01-13 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-13473:
---
Attachment: HDFS-13473-trunk.006.patch

> DataNode update BlockKeys using mode PULL rather than PUSH from NameNode
> 
>
> Key: HDFS-13473
> URL: https://issues.apache.org/jira/browse/HDFS-13473
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-13473-trunk.001.patch, HDFS-13473-trunk.002.patch, 
> HDFS-13473-trunk.003.patch, HDFS-13473-trunk.004.patch, 
> HDFS-13473-trunk.005.patch, HDFS-13473-trunk.006.patch
>
>
> It is passive behavior about updating Block keys for DataNode currently, and 
> it depends on if NameNode return #KeyUpdateCommand for heartbeat response.
> There are several problems of this Block keys synchronization mode:
> a. NameNode can't be sensed about if Block Keys reach DataNode successfully,
> b. It is also not sensed for DataNode who meets some exception while receive 
> or process heartbeat response which include BlockKeyCommand,
> such as HDFS-13441 and HDFS-12749 mentioned.
> So I propose improve Push Block Keys from NameNode for DataNode to DataNode 
> Pull Block Keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

< 1 2 3 4 5 6 7 8 9 >

501 - 600 of 817 matches

Mail list logo