[jira] [Commented] (HDFS-12142) Files may be closed before streamer is done

2017-07-14 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087565#comment-16087565 ] Kihwal Lee commented on HDFS-12142: --- The following appears after the files is successfully closed. It

[jira] [Commented] (HDFS-12120) Use new block for pre-RollingUpgrade files' append requests

2017-07-14 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087393#comment-16087393 ] Kihwal Lee commented on HDFS-12120: --- It sounds fine conceptually - The variable length block feature is

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-07-12 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084148#comment-16084148 ] Kihwal Lee commented on HDFS-5042: -- bq. Here is the addendum patch to move fsync() out of lock. It should

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-07-11 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082748#comment-16082748 ] Kihwal Lee commented on HDFS-5042: -- We are seeing significant performance degradation in 2.8 with this

[jira] [Commented] (HDFS-11915) Sync rbw dir on the first hsync() to avoid file lost on power failure

2017-07-10 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080579#comment-16080579 ] Kihwal Lee commented on HDFS-11915: --- The approach looks good. > Sync rbw dir on the first hsync() to

[jira] [Updated] (HDFS-11915) Sync rbw dir on the first hsync() to avoid file lost on power failure

2017-07-10 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11915: -- Description: As discussed in HDFS-5042, there is a chance to lose blocks on power failure if rbw file

[jira] [Moved] (HDFS-12056) NPE while getting list of live nodes via jmx

2017-06-28 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee moved HADOOP-14607 to HDFS-12056: Affects Version/s: (was: 2.7.3) 2.7.3

[jira] [Commented] (HDFS-12008) Improve the available-space block placement policy

2017-06-26 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063658#comment-16063658 ] Kihwal Lee commented on HDFS-12008: --- Test failures are not caused by the patch. {{TestFsDatasetImpl}}

[jira] [Updated] (HDFS-12040) TestFsDatasetImpl.testCleanShutdownOfVolume fails

2017-06-26 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-12040: -- Summary: TestFsDatasetImpl.testCleanShutdownOfVolume fails (was:

[jira] [Commented] (HDFS-12040) TestFsDatasetImpl.testCleanShutdownOfVolume fails in branch-2

2017-06-26 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063528#comment-16063528 ] Kihwal Lee commented on HDFS-12040: --- I just ran it on trunk and it failed. {noformat} Running

[jira] [Resolved] (HDFS-11589) Unable to remove dead node after datanode decommission

2017-06-26 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee resolved HDFS-11589. --- Resolution: Duplicate > Unable to remove dead node after datanode decommission >

[jira] [Commented] (HDFS-12004) Namenode UI continues to list DNs that have been removed from include and exclude

2017-06-26 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063279#comment-16063279 ] Kihwal Lee commented on HDFS-12004: --- I thought it was fixed in HDFS-8950. > Namenode UI continues to

[jira] [Updated] (HDFS-12008) Improve the available-space block placement policy

2017-06-26 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-12008: -- Attachment: HDFS-12008.v2.trunk.patch > Improve the available-space block placement policy >

[jira] [Commented] (HDFS-12008) Improve the available-space block placement policy

2017-06-23 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061277#comment-16061277 ] Kihwal Lee commented on HDFS-12008: --- branch-2 precommit failure. HADOOP-14146 is still causing problems.

[jira] [Updated] (HDFS-12008) Improve the available-space block placement policy

2017-06-23 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-12008: -- Attachment: HDFS-12008.v2.branch-2.patch HDFS-12008.v2.trunk.patch Attaching new

[jira] [Commented] (HDFS-12008) Improve the available-space block placement policy

2017-06-23 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061156#comment-16061156 ] Kihwal Lee commented on HDFS-12008: --- I changed the number of nodes from 20 (4 racks) to 100 (10 racks)

[jira] [Commented] (HDFS-12008) Improve the available-space block placement policy

2017-06-22 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060135#comment-16060135 ] Kihwal Lee commented on HDFS-12008: --- 2.8 actually seems fine. If I spread nodes evenly, I get the values

[jira] [Commented] (HDFS-12008) Improve the available-space block placement policy

2017-06-22 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060031#comment-16060031 ] Kihwal Lee commented on HDFS-12008: --- Hmm. Trunk is doing what's expected. 1) choose a random node. So

[jira] [Commented] (HDFS-12008) Improve the available-space block placement policy

2017-06-22 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060016#comment-16060016 ] Kihwal Lee commented on HDFS-12008: --- When you set the conf to balance the space all the time (1.0f), the

[jira] [Commented] (HDFS-12008) Improve the available-space block placement policy

2017-06-22 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059989#comment-16059989 ] Kihwal Lee commented on HDFS-12008: --- This doesn't look right...

[jira] [Updated] (HDFS-12008) Improve the available-space block placement policy

2017-06-21 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-12008: -- Status: Patch Available (was: Open) > Improve the available-space block placement policy >

[jira] [Updated] (HDFS-12008) Improve the available-space block placement policy

2017-06-21 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-12008: -- Attachment: HDFS-12008.patch > Improve the available-space block placement policy >

[jira] [Created] (HDFS-12008) Improve the available-space block placement policy

2017-06-21 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-12008: - Summary: Improve the available-space block placement policy Key: HDFS-12008 URL: https://issues.apache.org/jira/browse/HDFS-12008 Project: Hadoop HDFS Issue Type:

[jira] [Assigned] (HDFS-12008) Improve the available-space block placement policy

2017-06-21 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee reassigned HDFS-12008: - Assignee: Kihwal Lee > Improve the available-space block placement policy >

[jira] [Updated] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-20 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11960: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) >

[jira] [Updated] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-20 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11960: -- Fix Version/s: 2.8.2 3.0.0-alpha4 2.9.0 > Successfully closed

[jira] [Commented] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-20 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16056111#comment-16056111 ] Kihwal Lee commented on HDFS-11960: --- HDFS-9754 was needed for the test. Cherry-picked HDFS-9754 and this

[jira] [Updated] (HDFS-9754) Avoid unnecessary getBlockCollection calls in BlockManager

2017-06-20 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-9754: - Fix Version/s: 2.8.2 > Avoid unnecessary getBlockCollection calls in BlockManager >

[jira] [Commented] (HDFS-9754) Avoid unnecessary getBlockCollection calls in BlockManager

2017-06-20 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16056107#comment-16056107 ] Kihwal Lee commented on HDFS-9754: -- Cherry-picked to branch-2.8. > Avoid unnecessary getBlockCollection

[jira] [Commented] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-20 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055843#comment-16055843 ] Kihwal Lee commented on HDFS-11960: --- Thanks for the review, Daryn. I've committed this to trunk and

[jira] [Updated] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-19 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11960: -- Attachment: HDFS-11960-v2.branch-2.txt The branch-2 patch is identical except for the name change from

[jira] [Updated] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-19 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11960: -- Attachment: HDFS-11960-v2.trunk.txt Added unit test. > Successfully closed files can stay

[jira] [Commented] (HDFS-11742) Improve balancer usability after HDFS-8818

2017-06-14 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049905#comment-16049905 ] Kihwal Lee commented on HDFS-11742: --- bq. But may be we want to understand more closely at the root cause

[jira] [Commented] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-09 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045012#comment-16045012 ] Kihwal Lee commented on HDFS-11960: --- All failed tests were fine when reran. {noformat}

[jira] [Updated] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-09 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11960: -- Attachment: HDFS-11960.patch > Successfully closed files can stay under-replicated. >

[jira] [Updated] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-09 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11960: -- Status: Patch Available (was: Open) > Successfully closed files can stay under-replicated. >

[jira] [Commented] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-09 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044775#comment-16044775 ] Kihwal Lee commented on HDFS-11960: --- The simplest fix will be not letting {{addBlock()}} remove a

[jira] [Commented] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-09 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044755#comment-16044755 ] Kihwal Lee commented on HDFS-11960: --- Details of the step 6). {{processIncrementalBlockReport()}} calls

[jira] [Updated] (HDFS-11960) Successfully closed files can stay under-replicated.

2017-06-09 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11960: -- Summary: Successfully closed files can stay under-replicated. (was: Successfully closed file can stay

[jira] [Assigned] (HDFS-11960) Successfully closed file can stay under-replicated.

2017-06-09 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee reassigned HDFS-11960: - Assignee: Kihwal Lee > Successfully closed file can stay under-replicated. >

[jira] [Moved] (HDFS-11960) Successfully closed file can stay under-replicated.

2017-06-09 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee moved HADOOP-14514 to HDFS-11960: Target Version/s: 2.8.2 (was: 2.8.2) Key: HDFS-11960

[jira] [Commented] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-09 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044490#comment-16044490 ] Kihwal Lee commented on HDFS-11945: --- Thanks, [~liuml07]! > Internal lease recovery may not be retried

[jira] [Commented] (HDFS-11941) Move dfsadmin triggerBlockReport and metaSave to debugadmin

2017-06-08 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043165#comment-16043165 ] Kihwal Lee commented on HDFS-11941: --- {{recoverLease}} is not necessarily an admin command. It can be

[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-08 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Attachment: HDFS-11945.trunk.v2.patch HDFS-11945.branch-2.v2.patch Attaching updated

[jira] [Commented] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16041535#comment-16041535 ] Kihwal Lee commented on HDFS-11945: --- The failed tests all pass when I run them. {noformat}

[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Attachment: HDFS-11945.trunk.patch > Internal lease recovery may not be retried for a long time >

[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Status: Patch Available (was: Open) > Internal lease recovery may not be retried for a long time >

[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Attachment: (was: HDFS-11945.trunk.patch) > Internal lease recovery may not be retried for a long

[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Status: Open (was: Patch Available) > Internal lease recovery may not be retried for a long time >

[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Status: Patch Available (was: Open) > Internal lease recovery may not be retried for a long time >

[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Attachment: HDFS-11945.trunk.patch > Internal lease recovery may not be retried for a long time >

[jira] [Comment Edited] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040927#comment-16040927 ] Kihwal Lee edited comment on HDFS-11945 at 6/7/17 3:18 PM: --- We could change the

[jira] [Assigned] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee reassigned HDFS-11945: - Assignee: Kihwal Lee > Internal lease recovery may not be retried for a long time >

[jira] [Commented] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040927#comment-16040927 ] Kihwal Lee commented on HDFS-11945: --- We could change the namenode lease holder ID every hour. Normally

[jira] [Created] (HDFS-11945) Internal lease recovery may not be retried for a long time

2017-06-07 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-11945: - Summary: Internal lease recovery may not be retried for a long time Key: HDFS-11945 URL: https://issues.apache.org/jira/browse/HDFS-11945 Project: Hadoop HDFS

[jira] [Commented] (HDFS-11797) BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException when corrupt replicas are inconsistent

2017-06-06 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039647#comment-16039647 ] Kihwal Lee commented on HDFS-11797: --- bq. The AIOOBE error was logged without stacktrace in NameNode for

[jira] [Updated] (HDFS-10816) TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor

2017-06-05 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-10816: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.2

[jira] [Commented] (HDFS-10816) TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor

2017-06-05 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16037486#comment-16037486 ] Kihwal Lee commented on HDFS-10816: --- +1 > TestComputeInvalidateWork#testDatanodeReRegistration fails

[jira] [Commented] (HDFS-11742) Improve balancer usability after HDFS-8818

2017-06-05 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16037072#comment-16037072 ] Kihwal Lee commented on HDFS-11742: ---

[jira] [Updated] (HDFS-11742) Improve balancer usability after HDFS-8818

2017-06-05 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11742: -- Attachment: balancer_fix.png > Improve balancer usability after HDFS-8818 >

[jira] [Updated] (HDFS-10816) TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor

2017-06-05 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-10816: -- Attachment: HDFS-10816.002.patch Reattaching the trunk patch. >

[jira] [Updated] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11856: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.4 Status: Resolved

[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033758#comment-16033758 ] Kihwal Lee commented on HDFS-11856: --- +1 for the branch-2.7 patch. I've just committed it. > Ability to

[jira] [Updated] (HDFS-5042) Completed files lost after power failure

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-5042: - Fix Version/s: 2.8.2 2.7.4 > Completed files lost after power failure >

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033367#comment-16033367 ] Kihwal Lee commented on HDFS-5042: -- The patches look good. Now the fix is in branch-2.8 and branch-2.7.

[jira] [Commented] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033177#comment-16033177 ] Kihwal Lee commented on HDFS-11907: --- The {{statfs()}} call is not expensive, but I agree that the

[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033111#comment-16033111 ] Kihwal Lee commented on HDFS-11856: --- branch-2/branch-2.8 precommit failure is being handled in

[jira] [Assigned] (HDFS-11729) Improve NNStorageRetentionManager failure handling.

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee reassigned HDFS-11729: - Assignee: Weiwei Yang > Improve NNStorageRetentionManager failure handling. >

[jira] [Commented] (HDFS-10816) TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033058#comment-16033058 ] Kihwal Lee commented on HDFS-10816: --- The patch needs to be revised. {{BlockManagerTestUtil}} no longer

[jira] [Issue Comment Deleted] (HDFS-10816) TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-10816: -- Comment: was deleted (was: +1) > TestComputeInvalidateWork#testDatanodeReRegistration fails due to

[jira] [Commented] (HDFS-10816) TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor

2017-06-01 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033048#comment-16033048 ] Kihwal Lee commented on HDFS-10816: --- +1 > TestComputeInvalidateWork#testDatanodeReRegistration fails

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-05-31 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031646#comment-16031646 ] Kihwal Lee commented on HDFS-5042: -- In the 2.8 patch, {code} + * @param fileToSync the file to fsync +

[jira] [Updated] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-05-31 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11856: -- Fix Version/s: 2.8.2 2.9.0 > Ability to re-add Upgrading Nodes (remote) to pipeline

[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-05-31 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031579#comment-16031579 ] Kihwal Lee commented on HDFS-11856: --- +1 for the branch-2 patch. Need more time to look at the

[jira] [Updated] (HDFS-5042) Completed files lost after power failure

2017-05-31 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-5042: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha4 2.9.0

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-05-31 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031376#comment-16031376 ] Kihwal Lee commented on HDFS-5042: -- +1 the trunk patch looks good. Also the branch-2 patch looks fine. >

[jira] [Updated] (HDFS-5042) utuvjndvrcdtgddbdvj

2017-05-30 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-5042: - Summary: utuvjndvrcdtgddbdvj (was: Completed files lost after power failure) > utuvjndvrcdtgddbdvj >

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-05-30 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029789#comment-16029789 ] Kihwal Lee commented on HDFS-5042: -- +1 on the syncing rbw on the fisrt hsync(). But let's focus on the

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-05-26 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026424#comment-16026424 ] Kihwal Lee commented on HDFS-5042: -- bq. +1 the latest patch looks good. Hold on.

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-05-26 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026416#comment-16026416 ] Kihwal Lee commented on HDFS-5042: -- +1 the latest patch looks good. > Completed files lost after power

[jira] [Updated] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-05-26 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11856: -- Attachment: HDFS-11856-02.branch-2.patch Reattaching the branch-2 patch with a different name to kick

[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-25 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Attachment: HDFS-11817.v2.branch-2.8.patch Attaching what's committed to 2.8 as reference. The

[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-25 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.2

[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-05-25 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025137#comment-16025137 ] Kihwal Lee commented on HDFS-11856: --- Please provide patch for branch-2 and others. > Ability to re-add

[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-05-25 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025132#comment-16025132 ] Kihwal Lee commented on HDFS-11856: --- +1 the patch looks good. > Ability to re-add Upgrading Nodes

[jira] [Resolved] (HDFS-11852) Under-repicated block never completes because of failure in commitBlockSynchronization()

2017-05-25 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee resolved HDFS-11852. --- Resolution: Duplicate > Under-repicated block never completes because of failure in >

[jira] [Commented] (HDFS-11852) Under-repicated block never completes because of failure in commitBlockSynchronization()

2017-05-25 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024860#comment-16024860 ] Kihwal Lee commented on HDFS-11852: --- This issue happens because decommissioning replica is not

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-05-24 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16023352#comment-16023352 ] Kihwal Lee commented on HDFS-5042: -- Does it make sense to have {{FileIoProvider#sync()}} to call the new

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-05-24 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16023307#comment-16023307 ] Kihwal Lee commented on HDFS-5042: -- bq. Am I missing something here? Sorry, did not look closely. >

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-05-24 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16023167#comment-16023167 ] Kihwal Lee commented on HDFS-5042: -- {{fsync(File fileToSync)}} will leak file descriptors and other

[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-22 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Attachment: HDFS-11817.v2.trunk.patch HDFS-11817.v2.branch-2.patch > A faulty node can

[jira] [Commented] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-22 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020252#comment-16020252 ] Kihwal Lee commented on HDFS-11817: --- In trunk, there already is a logic to weed out null StorageInfo

[jira] [Commented] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-22 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019886#comment-16019886 ] Kihwal Lee commented on HDFS-11817: --- Two test failures are real. {noformat}

[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-19 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Status: Patch Available (was: Open) > A faulty node can cause a lease leak and NPE on accessing data >

[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-19 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Attachment: HDFS-11817.branch-2.patch I started the patch for branch-2.8 and branch-2. The trunk

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-05-19 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017461#comment-16017461 ] Kihwal Lee commented on HDFS-5042: -- bq. It seems like Lucene has figured out how to fsync a directory in

[jira] [Commented] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-18 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016420#comment-16016420 ] Kihwal Lee commented on HDFS-11817: --- Hi, [~raviprak]. We've seen what you described above. It is not

[jira] [Comment Edited] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-18 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014703#comment-16014703 ] Kihwal Lee edited comment on HDFS-11817 at 5/18/17 8:24 PM: *Details of the

[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-18 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Attachment: hdfs-11817_supplement.txt Attaching supplemental information including stack traces. > A

[jira] [Commented] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data

2017-05-17 Thread Kihwal Lee (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014712#comment-16014712 ] Kihwal Lee commented on HDFS-11817: --- *Summary:* The observation of the incident resulted in discovery

<    1   2   3   4   5   6   7   8   9   10   >