[jira] [Updated] (HDFS-9599) TestDecommissioningStatus.testDecommissionStatus occasionally fails

2016-04-04 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9599:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.8.0
Target Version/s: 2.8.0
  Status: Resolved  (was: Patch Available)

Committed. Thanks, [~linyiqun] and [~jojochuang].

> TestDecommissioningStatus.testDecommissionStatus occasionally fails
> ---
>
> Key: HDFS-9599
> URL: https://issues.apache.org/jira/browse/HDFS-9599
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Lin Yiqun
> Fix For: 2.8.0
>
> Attachments: HDFS-9599.001.patch, HDFS-9599.002.patch
>
>
> From test result of a recent jenkins nightly 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2663/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestDecommissioningStatus/testDecommissionStatus/
> The test failed because the number of under replicated blocks is 4, instead 
> of 3.
> Looking at the log, there is a strayed block, which might have caused the 
> faillure:
> {noformat}
> 2015-12-23 00:42:05,820 [Block report processor] INFO  BlockStateChange 
> (BlockManager.java:processReport(2131)) - BLOCK* processReport: 
> blk_1073741825_1001 on node 127.0.0.1:57382 size 16384 does not belong to any 
> file
> {noformat}
> The block size 16384 suggests this is left over from the sibling test case 
> testDecommissionStatusAfterDNRestart. This can happen, because the same 
> minidfs cluster is reused between tests.
> The test implementation should do a better job isolating tests.
> Another case of failure is when the load factor comes into play, and a block 
> can not find sufficient data nodes to place replica. In this test, the 
> runtime should not consider load factor:
> {noformat}
> conf.setBoolean(DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, 
> false);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9599) TestDecommissioningStatus.testDecommissionStatus occasionally fails

2016-04-04 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15224729#comment-15224729
 ] 

Masatake Iwasaki commented on HDFS-9599:


+1

> TestDecommissioningStatus.testDecommissionStatus occasionally fails
> ---
>
> Key: HDFS-9599
> URL: https://issues.apache.org/jira/browse/HDFS-9599
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Lin Yiqun
> Attachments: HDFS-9599.001.patch, HDFS-9599.002.patch
>
>
> From test result of a recent jenkins nightly 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2663/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestDecommissioningStatus/testDecommissionStatus/
> The test failed because the number of under replicated blocks is 4, instead 
> of 3.
> Looking at the log, there is a strayed block, which might have caused the 
> faillure:
> {noformat}
> 2015-12-23 00:42:05,820 [Block report processor] INFO  BlockStateChange 
> (BlockManager.java:processReport(2131)) - BLOCK* processReport: 
> blk_1073741825_1001 on node 127.0.0.1:57382 size 16384 does not belong to any 
> file
> {noformat}
> The block size 16384 suggests this is left over from the sibling test case 
> testDecommissionStatusAfterDNRestart. This can happen, because the same 
> minidfs cluster is reused between tests.
> The test implementation should do a better job isolating tests.
> Another case of failure is when the load factor comes into play, and a block 
> can not find sufficient data nodes to place replica. In this test, the 
> runtime should not consider load factor:
> {noformat}
> conf.setBoolean(DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, 
> false);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3743) QJM: improve formatting behavior for JNs

2016-04-01 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-3743:
---
Assignee: (was: Masatake Iwasaki)

> QJM: improve formatting behavior for JNs
> 
>
> Key: HDFS-3743
> URL: https://issues.apache.org/jira/browse/HDFS-3743
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>
> Currently, the JournalNodes automatically format themselves when a new writer 
> takes over, if they don't have any data for that namespace. However, this has 
> a few problems:
> 1) if the administrator accidentally points a new NN at the wrong quorum (eg 
> corresponding to another cluster), it will auto-format a directory on those 
> nodes. This doesn't cause any data loss, but would be better to bail out with 
> an error indicating that they need to be formatted.
> 2) if a journal node crashes and needs to be reformatted, it should be able 
> to re-join the cluster and start storing new segments without having to fail 
> over to a new NN.
> 3) if 2/3 JNs get accidentally reformatted (eg the mount point becomes 
> undone), and the user starts the NN, it should fail to start, because it may 
> end up missing edits. If it auto-formats in this case, the user might have 
> silent "rollback" of the most recent edits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-3743) QJM: improve formatting behavior for JNs

2016-04-01 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki reassigned HDFS-3743:
--

Assignee: Masatake Iwasaki

> QJM: improve formatting behavior for JNs
> 
>
> Key: HDFS-3743
> URL: https://issues.apache.org/jira/browse/HDFS-3743
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Masatake Iwasaki
>
> Currently, the JournalNodes automatically format themselves when a new writer 
> takes over, if they don't have any data for that namespace. However, this has 
> a few problems:
> 1) if the administrator accidentally points a new NN at the wrong quorum (eg 
> corresponding to another cluster), it will auto-format a directory on those 
> nodes. This doesn't cause any data loss, but would be better to bail out with 
> an error indicating that they need to be formatted.
> 2) if a journal node crashes and needs to be reformatted, it should be able 
> to re-join the cluster and start storing new segments without having to fail 
> over to a new NN.
> 3) if 2/3 JNs get accidentally reformatted (eg the mount point becomes 
> undone), and the user starts the NN, it should fail to start, because it may 
> end up missing edits. If it auto-formats in this case, the user might have 
> silent "rollback" of the most recent edits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10178) Permanent write failures can happen if pipeline recoveries occur for the first packet

2016-04-01 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222588#comment-15222588
 ] 

Masatake Iwasaki commented on HDFS-10178:
-

{code}
577 // For testing. Delay sending packet downstream
578 if (DataNodeFaultInjector.get().stopSendingPacketDownstream()) {
579   try {
580 Thread.sleep(6);
581   } catch (InterruptedException ie) {
582 throw new IOException("Interrupted while sleeping. Bailing 
out.");
583   }
584 }
{code}

Should the test logic be encapsulate in the DataNodeFaultInjector's method? like

{code}
DataNodeFaultInjector dnFaultInjector = new DataNodeFaultInjector() {
  int tries = 1;
  @Override
  public void stopSendingPacketDownstream() throws IOException {
if (tries > 0) {
  tries--;
  try {
Thread.sleep(6);
  } catch (InterruptedException ie) {
throw new IOException("Interrupted while sleeping. Bailing out.");
  }
}
  }
};
{code}


> Permanent write failures can happen if pipeline recoveries occur for the 
> first packet
> -
>
> Key: HDFS-10178
> URL: https://issues.apache.org/jira/browse/HDFS-10178
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-10178.patch, HDFS-10178.v2.patch, 
> HDFS-10178.v3.patch, HDFS-10178.v4.patch
>
>
> We have observed that write fails permanently if the first packet doesn't go 
> through properly and pipeline recovery happens. If the packet header is sent 
> out, but the data portion of the packet does not reach one or more datanodes 
> in time, the pipeline recovery will be done against the 0-byte partial block. 
>  
> If additional datanodes are added, the block is transferred to the new nodes. 
>  After the transfer, each node will have a meta file containing the header 
> and 0-length data block file. The pipeline recovery seems to work correctly 
> up to this point, but write fails when actual data packet is resent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-10230) HDFS Native Client build failed

2016-03-30 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki resolved HDFS-10230.
-
Resolution: Duplicate

I'm going to close this as duplicate of HADOOP-12692. Please reopen if this 
turns out to be independent issue.

> HDFS Native Client build failed
> ---
>
> Key: HDFS-10230
> URL: https://issues.apache.org/jira/browse/HDFS-10230
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Reporter: John Zhuge
>Priority: Blocker
>
> HDFS Native Client build failed: 
> https://builds.apache.org/job/Hadoop-trunk-Commit/9514/console
> {code}
> [INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ 
> hadoop-hdfs-native-client ---
> Downloading: 
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-hdfs/3.0.0-SNAPSHOT/hadoop-hdfs-3.0.0-20160328.214654-6500.pom
> 4/21 KB
> 8/21 KB   
> 8/21 KB   
> 12/21 KB   
> 14/21 KB   
> 18/21 KB   
> 21/21 KB   
>
> Downloaded: 
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-hdfs/3.0.0-SNAPSHOT/hadoop-hdfs-3.0.0-20160328.214654-6500.pom
>  (21 KB at 193.9 KB/sec)
> [WARNING] 
> Dependency convergence error for org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT 
> paths to dependency are:
> +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT
> and
> +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-hdfs:3.0.0-20160328.214654-6500
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.DependencyConvergence 
> failed with message:
> Failed while enforcing releasability the error(s) are [
> Dependency convergence error for org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT 
> paths to dependency are:
> +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT
> and
> +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-hdfs:3.0.0-20160328.214654-6500
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10230) HDFS Native Client build failed

2016-03-29 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216173#comment-15216173
 ] 

Masatake Iwasaki commented on HDFS-10230:
-

The cause seems to be slowing down of the build machine. Can you reproduce this 
on your environment?

> HDFS Native Client build failed
> ---
>
> Key: HDFS-10230
> URL: https://issues.apache.org/jira/browse/HDFS-10230
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Reporter: John Zhuge
>Priority: Blocker
>
> HDFS Native Client build failed: 
> https://builds.apache.org/job/Hadoop-trunk-Commit/9514/console
> {code}
> [INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ 
> hadoop-hdfs-native-client ---
> Downloading: 
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-hdfs/3.0.0-SNAPSHOT/hadoop-hdfs-3.0.0-20160328.214654-6500.pom
> 4/21 KB
> 8/21 KB   
> 8/21 KB   
> 12/21 KB   
> 14/21 KB   
> 18/21 KB   
> 21/21 KB   
>
> Downloaded: 
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-hdfs/3.0.0-SNAPSHOT/hadoop-hdfs-3.0.0-20160328.214654-6500.pom
>  (21 KB at 193.9 KB/sec)
> [WARNING] 
> Dependency convergence error for org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT 
> paths to dependency are:
> +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT
> and
> +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-hdfs:3.0.0-20160328.214654-6500
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.DependencyConvergence 
> failed with message:
> Failed while enforcing releasability the error(s) are [
> Dependency convergence error for org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT 
> paths to dependency are:
> +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT
> and
> +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-hdfs:3.0.0-20160328.214654-6500
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9954) Test RPC timeout fix of HADOOP-12672 against HDFS

2016-03-15 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9954:
---
Resolution: Invalid
Status: Resolved  (was: Patch Available)

It turned out that creating HDFS issue and attaching patch does not invoke HDFS 
tests. test-patch runs tests based on the contents of the patch.

> Test RPC timeout fix of HADOOP-12672 against HDFS
> -
>
> Key: HDFS-9954
> URL: https://issues.apache.org/jira/browse/HDFS-9954
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>  Labels: test
> Attachments: HDFS-9954.006.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9928) Make HDFS commands guide up to date

2016-03-15 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9928:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.9.0
Target Version/s: 2.9.0
  Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks, [~jojochuang]!

> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Fix For: 2.9.0
>
> Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928-trunk.003.patch, 
> HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date

2016-03-15 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195148#comment-15195148
 ] 

Masatake Iwasaki commented on HDFS-9928:


+1

> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928-trunk.003.patch, 
> HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9954) Test RPC timeout fix of HADOOP-12672 against HDFS

2016-03-14 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9954:
---
Status: Patch Available  (was: Open)

> Test RPC timeout fix of HADOOP-12672 against HDFS
> -
>
> Key: HDFS-9954
> URL: https://issues.apache.org/jira/browse/HDFS-9954
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>  Labels: test
> Attachments: HDFS-9954.006.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9954) Test RPC timeout fix of HADOOP-12672 against HDFS

2016-03-14 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9954:
---
Attachment: HDFS-9954.006.patch

attaching 006 patch of HADOOP-12672 to test it against HDFS.

> Test RPC timeout fix of HADOOP-12672 against HDFS
> -
>
> Key: HDFS-9954
> URL: https://issues.apache.org/jira/browse/HDFS-9954
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>  Labels: test
> Attachments: HDFS-9954.006.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9954) Test RPC timeout fix of HADOOP-12672 against HDFS

2016-03-14 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created HDFS-9954:
--

 Summary: Test RPC timeout fix of HADOOP-12672 against HDFS
 Key: HDFS-9954
 URL: https://issues.apache.org/jira/browse/HDFS-9954
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date

2016-03-13 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192392#comment-15192392
 ] 

Masatake Iwasaki commented on HDFS-9928:


Thanks for the update, [~jojochuang].  The patch looks good to me.

Can you upload the patch for trunk too? I think omitting the part of {{hdfs 
namenode -rollingUpgrade}} will make it. (We don't need the part for trunk 
since {{downgrade}} option was removed in trunk.)

> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9942) Add an HTrace span when refreshing the groups for a username

2016-03-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190489#comment-15190489
 ] 

Masatake Iwasaki commented on HDFS-9942:


+1

BTW, this looks like the use case of NullTracer being discussed in HTRACE-275.


> Add an HTrace span when refreshing the groups for a username
> 
>
> Key: HDFS-9942
> URL: https://issues.apache.org/jira/browse/HDFS-9942
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9942.001.patch
>
>
> We should add an HTrace span when refreshing the groups for a username.  This 
> can be an expensive operation in some cases, and it's good to know if it 
> delayed a request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date

2016-03-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189599#comment-15189599
 ] 

Masatake Iwasaki commented on HDFS-9928:


How about targeting 2.9.0 here by just reverting the part of 
{{-beforeShutdown}}?

> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date

2016-03-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189595#comment-15189595
 ] 

Masatake Iwasaki commented on HDFS-9928:


{noformat}
+  hdfs storagepolicies
+  [-listPolicies]
+  [-setStoragePolicy -path  -policy ]
+  [-getStoragePolicy -path ]
+  [-unsetStoragePolicy -path ]
+  [-help ]
{noformat}

{{-unsetStoragePolicy}} was added by HDFS-9534 in 2.9.0.


> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date

2016-03-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189573#comment-15189573
 ] 

Masatake Iwasaki commented on HDFS-9928:


{noformat}
 hdfs dfsadmin [GENERIC_OPTIONS]
   [-report [-live] [-dead] [-decommissioning]]
   [-safemode enter | leave | get | wait | forceExit]
-  [-saveNamespace]
+  [-saveNamespace [-beforeShutdown]]
{noformat}

Adding {{-beforeShutdown}} (by HDFS-6353) was trunk-only change. We should keep 
this as is if target is 2.8.0.

{noformat}
-| `-reconfig` \ \ \ | Start 
reconfiguration or get the status of an ongoing reconfiguration. The second 
parameter specifies the node type. Currently, only reloading DataNode's 
configuration is supported. |
+| `-reconfig` \ \ 
\ | Starts reconfiguration or gets the status of an 
ongoing reconfiguration, or gets a list of reconfigurable properties. The 
second parameter specifies the node type. |
{noformat}

Similar to above, fix version of HDFS-9094 is 2.9.0.


> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date

2016-03-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189527#comment-15189527
 ] 

Masatake Iwasaki commented on HDFS-9928:


{noformat}
-Usage: `hdfs cacheadmin -addDirective -path  -pool  [-force] 
[-replication ] [-ttl ]`
+Usage:
+
+hdfs dfsadmin
+  [-addDirective -path  -pool  [-force] [-replication 
] [-ttl ]]
+  [-modifyDirective -id  [-path ] [-force] [-replication 
] [-pool ] [-ttl ]]
+  [-listDirectives [-stats] [-path ] [-pool ] [-id ]
+  [-removeDirective ]
+  [-removeDirectives -path ]
+  [-addPool  [-owner ] [-group ] [-mode ] 
[-limit ] [-maxTtl ]
+  [-modifyPool  [-owner ] [-group ] [-mode ] 
[-limit ] [-maxTtl ]]
+  [-removePool ]
+  [-listPools [-stats] []]
+  [-help ]
{noformat}

* {{hdfs dfsadmin}} should be {{hdfs cacheadmin}}.
* This usage looks like we can use multiple directives at the same time but it 
is not true.


> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date

2016-03-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189512#comment-15189512
 ] 

Masatake Iwasaki commented on HDFS-9928:


{noformat}
264 | `-h`\|`--help` | Display the tool usage and help information and 
exit. |
{noformat}
There are duplicate entries of {{-h}} for oiv_legacy.


> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9895) Remove all cached configuration from DataNode

2016-03-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189449#comment-15189449
 ] 

Masatake Iwasaki commented on HDFS-9895:


Current implementation does not need this because reconfiguration task (in 
ReconfigurableBase) does not swap Configuration instance. Are you planning to 
change the reconfiguration logic?

> Remove all cached configuration from DataNode
> -
>
> Key: HDFS-9895
> URL: https://issues.apache.org/jira/browse/HDFS-9895
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9895.000.patch
>
>
> Since DataNode inherits ReconfigurableBase with Configured as base class 
> where configuration is maintained, all cached configurations in DataNode 
> should be removed for brevity and consistency purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9884) Use doxia macro to generate in-page TOC of HDFS site documentation

2016-03-09 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9884:
---
Status: Patch Available  (was: Open)

> Use doxia macro to generate in-page TOC of HDFS site documentation
> --
>
> Key: HDFS-9884
> URL: https://issues.apache.org/jira/browse/HDFS-9884
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9884.001.patch
>
>
> Since maven-site-plugin 3.5 was released, we can use toc macro in Markdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9884) Use doxia macro to generate in-page TOC of HDFS site documentation

2016-03-09 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9884:
---
Attachment: HDFS-9884.001.patch

Since HADOOP-12470 came in, we can use toc macro in HDFS docs.

> Use doxia macro to generate in-page TOC of HDFS site documentation
> --
>
> Key: HDFS-9884
> URL: https://issues.apache.org/jira/browse/HDFS-9884
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9884.001.patch
>
>
> Since maven-site-plugin 3.5 was released, we can use toc macro in Markdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9865) TestBlockReplacement fails intermittently in trunk

2016-03-07 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9865:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.7.3
  2.8.0
Target Version/s: 2.7.3
  Status: Resolved  (was: Patch Available)

+1. Committed to branch-2.7 and above. Thanks, [~linyiqun].

> TestBlockReplacement fails intermittently in trunk
> --
>
> Key: HDFS-9865
> URL: https://issues.apache.org/jira/browse/HDFS-9865
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Fix For: 2.8.0, 2.7.3
>
> Attachments: HDFS-9865.001.patch, HDFS-9865.002.patch
>
>
> I found the testcase {{TestBlockReplacement}} will be failed sometimes in 
> testing. And I looked the unit log, always I will found these infos:
> {code}
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement
> testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement)
>   Time elapsed: 8.764 sec  <<< FAILURE!
> java.lang.AssertionError: The block should be only on 1 datanode  
> expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436)
> {code}
> Finally I found the reason is that not deleting block completely in 
> testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. 
> And the time to wait FsDatasetAsyncDsikService to delete the block is not a 
> accurate value. 
> {code}
> LOG.info("replaceBlock:  " + replaceBlock(block,
>   (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc,
>   (DatanodeInfo)destDnDesc));
> // Waiting for the FsDatasetAsyncDsikService to delete the block
> Thread.sleep(3000);
> {code}
> When I adjust this time to 1 seconds, it will be always failed. Also the 3 
> seconds in test is not a accurate value too. We should adjust these code's 
> logic to a better way such as waiting for the block to be replicated in 
> testDecommision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9476) TestDFSUpgradeFromImage#testUpgradeFromRel1BBWImage occasionally fail

2016-03-06 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182645#comment-15182645
 ] 

Masatake Iwasaki commented on HDFS-9476:


Sorry, this was wrong.. "Cannot obtain block length for LocatedBlock..." is the 
retrying case.

> TestDFSUpgradeFromImage#testUpgradeFromRel1BBWImage occasionally fail
> -
>
> Key: HDFS-9476
> URL: https://issues.apache.org/jira/browse/HDFS-9476
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Akira AJISAKA
> Attachments: HDFS-9476.01.patch
>
>
> This test occasionally fail. For example, the most recent one is:
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2587/
> Error Message
> {noformat}
> Cannot obtain block length for 
> LocatedBlock{BP-1371507683-67.195.81.153-1448798439809:blk_7162739548153522810_1020;
>  getBlockSize()=1024; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:33080,DS-c5eaf2b4-2ee6-419d-a8a0-44a5df5ef9a1,DISK]]}
> {noformat}
> Stacktrace
> {noformat}
> java.io.IOException: Cannot obtain block length for 
> LocatedBlock{BP-1371507683-67.195.81.153-1448798439809:blk_7162739548153522810_1020;
>  getBlockSize()=1024; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:33080,DS-c5eaf2b4-2ee6-419d-a8a0-44a5df5ef9a1,DISK]]}
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:399)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:343)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:275)
>   at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:265)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1046)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1011)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.dfsOpenFileWithRetries(TestDFSUpgradeFromImage.java:177)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.verifyDir(TestDFSUpgradeFromImage.java:213)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.verifyFileSystem(TestDFSUpgradeFromImage.java:228)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.upgradeAndVerify(TestDFSUpgradeFromImage.java:600)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.testUpgradeFromRel1BBWImage(TestDFSUpgradeFromImage.java:622)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9476) TestDFSUpgradeFromImage#testUpgradeFromRel1BBWImage occasionally fail

2016-03-06 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182641#comment-15182641
 ] 

Masatake Iwasaki commented on HDFS-9476:


The message "Cannot obtain block length for LocatedBlock..." implies that 
{{DFSInputStream#readBlockLength}} failed. 
{{TestDFSUpgradeFromImage#dfsOpenFileWithRetries}} will not retry in that case. 
Just increasing retries will not work.

> TestDFSUpgradeFromImage#testUpgradeFromRel1BBWImage occasionally fail
> -
>
> Key: HDFS-9476
> URL: https://issues.apache.org/jira/browse/HDFS-9476
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Akira AJISAKA
> Attachments: HDFS-9476.01.patch
>
>
> This test occasionally fail. For example, the most recent one is:
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2587/
> Error Message
> {noformat}
> Cannot obtain block length for 
> LocatedBlock{BP-1371507683-67.195.81.153-1448798439809:blk_7162739548153522810_1020;
>  getBlockSize()=1024; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:33080,DS-c5eaf2b4-2ee6-419d-a8a0-44a5df5ef9a1,DISK]]}
> {noformat}
> Stacktrace
> {noformat}
> java.io.IOException: Cannot obtain block length for 
> LocatedBlock{BP-1371507683-67.195.81.153-1448798439809:blk_7162739548153522810_1020;
>  getBlockSize()=1024; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:33080,DS-c5eaf2b4-2ee6-419d-a8a0-44a5df5ef9a1,DISK]]}
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:399)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:343)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:275)
>   at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:265)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1046)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1011)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.dfsOpenFileWithRetries(TestDFSUpgradeFromImage.java:177)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.verifyDir(TestDFSUpgradeFromImage.java:213)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.verifyFileSystem(TestDFSUpgradeFromImage.java:228)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.upgradeAndVerify(TestDFSUpgradeFromImage.java:600)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.testUpgradeFromRel1BBWImage(TestDFSUpgradeFromImage.java:622)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9865) TestBlockReplacement fails intermittently in trunk

2016-03-06 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182627#comment-15182627
 ] 

Masatake Iwasaki commented on HDFS-9865:


Thanks for working on this, [~linyiqun].

{code}
415   int tries = 0;
416   while (tries++ < 20) {
{code}

{{for (int tries = 0; tries < 20; tries++)}} should be better to limit the 
scope of {{retreis}}.


{code}
418 // Triggering the incremental block report to report the 
deleted block
419 // to namnemode
420 cluster.getDataNodes().get(0).triggerBlockReport(
421 new 
BlockReportOptions.Factory().setIncremental(true).build());
{code}

Can you replace {{triggerBlockReport}} with 
{{DataNodeTestUtils#triggerDeletionReport}}? Though {{triggerBlockReport}} was 
in the original code, sending just IBR is appropriate and faster.


> TestBlockReplacement fails intermittently in trunk
> --
>
> Key: HDFS-9865
> URL: https://issues.apache.org/jira/browse/HDFS-9865
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9865.001.patch
>
>
> I found the testcase {{TestBlockReplacement}} will be failed sometimes in 
> testing. And I looked the unit log, always I will found these infos:
> {code}
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement
> testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement)
>   Time elapsed: 8.764 sec  <<< FAILURE!
> java.lang.AssertionError: The block should be only on 1 datanode  
> expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436)
> {code}
> Finally I found the reason is that not deleting block completely in 
> testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. 
> And the time to wait FsDatasetAsyncDsikService to delete the block is not a 
> accurate value. 
> {code}
> LOG.info("replaceBlock:  " + replaceBlock(block,
>   (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc,
>   (DatanodeInfo)destDnDesc));
> // Waiting for the FsDatasetAsyncDsikService to delete the block
> Thread.sleep(3000);
> {code}
> When I adjust this time to 1 seconds, it will be always failed. Also the 3 
> seconds in test is not a accurate value too. We should adjust these code's 
> logic to a better way such as waiting for the block to be replicated in 
> testDecommision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9048) DistCp documentation is out-of-dated

2016-03-03 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9048:
---
  Resolution: Fixed
   Fix Version/s: 2.7.3
  2.8.0
Target Version/s: 2.7.3
  Status: Resolved  (was: Patch Available)

+1. Committed to branch-2.7 and above. Thanks, [~daisuke.kobayashi]!

> DistCp documentation is out-of-dated
> 
>
> Key: HDFS-9048
> URL: https://issues.apache.org/jira/browse/HDFS-9048
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Daisuke Kobayashi
> Fix For: 2.8.0, 2.7.3
>
> Attachments: HDFS-9048-2.patch, HDFS-9048-3.patch, HDFS-9048-4.patch, 
> HDFS-9048.patch
>
>
> There are a couple issues with the current distcp document:
> * It recommends hftp / hsftp filesystem to copy data between different hadoop 
> version. hftp / hsftp have been deprecated in the flavor of webhdfs.
> * If the users are copying between Hadoop 2.x they can use the hdfs protocol 
> directly for better performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9048) DistCp documentation is out-of-dated

2016-03-03 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177490#comment-15177490
 ] 

Masatake Iwasaki commented on HDFS-9048:


{{namenode_address}} or {{namenode_host}} should be fine.

> DistCp documentation is out-of-dated
> 
>
> Key: HDFS-9048
> URL: https://issues.apache.org/jira/browse/HDFS-9048
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Daisuke Kobayashi
> Attachments: HDFS-9048-2.patch, HDFS-9048-3.patch, HDFS-9048.patch
>
>
> There are a couple issues with the current distcp document:
> * It recommends hftp / hsftp filesystem to copy data between different hadoop 
> version. hftp / hsftp have been deprecated in the flavor of webhdfs.
> * If the users are copying between Hadoop 2.x they can use the hdfs protocol 
> directly for better performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9048) DistCp documentation is out-of-dated

2016-03-02 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175631#comment-15175631
 ] 

Masatake Iwasaki commented on HDFS-9048:


Thanks for the update [~daisuke.kobayashi].  If you prefer using configuration 
properties, {{webhdfs://:}} should be 
{{webhdfs://}} since the value of dfs.http.address (and 
dfs.namenode.http-address which is the successor) includes port number.

I'm +1 if this is addressed.

> DistCp documentation is out-of-dated
> 
>
> Key: HDFS-9048
> URL: https://issues.apache.org/jira/browse/HDFS-9048
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Daisuke Kobayashi
> Attachments: HDFS-9048-2.patch, HDFS-9048-3.patch, HDFS-9048.patch
>
>
> There are a couple issues with the current distcp document:
> * It recommends hftp / hsftp filesystem to copy data between different hadoop 
> version. hftp / hsftp have been deprecated in the flavor of webhdfs.
> * If the users are copying between Hadoop 2.x they can use the hdfs protocol 
> directly for better performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9884) Use doxia macro to generate in-page TOC of HDFS site documentation

2016-03-01 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9884:
---
Description: Since maven-site-plugin 3.5 was released, we can use toc macro 
in Markdown.  (was: Since maven-site-plugin 3.5 was releaced, we can use toc 
macro in Markdown.)

> Use doxia macro to generate in-page TOC of HDFS site documentation
> --
>
> Key: HDFS-9884
> URL: https://issues.apache.org/jira/browse/HDFS-9884
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>
> Since maven-site-plugin 3.5 was released, we can use toc macro in Markdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9884) Use doxia macro to generate in-page TOC of HDFS site documentation

2016-03-01 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created HDFS-9884:
--

 Summary: Use doxia macro to generate in-page TOC of HDFS site 
documentation
 Key: HDFS-9884
 URL: https://issues.apache.org/jira/browse/HDFS-9884
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki


Since maven-site-plugin 3.5 was releaced, we can use toc macro in Markdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9048) DistCp documentation is out-of-dated

2016-02-28 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171484#comment-15171484
 ] 

Masatake Iwasaki commented on HDFS-9048:


{noformat}
419   Remote cluster is specified as `webhdfs:///`
420   (the default `dfs.http.address` is `:50070`). 
{noformat}

{{dfs.http.address}} is deplicated and the default are possible to be changed 
by HDFS-9427. How about just saying "Remote cluster is specified as 
'webhdfs://:/'"?


> DistCp documentation is out-of-dated
> 
>
> Key: HDFS-9048
> URL: https://issues.apache.org/jira/browse/HDFS-9048
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Daisuke Kobayashi
> Attachments: HDFS-9048-2.patch, HDFS-9048.patch
>
>
> There are a couple issues with the current distcp document:
> * It recommends hftp / hsftp filesystem to copy data between different hadoop 
> version. hftp / hsftp have been deprecated in the flavor of webhdfs.
> * If the users are copying between Hadoop 2.x they can use the hdfs protocol 
> directly for better performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9048) DistCp documentation is out-of-dated

2016-02-28 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9048:
---
Status: Patch Available  (was: Open)

> DistCp documentation is out-of-dated
> 
>
> Key: HDFS-9048
> URL: https://issues.apache.org/jira/browse/HDFS-9048
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Daisuke Kobayashi
> Attachments: HDFS-9048-2.patch, HDFS-9048.patch
>
>
> There are a couple issues with the current distcp document:
> * It recommends hftp / hsftp filesystem to copy data between different hadoop 
> version. hftp / hsftp have been deprecated in the flavor of webhdfs.
> * If the users are copying between Hadoop 2.x they can use the hdfs protocol 
> directly for better performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9427) HDFS should not default to ephemeral ports

2016-02-21 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156532#comment-15156532
 ] 

Masatake Iwasaki commented on HDFS-9427:


HBASE-10123 could be useful as reference.

> HDFS should not default to ephemeral ports
> --
>
> Key: HDFS-9427
> URL: https://issues.apache.org/jira/browse/HDFS-9427
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: Arpit Agarwal
>Assignee: Xiaobing Zhou
>Priority: Critical
>  Labels: Incompatible
>
> HDFS defaults to ephemeral ports for the some HTTP/RPC endpoints. This can 
> cause bind exceptions on service startup if the port is in use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9765) TestBlockScanner#testVolumeIteratorWithCaching fails intermittently

2016-02-15 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15147139#comment-15147139
 ] 

Masatake Iwasaki commented on HDFS-9765:


+1. {{TestBlockScanner#testVolumeIteratorWithCaching}} took +20 seconds on my 
server under no additional load. It would be possible to time out on heavily 
loaded build servers.

> TestBlockScanner#testVolumeIteratorWithCaching fails intermittently
> ---
>
> Key: HDFS-9765
> URL: https://issues.apache.org/jira/browse/HDFS-9765
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Mingliang Liu
>Assignee: Akira AJISAKA
> Attachments: HDFS-9765.01.patch
>
>
> It got timed out exception, with following stack:
> {code}
> java.lang.Exception: test timed out after 6 milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:812)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:776)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:747)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:427)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:376)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:369)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:362)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockScanner$TestContext.createFiles(TestBlockScanner.java:129)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockScanner.testVolumeIteratorImpl(TestBlockScanner.java:159)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockScanner.testVolumeIteratorWithCaching(TestBlockScanner.java:250)
> {code}
> See recent builds:
> * 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14390/testReport/org.apache.hadoop.hdfs.server.datanode/TestBlockScanner/testVolumeIteratorWithCaching/
> * 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14346/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_66.txt
> * 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14392/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_66.txt
> * 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14393/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_66.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol

2016-02-12 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9700:
---
Issue Type: Improvement  (was: Bug)

> DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for 
> DataTransferProtocol
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, 
> HDFS-9700-branch-2.7.003.patch, HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700.002.patch, HDFS-9700.003.patch, HDFS-9700.004.patch, 
> HDFS-9700_branch-2.7-v2.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-12 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144743#comment-15144743
 ] 

Masatake Iwasaki commented on HDFS-9700:


I would like to update title and type of the issue because this is improvement 
of DataTransferProtocol rather than bug fix of IPC.

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, 
> HDFS-9700-branch-2.7.003.patch, HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700.002.patch, HDFS-9700.003.patch, HDFS-9700.004.patch, 
> HDFS-9700_branch-2.7-v2.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol

2016-02-12 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9700:
---
Summary: DFSClient and DFSOutputStream should set TCP_NODELAY on sockets 
for DataTransferProtocol  (was: DFSClient and DFSOutputStream do not respect 
TCP_NODELAY config in two spots)

> DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for 
> DataTransferProtocol
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, 
> HDFS-9700-branch-2.7.003.patch, HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700.002.patch, HDFS-9700.003.patch, HDFS-9700.004.patch, 
> HDFS-9700_branch-2.7-v2.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-12 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144739#comment-15144739
 ] 

Masatake Iwasaki commented on HDFS-9700:


+1. I will commit this to branch-2.8 and above if there is no further comment. 
Thanks for the update, [~ghelmling].

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, 
> HDFS-9700-branch-2.7.003.patch, HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700.002.patch, HDFS-9700.003.patch, HDFS-9700.004.patch, 
> HDFS-9700_branch-2.7-v2.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol

2016-02-12 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9700:
---
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to branch-2.8 and above. Thanks, [~ghelmling], [~liuml07] and 
[~cmccabe].

> DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for 
> DataTransferProtocol
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Fix For: 2.8.0
>
> Attachments: HDFS-9700-branch-2.7.002.patch, 
> HDFS-9700-branch-2.7.003.patch, HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700.002.patch, HDFS-9700.003.patch, HDFS-9700.004.patch, 
> HDFS-9700_branch-2.7-v2.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-04 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133583#comment-15133583
 ] 

Masatake Iwasaki commented on HDFS-9700:


I'm +1 too on 003. I do not think setting the default value to true has harm. 
Thanks for the update, [~ghelmling].

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, 
> HDFS-9700-branch-2.7.003.patch, HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700.002.patch, HDFS-9700.003.patch, HDFS-9700_branch-2.7-v2.patch, 
> HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-04 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133598#comment-15133598
 ] 

Masatake Iwasaki commented on HDFS-9700:


Just 1 nit..
{code}
sock.setTcpNoDelay(client.getConf().getDataTransferTcpNoDelay());
{code}
This could be
{code}
sock.setTcpNoDelay(conf.getDataTransferTcpNoDelay());
{code}


> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, 
> HDFS-9700-branch-2.7.003.patch, HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700.002.patch, HDFS-9700.003.patch, HDFS-9700_branch-2.7-v2.patch, 
> HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2016-02-03 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131457#comment-15131457
 ] 

Masatake Iwasaki commented on HDFS-9601:


Thanks, [~shv]. I'm +1 on cherry-picking.

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Fix For: 2.8.0
>
> Attachments: HDFS-9601.001.patch, HDFS-9601.002.patch
>
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9726) Refactor IBR code to a new class

2016-02-02 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129110#comment-15129110
 ] 

Masatake Iwasaki commented on HDFS-9726:


Thanks for working on this, [~szetszwo]. I think this is good improvement 
making datanode's code clearer.

The patch looks good to me overall. Unit tests passed on my environment. Found 
some nits.

* {{IncrementalBlockReportManager#addRDBI}} should be VisibleForTesting and 
{{IncrementalBlockReportManager#sendImmediately}} is not.
* {{notifyNamenodeBlock}}: I felt the meanings of {{send}} and {{now}} are not 
clear from the variable name. How about {{immediate}} and {{notify}} 
respectively?

> Refactor IBR code to a new class
> 
>
> Key: HDFS-9726
> URL: https://issues.apache.org/jira/browse/HDFS-9726
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h9726_20160131.patch, h9726_20160201.patch
>
>
> The IBR code currently is mainly in BPServiceActor.  The JIRA is to refactor 
> it to a new class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-01 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126245#comment-15126245
 ] 

Masatake Iwasaki commented on HDFS-9700:


Both of {{DFSClient#connectToDN}} and {{DataStreamer#createSocketForPipeline}} 
should not use {{CommonConfigurationKeysPublic.IPC_CLIENT_TCPNODELAY_KEY}}.

Though I prefer always setting TCP_NODELAY, we should run some benchmark like 
TestDFSIo to see the effect before change the default behavior.

Adding configuration key such as 
HdfsClientConfigKeys.DFS_CLIENT_SOCKET_TCP_NODELAY might be conservative option 
to retain existing behaviour and change the default value later. (You can see 
HDFS-8829 and HDFS-9259 as example for the fix.)

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-v1.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-01-29 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123206#comment-15123206
 ] 

Masatake Iwasaki commented on HDFS-9700:


{{CommonConfigurationKeysPublic.IPC_CLIENT_TCPNODELAY_KEY}} is configuration 
key for Hadoop IPC but the sockets refered here are for data transfer protocol 
(which does not use common IPC framework). Should we use another key or always 
set TCP_NODELAY as {{DFSUtilClient#peerFromSocket}} do, if it is crucial?

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-v1.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9689) Test o.a.h.hdfs.TestRenameWhileOpen fails intermittently

2016-01-24 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114671#comment-15114671
 ] 

Masatake Iwasaki commented on HDFS-9689:


Thanks for working on this, [~liuml07].

Are other test processes still possible to bind the nn port between 
{{shutdownNameNode}} and {{createNameNode}} in 
{{MiniDFSCluster#restartNameNode}}? Though the patch narrows the time window of 
the port intercept.

{code}
shutdownNameNode(nnIndex);
if (args.length != 0) {
  startOpt = null;
} else {
  args = createArgs(startOpt);
}

NameNode nn = NameNode.createNameNode(args, info.conf);
{code}

> Test o.a.h.hdfs.TestRenameWhileOpen fails intermittently 
> -
>
> Key: HDFS-9689
> URL: https://issues.apache.org/jira/browse/HDFS-9689
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9689.000.patch
>
>
> The test fails in recent builds, e.g.
> https://builds.apache.org/job/PreCommit-HDFS-Build/14063/testReport/org.apache.hadoop.hdfs/TestRenameWhileOpen/
> and
> https://builds.apache.org/job/PreCommit-HDFS-Build/14212/testReport/org.apache.hadoop.hdfs/TestRenameWhileOpen/testWhileOpenRenameToNonExistentDirectory/
> The *Error Message* is like:
> {code}
> Problem binding to [localhost:60690] java.net.BindException: Address already 
> in use; For more details see:  http://wiki.apache.org/hadoop/BindException
> {code}
> and *Stacktrace* is:
> {code}
> java.net.BindException: Problem binding to [localhost:60690] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:463)
>   at sun.nio.ch.Net.bind(Net.java:455)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:469)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:695)
>   at org.apache.hadoop.ipc.Server.(Server.java:2464)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:958)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:392)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:743)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:685)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:884)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:863)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1581)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1247)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1016)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:891)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:823)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
>   at 
> org.apache.hadoop.hdfs.TestRenameWhileOpen.testWhileOpenRenameToNonExistentDirectory(TestRenameWhileOpen.java:332)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9618) Fix mismatch between log level and guard in BlockManager#computeRecoveryWorkForBlocks

2016-01-21 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9618:
---
Attachment: HDFS-9618.002.patch

> Fix mismatch between log level and guard in 
> BlockManager#computeRecoveryWorkForBlocks
> -
>
> Key: HDFS-9618
> URL: https://issues.apache.org/jira/browse/HDFS-9618
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-9618.001.patch, HDFS-9618.002.patch
>
>
> Debug log message is constructed when {{Logger#isInfoEnabled}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2016-01-21 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9601:
---
Affects Version/s: (was: 3.0.0)
   2.9.0

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9601.001.patch, HDFS-9601.002.patch
>
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2016-01-21 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9601:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks, [~liuml07] and [~kihwal].

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9601.001.patch, HDFS-9601.002.patch
>
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9618) Fix mismatch between log level and guard in BlockManager#computeRecoveryWorkForBlocks

2016-01-21 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9618:
---
  Resolution: Fixed
   Fix Version/s: 2.9.0
Target Version/s: 2.9.0
  Status: Resolved  (was: Patch Available)

Committed this to trunk and branch-2. Thanks, Kai, Mingliang and Akira.

> Fix mismatch between log level and guard in 
> BlockManager#computeRecoveryWorkForBlocks
> -
>
> Key: HDFS-9618
> URL: https://issues.apache.org/jira/browse/HDFS-9618
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: HDFS-9618.001.patch, HDFS-9618.002.patch
>
>
> Debug log message is constructed when {{Logger#isInfoEnabled}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9618) Fix mismatch between log level and guard in BlockManager#computeRecoveryWorkForBlocks

2016-01-06 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085335#comment-15085335
 ] 

Masatake Iwasaki commented on HDFS-9618:


The log level had been info but it seemed to be changed to debug in EC branch 
(6b6a63bb).

> Fix mismatch between log level and guard in 
> BlockManager#computeRecoveryWorkForBlocks
> -
>
> Key: HDFS-9618
> URL: https://issues.apache.org/jira/browse/HDFS-9618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
>
> Debug log message is constructed when {{Logger#isInfoEnabled}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9618) Fix mismatch between log level and guard in BlockManager#computeRecoveryWorkForBlocks

2016-01-06 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085324#comment-15085324
 ] 

Masatake Iwasaki commented on HDFS-9618:


{code}
if (blockLog.isInfoEnabled()) {
  // log which blocks have been scheduled for replication
  for(BlockRecoveryWork rw : recovWork){
DatanodeStorageInfo[] targets = rw.getTargets();
if (targets != null && targets.length != 0) {
  StringBuilder targetList = new StringBuilder("datanode(s)");
  for (DatanodeStorageInfo target : targets) {
targetList.append(' ');
targetList.append(target.getDatanodeDescriptor());
  }
  blockLog.debug("BLOCK* ask {} to replicate {} to {}", 
rw.getSrcNodes(),
  rw.getBlock(), targetList);
}
  }
}
{code}


> Fix mismatch between log level and guard in 
> BlockManager#computeRecoveryWorkForBlocks
> -
>
> Key: HDFS-9618
> URL: https://issues.apache.org/jira/browse/HDFS-9618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
>
> Debug log message is constructed when {{Logger#isInfoEnabled}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9618) Fix mismatch between log level and guard in BlockManager#computeRecoveryWorkForBlocks

2016-01-06 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created HDFS-9618:
--

 Summary: Fix mismatch between log level and guard in 
BlockManager#computeRecoveryWorkForBlocks
 Key: HDFS-9618
 URL: https://issues.apache.org/jira/browse/HDFS-9618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor


Debug log message is constructed when {{Logger#isInfoEnabled}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9576) HTrace: collect path/offset/length information on read and write operations

2016-01-06 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085932#comment-15085932
 ] 

Masatake Iwasaki commented on HDFS-9576:


I agree to fix tracing of write in follow-up.

The 02 patch looks good but 1 nit. The name of variable should not be 
{{ignored}} because it is used now.

> HTrace: collect path/offset/length information on read and write operations
> ---
>
> Key: HDFS-9576
> URL: https://issues.apache.org/jira/browse/HDFS-9576
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, tracing
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9576.00.patch, HDFS-9576.01.patch, 
> HDFS-9576.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9618) Fix mismatch between log level and guard in BlockManager#computeRecoveryWorkForBlocks

2016-01-06 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9618:
---
Attachment: HDFS-9618.001.patch

Thanks for the comment, [~drankye]. I attached 001.

bq. Then is there any reason for the following block?

The reason seems to be that {{UnderReplicatedBlocks#size}} is not just a 
accessor but it do some aggregation. I left the part as is in the attached 
patch.


> Fix mismatch between log level and guard in 
> BlockManager#computeRecoveryWorkForBlocks
> -
>
> Key: HDFS-9618
> URL: https://issues.apache.org/jira/browse/HDFS-9618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-9618.001.patch
>
>
> Debug log message is constructed when {{Logger#isInfoEnabled}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9618) Fix mismatch between log level and guard in BlockManager#computeRecoveryWorkForBlocks

2016-01-06 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9618:
---
Affects Version/s: (was: 3.0.0)
   2.8.0
   Status: Patch Available  (was: Open)

> Fix mismatch between log level and guard in 
> BlockManager#computeRecoveryWorkForBlocks
> -
>
> Key: HDFS-9618
> URL: https://issues.apache.org/jira/browse/HDFS-9618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-9618.001.patch
>
>
> Debug log message is constructed when {{Logger#isInfoEnabled}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9618) Fix mismatch between log level and guard in BlockManager#computeRecoveryWorkForBlocks

2016-01-06 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085978#comment-15085978
 ] 

Masatake Iwasaki commented on HDFS-9618:


This was wrong. The log level was changed by HDFS-6860.

> Fix mismatch between log level and guard in 
> BlockManager#computeRecoveryWorkForBlocks
> -
>
> Key: HDFS-9618
> URL: https://issues.apache.org/jira/browse/HDFS-9618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
>
> Debug log message is constructed when {{Logger#isInfoEnabled}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9623) Update example configuration of block state change log in log4j.properties

2016-01-06 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created HDFS-9623:
--

 Summary: Update example configuration of block state change log in 
log4j.properties
 Key: HDFS-9623
 URL: https://issues.apache.org/jira/browse/HDFS-9623
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: logging
Affects Versions: 2.8.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor


The log level of block state change log was changed from INFO to DEBUG by 
HDFS-6860. The example configuration in log4j.properties should be updated 
along with the change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9623) Update example configuration of block state change log in log4j.properties

2016-01-06 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9623:
---
Attachment: HDFS-9623.001.patch

> Update example configuration of block state change log in log4j.properties
> --
>
> Key: HDFS-9623
> URL: https://issues.apache.org/jira/browse/HDFS-9623
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 2.8.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-9623.001.patch
>
>
> The log level of block state change log was changed from INFO to DEBUG by 
> HDFS-6860. The example configuration in log4j.properties should be updated 
> along with the change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9576) HTrace: collect path/offset/length information on read and write operations

2016-01-05 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082697#comment-15082697
 ] 

Masatake Iwasaki commented on HDFS-9576:


Thanks for working on this, [~zhz]. I tried the 01 patch and checked that the 
annotations are properly set in spans.

{code}
@@ -209,7 +209,7 @@ protected TraceScope createWriteTraceScope() {
   private void writeChecksumChunks(byte b[], int off, int len)
   throws IOException {
 sum.calculateChunkedSums(b, off, len, checksum, 0);
-TraceScope scope = createWriteTraceScope();
+TraceScope scope = createWriteTraceScope(len);
{code}

The {{len}} of "DFSOutputStream#write" span is basically size of internal 
buffer of FSOutputSummer and constant (4608) most of the time. Do you have 
specific usecase for this?

For read, if the length is useful, recording the number of bytes actually read 
(which is return value of read) as annotation might be helpful too.


> HTrace: collect path/offset/length information on read and write operations
> ---
>
> Key: HDFS-9576
> URL: https://issues.apache.org/jira/browse/HDFS-9576
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, tracing
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9576.00.patch, HDFS-9576.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9376) TestSeveralNameNodes fails occasionally

2016-01-03 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15080780#comment-15080780
 ] 

Masatake Iwasaki commented on HDFS-9376:


Thanks, [~cnauroth].

> TestSeveralNameNodes fails occasionally
> ---
>
> Key: HDFS-9376
> URL: https://issues.apache.org/jira/browse/HDFS-9376
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Kihwal Lee
>Assignee: Masatake Iwasaki
> Fix For: 3.0.0
>
> Attachments: HDFS-9376.001.patch, HDFS-9376.002.patch
>
>
> TestSeveralNameNodes has been failing in precommit builds.  It usually times 
> out on waiting for the last thread to finish writing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2016-01-03 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9601:
---
Attachment: HDFS-9601.002.patch

I attached 002.
* simplified retrying
* added comment
* got rid of logging on every retries because NNThroughputBenchmark could be 
used from command line

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9601.001.patch, HDFS-9601.002.patch
>
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2016-01-03 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15080778#comment-15080778
 ] 

Masatake Iwasaki commented on HDFS-9601:


Thanks for the comment, [~liuml07].

As you say, we can not reuse {{DFSOutputStream#addBlock}} depending on real dfs 
client as is. I think it is not problem to retry {{addBlock}} by itself because 
{{generateInputs}} is called in preparetion phase of the benchmark. It does not 
need to have same behavior with {{DFSOutputStream}}. It should not need even 
exponential backoff.

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9601.001.patch
>
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2015-12-29 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9601:
---
Description: TestNNThroughputBenchmark intermittently fails due to 
NotReplicatedYetException. Because 
{{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
{{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
itself as {{DFSOutputStream#addBlock} do.

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2015-12-29 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9601:
---
Summary: NNThroughputBenchmark.BlockReportStats should handle 
NotReplicatedYetException on adding block  (was: Fix intermittent failure of 
TestNNThroughputBenchmark)

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2015-12-29 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9601:
---
Status: Patch Available  (was: Open)

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9601.001.patch
>
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2015-12-29 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9601:
---
Attachment: HDFS-9601.001.patch

I attached 001.

For the case of {{TestNNThroughputBenchmark}}, {{NotReplicatedYetException}} is 
not wrapped by {{RemoteException}}.


> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9601.001.patch
>
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9376) TestSeveralNameNodes fails occasionally

2015-12-28 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9376:
---
Attachment: HDFS-9376.002.patch

I updated the patch. Though the possibility is low, clients could exhaust 
retries in test run time. The max retries is increased in 002.

I could reproduce the test failure in 10 runs without the patch. No failure in 
100 runs with the patch applied.


> TestSeveralNameNodes fails occasionally
> ---
>
> Key: HDFS-9376
> URL: https://issues.apache.org/jira/browse/HDFS-9376
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9376.001.patch, HDFS-9376.002.patch
>
>
> TestSeveralNameNodes has been failing in precommit builds.  It usually times 
> out on waiting for the last thread to finish writing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9376) TestSeveralNameNodes fails occasionally

2015-12-28 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073364#comment-15073364
 ] 

Masatake Iwasaki commented on HDFS-9376:


The failure of {{TestReplicationPolicyConsiderLoad}} was already fixed by 
HDFS-9597. Other tests are flaky regardless of the patch and succeeded on my 
local environment.

> TestSeveralNameNodes fails occasionally
> ---
>
> Key: HDFS-9376
> URL: https://issues.apache.org/jira/browse/HDFS-9376
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9376.001.patch, HDFS-9376.002.patch
>
>
> TestSeveralNameNodes has been failing in precommit builds.  It usually times 
> out on waiting for the last thread to finish writing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9602) TestReplicationPolicyConsiderLoad.testChooseTargetWithDecomNodes failed on trunk

2015-12-25 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9602:
---
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

> TestReplicationPolicyConsiderLoad.testChooseTargetWithDecomNodes failed on 
> trunk
> 
>
> Key: HDFS-9602
> URL: https://issues.apache.org/jira/browse/HDFS-9602
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Blocker
> Attachments: HDFS-9602-v1.patch
>
>
> Failed tests: 
> TestReplicationPolicyConsiderLoad.testChooseTargetWithDecomNodes:96 
> expected:<1.6667> but was:<0.0>
> TestReplicationPolicyConsiderLoad.testChooseTargetWithDecomNodes:96 
> expected:<1.6667> but was:<0.0>
> I think it is broken by HDFS-9034 which add an updateDnStat method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9602) TestReplicationPolicyConsiderLoad.testChooseTargetWithDecomNodes failed on trunk

2015-12-25 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15071595#comment-15071595
 ] 

Masatake Iwasaki commented on HDFS-9602:


Thanks for reporting this, [~yangzhe1991]. This is duplicate of HDFS-9597. 
Could you add review comment for the submitted patch?

> TestReplicationPolicyConsiderLoad.testChooseTargetWithDecomNodes failed on 
> trunk
> 
>
> Key: HDFS-9602
> URL: https://issues.apache.org/jira/browse/HDFS-9602
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Blocker
> Attachments: HDFS-9602-v1.patch
>
>
> Failed tests: 
> TestReplicationPolicyConsiderLoad.testChooseTargetWithDecomNodes:96 
> expected:<1.6667> but was:<0.0>
> TestReplicationPolicyConsiderLoad.testChooseTargetWithDecomNodes:96 
> expected:<1.6667> but was:<0.0>
> I think it is broken by HDFS-9034 which add an updateDnStat method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9376) TestSeveralNameNodes fails occasionally

2015-12-24 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9376:
---
Attachment: HDFS-9376.001.patch

If we set time between failover to greater value, we might not get enough 
failovers kicking in while test threads are doing operations. I think it would 
be better to set smaller maximum sleep time for clients by setting 
{{dfs.client.failover.sleep.max.millis}} to make test stable and shorter.

{code}
String  SLEEPTIME_BASE_KEY = PREFIX + "sleep.base.millis";
int SLEEPTIME_BASE_DEFAULT = 500;
String  SLEEPTIME_MAX_KEY = PREFIX + "sleep.max.millis";
int SLEEPTIME_MAX_DEFAULT = 15000;
{code}

In addition, test should exit immediately after all test threads finish the 
work.

> TestSeveralNameNodes fails occasionally
> ---
>
> Key: HDFS-9376
> URL: https://issues.apache.org/jira/browse/HDFS-9376
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9376.001.patch
>
>
> TestSeveralNameNodes has been failing in precommit builds.  It usually times 
> out on waiting for the last thread to finish writing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9376) TestSeveralNameNodes fails occasionally

2015-12-24 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9376:
---
Status: Patch Available  (was: Open)

> TestSeveralNameNodes fails occasionally
> ---
>
> Key: HDFS-9376
> URL: https://issues.apache.org/jira/browse/HDFS-9376
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9376.001.patch
>
>
> TestSeveralNameNodes has been failing in precommit builds.  It usually times 
> out on waiting for the last thread to finish writing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9376) TestSeveralNameNodes fails occasionally

2015-12-24 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15071388#comment-15071388
 ] 

Masatake Iwasaki commented on HDFS-9376:


The failover thread in {{HAStressTestHarness}} will invoke failover 
periodically with fixed sleep time. The {{msBetweenFailovers}} is set to 1000 
ms for {{TestSeveralNameNodes}}.

{code}
for (int i = 0; i < nns; i++) {
  int next = (i + 1) % nns;
  ...
  cluster.transitionToStandby(i);
  cluster.transitionToActive(next);
  ...
  Thread.sleep(msBetweenFailovers);
{code}

Retry proxy of client have sleep time exponential to number of retries on 
failover. The client is possible to sleep up to around 15 seconds if it 
repeatedly fails on the operation. The client may not get enough effective run 
time due to this.

{noformat}
  2015-12-24 12:22:00,784 [Thread-250] INFO  retry.RetryInvocationHandler 
(RetryInvocationHandler.java:invoke(147)) - Exception while invoking create of 
class ClientNamenodeProtocolTranslatorPB over localhost/127.0.0.1:42201 after 4 
fail over attempts. Trying to fail over after sleeping for 10161ms.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category WRITE is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
{noformat}

> TestSeveralNameNodes fails occasionally
> ---
>
> Key: HDFS-9376
> URL: https://issues.apache.org/jira/browse/HDFS-9376
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Masatake Iwasaki
>
> TestSeveralNameNodes has been failing in precommit builds.  It usually times 
> out on waiting for the last thread to finish writing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9601) Fix intermittent failure of TestNNThroughputBenchmark

2015-12-24 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created HDFS-9601:
--

 Summary: Fix intermittent failure of TestNNThroughputBenchmark
 Key: HDFS-9601
 URL: https://issues.apache.org/jira/browse/HDFS-9601
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9601) Fix intermittent failure of TestNNThroughputBenchmark

2015-12-24 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15071407#comment-15071407
 ] 

Masatake Iwasaki commented on HDFS-9601:


{noformat}
testNNThroughput(org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark)
  Time elapsed: 2.836 sec  <<< ERROR!
org.apache.hadoop.hdfs.server.namenode.NotReplicatedYetException: Not 
replicated yet: 
/nnThroughputBenchmark/blockReport/ThroughputBenchDir0/ThroughputBench4
at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:190)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2378)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:797)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1184)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1171)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:281)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1519)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1422)
at 
org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark.testNNThroughput(TestNNThroughputBenchmark.java:53)
{noformat}


> Fix intermittent failure of TestNNThroughputBenchmark
> -
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9376) TestSeveralNameNodes fails occasionally

2015-12-23 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki reassigned HDFS-9376:
--

Assignee: Masatake Iwasaki

> TestSeveralNameNodes fails occasionally
> ---
>
> Key: HDFS-9376
> URL: https://issues.apache.org/jira/browse/HDFS-9376
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Masatake Iwasaki
>
> TestSeveralNameNodes has been failing in precommit builds.  It usually times 
> out on waiting for the last thread to finish writing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-21 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066800#comment-15066800
 ] 

Masatake Iwasaki commented on HDFS-9505:


Thanks, [~ajisakaa]!

> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
> Fix For: 2.8.0, 2.7.3
>
> Attachments: HDFS-9505.001.patch, HDFS-9505.002.patch
>
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-20 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9505:
---
Attachment: HDFS-9505.002.patch

I attached updated patch as 002.

* made it clearer that moving files to trash is a feature of FS shell.
* moved contents of "HDFS Trash Management" section to under "File Deletes and 
Undeletes" section.
* moved part of contents about trash feature to FileSystem Shell guide.
* removed description about WebDAV and added NFS gateway instead.


> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-9505.001.patch, HDFS-9505.002.patch
>
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-19 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15065648#comment-15065648
 ] 

Masatake Iwasaki commented on HDFS-9505:


Thanks for the review comments, [~ajisakaa].

bq. hadoop fs -expunge actually does not delete all the files in trash.

I'm going to move relevant part to "FileSystem Shell" guide and fix the 
description because trash is (basically) the feature of hadoop-common and could 
be relevant to other file systems.

bq. There is a jira for WebDAV (HDFS-225) but there have been no updates for 
more than 6 years. Instead, we should document that HDFS now supports NFSv3.

Yeah. I agree.


> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-9505.001.patch
>
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-13 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9505:
---
Attachment: (was: HDFS-9505.001.patch)

> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
>Priority: Minor
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-13 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9505:
---
Attachment: HDFS-9505.001.patch

I attached 001.

* fixed description about quota and permission.
* fixed description about variable block size (HDFS-3689).
* added description  about stale state of datanodes (HDFS-3703).
* fixed description about snapshot support.
* fixed description about client-side buffering.
* added hyperlinks.
* fixed formatting nits.


> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-9505.001.patch
>
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-13 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9505:
---
Attachment: HDFS-9505.001.patch

I reattached the patch to fix formatting issue.

> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-9505.001.patch
>
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-13 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055447#comment-15055447
 ] 

Masatake Iwasaki commented on HDFS-9505:


Long lines in markdown documentation is side effect of conversion from APT 
format. I added line breaks to rerevant lines of this jira for ease of future 
editing and diff tracking.

Please use {{git diff --word-diff}} to see the change of contents.


> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-9505.001.patch
>
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9535) Fix TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate

2015-12-11 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053798#comment-15053798
 ] 

Masatake Iwasaki commented on HDFS-9535:


bq. When one IBR is later received, the last block is completed (via 
addStoredBlock()). 

Yeah. You are right. Thanks for the analysis, [~liuml07].

> Fix TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate
> -
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9535.000.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8326) Documentation about when checkpoints are run is out of date

2015-12-11 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053828#comment-15053828
 ] 

Masatake Iwasaki commented on HDFS-8326:


Thanks, [~xyao].

> Documentation about when checkpoints are run is out of date
> ---
>
> Key: HDFS-8326
> URL: https://issues.apache.org/jira/browse/HDFS-8326
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.3.0
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Fix For: 2.8.0
>
> Attachments: HDFS-8326.001.patch, HDFS-8326.002.patch, 
> HDFS-8326.003.patch, HDFS-8326.004.patch, HDFS-8326.patch
>
>
> Apparently checkpointing by interval or transaction size are both supported 
> in at least HDFS 2.3, but the documentation does not reflect this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9535) Newly completed blocks in IBR should not be considered under-replicated too quickly

2015-12-11 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053897#comment-15053897
 ] 

Masatake Iwasaki commented on HDFS-9535:


bq. The guarantee Jing Zhao proposed in addStoredBlock makes sense to me.

I too agree with the fix. Should we have checking {{if (!bc.isStriped())}} 
around {{addExpectedReplicasToPending}}?


> Newly completed blocks in IBR should not be considered under-replicated too 
> quickly
> ---
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9535.000.patch, HDFS-9535.001.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-11 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052497#comment-15052497
 ] 

Masatake Iwasaki commented on HDFS-9505:


The statement that "all blocks in a file except the last block are the same 
size" is not always true after HDFS-3689.

> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
>Priority: Minor
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9535) Fix TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate

2015-12-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15051963#comment-15051963
 ] 

Masatake Iwasaki commented on HDFS-9535:


Thanks for reporting this.

{code}
 final boolean b = commitBlock(lastBlock, commitBlock);
 if (hasMinStorage(lastBlock)) {
  if (b && !bc.isStriped()) {
 addExpectedReplicasToPending(lastBlock);
   }
   completeBlock(lastBlock, false);
{code}

Hmm.. If the block is committed but not satisfy {{if 
(hasMinStorage(lastBlock))}}, {{addExpectedReplicasToPending}} will never be 
called before {{completeBlock}} on the next time {{commitOrCompleteLastBlock}} 
is called from the code path of {{addBlock}} because {{commitBlock}} will 
return false for already committed block.

The {{if (b && !bc.isStriped())}} checks the return value of {{commitBlock}} in 
order to avoid adding same nodes to {{pendingReplication}} multiple times. How 
about replace the condition with another logic such as 
{{pendingReplications.getNumReplicas(lastBlock) == 0}}?


> Fix TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate
> -
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
>Priority: Minor
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9535) Fix TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate

2015-12-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15051981#comment-15051981
 ] 

Masatake Iwasaki commented on HDFS-9535:


bq. For the first block, it is possible that NN receives 0 block-received msg 
from DN but still commit the block when the client tries to get the next block. 
In that case we will add the block into under-replicated queue instead of the 
pending queue, and block recovery will happen in the cluster.

{{BlockManager#processMisReplicatedBlock}} will not process incomplete block. I 
think the case is that block is completed but there is still pending 
block-received msg.
{code}
if (!block.isComplete()) {
// Incomplete blocks are never considered mis-replicated --
// they'll be reached when they are completed or recovered.
  return MisReplicationResult.UNDER_CONSTRUCTION;
}
{code}

> Fix TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate
> -
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
>Priority: Minor
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8326) Documentation about when checkpoints are run is out of date

2015-12-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052255#comment-15052255
 ] 

Masatake Iwasaki commented on HDFS-8326:


[~xyao], though the "Fix version/s" is set to 2.8.0, the patch was committed to 
trunk only. Could you commit this to branch-2? Syncing HdfsDesign.md between 
trunk and branch-2 would make it easier to maintain.

> Documentation about when checkpoints are run is out of date
> ---
>
> Key: HDFS-8326
> URL: https://issues.apache.org/jira/browse/HDFS-8326
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.3.0
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Fix For: 2.8.0
>
> Attachments: HDFS-8326.001.patch, HDFS-8326.002.patch, 
> HDFS-8326.003.patch, HDFS-8326.004.patch, HDFS-8326.patch
>
>
> Apparently checkpointing by interval or transaction size are both supported 
> in at least HDFS 2.3, but the documentation does not reflect this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-10 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052261#comment-15052261
 ] 

Masatake Iwasaki commented on HDFS-9505:


bq. "Currently, automatic restart and failover of the NameNode software to 
another machine is not supported."

This was fixed by HDFS-8914.


> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
>Priority: Minor
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-09 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048320#comment-15048320
 ] 

Masatake Iwasaki commented on HDFS-9505:


I would like to start working on this. If someone is already on this, ping me 
please.

> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Priority: Minor
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9505) HDFS Architecture documentation needs to be refreshed.

2015-12-09 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki reassigned HDFS-9505:
--

Assignee: Masatake Iwasaki

> HDFS Architecture documentation needs to be refreshed.
> --
>
> Key: HDFS-9505
> URL: https://issues.apache.org/jira/browse/HDFS-9505
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Masatake Iwasaki
>Priority: Minor
>
> The HDFS Architecture document is out of date with respect to the current 
> design of the system.
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> There are multiple false statements and omissions of recent features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9444) TestEditLogTailer#testNN1TriggersLogRolls fails bind exception.

2015-12-03 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9444:
---
Description: (was: 
https://builds.apache.org/job/Hadoop-Hdfs-trunk/2556/testReport/

{noformat}
java.net.BindException: Problem binding to [localhost:42477] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.apache.hadoop.ipc.Server.bind(Server.java:469)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:695)
at org.apache.hadoop.ipc.Server.(Server.java:2464)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:787)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:390)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:742)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:680)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:883)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:862)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1564)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1247)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1016)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:891)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:823)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRolls(TestEditLogTailer.java:139)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testNN1TriggersLogRolls(TestEditLogTailer.java:114)
{noformat})

> TestEditLogTailer#testNN1TriggersLogRolls fails bind exception.
> ---
>
> Key: HDFS-9444
> URL: https://issues.apache.org/jira/browse/HDFS-9444
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9444.001.patch, HDFS-9444.002.patch, 
> HDFS-9444.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9444) Add utility to find set of available ephemeral ports to ServerSocketUtil

2015-12-03 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9444:
---
Attachment: HDFS-9444.004.patch

I added fix to TestNameNodeMXBean and updated summary. Thanks for the 
suggestion, [~xiaochen].

> Add utility to find set of available ephemeral ports to ServerSocketUtil
> 
>
> Key: HDFS-9444
> URL: https://issues.apache.org/jira/browse/HDFS-9444
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9444.001.patch, HDFS-9444.002.patch, 
> HDFS-9444.003.patch, HDFS-9444.004.patch
>
>
> Unit tests using MiniDFSCluster with namanode-ha enabled need set of port 
> numbers in advance. Because namenodes talk to each other, we can not set ipc 
> port to 0 in configuration to make namenodes decide port number on its own. 
> ServerSocketUtil should provide utility to find set of available ephemeral 
> port numbers for this.
> For example, TestEditLogTailer could fail due to {{java.net.BindException: 
> Address already in use}}.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2556/testReport/
> {noformat}
> java.net.BindException: Problem binding to [localhost:42477] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:444)
>   at sun.nio.ch.Net.bind(Net.java:436)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:469)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:695)
>   at org.apache.hadoop.ipc.Server.(Server.java:2464)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:787)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:390)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:742)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:680)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:883)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:862)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1564)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1247)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1016)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:891)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:823)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRolls(TestEditLogTailer.java:139)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testNN1TriggersLogRolls(TestEditLogTailer.java:114)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9444) TestEditLogTailer#testNN1TriggersLogRolls fails bind exception.

2015-12-03 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9444:
---
Description: 
Unit tests using MiniDFSCluster with namanode-ha enabled need set of port 
numbers in advance. Because namenodes talk to each other, we can not set ipc 
port to 0 in configuration to make namenodes decide port number on its own. 
ServerSocketUtil should provide utility to find set of available ephemeral port 
numbers for this.

For example, TestEditLogTailer and TestNameNodeMXBean could fail due to 
{{java.net.BindException: Address already in use}}.

https://builds.apache.org/job/Hadoop-Hdfs-trunk/2556/testReport/
{noformat}
java.net.BindException: Problem binding to [localhost:42477] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.apache.hadoop.ipc.Server.bind(Server.java:469)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:695)
at org.apache.hadoop.ipc.Server.(Server.java:2464)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:787)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:390)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:742)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:680)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:883)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:862)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1564)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1247)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1016)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:891)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:823)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRolls(TestEditLogTailer.java:139)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testNN1TriggersLogRolls(TestEditLogTailer.java:114)
{noformat}


> TestEditLogTailer#testNN1TriggersLogRolls fails bind exception.
> ---
>
> Key: HDFS-9444
> URL: https://issues.apache.org/jira/browse/HDFS-9444
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9444.001.patch, HDFS-9444.002.patch, 
> HDFS-9444.003.patch
>
>
> Unit tests using MiniDFSCluster with namanode-ha enabled need set of port 
> numbers in advance. Because namenodes talk to each other, we can not set ipc 
> port to 0 in configuration to make namenodes decide port number on its own. 
> ServerSocketUtil should provide utility to find set of available ephemeral 
> port numbers for this.
> For example, TestEditLogTailer and TestNameNodeMXBean could fail due to 
> {{java.net.BindException: Address already in use}}.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2556/testReport/
> {noformat}
> java.net.BindException: Problem binding to [localhost:42477] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:444)
>   at sun.nio.ch.Net.bind(Net.java:436)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:469)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:695)
>   at org.apache.hadoop.ipc.Server.(Server.java:2464)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
>   at 
> 

[jira] [Updated] (HDFS-9444) Add utility to find set of available ephemeral ports to ServerSocketUtil

2015-12-03 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9444:
---
Summary: Add utility to find set of available ephemeral ports to 
ServerSocketUtil  (was: TestEditLogTailer#testNN1TriggersLogRolls fails bind 
exception.)

> Add utility to find set of available ephemeral ports to ServerSocketUtil
> 
>
> Key: HDFS-9444
> URL: https://issues.apache.org/jira/browse/HDFS-9444
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9444.001.patch, HDFS-9444.002.patch, 
> HDFS-9444.003.patch
>
>
> Unit tests using MiniDFSCluster with namanode-ha enabled need set of port 
> numbers in advance. Because namenodes talk to each other, we can not set ipc 
> port to 0 in configuration to make namenodes decide port number on its own. 
> ServerSocketUtil should provide utility to find set of available ephemeral 
> port numbers for this.
> For example, TestEditLogTailer and TestNameNodeMXBean could fail due to 
> {{java.net.BindException: Address already in use}}.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2556/testReport/
> {noformat}
> java.net.BindException: Problem binding to [localhost:42477] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:444)
>   at sun.nio.ch.Net.bind(Net.java:436)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:469)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:695)
>   at org.apache.hadoop.ipc.Server.(Server.java:2464)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:787)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:390)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:742)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:680)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:883)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:862)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1564)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1247)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1016)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:891)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:823)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRolls(TestEditLogTailer.java:139)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testNN1TriggersLogRolls(TestEditLogTailer.java:114)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6   7   8   >