RE: Listing large directories via WebHDFS

2016-10-19 Thread Brahma Reddy Battula
JFI, HADOOP-12502 introduced RemoteIterator at client side which is not 
committed through.



--Brahma Reddy Battula

-Original Message-
From: Andrew Wang [mailto:andrew.w...@cloudera.com] 
Sent: 20 October 2016 05:48
To: Zhe Zhang
Cc: Xiao Chen; hdfs-dev@hadoop.apache.org
Subject: Re: Listing large directories via WebHDFS

If the issue is just "hadoop fs -ls -R /", one thing we can look into is making 
the Globber use the listStatus API that returns a RemoteIterator rather than a 
FileStatus[]. That'll use the client-side pagination Xiao mentioned for 
WebHDFS/HttpFS (though this is currently not in a 2.x release).

The general case is still hard, for the reason you mentioned.

Best,
Andrew

On Wed, Oct 19, 2016 at 2:40 PM, Zhe Zhang  wrote:

> Thanks Xiao!
>
> Seems like server-side throttling are still vulnerable to abusing 
> users issuing large listing requests. Once such a request is 
> scheduled, it will keep listing potentially millions of files without 
> having to go through IPC/RPC queue again. It does have to compete for 
> fsn lock though, thanks to this server-side throttling logic.
>
> On Wed, Oct 19, 2016 at 2:33 PM Xiao Chen  wrote:
>
> > Hi Zhe,
> >
> > Per my understanding, the runner in webhdfs goes to
> NamenodeWebHdfsMethods
> >  > 9
> 572ab24e6e/hadoop-hdfs-project/hadoop-hdfs/src/main/
> java/org/apache/hadoop/hdfs/server/namenode/web/resources/
> NamenodeWebHdfsMethods.java#L972>,
> > which eventually calls FSNameSystem#getListing. So it's still 
> > throttled
> on
> > the NN side. Up for discussions for ddos part...
> >
> > Also, Andrew did some pagination features for webhdfs/httpfs via
> > https://issues.apache.org/jira/browse/HDFS-10784 and 
> > https://issues.apache.org/jira/browse/HDFS-10823, to provide better 
> > control.
> >
> > Best,
> >
> > -Xiao
> >
> > On Wed, Oct 19, 2016 at 2:08 PM, Zhe Zhang  wrote:
> >
> > Hi,
> >
> > The regular HDFS client (DistributedFileSystem) throttles the 
> > workload of listing large directories by dividing the work into 
> > batches, something
> like
> > below:
> > {code}
> > // fetch the first batch of entries in the directory
> > DirectoryListing thisListing = dfs.listPaths(
> > src, HdfsFileStatus.EMPTY_NAME);
> >  ..
> > if (!thisListing.hasMore()) { // got all entries of the directory
> >   FileStatus[] stats = new FileStatus[partialListing.length];
> > {code}
> >
> > However, WebHDFS doesn't seem to have this batching logic.
> > {code}
> >   @Override
> >   public FileStatus[] listStatus(final Path f) throws IOException {
> > final HttpOpParam.Op op = GetOpParam.Op.LISTSTATUS;
> > return new FsPathResponseRunner(op, f) {
> >   @Override
> >   FileStatus[] decodeResponse(Map json) {
> >   
> >   }
> > }.run();
> >   }
> > {code}
> >
> > Am I missing anything? So a user can DDoS by {{hadoop fs -ls -R /}} 
> > via WebHDFS?
> >
> >
> >
>

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11039) Expose more configuration properties to hdfs-default.xml

2016-10-19 Thread Yi Liu (JIRA)
Yi Liu created HDFS-11039:
-

 Summary: Expose more configuration properties to hdfs-default.xml
 Key: HDFS-11039
 URL: https://issues.apache.org/jira/browse/HDFS-11039
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation, newbie
Reporter: Yi Liu
Assignee: Jennica Pounds
Priority: Minor


There are some configuration properties for hdfs, but have not been exposed in 
hdfs-default.xml.

It's convenient for Hadoop user/admin if we add them in the hdfs-default.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11038) DiskBalancer: support running multiple commands under one setup of disk balancer

2016-10-19 Thread Xiaobing Zhou (JIRA)
Xiaobing Zhou created HDFS-11038:


 Summary: DiskBalancer: support running multiple commands under one 
setup of disk balancer
 Key: HDFS-11038
 URL: https://issues.apache.org/jira/browse/HDFS-11038
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou


Disk balancer follows/reuses one rule designed by HDFS balancer, that is, only 
one instance is allowed to run at the same time. This is correct in production 
system to avoid any inconsistencies, but it's not ideal to write and run unit 
tests. For example, it should be allowed run plan, execute, scan commands under 
one setup of disk balancer. One instance rule will throw exception by 
complaining 'Another instance is running'. In such a case, there's no way to do 
a full life cycle tests which involves a sequence of commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11037) DiskBalancer: redirect stdout/stderr stream for easy tests

2016-10-19 Thread Xiaobing Zhou (JIRA)
Xiaobing Zhou created HDFS-11037:


 Summary: DiskBalancer: redirect stdout/stderr stream for easy tests
 Key: HDFS-11037
 URL: https://issues.apache.org/jira/browse/HDFS-11037
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou


Currently, disk balancer command maintains PrintStream so that it makes it easy 
to buffer any outputs for test verification. This is not clean approach as we 
might also need to adding other print streams for inputs, outputs and errors. 
The better way is to use System.setErr() or System.setIn(), System.setOut() to 
do stream redirection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11036) Ozone : reuse Xceiver connection

2016-10-19 Thread Chen Liang (JIRA)
Chen Liang created HDFS-11036:
-

 Summary: Ozone : reuse Xceiver connection
 Key: HDFS-11036
 URL: https://issues.apache.org/jira/browse/HDFS-11036
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Chen Liang
Assignee: Chen Liang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: Listing large directories via WebHDFS

2016-10-19 Thread Andrew Wang
If the issue is just "hadoop fs -ls -R /", one thing we can look into is
making the Globber use the listStatus API that returns a RemoteIterator
rather than a FileStatus[]. That'll use the client-side pagination Xiao
mentioned for WebHDFS/HttpFS (though this is currently not in a 2.x
release).

The general case is still hard, for the reason you mentioned.

Best,
Andrew

On Wed, Oct 19, 2016 at 2:40 PM, Zhe Zhang  wrote:

> Thanks Xiao!
>
> Seems like server-side throttling are still vulnerable to abusing users
> issuing large listing requests. Once such a request is scheduled, it will
> keep listing potentially millions of files without having to go through
> IPC/RPC queue again. It does have to compete for fsn lock though, thanks to
> this server-side throttling logic.
>
> On Wed, Oct 19, 2016 at 2:33 PM Xiao Chen  wrote:
>
> > Hi Zhe,
> >
> > Per my understanding, the runner in webhdfs goes to
> NamenodeWebHdfsMethods
> >  572ab24e6e/hadoop-hdfs-project/hadoop-hdfs/src/main/
> java/org/apache/hadoop/hdfs/server/namenode/web/resources/
> NamenodeWebHdfsMethods.java#L972>,
> > which eventually calls FSNameSystem#getListing. So it's still throttled
> on
> > the NN side. Up for discussions for ddos part...
> >
> > Also, Andrew did some pagination features for webhdfs/httpfs via
> > https://issues.apache.org/jira/browse/HDFS-10784 and
> > https://issues.apache.org/jira/browse/HDFS-10823, to provide better
> > control.
> >
> > Best,
> >
> > -Xiao
> >
> > On Wed, Oct 19, 2016 at 2:08 PM, Zhe Zhang  wrote:
> >
> > Hi,
> >
> > The regular HDFS client (DistributedFileSystem) throttles the workload of
> > listing large directories by dividing the work into batches, something
> like
> > below:
> > {code}
> > // fetch the first batch of entries in the directory
> > DirectoryListing thisListing = dfs.listPaths(
> > src, HdfsFileStatus.EMPTY_NAME);
> >  ..
> > if (!thisListing.hasMore()) { // got all entries of the directory
> >   FileStatus[] stats = new FileStatus[partialListing.length];
> > {code}
> >
> > However, WebHDFS doesn't seem to have this batching logic.
> > {code}
> >   @Override
> >   public FileStatus[] listStatus(final Path f) throws IOException {
> > final HttpOpParam.Op op = GetOpParam.Op.LISTSTATUS;
> > return new FsPathResponseRunner(op, f) {
> >   @Override
> >   FileStatus[] decodeResponse(Map json) {
> >   
> >   }
> > }.run();
> >   }
> > {code}
> >
> > Am I missing anything? So a user can DDoS by {{hadoop fs -ls -R /}} via
> > WebHDFS?
> >
> >
> >
>


Re: Listing large directories via WebHDFS

2016-10-19 Thread Zhe Zhang
Thanks Xiao!

Seems like server-side throttling are still vulnerable to abusing users
issuing large listing requests. Once such a request is scheduled, it will
keep listing potentially millions of files without having to go through
IPC/RPC queue again. It does have to compete for fsn lock though, thanks to
this server-side throttling logic.

On Wed, Oct 19, 2016 at 2:33 PM Xiao Chen  wrote:

> Hi Zhe,
>
> Per my understanding, the runner in webhdfs goes to NamenodeWebHdfsMethods
> ,
> which eventually calls FSNameSystem#getListing. So it's still throttled on
> the NN side. Up for discussions for ddos part...
>
> Also, Andrew did some pagination features for webhdfs/httpfs via
> https://issues.apache.org/jira/browse/HDFS-10784 and
> https://issues.apache.org/jira/browse/HDFS-10823, to provide better
> control.
>
> Best,
>
> -Xiao
>
> On Wed, Oct 19, 2016 at 2:08 PM, Zhe Zhang  wrote:
>
> Hi,
>
> The regular HDFS client (DistributedFileSystem) throttles the workload of
> listing large directories by dividing the work into batches, something like
> below:
> {code}
> // fetch the first batch of entries in the directory
> DirectoryListing thisListing = dfs.listPaths(
> src, HdfsFileStatus.EMPTY_NAME);
>  ..
> if (!thisListing.hasMore()) { // got all entries of the directory
>   FileStatus[] stats = new FileStatus[partialListing.length];
> {code}
>
> However, WebHDFS doesn't seem to have this batching logic.
> {code}
>   @Override
>   public FileStatus[] listStatus(final Path f) throws IOException {
> final HttpOpParam.Op op = GetOpParam.Op.LISTSTATUS;
> return new FsPathResponseRunner(op, f) {
>   @Override
>   FileStatus[] decodeResponse(Map json) {
>   
>   }
> }.run();
>   }
> {code}
>
> Am I missing anything? So a user can DDoS by {{hadoop fs -ls -R /}} via
> WebHDFS?
>
>
>


Re: Listing large directories via WebHDFS

2016-10-19 Thread Xiao Chen
Hi Zhe,

Per my understanding, the runner in webhdfs goes to NamenodeWebHdfsMethods
,
which eventually calls FSNameSystem#getListing. So it's still throttled on
the NN side. Up for discussions for ddos part...

Also, Andrew did some pagination features for webhdfs/httpfs via
https://issues.apache.org/jira/browse/HDFS-10784 and
https://issues.apache.org/jira/browse/HDFS-10823, to provide better control.

Best,

-Xiao

On Wed, Oct 19, 2016 at 2:08 PM, Zhe Zhang  wrote:

> Hi,
>
> The regular HDFS client (DistributedFileSystem) throttles the workload of
> listing large directories by dividing the work into batches, something like
> below:
> {code}
> // fetch the first batch of entries in the directory
> DirectoryListing thisListing = dfs.listPaths(
> src, HdfsFileStatus.EMPTY_NAME);
>  ..
> if (!thisListing.hasMore()) { // got all entries of the directory
>   FileStatus[] stats = new FileStatus[partialListing.length];
> {code}
>
> However, WebHDFS doesn't seem to have this batching logic.
> {code}
>   @Override
>   public FileStatus[] listStatus(final Path f) throws IOException {
> final HttpOpParam.Op op = GetOpParam.Op.LISTSTATUS;
> return new FsPathResponseRunner(op, f) {
>   @Override
>   FileStatus[] decodeResponse(Map json) {
>   
>   }
> }.run();
>   }
> {code}
>
> Am I missing anything? So a user can DDoS by {{hadoop fs -ls -R /}} via
> WebHDFS?
>


Listing large directories via WebHDFS

2016-10-19 Thread Zhe Zhang
Hi,

The regular HDFS client (DistributedFileSystem) throttles the workload of
listing large directories by dividing the work into batches, something like
below:
{code}
// fetch the first batch of entries in the directory
DirectoryListing thisListing = dfs.listPaths(
src, HdfsFileStatus.EMPTY_NAME);
 ..
if (!thisListing.hasMore()) { // got all entries of the directory
  FileStatus[] stats = new FileStatus[partialListing.length];
{code}

However, WebHDFS doesn't seem to have this batching logic.
{code}
  @Override
  public FileStatus[] listStatus(final Path f) throws IOException {
final HttpOpParam.Op op = GetOpParam.Op.LISTSTATUS;
return new FsPathResponseRunner(op, f) {
  @Override
  FileStatus[] decodeResponse(Map json) {
  
  }
}.run();
  }
{code}

Am I missing anything? So a user can DDoS by {{hadoop fs -ls -R /}} via
WebHDFS?


Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le

2016-10-19 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/129/

[Oct 18, 2016 6:02:52 PM] (varunsaxena) YARN-5743. [Atsv2] Publish queue name 
and RMAppMetrics to ATS (Rohith
[Oct 18, 2016 6:06:47 PM] (xgong) YARN-5718. TimelineClient (and other places 
in YARN) shouldn't
[Oct 18, 2016 8:16:02 PM] (stevel) HADOOP-13560. S3ABlockOutputStream to 
support huge (many GB) file
[Oct 18, 2016 9:05:43 PM] (xyao) HDFS-10906. Add unit tests for Trash with HDFS 
encryption zones.
[Oct 19, 2016 1:00:29 AM] (rkanter) MAPREDUCE-6791. remove unnecessary 
dependency from
[Oct 19, 2016 1:18:43 AM] (xiao) HADOOP-7352. FileSystem#listStatus should 
throw IOE upon access error.
[Oct 19, 2016 1:24:59 AM] (xiao) HADOOP-13693. Remove the message about HTTP 
OPTIONS in SPNEGO
[Oct 19, 2016 1:32:01 AM] (benoy) HADOOP-12082 Support multiple authentication 
schemes via
[Oct 19, 2016 5:42:28 AM] (xiao) HDFS-11009. Add a tool to reconstruct block 
meta file from CLI.




-1 overall


The following subsystems voted -1:
compile unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc javac


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.hdfs.server.datanode.TestDataNodeLifeline 
   hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.TestCrcCorruption 
   hadoop.yarn.server.nodemanager.recovery.TestNMLeveldbStateStoreService 
   hadoop.yarn.server.nodemanager.TestNodeManagerShutdown 
   hadoop.yarn.server.timeline.TestRollingLevelDB 
   hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices 
   hadoop.yarn.server.timeline.TestTimelineDataManager 
   hadoop.yarn.server.timeline.TestLeveldbTimelineStore 
   hadoop.yarn.server.timeline.recovery.TestLeveldbTimelineStateStore 
   hadoop.yarn.server.timeline.TestRollingLevelDBTimelineStore 
   
hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer 
   hadoop.yarn.server.timelineservice.storage.common.TestRowKeys 
   hadoop.yarn.server.timelineservice.storage.common.TestKeyConverters 
   hadoop.yarn.server.timelineservice.storage.common.TestSeparator 
   hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart 
   hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore 
   hadoop.yarn.server.resourcemanager.TestRMRestart 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing 
   hadoop.yarn.server.resourcemanager.TestResourceTrackerService 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.client.cli.TestLogsCLI 
   hadoop.yarn.client.api.impl.TestNMClient 
   hadoop.yarn.server.timeline.TestLevelDBCacheTimelineStore 
   hadoop.yarn.server.timeline.TestOverrideTimelineStoreYarnClient 
   hadoop.yarn.server.timeline.TestEntityGroupFSTimelineStore 
   hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorage 
   
hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction
 
   hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun 
   
hadoop.yarn.server.timelineservice.storage.TestPhoenixOfflineAggregationWriterImpl
 
   
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 
   
hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.mapred.TestShuffleHandler 
   hadoop.mapreduce.v2.hs.TestHistoryServerLeveldbStateStoreService 
   hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked 

Timed out junit tests :

   org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache 
   org.apache.hadoop.mapred.TestMRIntermediateDataEncryption 
  

   compile:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/129/artifact/out/patch-compile-root.txt
  [316K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/129/artifact/out/patch-compile-root.txt
  [316K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/129/artifact/out/patch-compile-root.txt
  [316K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/129/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [404K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/129/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [52K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/129/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
  

[jira] [Created] (HDFS-11035) Better documentation for maintenace mode and upgrade domain

2016-10-19 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-11035:
--

 Summary: Better documentation for maintenace mode and upgrade 
domain
 Key: HDFS-11035
 URL: https://issues.apache.org/jira/browse/HDFS-11035
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, documentation
Affects Versions: 2.9.0
Reporter: Wei-Chiu Chuang


HDFS-7541 added upgrade domain and HDFS-7877 added maintenance mode. Existing 
documentation about these two features are scarce and the implementation have 
evolved from the original design doc. Looking at code and Javadoc and I still 
don't quite get how I can get datanodes into maintenance mode/ set up a upgrade 
domain.

File this jira to propose that we write an up-to-date description of these two 
features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11034) Provide a command line tool to clear decommissioned DataNode information from the NameNode without restarting.

2016-10-19 Thread Chris Nauroth (JIRA)
Chris Nauroth created HDFS-11034:


 Summary: Provide a command line tool to clear decommissioned 
DataNode information from the NameNode without restarting.
 Key: HDFS-11034
 URL: https://issues.apache.org/jira/browse/HDFS-11034
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Chris Nauroth


Information about decommissioned DataNodes remains tracked in the NameNode for 
the entire NameNode process lifetime.  Currently, the only way to clear this 
information is to restart the NameNode.  This issue proposes to add a way to 
clear this information online, without requiring a process restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2016-10-19 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/

[Oct 18, 2016 8:51:08 AM] (kai.zheng) HDFS-10920. 
TestStorageMover#testNoSpaceDisk is failing intermittently.
[Oct 18, 2016 6:02:52 PM] (varunsaxena) YARN-5743. [Atsv2] Publish queue name 
and RMAppMetrics to ATS (Rohith
[Oct 18, 2016 6:06:47 PM] (xgong) YARN-5718. TimelineClient (and other places 
in YARN) shouldn't
[Oct 18, 2016 8:16:02 PM] (stevel) HADOOP-13560. S3ABlockOutputStream to 
support huge (many GB) file
[Oct 18, 2016 9:05:43 PM] (xyao) HDFS-10906. Add unit tests for Trash with HDFS 
encryption zones.
[Oct 19, 2016 1:00:29 AM] (rkanter) MAPREDUCE-6791. remove unnecessary 
dependency from
[Oct 19, 2016 1:18:43 AM] (xiao) HADOOP-7352. FileSystem#listStatus should 
throw IOE upon access error.
[Oct 19, 2016 1:24:59 AM] (xiao) HADOOP-13693. Remove the message about HTTP 
OPTIONS in SPNEGO
[Oct 19, 2016 1:32:01 AM] (benoy) HADOOP-12082 Support multiple authentication 
schemes via
[Oct 19, 2016 5:42:28 AM] (xiao) HDFS-11009. Add a tool to reconstruct block 
meta file from CLI.




-1 overall


The following subsystems voted -1:
asflicense findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-common-project/hadoop-kms 
   Exception is caught when Exception is not thrown in 
org.apache.hadoop.crypto.key.kms.server.KMS.createKey(Map) At KMS.java:is not 
thrown in org.apache.hadoop.crypto.key.kms.server.KMS.createKey(Map) At 
KMS.java:[line 169] 
   Exception is caught when Exception is not thrown in 
org.apache.hadoop.crypto.key.kms.server.KMS.generateEncryptedKeys(String, 
String, int) At KMS.java:is not thrown in 
org.apache.hadoop.crypto.key.kms.server.KMS.generateEncryptedKeys(String, 
String, int) At KMS.java:[line 501] 

Failed junit tests :

   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.hdfs.TestRollingUpgrade 
   hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
   hadoop.yarn.client.cli.TestLogsCLI 
   hadoop.mapred.gridmix.TestResourceUsageEmulators 
   hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/diff-compile-javac-root.txt
  [168K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/diff-checkstyle-root.txt
  [16M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/diff-patch-pylint.txt
  [16K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/diff-patch-shelldocs.txt
  [16K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/whitespace-eol.txt
  [11M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/whitespace-tabs.txt
  [1.3M]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/branch-findbugs-hadoop-common-project_hadoop-kms-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/diff-javadoc-javadoc-root.txt
  [2.2M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [148K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  [268K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-nativetask.txt
  [124K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/199/artifact/out/patch-unit-hadoop-tools_hadoop-gridmix.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trun

[DISCUSS] HADOOP-13603 - Remove package line length checkstyle rule

2016-10-19 Thread Shane Kumpf
All,

I would like to start a discussion on the possibility of removing the
package line length checkstyle rule (HADOOP-13603
).

While working on various aspects of YARN container runtimes, all of my
pre-commit jobs would fail as the package line length exceeded 80
characters. While I'm all for automated checks, I feel checks need to be
enforceable and provide value. Fixing the package line length error does
not improve readability or maintainability of the code, and IMO should be
removed.

While on this topic, are there other automated checks that are difficult to
enforce or you feel are not providing value (perhaps the 150 line method
length)?

Please share your thoughts.

Thank you,
Shane Kumpf


[jira] [Created] (HDFS-11033) Add documents for native raw eraure coder in XOR codes

2016-10-19 Thread SammiChen (JIRA)
SammiChen created HDFS-11033:


 Summary: Add documents for native raw eraure coder in XOR codes
 Key: HDFS-11033
 URL: https://issues.apache.org/jira/browse/HDFS-11033
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: SammiChen
Assignee: SammiChen






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11032) [SPS]: Handling of block movement failure at the coordinator datanode

2016-10-19 Thread Rakesh R (JIRA)
Rakesh R created HDFS-11032:
---

 Summary: [SPS]: Handling of block movement failure at the 
coordinator datanode
 Key: HDFS-11032
 URL: https://issues.apache.org/jira/browse/HDFS-11032
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R


The idea of this jira is to discuss and implement an efficient failure(block 
movement failure) handling logic at the datanode cooridnator.  [Code 
reference|https://github.com/apache/hadoop/blob/HDFS-10285/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/StoragePolicySatisfyWorker.java#L243].

Following are the possible errors during block movement:
# Network errors(IOException) - provide retries(may be a hard coded 2 time 
retries) if the block storage movement is failed due to network errors. If its 
still end up with errors after 2 retries then marked as failure/retry to NN.
# No disk space(IOException) - no retries maked as failure/retry to NN.
# Block pinned - no retries marked as success/no-retry to NN. It is not 
possible to relocate this block to another datanode.
# Gen_Stamp mismatches - no retries marked as failure/retry to NN. Could be a 
case that the file might have re-opened.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org