[jira] [Updated] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13489:

Description: 
I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value from 
copyService.copy() was 0.

Here is related code from DistCp:
{code}
try {
  execute();
} catch (InvalidInputException e) {
  LOG.error("Invalid input: ", e);
  return DistCpConstants.INVALID_ARGUMENT;
} catch (DuplicateFileException e) {
  LOG.error("Duplicate files in input path: ", e);
  return DistCpConstants.DUPLICATE_INPUT;
} catch (AclsNotSupportedException e) {
  LOG.error("ACLs not supported on at least one file system: ", e);
  return DistCpConstants.ACLS_NOT_SUPPORTED;
} catch (XAttrsNotSupportedException e) {
  LOG.error("XAttrs not supported on at least one file system: ", e);
  return DistCpConstants.XATTRS_NOT_SUPPORTED;
} catch (Exception e) {
  LOG.error("Exception encountered ", e);
  return DistCpConstants.UNKNOWN_ERROR;
}
return DistCpConstants.SUCCESS;
{code}
We don't check whether the Job returned by execute() was successful - we rely 
on all failure cases going through the last catch clause but there may be 
special case.
Even if the Job fails, DistCpConstants.SUCCESS is returned.

  was:
I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value from 
copyService.copy() was 0.

Here is related code from DistCp:
{code}
try {
  execute();
} catch (InvalidInputException e) {
  LOG.error("Invalid input: ", e);
  return DistCpConstants.INVALID_ARGUMENT;
} catch (DuplicateFileException e) {
  LOG.error("Duplicate files in input path: ", e);
  return DistCpConstants.DUPLICATE_INPUT;
} catch (AclsNotSupportedException e) {
  LOG.error("ACLs not supported on at least one file system: ", e);
  return DistCpConstants.ACLS_NOT_SUPPORTED;
} catch (XAttrsNotSupportedException e) {
  LOG.error("XAttrs not supported on at least one file system: ", e);
  return DistCpConstants.XATTRS_NOT_SUPPORTED;
} catch (Exception e) {
  LOG.error("Exception encountered ", e);
  return DistCpConstants.UNKNOWN_ERROR;
}
return DistCpConstants.SUCCESS;
{code}
We don't check whether the Job returned by execute() was successful.
Even if the Job fails, DistCpConstants.SUCCESS is returned.


> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, MapReduceBackupCopyService.java, 
> TestIncrementalBackup-output.txt, testIncrementalBackup-8-12-rethrow.txt, 
> testIncrementalBackup-8-12.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful - we rely 
> on all failure cases going through the last catch clause but there may be 
> special case.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For 

[jira] [Commented] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419765#comment-15419765
 ] 

Ted Yu commented on HADOOP-13489:
-

I think DistCp#run() should be self-sufficient in that it doesn't return 
SUCCESS unless the distcp job (maybe submitted through wrapper) completes and 
succeeds.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, MapReduceBackupCopyService.java, 
> TestIncrementalBackup-output.txt, testIncrementalBackup-8-12-rethrow.txt, 
> testIncrementalBackup-8-12.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13489:

Attachment: testIncrementalBackup-8-12-rethrow.txt

Added the re-throw:
{code}
  } catch (Throwable t) {
LOG.debug("distcp " + job.getJobID() + " encountered error", t);
throw t;
{code}
As can be seen in the attached test log, there was no occurrence of the log 
above.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, MapReduceBackupCopyService.java, 
> TestIncrementalBackup-output.txt, testIncrementalBackup-8-12-rethrow.txt, 
> testIncrementalBackup-8-12.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419753#comment-15419753
 ] 

Ted Yu commented on HADOOP-13489:
-

I agree that ignoring Throwable is not good practice.
But if you look at the attached test output, there was no ' encountered error' 
line.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, MapReduceBackupCopyService.java, 
> TestIncrementalBackup-output.txt, testIncrementalBackup-8-12.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419745#comment-15419745
 ] 

Mingliang Liu commented on HADOOP-13489:


Thanks for providing more details about the test.

There is no sign that {{DistCp}} per se is doing wrongly. Moreover, there is no 
obvious proof that downstream applications are exploring {{DistCp}} correctly. 
This generally makes the proposal and patch no applicable.

I'm a bit surprised by the fact DistCp was extended and its {{execute()}} 
method was overridden. A broader question is that, is {{DistCp}} designed for 
this? Regardless of this (as I'm not a DistCp/HBase expert), I think the bigger 
problem here is that {{BackupDistCp}} is flaky in that it swallows all the 
exceptions as follows. This breaks the {{DistCp#run()}} contract with 
{{DistCp#execute()}}.
{code:title=MapReduceBackupCopyService.java$BackupDistCp}
  class BackupDistCp extends DistCp {
...
@Override
public Job execute() throws Exception {
  try {
  ...
  } catch (Throwable t) {
LOG.debug("distcp " + job.getJobID() + " encountered error", t);
  } finally {
  ...
{code}
The {{execute()}} method is invoked by the {{DistCp#run()}} method, which 
(again) checks the status of the job by exception handling. I can't imagine 
{{DistCP#run()}} is able to do anything if all the exceptions were already 
swallowed. This is the root cause of the mis-usage problem in HBase. As a 
result, we couldn't expect "Exception encountered " in the log.

I don't think HDFS is obliged to do anything here. Will close this JIRA if no 
objections in 24 hours.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, MapReduceBackupCopyService.java, 
> TestIncrementalBackup-output.txt, testIncrementalBackup-8-12.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13344) Add option to exclude Hadoop's SLF4J binding

2016-08-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419727#comment-15419727
 ] 

Allen Wittenauer commented on HADOOP-13344:
---

Maybe we are thinking about the problem wrong.  Could we check to see if a 
binding already exists and if not add one to the classpath?

> Add option to exclude Hadoop's SLF4J binding
> 
>
> Key: HADOOP-13344
> URL: https://issues.apache.org/jira/browse/HADOOP-13344
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: bin, scripts
>Affects Versions: 2.8.0, 2.7.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>  Labels: patch
> Attachments: HADOOP-13344.patch
>
>
> If another application that uses the Hadoop classpath brings in its own SLF4J 
> binding for logging, and that jar is not the exact same as the one brought in 
> by Hadoop, then there will be a conflict between logging jars between the two 
> classpaths. This patch introduces an optional setting to remove Hadoop's 
> SLF4J binding from the classpath, to get rid of this problem.
> This patch should be applied to 2.8.0, as bin/ and hadoop-config.sh structure 
> has been changed in 3.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12956) Inevitable Log4j2 migration via slf4j

2016-08-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419718#comment-15419718
 ] 

Allen Wittenauer commented on HADOOP-12956:
---

Again, to re-iterate:

If we have to make such a drastically incompatible change that's going to break 
all of our users' configuration files, then we should probably investigate 
whether we want to stay on log4j or move to something else (logback?).  

> Inevitable Log4j2 migration via slf4j
> -
>
> Key: HADOOP-12956
> URL: https://issues.apache.org/jira/browse/HADOOP-12956
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Gopal V
>
> {{5 August 2015 --The Apache Logging Services™ Project Management Committee 
> (PMC) has announced that the Log4j™ 1.x logging framework has reached its end 
> of life (EOL) and is no longer officially supported.}}
> https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces
> A whole framework log4j2 upgrade has to be synchronized, partly for improved 
> performance brought about by log4j2.
> https://logging.apache.org/log4j/2.x/manual/async.html#Performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419706#comment-15419706
 ] 

Ted Yu edited comment on HADOOP-13489 at 8/13/16 12:08 AM:
---

testIncrementalBackup-8-12.txt is the test output.

I cloned branch-2.7.3 locally, applied patch v1 (blocking mode in effect) and 
ran test with the following command:

mvn test -PrunAllTests -DfailIfNoTests=false -Dhadoop-two.version=2.7.3 
-Dtest=TestIncrementalBackup


was (Author: yuzhih...@gmail.com):
testIncrementalBackup-8-12.txt is the test output.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, MapReduceBackupCopyService.java, 
> TestIncrementalBackup-output.txt, testIncrementalBackup-8-12.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13489:

Attachment: MapReduceBackupCopyService.java

MapReduceBackupCopyService.java is the hbase class containing BackupDistCp 
which wraps DistCp.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, MapReduceBackupCopyService.java, 
> TestIncrementalBackup-output.txt, testIncrementalBackup-8-12.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13489:

Attachment: testIncrementalBackup-8-12.txt

testIncrementalBackup-8-12.txt is the test output.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, TestIncrementalBackup-output.txt, 
> testIncrementalBackup-8-12.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13489:

Status: Open  (was: Patch Available)

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, TestIncrementalBackup-output.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419703#comment-15419703
 ] 

Ted Yu commented on HADOOP-13489:
-

Discussed with Mingliang offline about the case I observed from hbase test.

The intricacy was that although the distcp job wasn't successful, the following 
code wasn't executed - I don't see "Exception encountered " in the log:
{code}
} catch (Exception e) {
  LOG.error("Exception encountered ", e);
  return DistCpConstants.UNKNOWN_ERROR;
{code}
Will attach hbase test output and related class shortly.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, TestIncrementalBackup-output.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419692#comment-15419692
 ] 

Hadoop QA commented on HADOOP-13489:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m  
4s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m  
5s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12823558/HADOOP-13489.v3.patch 
|
| JIRA Issue | HADOOP-13489 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 85b4b6e4b547 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 23c6e3c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10237/testReport/ |
| modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10237/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, 

[jira] [Commented] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419668#comment-15419668
 ] 

Mingliang Liu commented on HADOOP-13489:


I'm not in favor of the non-blocking case change, which is the core of the 
patch. Checking the job state in {{non-blocking}} mode makes little, if any, 
sense to me. If the user runs in {{non-blocking}} (a.k.a {{-async}}) mode, she 
should not expect the job be completed just after its submission. By the way, 
the it's {{blocking}} mode by default. Instead, she, as a downstream user of 
DistCp, should call {{DistCp#execute()}} directly and probes the job state 
manually. In [MAPREDUCE-6248], [~jingzhao] enabled users to get the MR job 
information for distcp. So {{DistCp#execute()}} instead of {{DistCp#run()}} 
should be right usage of the {{non-blocking}} distcp along with probing exit 
job state, IMO.

And for the blocking case, as we agree in the above discussion, it's not a 
problem at all.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, TestIncrementalBackup-output.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13489:

Attachment: HADOOP-13489.v3.patch

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, TestIncrementalBackup-output.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419648#comment-15419648
 ] 

Ted Yu commented on HADOOP-13489:
-

For #1, I converted the null check to assertion in patch v3.

w.r.t. #2, for the blocking case, if waitForJobCompletion() throws IOException, 
it would be caught by run(), resulting in DistCpConstants.UNKNOWN_ERROR to be 
returned.

For the non-blocking case, I am adding check of job.isComplete() to the if 
statement.

For the javadoc, there are many values for different failures. To be addressed 
in separate JIRA ?

Thanks

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> TestIncrementalBackup-output.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419635#comment-15419635
 ] 

Mingliang Liu commented on HADOOP-13489:


Thanks for reporting and working on this, [~tedyu].

The v1 patch checks the job not to be null value and job state to be 
DistCpConstants.SUCCEEDED by directly calling {{isSuccessful()}}. My concerns 
are:
# Basically the {{DistCp#run()}} checks the status of the job by exception 
handling. All exceptions, if not caught yet, will be caught by the last generic 
{{Exception}} block which returns DistCpConstants.UNKNOWN_ERROR. Is it possible 
that the job is null while there is no exception? (I think here the job should 
never be null)
#- If not, the {{if (job != null) {}} check is not needed, or better, state as 
an assert.
#- If true, I think for a null job, we should still return 
DistCpConstants.UNKNOWN_ERROR instead of DistCpConstants.SUCCESS.
#  We have to respect the {{DistCpOptions#blocking}} option ({{-async}} from 
command line parameters):
#- If the distcp option is blocking, {{DistCp#waitForJobCompletion()}} will be 
called, which will throw an IOE in case the job is not successful.
{code}
  /**
   * Wait for the given job to complete.
   * @param job the given mapreduce job that has already been submitted
   */
  public void waitForJobCompletion(Job job) throws Exception {
assert job != null;
if (!job.waitForCompletion(true)) {
  throw new IOException("DistCp failure: Job " + job.getJobID()
  + " has failed: " + job.getStatus().getFailureInfo());
}
  }
{code}
So if the Job fails, DistCpConstants.SUCCESS is NOT be returned?
#- If the distcp option is non-blocking, {{waitForJobCompletion()}} will not be 
called, in which case the distcp just returns after the job is submitted. The 
internal job state can still be RUNNING instead of SUCCEEDED. What should 
DistCp return in this case? I think we can simply return SUCCESS, as the 
existing code does.

FWIW, the {{@return}} javadoc of the {{DistCp#run()}} method was wrong for 
years.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> TestIncrementalBackup-output.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HADOOP-13489:
---

Assignee: Ted Yu

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> TestIncrementalBackup-output.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13475) Adding Append Blob support for WASB

2016-08-12 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HADOOP-13475:
-
Target Version/s: 2.9.0  (was: 2.8.0)
   Fix Version/s: (was: 2.8.0)

[~Raulmsm], it's already assigned to you - likely done by Chris.

BTW, dropping 2.8.0 version as we are close to an RC there..

> Adding Append Blob support for WASB
> ---
>
> Key: HADOOP-13475
> URL: https://issues.apache.org/jira/browse/HADOOP-13475
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: azure
>Affects Versions: 2.7.1
>Reporter: Raul da Silva Martins
>Assignee: Raul da Silva Martins
>Priority: Critical
> Attachments: 0001-Added-Support-for-Azure-AppendBlobs.patch
>
>
> Currently the WASB implementation of the HDFS interface does not support the 
> utilization of Azure AppendBlobs underneath. As owners of a large scale 
> service who intend to start writing to Append blobs, we need this support in 
> order to be able to keep using our HDI capabilities.
> This JIRA is added to implement Azure AppendBlob support to WASB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13494) ReconfigurableBase can log sensitive information

2016-08-12 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-13494:
---
Attachment: HADOOP-13494.001.patch

Attaching a patch with a simple approach. It'd be nice to have some mechanism 
where modules that define config parameters (e.g. Constants.java under s3a) can 
say which ones are sensitive. But that seems a recipe for over-engineering to 
me... I think a list of patterns will be best. Any suggestions of other 
patterns that belong on the list would be welcome...

> ReconfigurableBase can log sensitive information
> 
>
> Key: HADOOP-13494
> URL: https://issues.apache.org/jira/browse/HADOOP-13494
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-13494.001.patch
>
>
> ReconfigurableBase will log old and new configuration values, which may cause 
> sensitive parameters (most notably cloud storage keys, though there may be 
> other instances) to get included in the logs. 
> Given the currently small list of reconfigurable properties, an argument 
> could be made for simply not logging the property values at all, but this is 
> not the only instance where potentially sensitive configuration gets written 
> somewhere else in plaintext. I think a generic mechanism for redacting 
> sensitive information for textual display will be useful to some of the web 
> UIs too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13344) Add option to exclude Hadoop's SLF4J binding

2016-08-12 Thread Thomas Poepping (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419433#comment-15419433
 ] 

Thomas Poepping commented on HADOOP-13344:
--

I would say we can't unconditionally remove the binding, because hadoop code 
itself will still need an slf4j binding to accurately log items it tries to 
log. I'll work to getting my updated patch posted here

> Add option to exclude Hadoop's SLF4J binding
> 
>
> Key: HADOOP-13344
> URL: https://issues.apache.org/jira/browse/HADOOP-13344
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: bin, scripts
>Affects Versions: 2.8.0, 2.7.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>  Labels: patch
> Attachments: HADOOP-13344.patch
>
>
> If another application that uses the Hadoop classpath brings in its own SLF4J 
> binding for logging, and that jar is not the exact same as the one brought in 
> by Hadoop, then there will be a conflict between logging jars between the two 
> classpaths. This patch introduces an optional setting to remove Hadoop's 
> SLF4J binding from the classpath, to get rid of this problem.
> This patch should be applied to 2.8.0, as bin/ and hadoop-config.sh structure 
> has been changed in 3.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13494) ReconfigurableBase can log sensitive information

2016-08-12 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-13494:
---
Attachment: (was: HADOOP-13494.001.patch)

> ReconfigurableBase can log sensitive information
> 
>
> Key: HADOOP-13494
> URL: https://issues.apache.org/jira/browse/HADOOP-13494
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>
> ReconfigurableBase will log old and new configuration values, which may cause 
> sensitive parameters (most notably cloud storage keys, though there may be 
> other instances) to get included in the logs. 
> Given the currently small list of reconfigurable properties, an argument 
> could be made for simply not logging the property values at all, but this is 
> not the only instance where potentially sensitive configuration gets written 
> somewhere else in plaintext. I think a generic mechanism for redacting 
> sensitive information for textual display will be useful to some of the web 
> UIs too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13494) ReconfigurableBase can log sensitive information

2016-08-12 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-13494:
---
Attachment: HADOOP-13494.001.patch

> ReconfigurableBase can log sensitive information
> 
>
> Key: HADOOP-13494
> URL: https://issues.apache.org/jira/browse/HADOOP-13494
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-13494.001.patch
>
>
> ReconfigurableBase will log old and new configuration values, which may cause 
> sensitive parameters (most notably cloud storage keys, though there may be 
> other instances) to get included in the logs. 
> Given the currently small list of reconfigurable properties, an argument 
> could be made for simply not logging the property values at all, but this is 
> not the only instance where potentially sensitive configuration gets written 
> somewhere else in plaintext. I think a generic mechanism for redacting 
> sensitive information for textual display will be useful to some of the web 
> UIs too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13447) S3Guard: Refactor S3AFileSystem to support introduction of separate metadata repository and tests.

2016-08-12 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419351#comment-15419351
 ] 

Aaron Fabbri commented on HADOOP-13447:
---

I'm +1 on the 002 patch.

Noticed one mention of S3Store in a comment, but not worth holding up the patch 
for.

> S3Guard: Refactor S3AFileSystem to support introduction of separate metadata 
> repository and tests.
> --
>
> Key: HADOOP-13447
> URL: https://issues.apache.org/jira/browse/HADOOP-13447
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13447-HADOOP-13446.001.patch, 
> HADOOP-13447-HADOOP-13446.002.patch
>
>
> The scope of this issue is to refactor the existing {{S3AFileSystem}} into 
> multiple coordinating classes.  The goal of this refactoring is to separate 
> the {{FileSystem}} API binding from the AWS SDK integration, make code 
> maintenance easier while we're making changes for S3Guard, and make it easier 
> to mock some implementation details so that tests can simulate eventual 
> consistency behavior in a deterministic way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13344) Add option to exclude Hadoop's SLF4J binding

2016-08-12 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419313#comment-15419313
 ] 

Sean Busbey commented on HADOOP-13344:
--

For trunk, are we better off just unconditionally removing the binding? 
Presuming [~aw] still feels this way:

{quote}
* I think moving it to a place that may be optionally triggered off is a really 
the only realistic fix here. Munging the classpath to remove it (especially 
when we tend to use wildcards to speed things up), is just painful.
* Since that path is default on, this shouldn't be an incompatible break. Users 
who turn it off are making the decision to disable it the same way they would a 
plug-in. This is nearly the same sorts of decisions we give users to turn 
things on, such as special options to fsck.
{quote}

then sure, I'd be fine with the pom fixes as a follow-up so long as the 
solution here doesn't obviously rely on something that would change as a result.

If we're going to unconditionally remove the binding then I'd consider the 
maven fixes blocking, wether they happen here or in a different JIRA.

> Add option to exclude Hadoop's SLF4J binding
> 
>
> Key: HADOOP-13344
> URL: https://issues.apache.org/jira/browse/HADOOP-13344
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: bin, scripts
>Affects Versions: 2.8.0, 2.7.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>  Labels: patch
> Attachments: HADOOP-13344.patch
>
>
> If another application that uses the Hadoop classpath brings in its own SLF4J 
> binding for logging, and that jar is not the exact same as the one brought in 
> by Hadoop, then there will be a conflict between logging jars between the two 
> classpaths. This patch introduces an optional setting to remove Hadoop's 
> SLF4J binding from the classpath, to get rid of this problem.
> This patch should be applied to 2.8.0, as bin/ and hadoop-config.sh structure 
> has been changed in 3.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13487) Hadoop KMS doesn't clean up old delegation tokens stored in Zookeeper

2016-08-12 Thread Alex Ivanov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419308#comment-15419308
 ] 

Alex Ivanov commented on HADOOP-13487:
--

[~xiaochen], here are some more details about the issue, and thank you for 
looking into it!

ZK dtoken config for KMS (included for completeness):
{code}
  
  
hadoop.kms.authentication.zk-dt-secret-manager.enable
true

  Enables storage of delegation tokens in Zookeeper.

  
  
hadoop.kms.authentication.zk-dt-secret-manager.znodeWorkingPath
hadoop-kms-dt

  The znode path where KMS will store delegation tokens.

  
  

hadoop.kms.authentication.zk-dt-secret-manager.zkConnectionString
HOST1:2181,HOST2:2181,HOST3:2181

  The Zookeeper connection string: a list of hostnames and quorum port 
comma-separated.

  
  
hadoop.kms.authentication.zk-dt-secret-manager.zkAuthType
sasl

  The Zookeeper authentication type, 'none' or 'sasl' (Kerberos).

  
  
hadoop.kms.authentication.zk-dt-secret-manager.kerberos.keytab
/etc/hadoop-kms/conf/kms.keytab

  The absolute path for the Kerberos keytab with the credentials to
  connect to Zookeeper.

  
  

hadoop.kms.authentication.zk-dt-secret-manager.kerberos.principal
kms/HOST@BIGDATA

  The Kerberos service principal used to connect to Zookeeper.

  

  
  
hadoop.kms.authentication.delegation-token.update-interval.sec
1209600

  How often the master key is rotated, in seconds. Set to 2 weeks.

  
  
hadoop.kms.authentication.delegation-token.max-lifetime.sec
2419200

  Maximum lifetime of a delagation token, in seconds. Set to 4 weeks.

  
  
  
  
hadoop.kms.authentication.delegation-token.renew-interval.sec
120960

  Renewal interval of a delegation token, in seconds. Set to 2 weeks.

  
  

hadoop.kms.authentication.delegation-token.removal-scan-interval.sec
3600

  Scan interval to remove expired delegation tokens.

  
{code}

Since I set *delegation-token.renew-interval.sec* to 2 weeks, I expect the 
tokens to be invalid after that time (NOTE: I account for HADOOP-12659 
specifying the time in millis). There is no process renewing the tokens right 
now, but even if they were renewed, the maximum lifetime would be 4 weeks based 
on the setting.
If I use *zkCli* to connect to one of the ZK servers, I see there are many 
delegation tokens (NOTE: I ran all commands today, 08/12/2016):
{code}
[zk: HOST:2181(CONNECTED) 0] stat /hadoop-kms-dt/ZKDTSMRoot/ZKDTSMTokensRoot
cZxid = 0x1002395a5
ctime = Mon Jun 13 21:29:02 UTC 2016
mZxid = 0x1002395a5
mtime = Mon Jun 13 21:29:02 UTC 2016
pZxid = 0x100501d21
cversion = 109499
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 11
numChildren = 103229
{code}

As you can see, there are over 100k dtokens in that znode. Here's a sample old 
delegation token from June 29th:
{code}
[zk: HOST:2181(CONNECTED) 2] get 
/hadoop-kms-dt/ZKDTSMRoot/ZKDTSMTokensRoot/DT_2
adminyarnoozie�U��V&�V+�f&�N  U�^&�DJ�=��}ؒ�R
cZxid = 0x10029f135
ctime = Wed Jun 29 09:38:40 UTC 2016
mZxid = 0x10029f135
mtime = Wed Jun 29 09:38:40 UTC 2016
pZxid = 0x10029f135
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 75
numChildren = 0
{code}

Note that the renewal time is NOT available to see in the {{zkCli}} console 
output. I had to write a small program to extract this datum from the znode. 
Here's the output of the custom program:
{code}
>> ReadDelTokenFromZK 2
DT renew date: 1468402720294
{code}

1468402720294 = GMT: Wed, 13 Jul 2016 09:38:40.294 GMT
As you can see, the renewal date corresponds to the interval I've specified, 
i.e. 2 weeks (June 29th - July 13th).
The only problem is, it is August 12th today, and the dtoken is still there, 
which leads me to believe KMS is NOT cleaning up old tokens.

> Hadoop KMS doesn't clean up old delegation tokens stored in Zookeeper
> -
>
> Key: HADOOP-13487
> URL: https://issues.apache.org/jira/browse/HADOOP-13487
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Alex Ivanov
>
> Configuration:
> CDH 5.5.1 (Hadoop 2.6+)
> KMS configured to store delegation tokens in Zookeeper
> DEBUG logging enabled in /etc/hadoop-kms/conf/kms-log4j.properties
> Findings:
> It seems to me delegation tokens never get cleaned up from Zookeeper past 
> their renewal date. I can see in the logs that the removal thread is started 
> with the expected interval:
> {code}
> 2016-08-11 08:15:24,511 INFO  AbstractDelegationTokenSecretManager - Starting 
> expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
> {code}
> However, I don't see 

[jira] [Commented] (HADOOP-13344) Add option to exclude Hadoop's SLF4J binding

2016-08-12 Thread Thomas Poepping (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419298#comment-15419298
 ] 

Thomas Poepping commented on HADOOP-13344:
--

this seems like a different issue, but I understand what you're saying. as it 
stands now, it looks like hadoop-mapreduce-project, hadoop-mapreduce-client, 
and hadoop-minikdc all have the slf4j binding as a compile dependency, rather 
than a runtime dependency. changing this would exclude it for dependent 
projects. 

I would recommend a different JIRA for this issue, however, because this is not 
the issue I'm trying to solve here. The issue I'm trying to solve is that 
hadoop-config.sh will place the slf4j binding on the classpath no matter what. 
Applications that must be installed with hadoop to function (e.g. hive) will 
end up having Hadoop's slf4j-binding on their classpath, even if they don't 
want it there. This option would give the opportunity, in this instance, to 
remove that dependency from the classpath.

> Add option to exclude Hadoop's SLF4J binding
> 
>
> Key: HADOOP-13344
> URL: https://issues.apache.org/jira/browse/HADOOP-13344
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: bin, scripts
>Affects Versions: 2.8.0, 2.7.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>  Labels: patch
> Attachments: HADOOP-13344.patch
>
>
> If another application that uses the Hadoop classpath brings in its own SLF4J 
> binding for logging, and that jar is not the exact same as the one brought in 
> by Hadoop, then there will be a conflict between logging jars between the two 
> classpaths. This patch introduces an optional setting to remove Hadoop's 
> SLF4J binding from the classpath, to get rid of this problem.
> This patch should be applied to 2.8.0, as bin/ and hadoop-config.sh structure 
> has been changed in 3.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13344) Add option to exclude Hadoop's SLF4J binding

2016-08-12 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419277#comment-15419277
 ] 

Sean Busbey commented on HADOOP-13344:
--

Right now, sure, you'd use an exclusion to remove it. But it being there at all 
is a bug. We already exclude a ton of other transitive dependencies that aren't 
needed for the downstream facing client. Why would we not also exclude this one?

> Add option to exclude Hadoop's SLF4J binding
> 
>
> Key: HADOOP-13344
> URL: https://issues.apache.org/jira/browse/HADOOP-13344
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: bin, scripts
>Affects Versions: 2.8.0, 2.7.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>  Labels: patch
> Attachments: HADOOP-13344.patch
>
>
> If another application that uses the Hadoop classpath brings in its own SLF4J 
> binding for logging, and that jar is not the exact same as the one brought in 
> by Hadoop, then there will be a conflict between logging jars between the two 
> classpaths. This patch introduces an optional setting to remove Hadoop's 
> SLF4J binding from the classpath, to get rid of this problem.
> This patch should be applied to 2.8.0, as bin/ and hadoop-config.sh structure 
> has been changed in 3.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13344) Add option to exclude Hadoop's SLF4J binding

2016-08-12 Thread Thomas Poepping (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419263#comment-15419263
 ] 

Thomas Poepping commented on HADOOP-13344:
--

so let's say I'm a maven based consumer of hadoop client, i.e. I have a 
dependency on some hadoop-client artifact in my pomfile. What would you 
suggest? How would I include or exclude this binding in my dependency tree? 
There's already a builtin way to exclude it -- using . Why go 
around that?

> Add option to exclude Hadoop's SLF4J binding
> 
>
> Key: HADOOP-13344
> URL: https://issues.apache.org/jira/browse/HADOOP-13344
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: bin, scripts
>Affects Versions: 2.8.0, 2.7.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>  Labels: patch
> Attachments: HADOOP-13344.patch
>
>
> If another application that uses the Hadoop classpath brings in its own SLF4J 
> binding for logging, and that jar is not the exact same as the one brought in 
> by Hadoop, then there will be a conflict between logging jars between the two 
> classpaths. This patch introduces an optional setting to remove Hadoop's 
> SLF4J binding from the classpath, to get rid of this problem.
> This patch should be applied to 2.8.0, as bin/ and hadoop-config.sh structure 
> has been changed in 3.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13344) Add option to exclude Hadoop's SLF4J binding

2016-08-12 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419243#comment-15419243
 ] 

Sean Busbey commented on HADOOP-13344:
--

{quote}
If there is a downstream project including hadoop-common (or any other 
component), I say it's the responsibility of that project to exclude the slf4j 
binding if they don't want it. I don't believe any changes to pom.xml are 
necessary.
{quote}

disagree. Specifically, if folks are using the hadoop-client artifact(s) those 
are expressly supposed to be a library for downstream users. SLF4j is pretty 
clear on their guidance for libraries wrt including bindings. Just have the API 
and let an end user decide if they need logging to go somewhere.

> Add option to exclude Hadoop's SLF4J binding
> 
>
> Key: HADOOP-13344
> URL: https://issues.apache.org/jira/browse/HADOOP-13344
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: bin, scripts
>Affects Versions: 2.8.0, 2.7.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>  Labels: patch
> Attachments: HADOOP-13344.patch
>
>
> If another application that uses the Hadoop classpath brings in its own SLF4J 
> binding for logging, and that jar is not the exact same as the one brought in 
> by Hadoop, then there will be a conflict between logging jars between the two 
> classpaths. This patch introduces an optional setting to remove Hadoop's 
> SLF4J binding from the classpath, to get rid of this problem.
> This patch should be applied to 2.8.0, as bin/ and hadoop-config.sh structure 
> has been changed in 3.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13344) Add option to exclude Hadoop's SLF4J binding

2016-08-12 Thread Thomas Poepping (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419235#comment-15419235
 ] 

Thomas Poepping commented on HADOOP-13344:
--

Coming back to this, what downstream users would we expect?

The problem we're trying to solve is that the slf4j binding is in the 
HADOOP_CLASSPATH variable, setup by hadoop-config.sh. That is not controllable 
by a downstream project, which is why I'm changing it here.

If there is a downstream project including hadoop-common (or any other 
component), I say it's the responsibility of that project to exclude the slf4j 
binding if they don't want it. I don't believe any changes to pom.xml are 
necessary.

> Add option to exclude Hadoop's SLF4J binding
> 
>
> Key: HADOOP-13344
> URL: https://issues.apache.org/jira/browse/HADOOP-13344
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: bin, scripts
>Affects Versions: 2.8.0, 2.7.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>  Labels: patch
> Attachments: HADOOP-13344.patch
>
>
> If another application that uses the Hadoop classpath brings in its own SLF4J 
> binding for logging, and that jar is not the exact same as the one brought in 
> by Hadoop, then there will be a conflict between logging jars between the two 
> classpaths. This patch introduces an optional setting to remove Hadoop's 
> SLF4J binding from the classpath, to get rid of this problem.
> This patch should be applied to 2.8.0, as bin/ and hadoop-config.sh structure 
> has been changed in 3.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13496) Include file lengths in Mismatch in length error for distcp

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13496:

Attachment: (was: HADOOP-13496.v1.patch)

> Include file lengths in Mismatch in length error for distcp
> ---
>
> Key: HADOOP-13496
> URL: https://issues.apache.org/jira/browse/HADOOP-13496
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
> Attachments: HADOOP-13496.v1.patch
>
>
> Currently RetriableFileCopyCommand doesn't show the perceived lengths in 
> Mismatch in length error:
> {code}
> 2016-08-12 10:23:14,231 ERROR [LocalJobRunner Map Task Executor #0] 
> util.RetriableCommand(89): Failure in Retriable command: Copying 
> hdfs://localhost:53941/user/tyu/test-data/dc7c674a-c463-4798-8260-   
> c5d1e3440a4b/WALs/10.22.9.171,53952,1471022508087/10.22.9.171%2C53952%2C1471022508087.regiongroup-1.1471022510182
>  to hdfs://localhost:53941/backupUT/backup_1471022580616/WALs/10.22.9.
>171%2C53952%2C1471022508087.regiongroup-1.1471022510182
> java.io.IOException: Mismatch in length of 
> source:hdfs://localhost:53941/user/tyu/test-data/dc7c674a-c463-4798-8260-c5d1e3440a4b/WALs/10.22.9.171,53952,1471022508087/10.22.9.171%2C53952%2C1471022508087.regiongroup-1.1471022510182
>  and 
> target:hdfs://localhost:53941/backupUT/backup_1471022580616/WALs/.distcp.tmp.attempt_local344329843_0006_m_00_0
>   at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareFileLengths(RetriableFileCopyCommand.java:193)
>   at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:126)
>   at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99)
>   at 
> org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
>   at 
> org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:281)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:253)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
> {code}
> It would be helpful to include what's the expected length and what's the real 
> length.
> Thanks to [~yzhangal] for offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13496) Include file lengths in Mismatch in length error for distcp

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13496:

Attachment: HADOOP-13496.v1.patch

> Include file lengths in Mismatch in length error for distcp
> ---
>
> Key: HADOOP-13496
> URL: https://issues.apache.org/jira/browse/HADOOP-13496
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
> Attachments: HADOOP-13496.v1.patch
>
>
> Currently RetriableFileCopyCommand doesn't show the perceived lengths in 
> Mismatch in length error:
> {code}
> 2016-08-12 10:23:14,231 ERROR [LocalJobRunner Map Task Executor #0] 
> util.RetriableCommand(89): Failure in Retriable command: Copying 
> hdfs://localhost:53941/user/tyu/test-data/dc7c674a-c463-4798-8260-   
> c5d1e3440a4b/WALs/10.22.9.171,53952,1471022508087/10.22.9.171%2C53952%2C1471022508087.regiongroup-1.1471022510182
>  to hdfs://localhost:53941/backupUT/backup_1471022580616/WALs/10.22.9.
>171%2C53952%2C1471022508087.regiongroup-1.1471022510182
> java.io.IOException: Mismatch in length of 
> source:hdfs://localhost:53941/user/tyu/test-data/dc7c674a-c463-4798-8260-c5d1e3440a4b/WALs/10.22.9.171,53952,1471022508087/10.22.9.171%2C53952%2C1471022508087.regiongroup-1.1471022510182
>  and 
> target:hdfs://localhost:53941/backupUT/backup_1471022580616/WALs/.distcp.tmp.attempt_local344329843_0006_m_00_0
>   at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareFileLengths(RetriableFileCopyCommand.java:193)
>   at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:126)
>   at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99)
>   at 
> org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
>   at 
> org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:281)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:253)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
> {code}
> It would be helpful to include what's the expected length and what's the real 
> length.
> Thanks to [~yzhangal] for offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13496) Include file lengths in Mismatch in length error for distcp

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13496:

Attachment: HADOOP-13496.v1.patch

> Include file lengths in Mismatch in length error for distcp
> ---
>
> Key: HADOOP-13496
> URL: https://issues.apache.org/jira/browse/HADOOP-13496
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
> Attachments: HADOOP-13496.v1.patch
>
>
> Currently RetriableFileCopyCommand doesn't show the perceived lengths in 
> Mismatch in length error:
> {code}
> 2016-08-12 10:23:14,231 ERROR [LocalJobRunner Map Task Executor #0] 
> util.RetriableCommand(89): Failure in Retriable command: Copying 
> hdfs://localhost:53941/user/tyu/test-data/dc7c674a-c463-4798-8260-   
> c5d1e3440a4b/WALs/10.22.9.171,53952,1471022508087/10.22.9.171%2C53952%2C1471022508087.regiongroup-1.1471022510182
>  to hdfs://localhost:53941/backupUT/backup_1471022580616/WALs/10.22.9.
>171%2C53952%2C1471022508087.regiongroup-1.1471022510182
> java.io.IOException: Mismatch in length of 
> source:hdfs://localhost:53941/user/tyu/test-data/dc7c674a-c463-4798-8260-c5d1e3440a4b/WALs/10.22.9.171,53952,1471022508087/10.22.9.171%2C53952%2C1471022508087.regiongroup-1.1471022510182
>  and 
> target:hdfs://localhost:53941/backupUT/backup_1471022580616/WALs/.distcp.tmp.attempt_local344329843_0006_m_00_0
>   at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareFileLengths(RetriableFileCopyCommand.java:193)
>   at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:126)
>   at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99)
>   at 
> org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
>   at 
> org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:281)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:253)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
> {code}
> It would be helpful to include what's the expected length and what's the real 
> length.
> Thanks to [~yzhangal] for offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13496) Include file lengths in Mismatch in length error for distcp

2016-08-12 Thread Ted Yu (JIRA)
Ted Yu created HADOOP-13496:
---

 Summary: Include file lengths in Mismatch in length error for 
distcp
 Key: HADOOP-13496
 URL: https://issues.apache.org/jira/browse/HADOOP-13496
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ted Yu
Priority: Minor


Currently RetriableFileCopyCommand doesn't show the perceived lengths in 
Mismatch in length error:
{code}
2016-08-12 10:23:14,231 ERROR [LocalJobRunner Map Task Executor #0] 
util.RetriableCommand(89): Failure in Retriable command: Copying 
hdfs://localhost:53941/user/tyu/test-data/dc7c674a-c463-4798-8260-   
c5d1e3440a4b/WALs/10.22.9.171,53952,1471022508087/10.22.9.171%2C53952%2C1471022508087.regiongroup-1.1471022510182
 to hdfs://localhost:53941/backupUT/backup_1471022580616/WALs/10.22.9.  
 171%2C53952%2C1471022508087.regiongroup-1.1471022510182
java.io.IOException: Mismatch in length of 
source:hdfs://localhost:53941/user/tyu/test-data/dc7c674a-c463-4798-8260-c5d1e3440a4b/WALs/10.22.9.171,53952,1471022508087/10.22.9.171%2C53952%2C1471022508087.regiongroup-1.1471022510182
 and 
target:hdfs://localhost:53941/backupUT/backup_1471022580616/WALs/.distcp.tmp.attempt_local344329843_0006_m_00_0
  at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareFileLengths(RetriableFileCopyCommand.java:193)
  at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:126)
  at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99)
  at 
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
  at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:281)
  at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:253)
  at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
{code}
It would be helpful to include what's the expected length and what's the real 
length.

Thanks to [~yzhangal] for offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13495) create-release should allow a user-supplied Dockerfile

2016-08-12 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-13495:
-

 Summary: create-release should allow a user-supplied Dockerfile
 Key: HADOOP-13495
 URL: https://issues.apache.org/jira/browse/HADOOP-13495
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Allen Wittenauer


For non-ASF builds, it'd be handy to supply a custom Dockerfile to 
create-release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13487) Hadoop KMS doesn't clean up old delegation tokens stored in Zookeeper

2016-08-12 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419113#comment-15419113
 ] 

Xiao Chen commented on HADOOP-13487:


Oops, sorry Alex, I meant [~axenol].. :)

> Hadoop KMS doesn't clean up old delegation tokens stored in Zookeeper
> -
>
> Key: HADOOP-13487
> URL: https://issues.apache.org/jira/browse/HADOOP-13487
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Alex Ivanov
>
> Configuration:
> CDH 5.5.1 (Hadoop 2.6+)
> KMS configured to store delegation tokens in Zookeeper
> DEBUG logging enabled in /etc/hadoop-kms/conf/kms-log4j.properties
> Findings:
> It seems to me delegation tokens never get cleaned up from Zookeeper past 
> their renewal date. I can see in the logs that the removal thread is started 
> with the expected interval:
> {code}
> 2016-08-11 08:15:24,511 INFO  AbstractDelegationTokenSecretManager - Starting 
> expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
> {code}
> However, I don't see any delegation token removals, indicated by the 
> following log message:
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager 
> --> removeStoredToken(TokenIdent ident), line 769 [CDH]
> {code}
> if (LOG.isDebugEnabled()) {
>   LOG.debug("Removing ZKDTSMDelegationToken_"
>   + ident.getSequenceNumber());
> }
> {code}
> Meanwhile, I see a lot of expired delegation tokens in Zookeeper that don't 
> get cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12956) Inevitable Log4j2 migration via slf4j

2016-08-12 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419097#comment-15419097
 ] 

Sean Busbey commented on HADOOP-12956:
--

bq. We suggest upgrading to Log4j 2.x, we can offer help to do so.

Any chance of adding the ability to parse log4j 1.2.z properties files? Or a 
migration tool?

> Inevitable Log4j2 migration via slf4j
> -
>
> Key: HADOOP-12956
> URL: https://issues.apache.org/jira/browse/HADOOP-12956
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Gopal V
>
> {{5 August 2015 --The Apache Logging Services™ Project Management Committee 
> (PMC) has announced that the Log4j™ 1.x logging framework has reached its end 
> of life (EOL) and is no longer officially supported.}}
> https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces
> A whole framework log4j2 upgrade has to be synchronized, partly for improved 
> performance brought about by log4j2.
> https://logging.apache.org/log4j/2.x/manual/async.html#Performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12956) Inevitable Log4j2 migration via slf4j

2016-08-12 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HADOOP-12956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419093#comment-15419093
 ] 

Mikael Ståldal commented on HADOOP-12956:
-

Hi,

I am member of Apache Logging project PMC.

Due to various issues in Log4j 1.2.x, it will no longer work in JDK9: 
http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-July/008654.html

As Log4j 1.x is EOL, these issues will not be fixed. We suggest upgrading to 
Log4j 2.x, we can offer help to do so.


> Inevitable Log4j2 migration via slf4j
> -
>
> Key: HADOOP-12956
> URL: https://issues.apache.org/jira/browse/HADOOP-12956
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Gopal V
>
> {{5 August 2015 --The Apache Logging Services™ Project Management Committee 
> (PMC) has announced that the Log4j™ 1.x logging framework has reached its end 
> of life (EOL) and is no longer officially supported.}}
> https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces
> A whole framework log4j2 upgrade has to be synchronized, partly for improved 
> performance brought about by log4j2.
> https://logging.apache.org/log4j/2.x/manual/async.html#Performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13494) ReconfigurableBase can log sensitive information

2016-08-12 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418986#comment-15418986
 ] 

Sean Mackrory commented on HADOOP-13494:


Yeah that was my intention - my mistake.

> ReconfigurableBase can log sensitive information
> 
>
> Key: HADOOP-13494
> URL: https://issues.apache.org/jira/browse/HADOOP-13494
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>
> ReconfigurableBase will log old and new configuration values, which may cause 
> sensitive parameters (most notably cloud storage keys, though there may be 
> other instances) to get included in the logs. 
> Given the currently small list of reconfigurable properties, an argument 
> could be made for simply not logging the property values at all, but this is 
> not the only instance where potentially sensitive configuration gets written 
> somewhere else in plaintext. I think a generic mechanism for redacting 
> sensitive information for textual display will be useful to some of the web 
> UIs too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Moved] (HADOOP-13494) ReconfigurableBase can log sensitive information

2016-08-12 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory moved HDFS-10758 to HADOOP-13494:
---

Key: HADOOP-13494  (was: HDFS-10758)
Project: Hadoop Common  (was: Hadoop HDFS)

> ReconfigurableBase can log sensitive information
> 
>
> Key: HADOOP-13494
> URL: https://issues.apache.org/jira/browse/HADOOP-13494
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>
> ReconfigurableBase will log old and new configuration values, which may cause 
> sensitive parameters (most notably cloud storage keys, though there may be 
> other instances) to get included in the logs. 
> Given the currently small list of reconfigurable properties, an argument 
> could be made for simply not logging the property values at all, but this is 
> not the only instance where potentially sensitive configuration gets written 
> somewhere else in plaintext. I think a generic mechanism for redacting 
> sensitive information for textual display will be useful to some of the web 
> UIs too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found

2016-08-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned HADOOP-13493:
-

Assignee: Karthik Kambatla

> Compatibility Docs should clarify the policy for what takes precedence when a 
> conflict is found
> ---
>
> Key: HADOOP-13493
> URL: https://issues.apache.org/jira/browse/HADOOP-13493
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: Robert Kanter
>Assignee: Karthik Kambatla
>Priority: Critical
>
> The Compatibility Docs 
> (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API)
>  list the policies for Private, Public, not annotated, etc Classes and 
> members, but it doesn't say what happens when there's a conflict.  We should 
> try obviously try to avoid this situation, but it would be good to explicitly 
> state what takes precedence.
> As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} 
> looked like this:
> {code:java}
> @Private
> @Stable
> public abstract class RefreshNodesRequest {
>   @Public
>   @Stable
>   public static RefreshNodesRequest newInstance() {
> RefreshNodesRequest request = 
> Records.newRecord(RefreshNodesRequest.class);
> return request;
>   }
> }
> {code}
> Note that the class is marked {{\@Private}}, but the method is marked 
> {{\@Public}}.
> In this example, I'd say that the class level should have priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found

2016-08-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated HADOOP-13493:
--
Target Version/s: 3.0.0-alpha1  (was: 2.8.0)

> Compatibility Docs should clarify the policy for what takes precedence when a 
> conflict is found
> ---
>
> Key: HADOOP-13493
> URL: https://issues.apache.org/jira/browse/HADOOP-13493
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: Robert Kanter
>Assignee: Karthik Kambatla
>Priority: Critical
>
> The Compatibility Docs 
> (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API)
>  list the policies for Private, Public, not annotated, etc Classes and 
> members, but it doesn't say what happens when there's a conflict.  We should 
> try obviously try to avoid this situation, but it would be good to explicitly 
> state what takes precedence.
> As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} 
> looked like this:
> {code:java}
> @Private
> @Stable
> public abstract class RefreshNodesRequest {
>   @Public
>   @Stable
>   public static RefreshNodesRequest newInstance() {
> RefreshNodesRequest request = 
> Records.newRecord(RefreshNodesRequest.class);
> return request;
>   }
> }
> {code}
> Note that the class is marked {{\@Private}}, but the method is marked 
> {{\@Public}}.
> In this example, I'd say that the class level should have priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Moved] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found

2016-08-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla moved YARN-5515 to HADOOP-13493:
-

Affects Version/s: (was: 2.7.2)
   2.7.2
  Component/s: (was: documentation)
   documentation
  Key: HADOOP-13493  (was: YARN-5515)
  Project: Hadoop Common  (was: Hadoop YARN)

> Compatibility Docs should clarify the policy for what takes precedence when a 
> conflict is found
> ---
>
> Key: HADOOP-13493
> URL: https://issues.apache.org/jira/browse/HADOOP-13493
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: Robert Kanter
>
> The Compatibility Docs 
> (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API)
>  list the policies for Private, Public, not annotated, etc Classes and 
> members, but it doesn't say what happens when there's a conflict.  We should 
> try obviously try to avoid this situation, but it would be good to explicitly 
> state what takes precedence.
> As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} 
> looked like this:
> {code:java}
> @Private
> @Stable
> public abstract class RefreshNodesRequest {
>   @Public
>   @Stable
>   public static RefreshNodesRequest newInstance() {
> RefreshNodesRequest request = 
> Records.newRecord(RefreshNodesRequest.class);
> return request;
>   }
> }
> {code}
> Note that the class is marked {{\@Private}}, but the method is marked 
> {{\@Public}}.
> In this example, I'd say that the class level should have priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found

2016-08-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated HADOOP-13493:
--
Target Version/s: 2.8.0

> Compatibility Docs should clarify the policy for what takes precedence when a 
> conflict is found
> ---
>
> Key: HADOOP-13493
> URL: https://issues.apache.org/jira/browse/HADOOP-13493
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: Robert Kanter
>Priority: Critical
>
> The Compatibility Docs 
> (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API)
>  list the policies for Private, Public, not annotated, etc Classes and 
> members, but it doesn't say what happens when there's a conflict.  We should 
> try obviously try to avoid this situation, but it would be good to explicitly 
> state what takes precedence.
> As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} 
> looked like this:
> {code:java}
> @Private
> @Stable
> public abstract class RefreshNodesRequest {
>   @Public
>   @Stable
>   public static RefreshNodesRequest newInstance() {
> RefreshNodesRequest request = 
> Records.newRecord(RefreshNodesRequest.class);
> return request;
>   }
> }
> {code}
> Note that the class is marked {{\@Private}}, but the method is marked 
> {{\@Public}}.
> In this example, I'd say that the class level should have priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found

2016-08-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated HADOOP-13493:
--
Priority: Critical  (was: Major)

> Compatibility Docs should clarify the policy for what takes precedence when a 
> conflict is found
> ---
>
> Key: HADOOP-13493
> URL: https://issues.apache.org/jira/browse/HADOOP-13493
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: Robert Kanter
>Priority: Critical
>
> The Compatibility Docs 
> (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API)
>  list the policies for Private, Public, not annotated, etc Classes and 
> members, but it doesn't say what happens when there's a conflict.  We should 
> try obviously try to avoid this situation, but it would be good to explicitly 
> state what takes precedence.
> As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} 
> looked like this:
> {code:java}
> @Private
> @Stable
> public abstract class RefreshNodesRequest {
>   @Public
>   @Stable
>   public static RefreshNodesRequest newInstance() {
> RefreshNodesRequest request = 
> Records.newRecord(RefreshNodesRequest.class);
> return request;
>   }
> }
> {code}
> Note that the class is marked {{\@Private}}, but the method is marked 
> {{\@Public}}.
> In this example, I'd say that the class level should have priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10375) Local FS doesn't raise an error on mkdir() over a file

2016-08-12 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418829#comment-15418829
 ] 

Andras Bokor commented on HADOOP-10375:
---

[~ste...@apache.org],

I think it was solved by HADOOP-9361.
{code:title=RawLocalFilesystem.java#mkdirsWithOptionalPermission}
if (p2f.exists() && !p2f.isDirectory()) {
  throw new FileNotFoundException("Destination exists" +
  " and is not a directory: " + p2f.getCanonicalPath());
}
{code}

Do you agree?

> Local FS doesn't raise an error on mkdir() over a file
> --
>
> Key: HADOOP-10375
> URL: https://issues.apache.org/jira/browse/HADOOP-10375
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.3.0
>Reporter: Steve Loughran
>Assignee: Andras Bokor
>Priority: Minor
>
> if you mkdir() on a path where there is already a file, the operation does
> not fail. Instead the operation returns 0.
> This is at odds with the behaviour of HDFS. 
> HADOOP-6229 add the check for the parent dir not being a file, but something 
> similar is needed for the destination dir itself



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13184) Add "Apache" to Hadoop project logo

2016-08-12 Thread Shane Curcuru (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418823#comment-15418823
 ] 

Shane Curcuru commented on HADOOP-13184:


The logo should have a TM symbol; we do not currently hold a registration on 
the logo itself.

The mere use of the stylized HADOOP word within the logo does not mean the logo 
should be an (R), even though HADOOP as a word is registered in several 
countries.  Trademark law generally treats words separately from obviously 
stylized or graphical displays of words, so while the word (in any normal 
typeface) is registered, the logo depicting it is not.

Also, trademark law is often very specific about exact logos, or officially 
recognizing part of a logo vs. the entirety of the design.  Thus if the PMC 
wants to request registration of a new logo, we must provide the *exact* logo 
image we want to register.  While there are some trademark protections for 
similar designs, the strongest legal protection (and ability to use (R)) only 
extends to the same or virtually the same graphical image (although I'm pretty 
sure we could (R) a black & white version of the same image, if we registered 
the normal color version).

> Add "Apache" to Hadoop project logo
> ---
>
> Key: HADOOP-13184
> URL: https://issues.apache.org/jira/browse/HADOOP-13184
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Chris Douglas
>Assignee: Abhishek
>
> Many ASF projects include "Apache" in their logo. We should add it to Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-10375) Local FS doesn't raise an error on mkdir() over a file

2016-08-12 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor reassigned HADOOP-10375:
-

Assignee: Andras Bokor  (was: Steve Loughran)

> Local FS doesn't raise an error on mkdir() over a file
> --
>
> Key: HADOOP-10375
> URL: https://issues.apache.org/jira/browse/HADOOP-10375
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.3.0
>Reporter: Steve Loughran
>Assignee: Andras Bokor
>Priority: Minor
>
> if you mkdir() on a path where there is already a file, the operation does
> not fail. Instead the operation returns 0.
> This is at odds with the behaviour of HDFS. 
> HADOOP-6229 add the check for the parent dir not being a file, but something 
> similar is needed for the destination dir itself



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13487) Hadoop KMS doesn't clean up old delegation tokens stored in Zookeeper

2016-08-12 Thread Alex Iakovlev (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418753#comment-15418753
 ] 

Alex Iakovlev commented on HADOOP-13487:


I'm pretty sure I haven't report this:)

> Hadoop KMS doesn't clean up old delegation tokens stored in Zookeeper
> -
>
> Key: HADOOP-13487
> URL: https://issues.apache.org/jira/browse/HADOOP-13487
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Alex Ivanov
>
> Configuration:
> CDH 5.5.1 (Hadoop 2.6+)
> KMS configured to store delegation tokens in Zookeeper
> DEBUG logging enabled in /etc/hadoop-kms/conf/kms-log4j.properties
> Findings:
> It seems to me delegation tokens never get cleaned up from Zookeeper past 
> their renewal date. I can see in the logs that the removal thread is started 
> with the expected interval:
> {code}
> 2016-08-11 08:15:24,511 INFO  AbstractDelegationTokenSecretManager - Starting 
> expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
> {code}
> However, I don't see any delegation token removals, indicated by the 
> following log message:
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager 
> --> removeStoredToken(TokenIdent ident), line 769 [CDH]
> {code}
> if (LOG.isDebugEnabled()) {
>   LOG.debug("Removing ZKDTSMDelegationToken_"
>   + ident.getSequenceNumber());
> }
> {code}
> Meanwhile, I see a lot of expired delegation tokens in Zookeeper that don't 
> get cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-13491) fix several warnings from findbugs

2016-08-12 Thread uncleGen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-13491 started by uncleGen.
-
> fix several warnings from findbugs
> --
>
> Key: HADOOP-13491
> URL: https://issues.apache.org/jira/browse/HADOOP-13491
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: HADOOP-12756
>Reporter: uncleGen
>Assignee: uncleGen
> Fix For: HADOOP-12756
>
>
> {code:title=Bad practice Warnings|borderStyle=solid}
> Code  Warning
> RRorg.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.seek(long) ignores 
> result of java.io.InputStream.skip(long)
> Bug type SR_NOT_CHECKED (click for details) 
> In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
> In method org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.seek(long)
> Called method java.io.InputStream.skip(long)
> At AliyunOSSInputStream.java:[line 235]
> RR
> org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.multipartUploadObject() 
> ignores result of java.io.FileInputStream.skip(long)
> Bug type SR_NOT_CHECKED (click for details) 
> In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream
> In method 
> org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.multipartUploadObject()
> Called method java.io.FileInputStream.skip(long)
> At AliyunOSSOutputStream.java:[line 177]
> RVExceptional return value of java.io.File.delete() ignored in 
> org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.close()
> Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) 
> In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream
> In method org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.close()
> Called method java.io.File.delete()
> At AliyunOSSOutputStream.java:[line 116]
> {code}
> {code:title=Multithreaded correctness Warnings|borderStyle=solid}
> Code  Warning
> ISInconsistent synchronization of 
> org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.partRemaining; locked 
> 90% of time
> Bug type IS2_INCONSISTENT_SYNC (click for details) 
> In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
> Field org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.partRemaining
> Synchronized 90% of the time
> Unsynchronized access at AliyunOSSInputStream.java:[line 234]
> Synchronized access at AliyunOSSInputStream.java:[line 106]
> Synchronized access at AliyunOSSInputStream.java:[line 168]
> Synchronized access at AliyunOSSInputStream.java:[line 189]
> Synchronized access at AliyunOSSInputStream.java:[line 188]
> Synchronized access at AliyunOSSInputStream.java:[line 188]
> Synchronized access at AliyunOSSInputStream.java:[line 190]
> Synchronized access at AliyunOSSInputStream.java:[line 113]
> Synchronized access at AliyunOSSInputStream.java:[line 131]
> Synchronized access at AliyunOSSInputStream.java:[line 131]
> ISInconsistent synchronization of 
> org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.position; locked 66% of 
> time
> Bug type IS2_INCONSISTENT_SYNC (click for details) 
> In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
> Field org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.position
> Synchronized 66% of the time
> dUnsynchronized access at AliyunOSSInputStream.java:[line 232]
> Unsynchronized access at AliyunOSSInputStream.java:[line 234]
> Unsynchronized access at AliyunOSSInputStream.java:[line 234]
> Unsynchronized access at AliyunOSSInputStream.java:[line 235]
> Unsynchronized access at AliyunOSSInputStream.java:[line 236]
> Unsynchronized access at AliyunOSSInputStream.java:[line 245]
> Synchronized access at AliyunOSSInputStream.java:[line 222]
> Synchronized access at AliyunOSSInputStream.java:[line 105]
> Synchronized access at AliyunOSSInputStream.java:[line 167]
> Synchronized access at AliyunOSSInputStream.java:[line 169]
> Synchronized access at AliyunOSSInputStream.java:[line 187]
> Synchronized access at AliyunOSSInputStream.java:[line 187]
> Synchronized access at AliyunOSSInputStream.java:[line 113]
> Synchronized access at AliyunOSSInputStream.java:[line 114]
> Synchronized access at AliyunOSSInputStream.java:[line 130]
> Synchronized access at AliyunOSSInputStream.java:[line 130]
> Synchronized access at AliyunOSSInputStream.java:[line 259]
> Synchronized access at AliyunOSSInputStream.java:[line 266]
> ISInconsistent synchronization of 
> org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.wrappedStream; locked 
> 85% of time
> Bug type IS2_INCONSISTENT_SYNC (click for details) 
> In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
> Field org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.wrappedStream
> Synchronized 85% of the time
> Unsynchronized access at AliyunOSSInputStream.java:[line 235]
> Synchronized access at AliyunOSSInputStream.java:[line 

[jira] [Updated] (HADOOP-13491) fix several warnings from findbugs

2016-08-12 Thread uncleGen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

uncleGen updated HADOOP-13491:
--
Description: 
{code:title=Bad practice Warnings|borderStyle=solid}
CodeWarning
RR  org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.seek(long) ignores 
result of java.io.InputStream.skip(long)
Bug type SR_NOT_CHECKED (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
In method org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.seek(long)
Called method java.io.InputStream.skip(long)
At AliyunOSSInputStream.java:[line 235]

RR  
org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.multipartUploadObject() 
ignores result of java.io.FileInputStream.skip(long)
Bug type SR_NOT_CHECKED (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream
In method 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.multipartUploadObject()
Called method java.io.FileInputStream.skip(long)
At AliyunOSSOutputStream.java:[line 177]

RV  Exceptional return value of java.io.File.delete() ignored in 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.close()
Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream
In method org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.close()
Called method java.io.File.delete()
At AliyunOSSOutputStream.java:[line 116]
{code}

{code:title=Multithreaded correctness Warnings|borderStyle=solid}
CodeWarning
IS  Inconsistent synchronization of 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.partRemaining; locked 90% 
of time
Bug type IS2_INCONSISTENT_SYNC (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
Field org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.partRemaining
Synchronized 90% of the time
Unsynchronized access at AliyunOSSInputStream.java:[line 234]
Synchronized access at AliyunOSSInputStream.java:[line 106]
Synchronized access at AliyunOSSInputStream.java:[line 168]
Synchronized access at AliyunOSSInputStream.java:[line 189]
Synchronized access at AliyunOSSInputStream.java:[line 188]
Synchronized access at AliyunOSSInputStream.java:[line 188]
Synchronized access at AliyunOSSInputStream.java:[line 190]
Synchronized access at AliyunOSSInputStream.java:[line 113]
Synchronized access at AliyunOSSInputStream.java:[line 131]
Synchronized access at AliyunOSSInputStream.java:[line 131]

IS  Inconsistent synchronization of 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.position; locked 66% of 
time
Bug type IS2_INCONSISTENT_SYNC (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
Field org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.position
Synchronized 66% of the time
dUnsynchronized access at AliyunOSSInputStream.java:[line 232]
Unsynchronized access at AliyunOSSInputStream.java:[line 234]
Unsynchronized access at AliyunOSSInputStream.java:[line 234]
Unsynchronized access at AliyunOSSInputStream.java:[line 235]
Unsynchronized access at AliyunOSSInputStream.java:[line 236]
Unsynchronized access at AliyunOSSInputStream.java:[line 245]
Synchronized access at AliyunOSSInputStream.java:[line 222]
Synchronized access at AliyunOSSInputStream.java:[line 105]
Synchronized access at AliyunOSSInputStream.java:[line 167]
Synchronized access at AliyunOSSInputStream.java:[line 169]
Synchronized access at AliyunOSSInputStream.java:[line 187]
Synchronized access at AliyunOSSInputStream.java:[line 187]
Synchronized access at AliyunOSSInputStream.java:[line 113]
Synchronized access at AliyunOSSInputStream.java:[line 114]
Synchronized access at AliyunOSSInputStream.java:[line 130]
Synchronized access at AliyunOSSInputStream.java:[line 130]
Synchronized access at AliyunOSSInputStream.java:[line 259]
Synchronized access at AliyunOSSInputStream.java:[line 266]

IS  Inconsistent synchronization of 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.wrappedStream; locked 85% 
of time
Bug type IS2_INCONSISTENT_SYNC (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
Field org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.wrappedStream
Synchronized 85% of the time
Unsynchronized access at AliyunOSSInputStream.java:[line 235]
Synchronized access at AliyunOSSInputStream.java:[line 92]
Synchronized access at AliyunOSSInputStream.java:[line 96]
Synchronized access at AliyunOSSInputStream.java:[line 101]
Synchronized access at AliyunOSSInputStream.java:[line 102]
Synchronized access at AliyunOSSInputStream.java:[line 178]
Synchronized access at AliyunOSSInputStream.java:[line 123]
{code}

{code:title=Dodgy code Warnings|borderStyle=solid}
CodeWarning
REC Exception is caught when Exception is not thrown in 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem.multipartCopy(String, long, 
String)
Bug type REC_CATCH_EXCEPTION 

[jira] [Updated] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13489:

Attachment: TestIncrementalBackup-output.txt

Test output with DistCp related portion.

> DistCp may incorrectly return success status when the underlying Job failed
> ---
>
> Key: HADOOP-13489
> URL: https://issues.apache.org/jira/browse/HADOOP-13489
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
> Attachments: HADOOP-13489.v1.patch, TestIncrementalBackup-output.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
> try {
>   execute();
> } catch (InvalidInputException e) {
>   LOG.error("Invalid input: ", e);
>   return DistCpConstants.INVALID_ARGUMENT;
> } catch (DuplicateFileException e) {
>   LOG.error("Duplicate files in input path: ", e);
>   return DistCpConstants.DUPLICATE_INPUT;
> } catch (AclsNotSupportedException e) {
>   LOG.error("ACLs not supported on at least one file system: ", e);
>   return DistCpConstants.ACLS_NOT_SUPPORTED;
> } catch (XAttrsNotSupportedException e) {
>   LOG.error("XAttrs not supported on at least one file system: ", e);
>   return DistCpConstants.XATTRS_NOT_SUPPORTED;
> } catch (Exception e) {
>   LOG.error("Exception encountered ", e);
>   return DistCpConstants.UNKNOWN_ERROR;
> }
> return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12756) Incorporate Aliyun OSS file system implementation

2016-08-12 Thread uncleGen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418558#comment-15418558
 ] 

uncleGen commented on HADOOP-12756:
---

[HADOOP-13479|https://issues.apache.org/jira/browse/HADOOP-13479] will do some 
preparation and improvements before release.

> Incorporate Aliyun OSS file system implementation
> -
>
> Key: HADOOP-12756
> URL: https://issues.apache.org/jira/browse/HADOOP-12756
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0, HADOOP-12756
>Reporter: shimingfei
>Assignee: shimingfei
> Fix For: HADOOP-12756
>
> Attachments: HADOOP-12756-v02.patch, HADOOP-12756.003.patch, 
> HADOOP-12756.004.patch, HADOOP-12756.005.patch, HADOOP-12756.006.patch, 
> HADOOP-12756.007.patch, HADOOP-12756.008.patch, HADOOP-12756.009.patch, HCFS 
> User manual.md, OSS integration.pdf, OSS integration.pdf
>
>
> Aliyun OSS is widely used among China’s cloud users, but currently it is not 
> easy to access data laid on OSS storage from user’s Hadoop/Spark application, 
> because of no original support for OSS in Hadoop.
> This work aims to integrate Aliyun OSS with Hadoop. By simple configuration, 
> Spark/Hadoop applications can read/write data from OSS without any code 
> change. Narrowing the gap between user’s APP and data storage, like what have 
> been done for S3 in Hadoop 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13491) fix several warnings from findbugs

2016-08-12 Thread uncleGen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

uncleGen updated HADOOP-13491:
--
Description: 

CodeWarning
RR  org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.seek(long) ignores 
result of java.io.InputStream.skip(long)
Bug type SR_NOT_CHECKED (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
In method org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.seek(long)
Called method java.io.InputStream.skip(long)
At AliyunOSSInputStream.java:[line 235]

RR  
org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.multipartUploadObject() 
ignores result of java.io.FileInputStream.skip(long)
Bug type SR_NOT_CHECKED (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream
In method 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.multipartUploadObject()
Called method java.io.FileInputStream.skip(long)
At AliyunOSSOutputStream.java:[line 177]

RV  Exceptional return value of java.io.File.delete() ignored in 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.close()
Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream
In method org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream.close()
Called method java.io.File.delete()
At AliyunOSSOutputStream.java:[line 116]


CodeWarning
IS  Inconsistent synchronization of 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.partRemaining; locked 90% 
of time
Bug type IS2_INCONSISTENT_SYNC (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
Field org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.partRemaining
Synchronized 90% of the time
Unsynchronized access at AliyunOSSInputStream.java:[line 234]
Synchronized access at AliyunOSSInputStream.java:[line 106]
Synchronized access at AliyunOSSInputStream.java:[line 168]
Synchronized access at AliyunOSSInputStream.java:[line 189]
Synchronized access at AliyunOSSInputStream.java:[line 188]
Synchronized access at AliyunOSSInputStream.java:[line 188]
Synchronized access at AliyunOSSInputStream.java:[line 190]
Synchronized access at AliyunOSSInputStream.java:[line 113]
Synchronized access at AliyunOSSInputStream.java:[line 131]
Synchronized access at AliyunOSSInputStream.java:[line 131]

IS  Inconsistent synchronization of 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.position; locked 66% of 
time
Bug type IS2_INCONSISTENT_SYNC (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
Field org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.position
Synchronized 66% of the time
Unsynchronized access at AliyunOSSInputStream.java:[line 232]
Unsynchronized access at AliyunOSSInputStream.java:[line 234]
Unsynchronized access at AliyunOSSInputStream.java:[line 234]
Unsynchronized access at AliyunOSSInputStream.java:[line 235]
Unsynchronized access at AliyunOSSInputStream.java:[line 236]
Unsynchronized access at AliyunOSSInputStream.java:[line 245]
Synchronized access at AliyunOSSInputStream.java:[line 222]
Synchronized access at AliyunOSSInputStream.java:[line 105]
Synchronized access at AliyunOSSInputStream.java:[line 167]
Synchronized access at AliyunOSSInputStream.java:[line 169]
Synchronized access at AliyunOSSInputStream.java:[line 187]
Synchronized access at AliyunOSSInputStream.java:[line 187]
Synchronized access at AliyunOSSInputStream.java:[line 113]
Synchronized access at AliyunOSSInputStream.java:[line 114]
Synchronized access at AliyunOSSInputStream.java:[line 130]
Synchronized access at AliyunOSSInputStream.java:[line 130]
Synchronized access at AliyunOSSInputStream.java:[line 259]
Synchronized access at AliyunOSSInputStream.java:[line 266]

IS  Inconsistent synchronization of 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.wrappedStream; locked 85% 
of time
Bug type IS2_INCONSISTENT_SYNC (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream
Field org.apache.hadoop.fs.aliyun.oss.AliyunOSSInputStream.wrappedStream
Synchronized 85% of the time
Unsynchronized access at AliyunOSSInputStream.java:[line 235]
Synchronized access at AliyunOSSInputStream.java:[line 92]
Synchronized access at AliyunOSSInputStream.java:[line 96]
Synchronized access at AliyunOSSInputStream.java:[line 101]
Synchronized access at AliyunOSSInputStream.java:[line 102]
Synchronized access at AliyunOSSInputStream.java:[line 178]
Synchronized access at AliyunOSSInputStream.java:[line 123]


CodeWarning
REC Exception is caught when Exception is not thrown in 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem.multipartCopy(String, long, 
String)
Bug type REC_CATCH_EXCEPTION (click for details) 
In class org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem
In method 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem.multipartCopy(String, long, 
String)
At 

[jira] [Commented] (HADOOP-7064) FsShell does not properly check permissions of files in a directory when doing rmr

2016-08-12 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418540#comment-15418540
 ] 

Weiwei Yang commented on HADOOP-7064:
-

I have uploaded a patch in HDFS-8312 to demonstrate this issue,  appreciate if 
someone can take a look and let me know if my thought looks good. 

Many thanks.

> FsShell does not properly check permissions of files in a directory when 
> doing rmr
> --
>
> Key: HADOOP-7064
> URL: https://issues.apache.org/jira/browse/HADOOP-7064
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.20.2
>Reporter: Alan Gates
>
> In POSIX file semantics, the ability to remove an entry a file is determined 
> by whether the user has write permissions on the directory containing the 
> file.  However, to delete recursively (rm -r) the user must have write 
> permissions in all directories being removed.  Thus if you have a directory 
> structure like /a/b/c and a user has write permissions on a but not on b, 
> then he is not allowed to do 'rm -r b'.  This is because he does not have 
> permissions to remove c, so the rm of b fails, even though he has permission 
> to remove b.
> However, 'hadoop fs -rmr b' removes both b and c in this case.  It should 
> instead fail and return an error message saying the user does not have 
> permission to remove c.  'hadoop fs -rmr c' correctly fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13489) DistCp may incorrectly return success status when the underlying Job failed

2016-08-12 Thread Ted Yu (JIRA)
Ted Yu created HADOOP-13489:
---

 Summary: DistCp may incorrectly return success status when the 
underlying Job failed
 Key: HADOOP-13489
 URL: https://issues.apache.org/jira/browse/HADOOP-13489
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu


I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value from 
copyService.copy() was 0.

Here is related code from DistCp:
{code}
try {
  execute();
} catch (InvalidInputException e) {
  LOG.error("Invalid input: ", e);
  return DistCpConstants.INVALID_ARGUMENT;
} catch (DuplicateFileException e) {
  LOG.error("Duplicate files in input path: ", e);
  return DistCpConstants.DUPLICATE_INPUT;
} catch (AclsNotSupportedException e) {
  LOG.error("ACLs not supported on at least one file system: ", e);
  return DistCpConstants.ACLS_NOT_SUPPORTED;
} catch (XAttrsNotSupportedException e) {
  LOG.error("XAttrs not supported on at least one file system: ", e);
  return DistCpConstants.XATTRS_NOT_SUPPORTED;
} catch (Exception e) {
  LOG.error("Exception encountered ", e);
  return DistCpConstants.UNKNOWN_ERROR;
}
return DistCpConstants.SUCCESS;
{code}
We don't check whether the Job returned by execute() was successful.
Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-13491) fix several warnings from findbugs

2016-08-12 Thread uncleGen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

uncleGen reassigned HADOOP-13491:
-

Assignee: uncleGen

> fix several warnings from findbugs
> --
>
> Key: HADOOP-13491
> URL: https://issues.apache.org/jira/browse/HADOOP-13491
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: HADOOP-12756
>Reporter: uncleGen
>Assignee: uncleGen
> Fix For: HADOOP-12756
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13483) file-create should throw error rather than overwrite directories

2016-08-12 Thread uncleGen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418520#comment-15418520
 ] 

uncleGen commented on HADOOP-13483:
---

waiting for [HADOOP-13491|https://issues.apache.org/jira/browse/HADOOP-13491] 
completed.

> file-create should throw error rather than overwrite directories
> 
>
> Key: HADOOP-13483
> URL: https://issues.apache.org/jira/browse/HADOOP-13483
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: HADOOP-12756
>Reporter: uncleGen
>Assignee: uncleGen
> Fix For: HADOOP-12756
>
> Attachments: HADOOP-13483-HADOOP-12756.002.patch, 
> HADOOP-13483.001.patch
>
>
> similar to [HADOOP-13188|https://issues.apache.org/jira/browse/HADOOP-13188]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13492) DataXceiver#run() should not log InvalidToken exception as an error

2016-08-12 Thread Pan Yuxuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pan Yuxuan updated HADOOP-13492:

Attachment: HADOOP-13492.patch

Attachment is a patch for this.

> DataXceiver#run() should not log InvalidToken exception as an error
> ---
>
> Key: HADOOP-13492
> URL: https://issues.apache.org/jira/browse/HADOOP-13492
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Pan Yuxuan
> Attachments: HADOOP-13492.patch
>
>
> DataXceiver#run() just log InvalidToken exception as an error.
> When client has an expired token and just refetch a new token, the DN log 
> will has an error like below:
> {noformat}
> 2016-08-11 02:41:09,817 ERROR datanode.DataNode (DataXceiver.java:run(269)) - 
> XXX:50010:DataXceiver error processing READ_BLOCK operation  src: 
> /10.17.1.5:38844 dst: /10.17.1.5:50010
> org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with 
> block_token_identifier (expiryDate=1470850746803, keyId=-2093956963, 
> userId=hbase, blockPoolId=BP-641703426-10.17.1.2-1468517918886, 
> blockId=1077120201, access modes=[READ]) is expired.
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:301)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:97)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1236)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:481)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:242)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This is not a server error and the DataXceiver#checkAccess() has already 
> loged the InvalidToken as a warning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13492) DataXceiver#run() should not log InvalidToken exception as an error

2016-08-12 Thread Pan Yuxuan (JIRA)
Pan Yuxuan created HADOOP-13492:
---

 Summary: DataXceiver#run() should not log InvalidToken exception 
as an error
 Key: HADOOP-13492
 URL: https://issues.apache.org/jira/browse/HADOOP-13492
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0-alpha1
Reporter: Pan Yuxuan


DataXceiver#run() just log InvalidToken exception as an error.
When client has an expired token and just refetch a new token, the DN log will 
has an error like below:
{noformat}
2016-08-11 02:41:09,817 ERROR datanode.DataNode (DataXceiver.java:run(269)) - 
XXX:50010:DataXceiver error processing READ_BLOCK operation  src: 
/10.17.1.5:38844 dst: /10.17.1.5:50010
org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with 
block_token_identifier (expiryDate=1470850746803, keyId=-2093956963, 
userId=hbase, blockPoolId=BP-641703426-10.17.1.2-1468517918886, 
blockId=1077120201, access modes=[READ]) is expired.
at 
org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280)
at 
org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:301)
at 
org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:97)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1236)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:481)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:242)
at java.lang.Thread.run(Thread.java:745)
{noformat}
This is not a server error and the DataXceiver#checkAccess() has already loged 
the InvalidToken as a warning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13491) fix several warnings from findbugs

2016-08-12 Thread uncleGen (JIRA)
uncleGen created HADOOP-13491:
-

 Summary: fix several warnings from findbugs
 Key: HADOOP-13491
 URL: https://issues.apache.org/jira/browse/HADOOP-13491
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs
Affects Versions: HADOOP-12756
Reporter: uncleGen
 Fix For: HADOOP-12756






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13483) file-create should throw error rather than overwrite directories

2016-08-12 Thread uncleGen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418423#comment-15418423
 ] 

uncleGen commented on HADOOP-13483:
---

get, i am updating the patch

> file-create should throw error rather than overwrite directories
> 
>
> Key: HADOOP-13483
> URL: https://issues.apache.org/jira/browse/HADOOP-13483
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: HADOOP-12756
>Reporter: uncleGen
>Assignee: uncleGen
> Fix For: HADOOP-12756
>
> Attachments: HADOOP-13483-HADOOP-12756.002.patch, 
> HADOOP-13483.001.patch
>
>
> similar to [HADOOP-13188|https://issues.apache.org/jira/browse/HADOOP-13188]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13487) Hadoop KMS doesn't clean up old delegation tokens stored in Zookeeper

2016-08-12 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418406#comment-15418406
 ] 

Xiao Chen commented on HADOOP-13487:


Thanks for reporting this, [~Aguinore].

Just to clarify, the tokens will only be cleaned up after they reach max 
lifetime (7 days by default).

If you're sure the tokens in zookeeper is expired, a workaround would be to 
remove them manually. But before that, would you mind capture a snapshot of 
zoookeeper's znodes here, for investigation?

> Hadoop KMS doesn't clean up old delegation tokens stored in Zookeeper
> -
>
> Key: HADOOP-13487
> URL: https://issues.apache.org/jira/browse/HADOOP-13487
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Alex Ivanov
>
> Configuration:
> CDH 5.5.1 (Hadoop 2.6+)
> KMS configured to store delegation tokens in Zookeeper
> DEBUG logging enabled in /etc/hadoop-kms/conf/kms-log4j.properties
> Findings:
> It seems to me delegation tokens never get cleaned up from Zookeeper past 
> their renewal date. I can see in the logs that the removal thread is started 
> with the expected interval:
> {code}
> 2016-08-11 08:15:24,511 INFO  AbstractDelegationTokenSecretManager - Starting 
> expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
> {code}
> However, I don't see any delegation token removals, indicated by the 
> following log message:
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager 
> --> removeStoredToken(TokenIdent ident), line 769 [CDH]
> {code}
> if (LOG.isDebugEnabled()) {
>   LOG.debug("Removing ZKDTSMDelegationToken_"
>   + ident.getSequenceNumber());
> }
> {code}
> Meanwhile, I see a lot of expired delegation tokens in Zookeeper that don't 
> get cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org