[jira] [Updated] (HDFS-10397) Distcp should ignore -delete option if -diff option is provided instead of exiting

2016-05-14 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10397:
-
Attachment: HDFS-10397.002.patch

Thanks [~yzhangal] for the insightful suggestion. This motivated my v2 patch.

The pain in the code to handle different option combinations comes from the 
fact that, for each option we may validate and set it individually. This is not 
a clear way as 1) not efficient, 2) not well defined, and 3) error prone.
# For point 1) we validate the options multiple times which is not needed or 
scalable.
# For 2) some of the options are set after validation while the other options 
are set without validation. Distributing the decision to validate or not to 
validate across all the setters smells bad to me.
# For 3), when we validate an option, chances are that its dependent option B 
is not set yet. This implies that the order of setting options have to be 
carefully chosen, leading to fragile code snippet. Take {{syncFolder}} and 
{{skipCRC}} for example, skip CRC is valid only with update options, and if we 
set (and thus validate) {{skipCRC}} before setting {{syncFolder}} option, the 
validation will fail, even if both of them are provided in the command line.

I think a better way is to validate all the options only once after all the 
options are set, i.e. a central validation method. Moreover, the parser is to 
parse the options and should not handle the validation of option combinations 
explicitly, if it's possible to delegate the work to {{validate()}} method of 
{{DistCpOptions}}. Of course, if there is any parsing errors of a single option 
(eg. only one snapshot is provided for the {{-diff}} option), the parser should 
throw the {{IllegalArgumentException}} directly.

What's your thought?

Ping [~jingzhao] for more input.

> Distcp should ignore -delete option if -diff option is provided instead of 
> exiting
> --
>
> Key: HDFS-10397
> URL: https://issues.apache.org/jira/browse/HDFS-10397
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10397.000.patch, HDFS-10397.001.patch, 
> HDFS-10397.002.patch
>
>
> In distcp, {{-delete}} and {{-diff}} options are mutually exclusive. 
> [HDFS-8828] brought strictly checking which makes the existing applications 
> (or scripts) that work just fine with both {{-delete}} and {{-diff}} options 
> previously stop performing because of the 
> {{java.lang.IllegalArgumentException: Diff is valid only with update 
> options}} exception.
> To make it backward incompatible, we can ignore the {{-delete}} option, given 
> {{-diff}} option, instead of exiting the program. Along with that, we can 
> print a warning message saying that _Diff is valid only with update options, 
> and -delete option is ignored_.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10397) Distcp should ignore -delete option if -diff option is provided instead of exiting

2016-05-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283487#comment-15283487
 ] 

Hadoop QA commented on HDFS-10397:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} hadoop-tools/hadoop-distcp: patch generated 0 new + 
76 unchanged - 11 fixed = 76 total (was 87) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 2s 
{color} | {color:green} hadoop-distcp in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 44s 
{color} | {color:green} hadoop-distcp in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 8s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12804014/HDFS-10397.002.patch |
| JIRA Issue | HDFS-10397 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1598bc190d8d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 3fa13

[jira] [Commented] (HDFS-2173) saveNamespace should not throw IOE when only one storage directory fails to write VERSION file

2016-05-14 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283491#comment-15283491
 ] 

Andras Bokor commented on HDFS-2173:


[~andrew.wang]
Thanks a lot for reviewing this.
Yesterday I was not able to comment on JIRA due to lock down.
bq. writeAll is not quite right though...
That was my concern too. It was not obvious based on the description and the 
tests. I should have commented my concern. My fault.

I updated my patch and uploaded [^HDFS-2173.02.patch] and 
[^HDFS-2173.03.patch]. They do the same but with a different solution.
Which do you think is more straightforward?

In spite of that I uploaded a patch it seems [~hadoopqa] was not triggered.

> saveNamespace should not throw IOE when only one storage directory fails to 
> write VERSION file
> --
>
> Key: HDFS-2173
> URL: https://issues.apache.org/jira/browse/HDFS-2173
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: Edit log branch (HDFS-1073), 0.23.0
>Reporter: Todd Lipcon
>Assignee: Andras Bokor
> Attachments: HDFS-2173.01.patch, HDFS-2173.02.patch, 
> HDFS-2173.03.patch
>
>
> This JIRA tracks a TODO in TestSaveNamespace. Currently, if, while writing 
> the VERSION files in the storage directories, one of the directories fails, 
> the entire operation throws IOE. This is unnecessary -- instead, just that 
> directory should be marked as failed.
> This is targeted to be fixed _after_ HDFS-1073 is merged to trunk, since it 
> does not ever dataloss, and would rarely occur in practice (the dir would 
> have to fail between writing the fsimage file and writing VERSION)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10400) hdfs dfs -put exits with zero on error

2016-05-14 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283529#comment-15283529
 ] 

Yiqun Lin commented on HDFS-10400:
--

I have looked into the code. When the src local files's num is more than one, 
it will invoke its parent method. And the potenial IOException in method 
{{processArgument}} is catched in {{Command}} and not be threw again.
{code}
  protected void processArguments(LinkedList args)
  throws IOException {
for (PathData arg : args) {
  try {
processArgument(arg);
  } catch (IOException e) {
displayError(e);
  }
}
  }
{code}
The similar case is also happens in {{Commands#processPaths}}. And these method 
will be involed in {{processRawArguments(args);}}, its IOException will not be 
thred here. The numErrors will also not be incrased.
{code}
  public int run(String...argv) {
LinkedList args = new LinkedList(Arrays.asList(argv));
try {
  if (isDeprecated()) {
displayWarning(
"DEPRECATED: Please use '"+ getReplacementCommand() + "' instead.");
  }
  processOptions(args);
  processRawArguments(args);
} catch (CommandInterruptException e) {
  displayError("Interrupted");
  return 130;
} catch (IOException e) {
  displayError(e);
}

return (numErrors == 0) ? exitCode : exitCodeForError();
  }
{code} 
So I think this is likely the reason. 
I'm glad to do further work for this, who can assign this JIRA to me? It seems 
that I can't assign JIRA to myself now, thanks.

> hdfs dfs -put exits with zero on error
> --
>
> Key: HDFS-10400
> URL: https://issues.apache.org/jira/browse/HDFS-10400
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jo Desmet
>
> On a filesystem that is about to fill up, execute "hdfs dfs -put" for a file 
> that is big enough to go over the limit. As a result, the command fails with 
> an exception, however the command terminates normally (exit code 0).
> Expectation is that any detectable failure generates an exit code different 
> than zero.
> Documentation on 
> https://hadoop.apache.org/docs/r1.2.1/file_system_shell.html#put states:
> Exit Code:
> Returns 0 on success and -1 on error. 
> following is the exception generated: 
> 16/05/11 13:37:07 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2282)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1352)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1271)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:464)
> 16/05/11 13:37:07 INFO hdfs.DFSClient: Abandoning 
> BP-1964113808-130.8.138.99-1446787670498:blk_1073835906_95114
> 16/05/11 13:37:08 INFO hdfs.DFSClient: Excluding datanode 
> DatanodeInfoWithStorage[130.8.138.99:50010,DS-eed7039a-8031-499e-85a5-7216b9d766a8,DISK]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2016-05-14 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283542#comment-15283542
 ] 

Kai Sasaki commented on HDFS-8287:
--

[~umamaheswararao] Sorry for late response. I couldn't take time to work on 
this JIRA.
It seems to be required to rebase. Can I check the current approach is correct 
or wrong on current DFSStripedOutputStream codebase? Thank you for taking care.

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, 
> HDFS-8287-HDFS-7285.05.patch, HDFS-8287-HDFS-7285.06.patch, 
> HDFS-8287-HDFS-7285.07.patch, HDFS-8287-HDFS-7285.08.patch, 
> HDFS-8287-HDFS-7285.09.patch, HDFS-8287-HDFS-7285.10.patch, 
> HDFS-8287-HDFS-7285.11.patch, HDFS-8287-HDFS-7285.WIP.patch, 
> HDFS-8287-performance-report.pdf, HDFS-8287.12.patch, HDFS-8287.13.patch, 
> HDFS-8287.14.patch, HDFS-8287.15.patch, h8287_20150911.patch, jstack-dump.txt
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for striped files (stripe by stripe)

2016-05-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283544#comment-15283544
 ] 

Kai Zheng commented on HDFS-8430:
-

Thanks [~umamaheswararao]. Yes it would be great to have this in a rc release, 
as well as some others. I will schedule this accordingly.

> Erasure coding: compute file checksum for striped files (stripe by stripe)
> --
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10397) Distcp should ignore -delete option if -diff option is provided instead of exiting

2016-05-14 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283787#comment-15283787
 ] 

Yongjun Zhang commented on HDFS-10397:
--

Hi [~liuml07],

I think it's a very good idea to do what you did in 002. I will try to review 
early next week.

Thanks a lot.



> Distcp should ignore -delete option if -diff option is provided instead of 
> exiting
> --
>
> Key: HDFS-10397
> URL: https://issues.apache.org/jira/browse/HDFS-10397
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10397.000.patch, HDFS-10397.001.patch, 
> HDFS-10397.002.patch
>
>
> In distcp, {{-delete}} and {{-diff}} options are mutually exclusive. 
> [HDFS-8828] brought strictly checking which makes the existing applications 
> (or scripts) that work just fine with both {{-delete}} and {{-diff}} options 
> previously stop performing because of the 
> {{java.lang.IllegalArgumentException: Diff is valid only with update 
> options}} exception.
> To make it backward incompatible, we can ignore the {{-delete}} option, given 
> {{-diff}} option, instead of exiting the program. Along with that, we can 
> print a warning message saying that _Diff is valid only with update options, 
> and -delete option is ignored_.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org