[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875213#comment-15875213
 ] 

Hadoop QA commented on HADOOP-14041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
31s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
55s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
28s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
39s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 14s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12853616/HADOOP-14041-HADOOP-13345.009.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux b8aa84dfe4bf 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 7a1bce5 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11664/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11664/testReport/ |
| modules | C: hadoop-common-projec

[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875009#comment-15875009
 ] 

Hadoop QA commented on HADOOP-14041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
55s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
31s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
43s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
45s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
39s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 13s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
48s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12853594/HADOOP-14041-HADOOP-13345.008.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 5fa7b54ece64 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 8b37b6a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11659/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11659/testReport/ |
| modules | C: hadoop-common-projec

[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-20 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874960#comment-15874960
 ] 

Sean Mackrory commented on HADOOP-14041:


Looks good - +1

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch, 
> HADOOP-14041-HADOOP-13345.006.patch, HADOOP-14041-HADOOP-13345.007.patch, 
> HADOOP-14041-HADOOP-13345.008.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-20 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874906#comment-15874906
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

Thanks for the folow-up patch [~mackrorysd].  This is looking good. I'm 
generally +1 on this but am  attaching a patch that makes a couple of minor 
changes.  

- Remove a whitespace change.
- NullMetadataStore does support prune(), it is a no-op (matching with the rest 
of that class).
- MetadataStoreTestBase tests the contract semantics (any files older than X 
are removed), not the specific DynamoDBMetadataStore behavior of leaving 
directories (that could be added in the TestDynamoDBMetadataStore subclass).
- Use allowMissing() in a finer-grained manner: we can still run the test and 
assertNotCached() when allowMissing().

I tested the three MetadataStore integration tests in US West 2.


> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch, 
> HADOOP-14041-HADOOP-13345.006.patch, HADOOP-14041-HADOOP-13345.007.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-20 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874513#comment-15874513
 ] 

Steve Loughran commented on HADOOP-14041:
-

Current convention w.r.t interrupts tends to be one of

* call {{Thread.interrupt()}} to mark the thread as interrupted again.
* throw an {{InterruptedIOException}}, wrapping the inner InterruptedException
* some classes which don't declare "throws IOE" wrap it in a generic 
RuntimeException (bad).

I don't know which is better, I'm sure in decades to come people will curse our 
decision whatever it is.

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch, 
> HADOOP-14041-HADOOP-13345.006.patch, HADOOP-14041-HADOOP-13345.007.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873462#comment-15873462
 ] 

Hadoop QA commented on HADOOP-14041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
50s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
35s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
44s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
28s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
38s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 13s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 88m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12853440/HADOOP-14041-HADOOP-13345.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux de9cbbe3214b 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 8b37b6a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11655/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11655/testReport/ |
| modules | C: hadoop-common-projec

[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-18 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873358#comment-15873358
 ] 

Sean Mackrory commented on HADOOP-14041:


{quote}This part of the change could be left out, I think? NullMetadataStore 
always prunes! Where prune is defined as removing anything older than X.. 
always true for empty set. {quote}

Trouble is that to make this testable pruning has to be defined as ONLY pruning 
what it should. NullMetadataStore tends to get a little carried away at that 
part. So like you said - may be closely linked with allowMissing. Will rev the 
patch on all the other input...

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch, 
> HADOOP-14041-HADOOP-13345.006.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-17 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872846#comment-15872846
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

Thanks for the follow-up patch [~mackrorysd].  Looks good.. Of the comments 
below, I think the important ones are the prune() method prototype, and errors 
going to stderr.

{noformat}
+  public void testPruneDirs() throws Exception {
+// This test does not necessarily define required behavior: directories
+// that become empty after a prune operation could be cleaned up, but
+// currently they don't because if a file was created in that directory
+// mid-prune, it would violate the invariant that all ancestors of a file
{noformat}

Tiny nit: this invariant is an implementation detail of the dynamo MS.  Not a 
MetadataStore invariant per se.  Could mention the word dynamo here.

{noformat}
+// exist in the metastore. If an implementation could satisfy this, it
+// would be okay for this test not to pass.
+Assume.assumeFalse(ms instanceof NullMetadataStore);
+createNewDirs("/pruneDirs/dir");
{noformat}

Did you mean to change this Assume to call {{supportsPruning()}}?
Technically, seems like you should use that, and maybe {{allowMissing()}}?  
Basically, when allowMissing() returns true, the metadata store may not return 
results you just put into it (like a cache where something got evicted before 
you asked for it again).

{noformat}
--- 
a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStore.java
+++ 
b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStore.java
@@ -165,4 +165,15 @@ void move(Collection pathsToDelete, 
Collection
* @throws IOException if there is an error
*/
   void destroy() throws IOException;
+
+  /**
+   * Clear any metadata older than a specified time from the repository. Note
+   * that modification times should be in UTC, as returned by System
+   * .currentTimeMillis at the time of modification.
+   *
+   * @param modTime Oldest modification time to allow
+   * @throws IOException if there is an error
+   * @throws InterruptedException if the process is interrupted
+   */
+  void prune(long modTime) throws InterruptedException, IOException;
 }
{noformat}
Couple of things:
1. We should mention here that implementations:  *must* clear any file metadata 
older than modTime, *may* clear any directory metadata older than modTime, and 
throw an UnsupportedOperationException(*) otherwise?
2. Instead of declaring a checked exception (InterruptedException), IMO, that 
should always be wrapped in an IOException.. So this should only be throws 
IOException.

(*) [~ste...@apache.org] is this the idiomatic thing to do here in Hadoop?

{noformat}
--- 
a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/NullMetadataStore.java
+++ 
b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/NullMetadataStore.java
@@ -87,6 +87,10 @@ public void destroy() throws IOException {
   }
  
   @Override
+  public void prune(long modTime) throws IOException {
+  }
+
{noformat}
Love the algorithm here.   Classic no-op, my fave.

{noformat}
+  if (confDelta <= 0 && cliDelta <= 0) {
+System.out.println(
+"You must specify a positive age for metadata to prune.");
+  }
+
{noformat}
I think this should go to stderr (search for "stderr" 
[here|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html]).

{noformat}
--- 
a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestNullMetadataStore.java
+++ 
b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestNullMetadataStore.java
@@ -51,6 +51,12 @@ public boolean allowMissing() {
 return true;
   }
  
+  /** This MetadataStore won't store anything, so there's nothing to prune. */
+  @Override
+  public boolean supportsPruning() {
+return false;
+  }
{noformat}

This part of the change could be left out, I think?  NullMetadataStore always 
prunes!  Where prune is defined as removing anything older than X.. always true 
for empty set.  :-)

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch, 
> HADOOP-14041-HADOOP-13345.006.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extend

[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872436#comment-15872436
 ] 

Hadoop QA commented on HADOOP-14041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
10s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m  
0s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
47s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 37s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 97m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.sftp.TestSFTPFileSystem |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12853293/HADOOP-14041-HADOOP-13345.006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux a931b4ee63e0 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 8b37b6a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11653/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11653/testReport/ |
| modules | C: hadoop-comm

[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871899#comment-15871899
 ] 

Steve Loughran commented on HADOOP-14041:
-

why in {{DynamoDBMetadataStore}} line 584 IOE .getMessage() logged, but not 
details and not rethrown.

if the IOEs really are to be swallowed, then it should be a full log.warn. 
Though I think it should actually just throw up the IOE to the caller. Why? for 
tests to show something failed, for management tools calling it direct to 
detect the same, and for CLI tools to report and return an error code. 
Something has gone wrong

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-16 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870930#comment-15870930
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

This usage also doesn't work:

{noformat}
$ hadoop s3a prune -H 2 -m dynamodb://fabbri-bucket s3a://fabbri-bucket
2017-02-16 14:02:26,320 INFO s3guard.S3GuardTool: create metadata store: 
dynamodb://fabbri-dev scheme: dynamodb
2017-02-16 14:02:26,456 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2017-02-16 14:02:27,191 ERROR s3guard.DynamoDBClientFactory: Incorrect DynamoDB 
endpoint: null
java.lang.IllegalArgumentException: endpoint cannot be null
at com.amazonaws.util.RuntimeHttpUtils.toUri(RuntimeHttpUtils.java:147)
at 
com.amazonaws.AmazonWebServiceClient.toURI(AmazonWebServiceClient.java:224)
{noformat}

Similar error doing {{hadoop s3a prune -H 2 -m dynamodb://fabbri-bucket -e 
dynamodb.us-west-2.amazonaws.com}}

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-16 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870761#comment-15870761
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

[~mackrorysd] I'm also fine with doing a followup "S3guard CLI improvements" 
JIRA.. there are multiple related issues I'd like to tackle.. So I'm fine with 
committing this patch (once I finish my testing) and then filing new JIRA.

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-16 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870710#comment-15870710
 ] 

Sean Mackrory commented on HADOOP-14041:


Test failure #5 is addressed by HADOOP-14046. The version marker patch went in 
after I had last run that tests when cleaning up the original S3GuardTool tests 
and it breaks.

I'll make the change required to have an S3 path provide that instead. I'm 
really surprised your last example behaves differently. I don't like the logic 
for determining how the CLI tools connect to the metastore and it's been an 
issue a couple of times now. What if we drop the "-m dynamodb:// " notion 
entirely and just use configuration + optional S3 path to connect?



> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-16 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870704#comment-15870704
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

Also, shouldn't this work?

{noformat}
$ hadoop s3a -Dfs.s3a.s3guard.ddb.table=fabbri-bucket prune -H 1 
s3a://fabbri-bucket
Usage: hadoop s3a [init|destroy|import|diff|prune] [OPTIONS] [ARGUMENTS]
perform metadata store administrative commands for s3a filesystem.
{noformat}

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-16 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870696#comment-15870696
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

Just recording results from my test runs last night:
mvn clean verify -Ds3guard -Ddynamo -Dscale
Tests run: 366, Failures: 3, Errors: 2, Skipped: 70

{noformat}
Failed tests:
(1)  
ITestS3AContractRootDir>AbstractContractRootDirectoryTest.testRecursiveRootListing:222->Assert.assertTrue:41->Assert.fail:88
 files mismatch:   "s3a://fabbri-dev/user/fabbri/test/file"  
"s3a://fabbri-dev/user/fabbri/test/parentdir/child"
(2)  
ITestS3AContractRootDir>AbstractContractRootDirectoryTest.testRmEmptyRootDirNonRecursive:95->Assert.fail:88
 After 1 attempts: listing after rm /* not empty
final [00] S3AFileStatus{path=s3a://fabbri-dev/Users; isDirectory=true; 
modification_time=0; access_time=0; owner=fabbri; group=fabbri; 
permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=false
(3)  
ITestS3AContractRootDir.testListEmptyRootDirectory:63->AbstractContractRootDirectoryTest.testListEmptyRootDirectory:186->Assert.fail:88
 Deleted file: unexpectedly found s3a://fabbri-dev/user as  
S3AFileStatus{path=s3a://fabbri-dev/user; isDirectory=true; 
modification_time=0; access_time=0; owner=fabbri; group=fabbri; 
permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=false

Tests in error:
(4)  ITestS3ACredentialsInURL.testInstantiateFromURL:86 » InterruptedIO 
initTable: ...
(5)  ITestS3GuardToolDynamoDB.testDestroyDynamoDBMetadataStore:145 » IO S3Guard 
tab...
{noformat}

1-3 are root directory test failures which have been flaky.. one is leftover 
files from FileSystemContractBaseTest, the other two are something creating a 
user/ directory while test is running? 

4 is expected: s3guard will not use URI credentials.  (We should skip this if 
we don't already do that in pending patch)
5 is this: S3Guard table lacks version marker. Table: 
destroyDynamoDBMetadataStore-1546206104
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.verifyVersionCompatibility(DynamoDBMetadataStore.java:667)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initTable(DynamoDBMetadataStore.java:630)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initialize(DynamoDBMetadataStore.java:288)

I don't think any of these are related, except maybe the last one?

As for testing the prune command itself, the first thing I notice is that it 
behaves a bit differently than, say, diff.  Diff appears to use bucket name as 
table name if one is not set, but prune requires setting the table name.

{noformat}
$ hadoop s3a prune -H 1 s3a://fabbri-bucket
No DynamoDB table name configured!
{noformat}


> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868994#comment-15868994
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

Your latest patch looks good to me, +1.

I'm fine with the default sleep value.  It is at least tunable now, which is 
great.  Some folks may want it to go fast, others will want to minimize impact 
to DDB provisioned IO for other live workloads.

Yetus looks clean except for TestKDiag stuff which I believe is unrelated 
(HADOOP-14030).

I will do some testing and commit this if all looks good.



> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868934#comment-15868934
 ] 

Hadoop QA commented on HADOOP-14041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
58s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
50s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 4s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
46s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
36s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 59s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
50s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12852640/HADOOP-14041-HADOOP-13345.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 38d5fc14a83e 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 94287ce |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11638/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11638/testReport/ |
| modules | C: hadoop-common-projec

[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-15 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868432#comment-15868432
 ] 

Sean Mackrory commented on HADOOP-14041:


[~fabbri] Oh you know what? That's what I'm talking about here: 
https://issues.apache.org/jira/browse/HADOOP-13736

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868367#comment-15868367
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

[~mackrorysd] Any idea what is up with the jenkins unit failures here?

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866676#comment-15866676
 ] 

Hadoop QA commented on HADOOP-14041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
44s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
35s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
41s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 12s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 45s{color} 
| {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 88m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
|   | hadoop.fs.s3a.s3guard.TestDynamoDBMetadataStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12852640/HADOOP-14041-HADOOP-13345.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 8a2f8aebbd12 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 2c3f575 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11625/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11625/artifact/p

[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-14 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866447#comment-15866447
 ] 

Sean Mackrory commented on HADOOP-14041:


I missed the javadoc issue locally. The hadoop-common failures are not related. 
The hadoop-aws failure is something I've seen a lot locally and have mentioned 
elsewhere but it seems no one else was seeing it and occasionally I don't see 
it (no idea how - we use FileStatus all over S3Guard). Removing the assertion 
and not casting to S3AFileStatus in that function makes everything work nicely. 
Has no one else seen this failure?

I'll upload a new patch that addresses the javadoc oversight.

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866357#comment-15866357
 ] 

Hadoop QA commented on HADOOP-14041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
58s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
43s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
47s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
42s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-tools_hadoop-aws generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 14s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 45s{color} 
| {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
|   | hadoop.fs.s3a.s3guard.TestDynamoDBMetadataStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12852602/HADOOP-14041-HADOOP-13345.004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux c658e2598be4 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 2c3f575 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11624/artifact/patchprocess/diff-javadoc-javadoc-hadoop-tools_hadoop-aws.txt
 |
| unit | 
https://bu

[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-08 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858874#comment-15858874
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

Thank you for the patch.  Overall it looks pretty good.  A couple of things 
need addressing.  The core-default and the InterruptedException are the 
important ones.

{noformat}
+  @InterfaceStability.Unstable
+  public static final String S3GUARD_CLI_PRUNE_AGE =
+  "fs.s3a.s3guard.cli.prune.age";
{noformat}

Probably want a snippet in core-default.xml.

{noformat}
+  @InterfaceStability.Unstable
+  public static final String S3GUARD_DDB_BATCH_SLEEP_MSEC_KEY =
+  "fs.s3a.s3guard.ddb.batch.sleep";
+  public static final int S3GUARD_DDB_BATCH_SLEEP_MSEC_DEFAULT = 25;
{noformat}

Same here.  Also wondering if we should call this "...ddb.prune.batch.sleep" as 
to not cause confusion with stuff like HADOOP-13904.  I think prune is going to 
remain a special case since it is a background priority job.  We could also 
call it "...ddb.background.sleep" to future-proof it for other background tasks 
(e.g. if we introduced an background scrubber or integrity checker?

{noformat}
+deletionBatch.add(path);
+  if (deletionBatch.size() == S3GUARD_DDB_BATCH_WRITE_REQUEST_LIMIT) {
+Thread.sleep(delay);
+processBatchWriteRequest(pathToKey(deletionBatch), new Item[0]);
+  }
+} catch (IOException e) {
+  LOG.error(e.getMessage());
+}
+if (deletionBatch.size() > 0) {
+  Thread.sleep(delay);
+  processBatchWriteRequest(pathToKey(deletionBatch), new Item[0]);
+}
{noformat}

Minor nit: I would make sleep happen between batches (not before the first).  
e.g. 

{noformat}
long batchCount = 0;
...
deletionBatch.add(path);
if (deletionBatch.size() == S3GUARD_DDB_BATCH_WRITE_REQUEST_LIMIT) {
if (batchCount++ > 0) {// don't sleep before first batch
Thread.sleep(delay); 
}
processBatchWriteRequest(pathToKey(deletionBatch), new Item[0]);
...
{noformat}

You could also use that for an interesting log message {{LOG.debug("Finished 
processing {} batches", batchCount);}}

{noformat}
+} catch (InterruptedException e) {
+  LOG.warn("Pruning operation was interrupted!");
+}
{noformat}
You need to propagate this exception, or set the threads' interrupt status.






> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-06 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854863#comment-15854863
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

A couple of thoughts on this patch:

1. I think prune() should be optional for implementations.   prune() is an 
offline algorithm for evicting old metadata from the metadata store.  Some 
implementations (i.e. LocalMetadataStore) probably want to do this as an online 
algorithm.  When I get around to doing HADOOP-13649, I would probably remove 
the prune() function there and do eviction as we go from the clients' accesses.

2. I think the work here should be broken up into batches, and there should be 
a sleep parameter to the prune function "batchSleepMsec" which is the number of 
milliseconds the implementation should sleep between pruning batches.  This is 
a simple way to have a tunable "niceness" parameter for the process.  This 
allows users to minimize impact to production jobs by making it much less 
likely that provisioned capacity will be exceeded.

3. The directory pruning has a couple of issues.  I'm wondering if we should 
omit directory pruning from the v1 of this.   Currently it builds a set of all 
directories in the whole metadata store, in memory, then checks each one if it 
is empty, and prunes it if so.  This could be optimized some, but the problems 
of having everything in memory, and of potentially breaking the "all paths to 
root are stored" invariant of the DDB data remains.

Let me share a variation on this algorithm I'm thinking of:

*Phase 1*: prune files.
{noformat}
while (number_pruned > 0) :
paths = table_scan(mod_time < x && is_dir==false, limit=BATCH_SIZE)
do_batched_delete(paths)
number_pruned = paths.size()
sleep(batchSleepMsecs)
{noformat}

*Phase 2*: directory pruning
Change meaning of mod_time for directories in DDB: it is create time.
{noformat}
while (number_pruned > 0) :
   paths = table_scan(mod_time < x && is_dir==true, limit=BATCH_SIZE)
   emptyOldDirs = paths.filter(isEmptyDir(x))
   do_batched_delete(emptyOldDirs)
{noformat}

Phase 2 is still subject to races where a file is placed into a directory right 
after we evaluate isEmptyDir(path).  Solving this with DDB doesn't seem 
trivial.  For now we could expose an option for prune() where the caller can 
select to prune just files, or to prune files and directories, with the caveat 
that directory pruning should not happen if there are other clients actively 
modifying the filesystem?


> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-03 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15852389#comment-15852389
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

Thanks for the hard work on this [~mackrorysd].  Will try to get you a review 
by mid-day Monday.

I also saw that error.  Do you have a table name defined in your 
src/test/resources core-site.xml/auth-keys.xml?  If so it may be that the table 
name that the test overrides that config with is getting set too late.  I think 
it went away when I removed the table name from my config.  

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-02 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15850820#comment-15850820
 ] 

Sean Mackrory commented on HADOOP-14041:


Been thinking about it some more and cleaning up directories is very tricky. 
One problem is that since we don't put a mod_time on directories (presumably 
just because S3 doesn't?) so it's impossible to distinguish between a directory 
that has existed for a long time and has had all of it's contents pruned, vs. a 
directory that was just created recently and had no contents to prune (yet). 
Putting a mod_time on a directory could be done in 2 days: we could just use 
that as a creation time, or a time when it's list of children changed. If it's 
only used for deciding when to prune old metadata, using it as creation time 
allows us to clean very old directories that don't have more recent children 
without the overhead of updating it every time we add or modify a child. But 
that might be a bit of a departure from the meaning expressed by "modification 
time".

I'm thinking a couple of things:

1) For now, I think I'll just prune directories that did have contents, but are 
now completely empty post-prune. Later, maybe we can add mod_time for 
directories and clean up directories that are old enough to be pruned and are 
empty, even though they didn't have children removed in the prune. The more I 
think about it, the more I think that will be rare and not worth adding 
mod_time to all directories just to clean it up more nicely.

2) Having thought about the gap between identifying files to prune and which 
directories to prune, it's probably better to do this in very small batches. 
It's okay for this prune command to take a longer time to run because we're 
making many round trips. The benefit of that is we minimize the window in which 
files can get created in a directory that is being cleaned up and might be 
considered empty. It also minimized impact on other workloads.

So ultimately I'm thinking the best way to do this is to clean up directories 
that did have children but had them all pruned (and THEIR parents if the same 
is now true of the parent directory), and to do this in very small batches or 
even individually. The more I think about it, it's probably not worth adding 
mod_time to directories to handle this any more completely. Would love to hear 
others' input, though.

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org