[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-05-03 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8449:

Attachment: HDFS-8449-009.patch

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch, 
> HDFS-8449-005.patch, HDFS-8449-006.patch, HDFS-8449-007.patch, 
> HDFS-8449-008.patch, HDFS-8449-009.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-10361) Support starting StorageContainerManager as a daemon

2016-05-03 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10361:
-
Comment: was deleted

(was: | (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HDFS-10361 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12802110/HDFS-10361.01.patch |
| JIRA Issue | HDFS-10361 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15353/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.

)

> Support starting StorageContainerManager as a daemon
> 
>
> Key: HDFS-10361
> URL: https://issues.apache.org/jira/browse/HDFS-10361
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-1312
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-10361-HDFS-7240.01.patch
>
>
> Add shell script support for starting the StorageContainerManager service as 
> a daemon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10361) Support starting StorageContainerManager as a daemon

2016-05-03 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10361:
-
Attachment: HDFS-10361-HDFS-7240.01.patch

> Support starting StorageContainerManager as a daemon
> 
>
> Key: HDFS-10361
> URL: https://issues.apache.org/jira/browse/HDFS-10361
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-1312
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-10361-HDFS-7240.01.patch
>
>
> Add shell script support for starting the StorageContainerManager service as 
> a daemon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10361) Support starting StorageContainerManager as a daemon

2016-05-03 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10361:
-
Attachment: (was: HDFS-10361.01.patch)

> Support starting StorageContainerManager as a daemon
> 
>
> Key: HDFS-10361
> URL: https://issues.apache.org/jira/browse/HDFS-10361
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-1312
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-10361-HDFS-7240.01.patch
>
>
> Add shell script support for starting the StorageContainerManager service as 
> a daemon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10361) Support starting StorageContainerManager as a daemon

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270099#comment-15270099
 ] 

Hadoop QA commented on HDFS-10361:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HDFS-10361 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12802110/HDFS-10361.01.patch |
| JIRA Issue | HDFS-10361 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15353/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Support starting StorageContainerManager as a daemon
> 
>
> Key: HDFS-10361
> URL: https://issues.apache.org/jira/browse/HDFS-10361
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-1312
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-10361.01.patch
>
>
> Add shell script support for starting the StorageContainerManager service as 
> a daemon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10361) Support starting StorageContainerManager as a daemon

2016-05-03 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10361:
-
Status: Patch Available  (was: Open)

> Support starting StorageContainerManager as a daemon
> 
>
> Key: HDFS-10361
> URL: https://issues.apache.org/jira/browse/HDFS-10361
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-1312
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-10361.01.patch
>
>
> Add shell script support for starting the StorageContainerManager service as 
> a daemon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10361) Support starting StorageContainerManager as a daemon

2016-05-03 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10361:
-
Attachment: HDFS-10361.01.patch

v01 patch adds a new service to the {{hdfs}} script called {{scm}} as 
StorageContainerManager. The service can be managed with:
* hdfs --daemon start scm
* hdfs --daemon stop scm

> Support starting StorageContainerManager as a daemon
> 
>
> Key: HDFS-10361
> URL: https://issues.apache.org/jira/browse/HDFS-10361
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-1312
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-10361.01.patch
>
>
> Add shell script support for starting the StorageContainerManager service as 
> a daemon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10361) Support starting StorageContainerManager as a daemon

2016-05-03 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-10361:


 Summary: Support starting StorageContainerManager as a daemon
 Key: HDFS-10361
 URL: https://issues.apache.org/jira/browse/HDFS-10361
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-1312
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


Add shell script support for starting the StorageContainerManager service as a 
daemon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10338) DistCp masks potential CRC check failures

2016-05-03 Thread Lin Yiqun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270094#comment-15270094
 ] 

Lin Yiqun commented on HDFS-10338:
--

Hi, [~raviprak], thanks for review.
I agree with your comment. Is there any other comment for the latest patch? 
[~yzhangal], could you please have a time to see this patch in DistCp?

If there are no other commet, I will post a new patch later.

> DistCp masks potential CRC check failures
> -
>
> Key: HDFS-10338
> URL: https://issues.apache.org/jira/browse/HDFS-10338
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.7.1
>Reporter: Elliot West
>Assignee: Lin Yiqun
> Attachments: HDFS-10338.001.patch, HDFS-10338.002.patch
>
>
> There appear to be edge cases whereby CRC checks may be circumvented when 
> requests for checksums from the source or target file system fail. In this 
> event CRCs could differ between the source and target and yet the DistCp copy 
> would succeed, even when the 'skip CRC check' option is not being used.
> The code in question is contained in the method 
> [{{org.apache.hadoop.tools.util.DistCpUtils#checksumsAreEqual(...)}}|https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java#L457]
> Specifically this code block suggests that if there is a failure when trying 
> to read the source or target checksum then the method will return {{true}} 
> (i.e.  the checksums are equal), implying that the check succeeded. In actual 
> fact we just failed to obtain the checksum and could not perform the check.
> {code}
> try {
>   sourceChecksum = sourceChecksum != null ? sourceChecksum : 
> sourceFS.getFileChecksum(source);
>   targetChecksum = targetFS.getFileChecksum(target);
> } catch (IOException e) {
>   LOG.error("Unable to retrieve checksum for " + source + " or "
> + target, e);
> }
> return (sourceChecksum == null || targetChecksum == null ||
>   sourceChecksum.equals(targetChecksum));
> {code}
> I believe that at the very least the caught {{IOException}} should be 
> re-thrown. If this is not deemed desirable then I believe an option 
> ({{--strictCrc}}?) should be added to enforce a strict check where we require 
> that both the source and target CRCs are retrieved, are not null, and are 
> then compared for equality. If for any reason either of the CRCs retrievals 
> fail then an exception is thrown.
> Clearly some {{FileSystems}} do not support CRCs and invocations to 
> {{FileSystem.getFileChecksum(...)}} return {{null}} in these instances. I 
> would suggest that these should fail a strict CRC check to prevent users 
> developing a false sense of security in their copy pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10346) Implement asynchronous setPermission for DistributedFileSystem

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270045#comment-15270045
 ] 

Hadoop QA commented on HDFS-10346:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 36s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 8s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 26s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_92. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 110m 23s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_92. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 18s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 51s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 255m 45s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_92 Failed ju

[jira] [Commented] (HDFS-10359) Allow trigger block report from all datanodes

2016-05-03 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270042#comment-15270042
 ] 

Arpit Agarwal commented on HDFS-10359:
--

Hi [~Tao Jie], processing full block reports is an expensive operation for the 
NameNode and it gets more expensive as the cluster size and data grow. You will 
cause a denial of service attack on your NameNode if you trigger full block 
reports every time you issue setrep. The default block report interval is 6 
hours for a good reason.

bq. however namenode would not notice block missing until block report in 6 
hours. In this case, we suppose to trigger block report for all datanodes 
before setrep -w. Further more, if we want to set replication of blocks to 1, 
some blocks may corrupt.
You should never set the replication factor of a file to 1 unless you are okay 
with losing the data or it can be trivially regenerated.

bq. It is OK to use a script to trigger block report from all datenodes, or 
just restart namenode.
Neither is necessary or recommended. You should trust the self-healing 
mechanisms of HDFS to detect and deal with lost blocks and let go of the 
expectation that all blocks will have exactly the expected number of replicas 
at all times. Under and over-replications are common in any real cluster as 
disks fail, network links get congested, or nodes go away and come back.

> Allow trigger block report from all datanodes
> -
>
> Key: HDFS-10359
> URL: https://issues.apache.org/jira/browse/HDFS-10359
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0, 2.6.1
>Reporter: Tao Jie
>
> Since we have HDFS-7278 allows trigger block report from one certain 
> datanode. It would be helpful to add a option to this command to trigger 
> block report from all datanodes.
> Command maybe like this:
> *hdfs dfsadmin -triggerBlockReport \[-incremental\] 
> *



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-05-03 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270026#comment-15270026
 ] 

Mingliang Liu commented on HDFS-10220:
--

I also think changing from {{Lease leaseToCheck = sortedLeases.poll();}} to 
{{Lease leaseToCheck = sortedLeases.peek();}} will address [~walter.k.su]'s 
comment. Moreover, we can move the statements in {{finally}} block out of it 
(instead, put them after the try-catch). I'm not favor of "breaking" a 
upper-level loop in the {{finally}} block and I was hinted by 
[ERR04-J.|https://www.securecoding.cert.org/confluence/display/java/ERR04-J.+Do+not+complete+abruptly+from+a+finally+block].

Other than this, I have some nits:
# {{isMaxLockHoldToReleaseLease}} can be private
# In the test, according to {{assertEquals(expected, actual)}} signature, we 
need reduce confusing test failing message.
{code:java}
- assertEquals(lm.countLease(), numLease);
+ assertEquals(numLease, lm.countLease());
{code}
# We may still need the javadoc for {{MAX_LOCK_HOLD_TO_RELEASE_LAESE_MS}}

> Namenode failover due to too long loking in LeaseManager.Monitor
> 
>
> Key: HDFS-10220
> URL: https://issues.apache.org/jira/browse/HDFS-10220
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Nicolas Fraison
>Assignee: Nicolas Fraison
>Priority: Minor
> Attachments: HADOOP-10220.001.patch, HADOOP-10220.002.patch, 
> HADOOP-10220.003.patch, HADOOP-10220.004.patch, HADOOP-10220.005.patch, 
> threaddump_zkfc.txt
>
>
> I have faced a namenode failover due to unresponsive namenode detected by the 
> zkfc with lot's of WARN messages (5 millions) like this one:
> _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All 
> existing blocks are COMPLETE, lease removed, file closed._
> On the threaddump taken by the zkfc there are lots of thread blocked due to a 
> lock.
> Looking at the code, there are a lock taken by the LeaseManager.Monitor when 
> some lease must be released. Due to the really big number of lease to be 
> released the namenode has taken too many times to release them blocking all 
> other tasks and making the zkfc thinking that the namenode was not 
> available/stuck.
> The idea of this patch is to limit the number of leased released each time we 
> check for lease so the lock won't be taken for a too long time period.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10359) Allow trigger block report from all datanodes

2016-05-03 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270024#comment-15270024
 ] 

Tao Jie commented on HDFS-10359:


[~kihwal], [~cnauroth], [~arpitagarwal], [~linyiqun] Thank you for reply!
I understand that trigger block report for all datanode would bring too much 
presure for namenode.
Our scenario is we use *setrep -w* command to ensure replications of blocks on 
all datanodes sometime. Blocks on datanode may lost somehow, however namenode 
would not notice block missing until block report in 6 hours. In this case, we 
suppose to trigger block report for all datanodes before *setrep -w*. Further 
more, if we want to set replication of blocks to 1, some blocks may corrupt.
It is OK to use a script to trigger block report from all datenodes, or just 
restart namenode.
I am not very familiar with this logic, if I am wrong or there is a better way, 
please correct me.


> Allow trigger block report from all datanodes
> -
>
> Key: HDFS-10359
> URL: https://issues.apache.org/jira/browse/HDFS-10359
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0, 2.6.1
>Reporter: Tao Jie
>
> Since we have HDFS-7278 allows trigger block report from one certain 
> datanode. It would be helpful to add a option to this command to trigger 
> block report from all datanodes.
> Command maybe like this:
> *hdfs dfsadmin -triggerBlockReport \[-incremental\] 
> *



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269973#comment-15269973
 ] 

Hadoop QA commented on HDFS-10360:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
7s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 107m 15s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_92. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 102m 20s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 237m 18s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_92 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.TestAsyncDFSRename |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandb

[jira] [Commented] (HDFS-2173) saveNamespace should not throw IOE when only one storage directory fails to write VERSION file

2016-05-03 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269962#comment-15269962
 ] 

Todd Lipcon commented on HDFS-2173:
---

Hrm, the original test/bug was filed >4.5 years ago so I'm a bit rusty on this 
area of the code and can't remember the details of the test. I'm not working on 
HDFS much in recent years. Maybe someone like [~andrew.wang] or [~yzhangal] 
might know this area better?

> saveNamespace should not throw IOE when only one storage directory fails to 
> write VERSION file
> --
>
> Key: HDFS-2173
> URL: https://issues.apache.org/jira/browse/HDFS-2173
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: Edit log branch (HDFS-1073), 0.23.0
>Reporter: Todd Lipcon
>Assignee: Andras Bokor
> Attachments: HDFS-2173.01.patch
>
>
> This JIRA tracks a TODO in TestSaveNamespace. Currently, if, while writing 
> the VERSION files in the storage directories, one of the directories fails, 
> the entire operation throws IOE. This is unnecessary -- instead, just that 
> directory should be marked as failed.
> This is targeted to be fixed _after_ HDFS-1073 is merged to trunk, since it 
> does not ever dataloss, and would rarely occur in practice (the dir would 
> have to fail between writing the fsimage file and writing VERSION)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-05-03 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269921#comment-15269921
 ] 

Ravi Prakash commented on HDFS-10220:
-

Thanks Nicolas! You're amazing :-)
I think I would like to hear opinions from some of the other people like 
[~kihwal] and [~yzhangal]. I'd be wary of throwing away leases for which 
recovery did not successfully finish

> Namenode failover due to too long loking in LeaseManager.Monitor
> 
>
> Key: HDFS-10220
> URL: https://issues.apache.org/jira/browse/HDFS-10220
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Nicolas Fraison
>Assignee: Nicolas Fraison
>Priority: Minor
> Attachments: HADOOP-10220.001.patch, HADOOP-10220.002.patch, 
> HADOOP-10220.003.patch, HADOOP-10220.004.patch, HADOOP-10220.005.patch, 
> threaddump_zkfc.txt
>
>
> I have faced a namenode failover due to unresponsive namenode detected by the 
> zkfc with lot's of WARN messages (5 millions) like this one:
> _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All 
> existing blocks are COMPLETE, lease removed, file closed._
> On the threaddump taken by the zkfc there are lots of thread blocked due to a 
> lock.
> Looking at the code, there are a lock taken by the LeaseManager.Monitor when 
> some lease must be released. Due to the really big number of lease to be 
> released the namenode has taken too many times to release them blocking all 
> other tasks and making the zkfc thinking that the namenode was not 
> available/stuck.
> The idea of this patch is to limit the number of leased released each time we 
> check for lease so the lock won't be taken for a too long time period.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10338) DistCp masks potential CRC check failures

2016-05-03 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269909#comment-15269909
 ] 

Ravi Prakash commented on HDFS-10338:
-

{{ignoreFailures}} would be a tragically confusing and misnamed variable. I 
would propose something like {{ignoreCRCerrors}}

> DistCp masks potential CRC check failures
> -
>
> Key: HDFS-10338
> URL: https://issues.apache.org/jira/browse/HDFS-10338
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.7.1
>Reporter: Elliot West
>Assignee: Lin Yiqun
> Attachments: HDFS-10338.001.patch, HDFS-10338.002.patch
>
>
> There appear to be edge cases whereby CRC checks may be circumvented when 
> requests for checksums from the source or target file system fail. In this 
> event CRCs could differ between the source and target and yet the DistCp copy 
> would succeed, even when the 'skip CRC check' option is not being used.
> The code in question is contained in the method 
> [{{org.apache.hadoop.tools.util.DistCpUtils#checksumsAreEqual(...)}}|https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java#L457]
> Specifically this code block suggests that if there is a failure when trying 
> to read the source or target checksum then the method will return {{true}} 
> (i.e.  the checksums are equal), implying that the check succeeded. In actual 
> fact we just failed to obtain the checksum and could not perform the check.
> {code}
> try {
>   sourceChecksum = sourceChecksum != null ? sourceChecksum : 
> sourceFS.getFileChecksum(source);
>   targetChecksum = targetFS.getFileChecksum(target);
> } catch (IOException e) {
>   LOG.error("Unable to retrieve checksum for " + source + " or "
> + target, e);
> }
> return (sourceChecksum == null || targetChecksum == null ||
>   sourceChecksum.equals(targetChecksum));
> {code}
> I believe that at the very least the caught {{IOException}} should be 
> re-thrown. If this is not deemed desirable then I believe an option 
> ({{--strictCrc}}?) should be added to enforce a strict check where we require 
> that both the source and target CRCs are retrieved, are not null, and are 
> then compared for equality. If for any reason either of the CRCs retrievals 
> fail then an exception is thrown.
> Clearly some {{FileSystems}} do not support CRCs and invocations to 
> {{FileSystem.getFileChecksum(...)}} return {{null}} in these instances. I 
> would suggest that these should fail a strict CRC check to prevent users 
> developing a false sense of security in their copy pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9902) Support different values of dfs.datanode.du.reserved per storage type

2016-05-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269902#comment-15269902
 ] 

Hudson commented on HDFS-9902:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9709 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9709/])
HDFS-9902. Support different values of dfs.datanode.du.reserved per (arp: rev 
6d77d6eab7790ed7ae2cad5b327ba5d1deb485db)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsVolumeList.java


> Support different values of dfs.datanode.du.reserved per storage type
> -
>
> Key: HDFS-9902
> URL: https://issues.apache.org/jira/browse/HDFS-9902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.2
>Reporter: Pan Yuxuan
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-9902-02.patch, HDFS-9902-03.patch, 
> HDFS-9902-04.patch, HDFS-9902-05.patch, HDFS-9902.patch
>
>
> Now Hadoop support different storage type for DISK, SSD, ARCHIVE and 
> RAM_DISK, but they share one configuration dfs.datanode.du.reserved.
> The DISK size may be several TB and the RAM_DISK size may be only several 
> tens of GB.
> The problem is that when I configure DISK and RAM_DISK (tmpfs) in the same 
> DN, and I set  dfs.datanode.du.reserved values 10GB, this will waste a lot of 
> RAM_DISK size. 
> Since the usage of RAM_DISK can be 100%, so I don't want 
> dfs.datanode.du.reserved configured for DISK impacts the usage of tmpfs.
> So can we make a new configuration for RAM_DISK or just skip this 
> configuration for RAM_DISK?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9902) Support different values of dfs.datanode.du.reserved per storage type

2016-05-03 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-9902:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

+1 from me too for the v5 patch.

I committed it for 2.8.0. Thanks for the contribution [~brahmareddy] and thanks 
for the reviews [~xyao] and [~panyuxuan].

> Support different values of dfs.datanode.du.reserved per storage type
> -
>
> Key: HDFS-9902
> URL: https://issues.apache.org/jira/browse/HDFS-9902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.2
>Reporter: Pan Yuxuan
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-9902-02.patch, HDFS-9902-03.patch, 
> HDFS-9902-04.patch, HDFS-9902-05.patch, HDFS-9902.patch
>
>
> Now Hadoop support different storage type for DISK, SSD, ARCHIVE and 
> RAM_DISK, but they share one configuration dfs.datanode.du.reserved.
> The DISK size may be several TB and the RAM_DISK size may be only several 
> tens of GB.
> The problem is that when I configure DISK and RAM_DISK (tmpfs) in the same 
> DN, and I set  dfs.datanode.du.reserved values 10GB, this will waste a lot of 
> RAM_DISK size. 
> Since the usage of RAM_DISK can be 100%, so I don't want 
> dfs.datanode.du.reserved configured for DISK impacts the usage of tmpfs.
> So can we make a new configuration for RAM_DISK or just skip this 
> configuration for RAM_DISK?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269860#comment-15269860
 ] 

Hadoop QA commented on HDFS-9890:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 20s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
51s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 47s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 50s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 22m 5s {color} | 
{color:red} hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_91 with JDK 
v1.8.0_91 generated 41 new + 23 unchanged - 6 fixed = 64 total (was 29) {color} 
|
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 3m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 25m 34s {color} | 
{color:red} hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_95 with JDK 
v1.7.0_95 generated 41 new + 23 unchanged - 6 fixed = 64 total (was 29) {color} 
|
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 3m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 39s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 30s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 1s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.7.0_95 Failed CTEST tests | 
test_libhdfs_mini_stress_hdfspp_test_shim_static |
\\
\\
|| Subsystem || Re

[jira] [Updated] (HDFS-10344) DistributedFileSystem#getTrashRoots should skip encryption zone that does not have .Trash

2016-05-03 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-10344:
--
Reporter: Namit Maheshwari  (was: Xiaoyu Yao)

> DistributedFileSystem#getTrashRoots should skip encryption zone that does not 
> have .Trash
> -
>
> Key: HDFS-10344
> URL: https://issues.apache.org/jira/browse/HDFS-10344
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Namit Maheshwari
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-10344.00.patch
>
>
> HDFS-8831 added trash support for encryption zones. For encryption zone that 
> does not have .Trash created yet, DistributedFileSystem#getTrashRoots should 
> skip rather than call listStatus() and break by FileNotFoundException. I will 
> post a patch for the fix shortly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10324) Trash directory in an encryption zone should be pre-created with sticky bit

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269807#comment-15269807
 ] 

Hadoop QA commented on HDFS-10324:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 8s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 31s {color} 
| {color:red} root-jdk1.8.0_91 with JDK v1.8.0_91 generated 57 new + 724 
unchanged - 0 fixed = 781 total (was 724) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m 6s {color} 
| {color:red} root-jdk1.7.0_95 with JDK v1.7.0_95 generated 57 new + 719 
unchanged - 0 fixed = 776 total (was 719) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 35s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 4s 
{color} | {color:red} root: patch generated 5 new + 15 unchanged - 3 fixed = 20 
total (was 18) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 57s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 11s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 13s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {colo

[jira] [Updated] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-05-03 Thread Xiaowei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Zhu updated HDFS-9890:
--
Attachment: HDFS-9890.HDFS-8707.005.patch

> libhdfs++: Add test suite to simulate network issues
> 
>
> Key: HDFS-9890
> URL: https://issues.apache.org/jira/browse/HDFS-9890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-9890.HDFS-8707.000.patch, 
> HDFS-9890.HDFS-8707.001.patch, HDFS-9890.HDFS-8707.002.patch, 
> HDFS-9890.HDFS-8707.003.patch, HDFS-9890.HDFS-8707.004.patch, 
> HDFS-9890.HDFS-8707.005.patch
>
>
> I propose adding a test suite to simulate various network issues/failures in 
> order to get good test coverage on some of the retry paths that aren't easy 
> to hit in mock unit tests.
> At the moment the only things that hit the retry paths are the gmock unit 
> tests.  The gmock are only as good as their mock implementations which do a 
> great job of simulating protocol correctness but not more complex 
> interactions.  They also can't really simulate the types of lock contention 
> and subtle memory stomps that show up while doing hundreds or thousands of 
> concurrent reads.   We should add a new minidfscluster test that focuses on 
> heavy read/seek load and then randomly convert error codes returned by 
> network functions into errors.
> List of things to simulate(while heavily loaded), roughly in order of how 
> badly I think they need to be tested at the moment:
> -Rpc connection disconnect
> -Rpc connection slowed down enough to cause a timeout and trigger retry
> -DN connection disconnect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access

2016-05-03 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269766#comment-15269766
 ] 

Andrew Wang commented on HDFS-9924:
---

Hi all, a few notes:

* Can we do this work on a branch? This is a big addition to the HDFS API, so I 
think needs some broader buy-in from the community and validation before 
merging. Since performance is a stated goal, a performance evaluation seems 
like a merge requirement.
* Could someone post a design doc with the motivations, proposed API, and 
discussion? It'd help to go over the pros/cons of the different API options. 
ListenableFuture for instance has also been brought up. Reviewing some other 
async RPC interfaces for comparison would also be helpful. This design doc is 
also the place to discuss Colin's question about performance compared to a 
thread pool. If that option is available to us, it's preferable since it does 
not involve expanding the API.

Thanks!

> [umbrella] Asynchronous HDFS Access
> ---
>
> Key: HDFS-9924
> URL: https://issues.apache.org/jira/browse/HDFS-9924
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: fs
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Xiaobing Zhou
>
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked 
> until the method returns.  It is very slow if a client makes a large number 
> of independent calls in a single thread since each call has to wait until the 
> previous call is finished.  It is inefficient if a client needs to create a 
> large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is 
> not blocked.  The methods in the new API immediately return a Java Future 
> object.  The return value can be obtained by the usual Future.get() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10346) Implement asynchronous setPermission for DistributedFileSystem

2016-05-03 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269735#comment-15269735
 ] 

Xiaobing Zhou commented on HDFS-10346:
--

The patch v000 is posted for review. I will add more tests in next patch.

> Implement asynchronous setPermission for DistributedFileSystem
> --
>
> Key: HDFS-10346
> URL: https://issues.apache.org/jira/browse/HDFS-10346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10346-HDFS-9924.000.patch
>
>
> This is proposed to implement an asynchronous setPermission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10346) Implement asynchronous setPermission for DistributedFileSystem

2016-05-03 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10346:
-
Attachment: HDFS-10346-HDFS-9924.000.patch

> Implement asynchronous setPermission for DistributedFileSystem
> --
>
> Key: HDFS-10346
> URL: https://issues.apache.org/jira/browse/HDFS-10346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10346-HDFS-9924.000.patch
>
>
> This is proposed to implement an asynchronous setPermission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10346) Implement asynchronous setPermission for DistributedFileSystem

2016-05-03 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10346:
-
Status: Patch Available  (was: Open)

> Implement asynchronous setPermission for DistributedFileSystem
> --
>
> Key: HDFS-10346
> URL: https://issues.apache.org/jira/browse/HDFS-10346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10346-HDFS-9924.000.patch
>
>
> This is proposed to implement an asynchronous setPermission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing

2016-05-03 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269669#comment-15269669
 ] 

Wei-Chiu Chuang commented on HDFS-10360:


In addition to the fix, it should also report the error via JMX.

> DataNode may format directory and lose blocks if current/VERSION is missing
> ---
>
> Key: HDFS-10360
> URL: https://issues.apache.org/jira/browse/HDFS-10360
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10360.001.patch
>
>
> Under certain circumstances, if the current/VERSION of a storage directory is 
> missing, DataNode may format the storage directory even though _block files 
> are not missing_.
> This is very easy to reproduce. Simply launch a HDFS cluster and create some 
> files. Delete current/VERSION, and restart the data node.
> After the restart, the data node will format the directory and remove all 
> existing block files:
> {noformat}
> 2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Lock on /data/dfs/dn/in_use.lock acquired by nodename 
> 5...@weichiu-dn-2.vpc.cloudera.com
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Storage directory /data/dfs/dn is not formatted for 
> BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Locking is disabled for 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Block pool storage directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted 
> for BP-787466439-172
> .26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
> {noformat}
> The bug is: DataNode assumes that if none of {{current/VERSION}}, 
> {{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and 
> {{lastcheckpoint.tmp/}} exists, the storage directory contains nothing 
> important to HDFS and decides to format it. 
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java#L526-L545
> However, block files may still exist, and in my opinion, we should do 
> everything possible to retain the block files.
> I have two suggestions:
> # check if {{current/}} directory is empty. If not, throw an 
> InconsistentFSStateException in {{Storage#analyzeStorage}} instead of 
> asumming its not formatted. Or,
> # In {{Storage#clearDirectory}}, before it formats the storage directory, 
> rename or move {{current/}} directory. Also, log whatever is being 
> renamed/moved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing

2016-05-03 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10360:
---
Description: 
Under certain circumstances, if the current/VERSION of a storage directory is 
missing, DataNode may format the storage directory even though _block files are 
not missing_.

This is very easy to reproduce. Simply launch a HDFS cluster and create some 
files. Delete current/VERSION, and restart the data node.

After the restart, the data node will format the directory and remove all 
existing block files:

{noformat}
2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock 
on /data/dfs/dn/in_use.lock acquired by nodename 
5...@weichiu-dn-2.vpc.cloudera.com
2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Storage directory /data/dfs/dn is not formatted for 
BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Formatting ...
2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Locking is disabled for 
/data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Block pool storage directory 
/data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted 
for BP-787466439-172
.26.24.43-1462305406642
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Formatting ...
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory 
/data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
{noformat}

The bug is: DataNode assumes that if none of {{current/VERSION}}, 
{{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and 
{{lastcheckpoint.tmp/}} exists, the storage directory contains nothing 
important to HDFS and decides to format it. 
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java#L526-L545
However, block files may still exist, and in my opinion, we should do 
everything possible to retain the block files.

I have two suggestions:
# check if {{current/}} directory is empty. If not, throw an 
InconsistentFSStateException in {{Storage#analyzeStorage}} instead of asumming 
its not formatted. Or,
# In {{Storage#clearDirectory}}, before it formats the storage directory, 
rename or move {{current/}} directory. Also, log whatever is being 
renamed/moved.

  was:
Under certain circumstances, if the current/VERSION of a storage directory is 
missing, DataNode may format the storage directory even though _block files are 
not missing_.

This is very easy to reproduce. Simply launch a HDFS cluster and create some 
files. Delete current/VERSION, and restart the data node.

After the restart, the data node will format the directory and remove all 
existing block files:

{noformat}
2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock 
on /data/dfs/dn/in_use.lock acquired by nodename 
5...@weichiu-dn-2.vpc.cloudera.com
2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Storage directory /data/dfs/dn is not formatted for 
BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Formatting ...
2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Locking is disabled for 
/data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Block pool storage directory 
/data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted 
for BP-787466439-172
.26.24.43-1462305406642
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Formatting ...
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory 
/data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
{noformat}

The bug is: DataNode assumes that if none of {{current/VERSION}}, 
{{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and 
{{lastcheckpoint.tmp/}} exists, the storage directory contains nothing 
important to HDFS and decides to format it. However, block files may still 
exist, and in my opinion, we should do everything possible to retain the block 
files.

I have two suggestions:
# check if {{current/}} directory is empty. If not, throw an 
In

[jira] [Updated] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing

2016-05-03 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10360:
---
Summary: DataNode may format directory and lose blocks if current/VERSION 
is missing  (was: DataNode may format directory and lose blocks if If 
current/VERSION is missing)

> DataNode may format directory and lose blocks if current/VERSION is missing
> ---
>
> Key: HDFS-10360
> URL: https://issues.apache.org/jira/browse/HDFS-10360
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10360.001.patch
>
>
> Under certain circumstances, if the current/VERSION of a storage directory is 
> missing, DataNode may format the storage directory even though _block files 
> are not missing_.
> This is very easy to reproduce. Simply launch a HDFS cluster and create some 
> files. Delete current/VERSION, and restart the data node.
> After the restart, the data node will format the directory and remove all 
> existing block files:
> {noformat}
> 2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Lock on /data/dfs/dn/in_use.lock acquired by nodename 
> 5...@weichiu-dn-2.vpc.cloudera.com
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Storage directory /data/dfs/dn is not formatted for 
> BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Locking is disabled for 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Block pool storage directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted 
> for BP-787466439-172
> .26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
> {noformat}
> The bug is: DataNode assumes that if none of {{current/VERSION}}, 
> {{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and 
> {{lastcheckpoint.tmp/}} exists, the storage directory contains nothing 
> important to HDFS and decides to format it. However, block files may still 
> exist, and in my opinion, we should do everything possible to retain the 
> block files.
> I have two suggestions:
> # check if {{current/}} directory is empty. If not, throw an 
> InconsistentFSStateException in {{Storage#analyzeStorage}} instead of 
> asumming its not formatted. Or,
> # In {{Storage#clearDirectory}}, before it formats the storage directory, 
> rename or move {{current/}} directory. Also, log whatever is being 
> renamed/moved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10360) DataNode may format directory and lose blocks if If current/VERSION is missing

2016-05-03 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10360:
---
Status: Patch Available  (was: Open)

> DataNode may format directory and lose blocks if If current/VERSION is missing
> --
>
> Key: HDFS-10360
> URL: https://issues.apache.org/jira/browse/HDFS-10360
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10360.001.patch
>
>
> Under certain circumstances, if the current/VERSION of a storage directory is 
> missing, DataNode may format the storage directory even though _block files 
> are not missing_.
> This is very easy to reproduce. Simply launch a HDFS cluster and create some 
> files. Delete current/VERSION, and restart the data node.
> After the restart, the data node will format the directory and remove all 
> existing block files:
> {noformat}
> 2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Lock on /data/dfs/dn/in_use.lock acquired by nodename 
> 5...@weichiu-dn-2.vpc.cloudera.com
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Storage directory /data/dfs/dn is not formatted for 
> BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Locking is disabled for 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Block pool storage directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted 
> for BP-787466439-172
> .26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
> {noformat}
> The bug is: DataNode assumes that if none of {{current/VERSION}}, 
> {{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and 
> {{lastcheckpoint.tmp/}} exists, the storage directory contains nothing 
> important to HDFS and decides to format it. However, block files may still 
> exist, and in my opinion, we should do everything possible to retain the 
> block files.
> I have two suggestions:
> # check if {{current/}} directory is empty. If not, throw an 
> InconsistentFSStateException in {{Storage#analyzeStorage}} instead of 
> asumming its not formatted. Or,
> # In {{Storage#clearDirectory}}, before it formats the storage directory, 
> rename or move {{current/}} directory. Also, log whatever is being 
> renamed/moved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10360) DataNode may format directory and lose blocks if If current/VERSION is missing

2016-05-03 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10360:
---
Attachment: HDFS-10360.001.patch

Upload a proof of concept for the fix proposed in #1.

> DataNode may format directory and lose blocks if If current/VERSION is missing
> --
>
> Key: HDFS-10360
> URL: https://issues.apache.org/jira/browse/HDFS-10360
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10360.001.patch
>
>
> Under certain circumstances, if the current/VERSION of a storage directory is 
> missing, DataNode may format the storage directory even though _block files 
> are not missing_.
> This is very easy to reproduce. Simply launch a HDFS cluster and create some 
> files. Delete current/VERSION, and restart the data node.
> After the restart, the data node will format the directory and remove all 
> existing block files:
> {noformat}
> 2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Lock on /data/dfs/dn/in_use.lock acquired by nodename 
> 5...@weichiu-dn-2.vpc.cloudera.com
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Storage directory /data/dfs/dn is not formatted for 
> BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Locking is disabled for 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Block pool storage directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted 
> for BP-787466439-172
> .26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
> {noformat}
> The bug is: DataNode assumes that if none of {{current/VERSION}}, 
> {{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and 
> {{lastcheckpoint.tmp/}} exists, the storage directory contains nothing 
> important to HDFS and decides to format it. However, block files may still 
> exist, and in my opinion, we should do everything possible to retain the 
> block files.
> I have two suggestions:
> # check if {{current/}} directory is empty. If not, throw an 
> InconsistentFSStateException in {{Storage#analyzeStorage}} instead of 
> asumming its not formatted. Or,
> # In {{Storage#clearDirectory}}, before it formats the storage directory, 
> rename or move {{current/}} directory. Also, log whatever is being 
> renamed/moved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10360) DataNode may format directory and lose blocks if If current/VERSION is missing

2016-05-03 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-10360:
--

 Summary: DataNode may format directory and lose blocks if If 
current/VERSION is missing
 Key: HDFS-10360
 URL: https://issues.apache.org/jira/browse/HDFS-10360
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang


Under certain circumstances, if the current/VERSION of a storage directory is 
missing, DataNode may format the storage directory even though _block files are 
not missing_.

This is very easy to reproduce. Simply launch a HDFS cluster and create some 
files. Delete current/VERSION, and restart the data node.

After the restart, the data node will format the directory and remove all 
existing block files:

{noformat}
2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock 
on /data/dfs/dn/in_use.lock acquired by nodename 
5...@weichiu-dn-2.vpc.cloudera.com
2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Storage directory /data/dfs/dn is not formatted for 
BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Formatting ...
2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Locking is disabled for 
/data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Block pool storage directory 
/data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted 
for BP-787466439-172
.26.24.43-1462305406642
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Formatting ...
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory 
/data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
{noformat}

The bug is: DataNode assumes that if none of {{current/VERSION}}, 
{{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and 
{{lastcheckpoint.tmp/}} exists, the storage directory contains nothing 
important to HDFS and decides to format it. However, block files may still 
exist, and in my opinion, we should do everything possible to retain the block 
files.

I have two suggestions:
# check if {{current/}} directory is empty. If not, throw an 
InconsistentFSStateException in {{Storage#analyzeStorage}} instead of asumming 
its not formatted. Or,
# In {{Storage#clearDirectory}}, before it formats the storage directory, 
rename or move {{current/}} directory. Also, log whatever is being 
renamed/moved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10320) Rack failures may result in NN terminate

2016-05-03 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269566#comment-15269566
 ] 

Ming Ma commented on HDFS-10320:


Thanks [~xiaochen]! The patch looks good overall. Couple questions:

* Although it requires extra work, it might be useful to add a new unit test to 
verify the race condition you identified.
* It seems the following line should always return true given it has explicitly 
exclude those nodes earlier.
{noformat}
  if (excludedNodes.add(chosenNode)) { //was not in the excluded list
{noformat}
* Is the following line still necessary given it was added to excludedNodes 
earlier?
{noformat}
addToExcludedNodes(chosenNode, excludedNodes);
{noformat}
* The following change in webHDFS seems right. But wonder if that means it 
didn't take excludes into consideration before under some scenario.
{noformat}
return (DatanodeDescriptor)bm.getDatanodeManager().getNetworkTopology(
).chooseRandom(NodeBase.ROOT, excludes);
{noformat} 

> Rack failures may result in NN terminate
> 
>
> Key: HDFS-10320
> URL: https://issues.apache.org/jira/browse/HDFS-10320
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-10320.01.patch, HDFS-10320.02.patch, 
> HDFS-10320.03.patch
>
>
> If there're rack failures which end up leaving only 1 rack available, 
> {{BlockPlacementPolicyDefault#chooseRandom}} may get 
> {{InvalidTopologyException}} when calling {{NetworkTopology#chooseRandom}}, 
> which then throws all the way out to {{BlockManager}}'s 
> {{ReplicationMonitor}} thread and terminate the NN.
> Log:
> {noformat}
> 2016-02-24 09:22:01,514  WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], 
> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For 
> more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-02-24 09:22:01,958  ERROR 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception. 
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Failed to 
> find datanode (scope="" excludedScope="/rack_a5").
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:729)
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:694)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:635)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3746)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3711)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1400)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1306)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3682)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3634)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269515#comment-15269515
 ] 

Hadoop QA commented on HDFS-9890:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 26m 25s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
45s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 48s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 59s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 54s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 26m 30s {color} | 
{color:red} hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_91 with JDK 
v1.8.0_91 generated 41 new + 29 unchanged - 0 fixed = 70 total (was 29) {color} 
|
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 31m 29s {color} | 
{color:red} hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_95 with JDK 
v1.7.0_95 generated 41 new + 29 unchanged - 0 fixed = 70 total (was 29) {color} 
|
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 59s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 5m 55s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 59s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.7.0_95 Failed CTEST tests | 
test_libhdfs_mini_stress_hdfspp_test_shim_static |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12801998/HDFS-9890.HDFS-8707.004.patch
 |
| JIRA Issue | HDFS-9890 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux eae7556a1497 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / d187112 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_91 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
| cc | hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_91: 
http

[jira] [Commented] (HDFS-10348) Namenode report bad block method doesn't check whether the block belongs to datanode before adding it to corrupt replicas map.

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269397#comment-15269397
 ] 

Hadoop QA commented on HDFS-10348:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
8s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 23s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_92. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 5s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 209m 12s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_92 Failed junit tests | hadoop.hdfs.TestFileAppend |
|   | 
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
 |
|   | hadoop.hdfs.server.datanode.TestDataNodeMXBean |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.TestAsyncDFSRename |
| JDK v1.8.0_92 Timed out junit tests | 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.server.datanode.TestDataNodeLifeline |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.namenode.sna

[jira] [Commented] (HDFS-9902) Support different values of dfs.datanode.du.reserved per storage type

2016-05-03 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269394#comment-15269394
 ] 

Xiaoyu Yao commented on HDFS-9902:
--

Thanks [~brahmareddy] for updating the patch. +1 for patch v05. 

> Support different values of dfs.datanode.du.reserved per storage type
> -
>
> Key: HDFS-9902
> URL: https://issues.apache.org/jira/browse/HDFS-9902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.2
>Reporter: Pan Yuxuan
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-9902-02.patch, HDFS-9902-03.patch, 
> HDFS-9902-04.patch, HDFS-9902-05.patch, HDFS-9902.patch
>
>
> Now Hadoop support different storage type for DISK, SSD, ARCHIVE and 
> RAM_DISK, but they share one configuration dfs.datanode.du.reserved.
> The DISK size may be several TB and the RAM_DISK size may be only several 
> tens of GB.
> The problem is that when I configure DISK and RAM_DISK (tmpfs) in the same 
> DN, and I set  dfs.datanode.du.reserved values 10GB, this will waste a lot of 
> RAM_DISK size. 
> Since the usage of RAM_DISK can be 100%, so I don't want 
> dfs.datanode.du.reserved configured for DISK impacts the usage of tmpfs.
> So can we make a new configuration for RAM_DISK or just skip this 
> configuration for RAM_DISK?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10324) Trash directory in an encryption zone should be pre-created with sticky bit

2016-05-03 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10324:
---
Attachment: HDFS-10324.005.patch

Rev05: Thanks [~xyao] and [~andrew.wang].
I added an overloaded {{HdfsAdmin#createEncryptionZone}} API as well as a new 
{{HdfsAdmin#provisionEncryptionZoneTrash}} API. The crypto CLI now calls these 
two {{HdfsAdmin}} APIs instead of using {{DistributedFileSystem}} directly.

> Trash directory in an encryption zone should be pre-created with sticky bit
> ---
>
> Key: HDFS-10324
> URL: https://issues.apache.org/jira/browse/HDFS-10324
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 2.8.0
> Environment: CDH5.7.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10324.001.patch, HDFS-10324.002.patch, 
> HDFS-10324.003.patch, HDFS-10324.004.patch, HDFS-10324.005.patch
>
>
> We encountered a bug in HDFS-8831:
> After HDFS-8831, a deleted file in an encryption zone is moved to a .Trash 
> subdirectory within the encryption zone.
> However, if this .Trash subdirectory is not created beforehand, it will be 
> created and owned by the first user who deleted a file, with permission 
> drwx--. This creates a serious bug because any other non-privileged user 
> will not be able to delete any files within the encryption zone, because they 
> do not have the permission to move directories to the trash directory.
> We should fix this bug, by pre-creating the .Trash directory with sticky bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10355) Fix thread_local related build issue on Mac OS X

2016-05-03 Thread Tibor Kiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tibor Kiss updated HDFS-10355:
--
Description: 
The native hdfs library uses C++11 features heavily.
One of such feature is thread_local storage class which is supported in GCC, 
Visual Studio and the community version of clang compiler, but not by Apple's 
clang (which is default on OS X boxes). 
See further details here: http://stackoverflow.com/a/29929949

Even though not many Hadoop cluster runs on OS X developers still use this 
platform for development.

The problem can be solved multiple ways:
 a) Stick to gcc/g++ or community based clang on OS X. Developers will need 
extra steps to build Hadoop.
 b) Workaround thread_local with a helper class.
 c) Get rid of all the globals marked with thread_local. Interface change will 
be erquired.
 d) Disable multi threading support in the native client on OS X and document 
this limitation. 

Compile error related to thread_local:
{noformat}
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/bindings/c/hdfs.cc:66:1:
 error: thread-local storage is not supported for the current target
 [exec] thread_local std::string errstr;
 [exec] ^
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/bindings/c/hdfs.cc:87:1:
 error: thread-local storage is not supported for the current target
 [exec] thread_local std::experimental::optional 
fsEventCallback;
 [exec] ^
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/bindings/c/hdfs.cc:88:1:
 error: thread-local storage is not supported for the current target
 [exec] thread_local std::experimental::optional 
fileEventCallback;
 [exec] ^
 [exec] 1 warning and 3 errors generated.
{noformat}

  was:
The native hdfs library uses C++11 features heavily.
One of such feature is thread_local storage class which is supported in GCC, 
Visual Studio and the community version of clang compiler, but not by Apple's 
clang (which is default on OS X boxes). 
See further details here: http://stackoverflow.com/a/29929949

Even though not many Hadoop cluster runs on OS X developers still use this 
platform for development.

The problem can be solved multiple ways:
 a) Stick to gcc/g++ or community based clang on OS X. Developers will need 
extra steps to build Hadoop.
 b) Workaround thread_local with a helper class.
 c) Get rid of all the globals marked with thread_local. Interface change will 
be erquired.
 d) Disable multi threading support in the native client on OS X and document 
this limitation. 

Compile error related to thread_local:
{noformat}
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/bindings/c/hdfs.cc:66:1:
 error: thread-local storage is not supported for the current target
 [exec] thread_local std::string errstr;
 [exec] ^
 [exec] 1 warning and 1 error generated.
 [exec] make[2]: *** 
[main/native/libhdfspp/lib/bindings/c/CMakeFiles/bindings_c_obj.dir/hdfs.cc.o] 
Error 1
 [exec] make[1]: *** 
[main/native/libhdfspp/lib/bindings/c/CMakeFiles/bindings_c_obj.dir/all] Error 2
 [exec] make: *** [all] Error 2
{noformat}


> Fix thread_local related build issue on Mac OS X
> 
>
> Key: HDFS-10355
> URL: https://issues.apache.org/jira/browse/HDFS-10355
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
> Environment: OS: Mac OS X 10.11
> clang: Apple LLVM version 7.0.2 (clang-700.1.81)
>Reporter: Tibor Kiss
>
> The native hdfs library uses C++11 features heavily.
> One of such feature is thread_local storage class which is supported in GCC, 
> Visual Studio and the community version of clang compiler, but not by Apple's 
> clang (which is default on OS X boxes). 
> See further details here: http://stackoverflow.com/a/29929949
> Even though not many Hadoop cluster runs on OS X developers still use this 
> platform for development.
> The problem can be solved multiple ways:
>  a) Stick to gcc/g++ or community based clang on OS X. Developers will need 
> extra steps to build Hadoop.
>  b) Workaround thread_local with a helper class.
>  c) Get rid of all the globals marked with thread_local. Interface change 
> will be erquired.
>  d) Disable multi threading support in the native client on OS X and document 
> this limitation. 
> Compile error related to thread_local:
> {noformat}
>  [exec] 
> /Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/bindings/c/hdfs.cc:66:1:
>  error: thread-local storage i

[jira] [Updated] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-05-03 Thread Xiaowei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Zhu updated HDFS-9890:
--
Attachment: HDFS-9890.HDFS-8707.004.patch

> libhdfs++: Add test suite to simulate network issues
> 
>
> Key: HDFS-9890
> URL: https://issues.apache.org/jira/browse/HDFS-9890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-9890.HDFS-8707.000.patch, 
> HDFS-9890.HDFS-8707.001.patch, HDFS-9890.HDFS-8707.002.patch, 
> HDFS-9890.HDFS-8707.003.patch, HDFS-9890.HDFS-8707.004.patch
>
>
> I propose adding a test suite to simulate various network issues/failures in 
> order to get good test coverage on some of the retry paths that aren't easy 
> to hit in mock unit tests.
> At the moment the only things that hit the retry paths are the gmock unit 
> tests.  The gmock are only as good as their mock implementations which do a 
> great job of simulating protocol correctness but not more complex 
> interactions.  They also can't really simulate the types of lock contention 
> and subtle memory stomps that show up while doing hundreds or thousands of 
> concurrent reads.   We should add a new minidfscluster test that focuses on 
> heavy read/seek load and then randomly convert error codes returned by 
> network functions into errors.
> List of things to simulate(while heavily loaded), roughly in order of how 
> badly I think they need to be tested at the moment:
> -Rpc connection disconnect
> -Rpc connection slowed down enough to cause a timeout and trigger retry
> -DN connection disconnect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10355) Fix thread_local related build issue on Mac OS X

2016-05-03 Thread Tibor Kiss (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269166#comment-15269166
 ] 

Tibor Kiss commented on HDFS-10355:
---

Yes, it does have pthread support.

> Fix thread_local related build issue on Mac OS X
> 
>
> Key: HDFS-10355
> URL: https://issues.apache.org/jira/browse/HDFS-10355
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
> Environment: OS: Mac OS X 10.11
> clang: Apple LLVM version 7.0.2 (clang-700.1.81)
>Reporter: Tibor Kiss
>
> The native hdfs library uses C++11 features heavily.
> One of such feature is thread_local storage class which is supported in GCC, 
> Visual Studio and the community version of clang compiler, but not by Apple's 
> clang (which is default on OS X boxes). 
> See further details here: http://stackoverflow.com/a/29929949
> Even though not many Hadoop cluster runs on OS X developers still use this 
> platform for development.
> The problem can be solved multiple ways:
>  a) Stick to gcc/g++ or community based clang on OS X. Developers will need 
> extra steps to build Hadoop.
>  b) Workaround thread_local with a helper class.
>  c) Get rid of all the globals marked with thread_local. Interface change 
> will be erquired.
>  d) Disable multi threading support in the native client on OS X and document 
> this limitation. 
> Compile error related to thread_local:
> {noformat}
>  [exec] 
> /Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/bindings/c/hdfs.cc:66:1:
>  error: thread-local storage is not supported for the current target
>  [exec] thread_local std::string errstr;
>  [exec] ^
>  [exec] 1 warning and 1 error generated.
>  [exec] make[2]: *** 
> [main/native/libhdfspp/lib/bindings/c/CMakeFiles/bindings_c_obj.dir/hdfs.cc.o]
>  Error 1
>  [exec] make[1]: *** 
> [main/native/libhdfspp/lib/bindings/c/CMakeFiles/bindings_c_obj.dir/all] 
> Error 2
>  [exec] make: *** [all] Error 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10354) Fix compilation & unit test issues on Mac OS X with clang compiler

2016-05-03 Thread Tibor Kiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tibor Kiss updated HDFS-10354:
--
Description: 
Compilation fails with multiple errors on Mac OS X.
Unit test test_test_libhdfs_zerocopy_hdfs_static also fails to execute on OS X.

Compile error 1:
{noformat}
 [exec] Scanning dependencies of target common_obj
 [exec] [ 45%] Building CXX object 
main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/base64.cc.o
 [exec] [ 45%] Building CXX object 
main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/status.cc.o
 [exec] [ 46%] Building CXX object 
main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/sasl_digest_md5.cc.o
 [exec] [ 46%] Building CXX object 
main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/hdfs_public_api.cc.o
 [exec] [ 47%] Building CXX object 
main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/options.cc.o
 [exec] [ 48%] Building CXX object 
main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/configuration.cc.o
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/common/configuration.cc:85:12:
 error: no viable conversion from 'optional' to 'optional'
 [exec] return result;
 [exec]^~
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party/tr2/optional.hpp:427:13:
 note: candidate constructor not viable: no known conversion from 
'std::experimental::optional' to 'std::experimental::nullopt_t' for 1st 
argument
 [exec]   constexpr optional(nullopt_t) noexcept : OptionalBase() {};
 [exec] ^
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party/tr2/optional.hpp:429:3:
 note: candidate constructor not viable: no known conversion from 
'std::experimental::optional' to 'const std::experimental::optional &' for 1st argument
 [exec]   optional(const optional& rhs)
 [exec]   ^
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party/tr2/optional.hpp:438:3:
 note: candidate constructor not viable: no known conversion from 
'std::experimental::optional' to 'std::experimental::optional 
&&' for 1st argument
 [exec]   optional(optional&& rhs) 
noexcept(is_nothrow_move_constructible::value)
 [exec]   ^
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party/tr2/optional.hpp:447:13:
 note: candidate constructor not viable: no known conversion from 
'std::experimental::optional' to 'const long long &' for 1st argument
 [exec]   constexpr optional(const T& v) : OptionalBase(v) {}
 [exec] ^
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party/tr2/optional.hpp:449:13:
 note: candidate constructor not viable: no known conversion from 
'std::experimental::optional' to 'long long &&' for 1st argument
 [exec]   constexpr optional(T&& v) : OptionalBase(constexpr_move(v)) {}
 [exec] ^
 [exec] 1 error generated.
 [exec] make[2]: *** 
[main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/configuration.cc.o] 
Error 1
 [exec] make[1]: *** 
[main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/all] Error 2
 [exec] make: *** [all] Error 2
{noformat}

Compile error 2:
{noformat}
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/fs/filesystem.cc:285:66:
 error: use of overloaded operator '<<' is ambiguous (with operand types 
'hdfs::LogMessage' and 'size_type' (aka 'unsigned long'))
 [exec]   << " Existing thread count = " << 
worker_threads_.size());
 [exec]   
~~~^~
{noformat}

There is an addition compile failure in native client related to thread_local.
The complexity of the error mandates to track that issue in a [separate 
ticket|https://issues.apache.org/jira/browse/HDFS-10355].
{noformat}
 [exec] 
/Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/bindings/c/hdfs.cc:66:1:
 error: thread-local storage is not supported for the current target
 [exec] thread_local std::string errstr;
 [exec] ^
 [exec] 1 warning and 1 error generated.
 [exec] make[2]: *** 
[main/native/libhdfspp/lib/bindings/c/CMakeFiles/bindings_c_obj.dir/hdfs.cc.o] 
Error 1
 [exec] make[1]: *** 
[main/native/libhdfspp/lib/bindings/c/CMakeFiles/bindings_c_obj.dir/all] Error 2
 [

[jira] [Commented] (HDFS-9732) Remove DelegationTokenIdentifier.toString() —for better logging output

2016-05-03 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269156#comment-15269156
 ] 

Yongjun Zhang commented on HDFS-9732:
-

Hi [~liuml07],

Thanks for your question. Please see some discussion here
http://stackoverflow.com/questions/4648607/stringbuilder-stringbuffer-vs-operator

For things that compiler will automatically translate from {{+}} to 
{{StringBuilder}}, we probably don't have to change.  However, the change 
doesn't really hurt, and because of the formatting done with the patch, 
readability looks better than original code rather than an issue.

Some changes made in the patch is used in a loop to construct a string, where 
{{StringBuilder}} is preferred.

Thanks.


> Remove DelegationTokenIdentifier.toString() —for better logging output
> --
>
> Key: HDFS-9732
> URL: https://issues.apache.org/jira/browse/HDFS-9732
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Steve Loughran
>Assignee: Yongjun Zhang
> Attachments: HADOOP-12752-001.patch, HDFS-9732-000.patch, 
> HDFS-9732.001.patch, HDFS-9732.002.patch, HDFS-9732.003.patch, 
> HDFS-9732.004.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> HDFS {{DelegationTokenIdentifier.toString()}} adds some diagnostics info, 
> owner, sequence number. But its superclass,  
> {{AbstractDelegationTokenIdentifier}} contains a lot more information, 
> including token issue and expiry times.
> Because  {{DelegationTokenIdentifier.toString()}} doesn't include this data,
> information that is potentially useful for kerberos diagnostics is lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10355) Fix thread_local related build issue on Mac OS X

2016-05-03 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269134#comment-15269134
 ] 

James Clampffer commented on HDFS-10355:


Does the builtin version of clang on OSX have pthread support?

> Fix thread_local related build issue on Mac OS X
> 
>
> Key: HDFS-10355
> URL: https://issues.apache.org/jira/browse/HDFS-10355
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
> Environment: OS: Mac OS X 10.11
> clang: Apple LLVM version 7.0.2 (clang-700.1.81)
>Reporter: Tibor Kiss
>
> The native hdfs library uses C++11 features heavily.
> One of such feature is thread_local storage class which is supported in GCC, 
> Visual Studio and the community version of clang compiler, but not by Apple's 
> clang (which is default on OS X boxes). 
> See further details here: http://stackoverflow.com/a/29929949
> Even though not many Hadoop cluster runs on OS X developers still use this 
> platform for development.
> The problem can be solved multiple ways:
>  a) Stick to gcc/g++ or community based clang on OS X. Developers will need 
> extra steps to build Hadoop.
>  b) Workaround thread_local with a helper class.
>  c) Get rid of all the globals marked with thread_local. Interface change 
> will be erquired.
>  d) Disable multi threading support in the native client on OS X and document 
> this limitation. 
> Compile error related to thread_local:
> {noformat}
>  [exec] 
> /Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/bindings/c/hdfs.cc:66:1:
>  error: thread-local storage is not supported for the current target
>  [exec] thread_local std::string errstr;
>  [exec] ^
>  [exec] 1 warning and 1 error generated.
>  [exec] make[2]: *** 
> [main/native/libhdfspp/lib/bindings/c/CMakeFiles/bindings_c_obj.dir/hdfs.cc.o]
>  Error 1
>  [exec] make[1]: *** 
> [main/native/libhdfspp/lib/bindings/c/CMakeFiles/bindings_c_obj.dir/all] 
> Error 2
>  [exec] make: *** [all] Error 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10359) Allow trigger block report from all datanodes

2016-05-03 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269106#comment-15269106
 ] 

Kihwal Lee commented on HDFS-10359:
---

Although this will be an excellent stress test tool, it will probably be best 
to not make it easy for users to unleash the beast.
If an admin somehow wants to do it, one can write a simple script to get the 
list of nodes and issue the existing dfsadmin command to all nodes.

> Allow trigger block report from all datanodes
> -
>
> Key: HDFS-10359
> URL: https://issues.apache.org/jira/browse/HDFS-10359
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0, 2.6.1
>Reporter: Tao Jie
>
> Since we have HDFS-7278 allows trigger block report from one certain 
> datanode. It would be helpful to add a option to this command to trigger 
> block report from all datanodes.
> Command maybe like this:
> *hdfs dfsadmin -triggerBlockReport \[-incremental\] 
> *



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10359) Allow trigger block report from all datanodes

2016-05-03 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269055#comment-15269055
 ] 

Chris Nauroth commented on HDFS-10359:
--

I also am not in favor of adding such a command, for the same reasons stated by 
others.  There has been a lot of work lately around reducing load generated by 
block reports and staggering them to prevent a thundering herd problem.  A mass 
trigger for all DataNodes would circumvent those protections.  If someone 
really, really had a need to do this for some reason, then I suppose they could 
script a sequence of single-node calls.  That would at least introduce some 
amount of throttling in the process.

> Allow trigger block report from all datanodes
> -
>
> Key: HDFS-10359
> URL: https://issues.apache.org/jira/browse/HDFS-10359
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0, 2.6.1
>Reporter: Tao Jie
>
> Since we have HDFS-7278 allows trigger block report from one certain 
> datanode. It would be helpful to add a option to this command to trigger 
> block report from all datanodes.
> Command maybe like this:
> *hdfs dfsadmin -triggerBlockReport \[-incremental\] 
> *



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-05-03 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268988#comment-15268988
 ] 

James Clampffer commented on HDFS-9890:
---

Patch looks good.  I have some comments/open questions:

Can we push "random() % RANDOM_ERROR_RATIO" into the callbacks everywhere so 
that
{code}
if (status.ok() && random() % RANDOM_ERROR_RATIO == 0 && event_resp.response() 
== event_response::kTest_Error)
{code}
turns into
{code}
if (status.ok() && event_resp.response() == event_response::kTest_Error)
{code}
So that the callback can trigger failures deterministically if needed.  It 
looks like you already took care of this in filehandle.cc
{code}
event_response event_resp = event_handlers->call(FILE_DN_CONNECT_EVENT, 
cluster_name.c_str(), path.c_str(), 0);
 #ifndef NDEBUG
 if (event_resp.response() == event_response::kTest_Error) {
   status = event_resp.status();
{code}

BlockReader shouldn't need to hold a shared_ptr to the event_handlers_ object.  
As far as object lifetimes go FileSystem/FileHandle should always outlive block 
readers (it's a bug if they don't).  Fewer shared_ptrs means fewer bus 
locks/cache invalidations to deal with atomic integer operations; those can add 
up on NUMA systems.  I know I've added some where they aren't need 
(BlockReader::cancel_state_) that I've been trying to remove whenever I'm 
working on those bits.

Another question I have but not really related to this patch is if 
event_response needs to be a seperate class from Status.  At least in this 
usage they get effectively the same thing done but maybe there are others I'm 
not thinking about.


> libhdfs++: Add test suite to simulate network issues
> 
>
> Key: HDFS-9890
> URL: https://issues.apache.org/jira/browse/HDFS-9890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-9890.HDFS-8707.000.patch, 
> HDFS-9890.HDFS-8707.001.patch, HDFS-9890.HDFS-8707.002.patch, 
> HDFS-9890.HDFS-8707.003.patch
>
>
> I propose adding a test suite to simulate various network issues/failures in 
> order to get good test coverage on some of the retry paths that aren't easy 
> to hit in mock unit tests.
> At the moment the only things that hit the retry paths are the gmock unit 
> tests.  The gmock are only as good as their mock implementations which do a 
> great job of simulating protocol correctness but not more complex 
> interactions.  They also can't really simulate the types of lock contention 
> and subtle memory stomps that show up while doing hundreds or thousands of 
> concurrent reads.   We should add a new minidfscluster test that focuses on 
> heavy read/seek load and then randomly convert error codes returned by 
> network functions into errors.
> List of things to simulate(while heavily loaded), roughly in order of how 
> badly I think they need to be tested at the moment:
> -Rpc connection disconnect
> -Rpc connection slowed down enough to cause a timeout and trigger retry
> -DN connection disconnect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10328) Add per-cache-pool default replication num configuration

2016-05-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268954#comment-15268954
 ] 

Colin Patrick McCabe commented on HDFS-10328:
-

Thanks for the patch, [~xupener].

{code}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
 
b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
index 7acb394..73db055 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
+++ 
b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
@@ -533,7 +533,8 @@ message CachePoolInfoProto {
   optional string groupName = 3;
   optional int32 mode = 4;
   optional int64 limit = 5;
-  optional int64 maxRelativeExpiry = 6;
+  optional uint32 defaultReplication = 6;
+  optional int64 maxRelativeExpiry = 7;
 }
{code}
Please be careful not to remove or change fields that already exist.  In this 
case, you have moved maxRelativeExpiry from field 6 to field 7, which is an 
incompatible change.  Instead, you should simply add your new field to the end.

I suggest using something like this:
{code}
+  optional uint32 defaultReplication = 6 [default=1];
{code}

To avoid having to programmatically add a default of 1 in so many places.

> Add per-cache-pool default replication num configuration
> 
>
> Key: HDFS-10328
> URL: https://issues.apache.org/jira/browse/HDFS-10328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching
>Reporter: xupeng
>Assignee: xupeng
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-10328.001.patch, HDFS-10328.002.patch
>
>
> For now, hdfs cacheadmin can not set a default replication num for cached 
> directive in the same cachepool. Each cache directive added in the same cache 
> pool should set their own replication num individually. 
> Consider this situation, we add daily hive table into cache pool "hive" .Each 
> time i should set the same replication num for every table directive in the 
> same cache pool.  
> I think we should enable setting a default replication num for a cachepool 
> that every cache directive in the pool can inherit replication configuration 
> from the pool. Also cache directive can override replication configuration 
> explicitly by calling "add & modify  directive -replication" command from 
> cli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10348) Namenode report bad block method doesn't check whether the block belongs to datanode before adding it to corrupt replicas map.

2016-05-03 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-10348:
--
Status: Patch Available  (was: Open)

Attaching a new patch addressing white space and find bugs warnings.
TestBlockManager passes on my local box on both jdk versions.
Failure of TestHFlush is being tracked under multiple jiras: HDFS-2043, 
HDFS-3041 and HDFS-4504
TestDataNodeLifeline is failing on local box with and without my patch.
But I don't see it failing on any other jenkins build except this one.
I will wait for another jenkins build on my latest patch.
If it fails again then I will file a jira.

> Namenode report bad block method doesn't check whether the block belongs to 
> datanode before adding it to corrupt replicas map.
> --
>
> Key: HDFS-10348
> URL: https://issues.apache.org/jira/browse/HDFS-10348
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10348-1.patch, HDFS-10348.patch
>
>
> Namenode (via report bad block nethod) doesn't check whether the block 
> belongs to the datanode before it adds to corrupt replicas map.
> In one of our cluster we found that there were 3 lingering corrupt blocks.
> It happened in the following order.
> 1. Two clients called getBlockLocations for a particular file.
> 2. Client C1 tried to open the file and encountered checksum error from   
> node N3 and it reported bad block (blk1) to the namenode.
> 3. Namenode added that node N3 and block blk1  to corrrupt replicas map   and 
> ask one of the good node (one of the 2 nodes) to replicate the block to 
> another node N4.
> 4. After receiving the block, N4 sends an IBR (with RECEIVED_BLOCK) to 
> namenode.
> 5. Namenode removed the block and node N3 from corrupt replicas map.
>It also removed N3's storage from triplets and queued an invalidate 
> request for N3.
> 6. In the mean time, Client C2 tries to open the file and the request went to 
> node N3.
>C2 also encountered the checksum exception and reported bad block to 
> namenode.
> 7. Namenode added the corrupt block blk1 and node N3 to the corrupt replicas 
> map without confirming whether node N3 has the block or not.
> After deleting the block, N3 sends an IBR (with DELETED) and the namenode 
> simply ignores the report since the N3's storage is no longer in the 
> triplets(from step 5)
> We took the node out of rotation, but still the block was present only in the 
> corruptReplciasMap. 
> Since on removing the node, we only goes through the block which are present 
> in the triplets for a given datanode.
> [~kshukla]'s patch fixed this bug via 
> https://issues.apache.org/jira/browse/HDFS-9958.
> But I think the following check should be made in the 
> BlockManager#markBlockAsCorrupt instead of 
> BlockManager#findAndMarkBlockAsCorrupt.
> {noformat}
> if (storage == null) {
>   storage = storedBlock.findStorageInfo(node);
> }
> if (storage == null) {
>   blockLog.debug("BLOCK* findAndMarkBlockAsCorrupt: {} not found on {}",
>   blk, dn);
>   return;
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9823) libhdfs++: Cancel outstanding operations without calling shutdown

2016-05-03 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9823:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

This has been taken care of in HDFS-10311.

> libhdfs++: Cancel outstanding operations without calling shutdown
> -
>
> Key: HDFS-9823
> URL: https://issues.apache.org/jira/browse/HDFS-9823
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9823.HDFS-8707.000.patch
>
>
> Testing has shown that there is a race condition in calling shutdown then 
> close().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10359) Allow trigger block report from all datanodes

2016-05-03 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268894#comment-15268894
 ] 

Arpit Agarwal commented on HDFS-10359:
--

I agree with [~linyiqun]. This could bring down the NameNode or at least make 
it very busy for a while. Not in favor of adding it.

> Allow trigger block report from all datanodes
> -
>
> Key: HDFS-10359
> URL: https://issues.apache.org/jira/browse/HDFS-10359
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0, 2.6.1
>Reporter: Tao Jie
>
> Since we have HDFS-7278 allows trigger block report from one certain 
> datanode. It would be helpful to add a option to this command to trigger 
> block report from all datanodes.
> Command maybe like this:
> *hdfs dfsadmin -triggerBlockReport \[-incremental\] 
> *



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10348) Namenode report bad block method doesn't check whether the block belongs to datanode before adding it to corrupt replicas map.

2016-05-03 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-10348:
--
Attachment: HDFS-10348-1.patch

> Namenode report bad block method doesn't check whether the block belongs to 
> datanode before adding it to corrupt replicas map.
> --
>
> Key: HDFS-10348
> URL: https://issues.apache.org/jira/browse/HDFS-10348
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10348-1.patch, HDFS-10348.patch
>
>
> Namenode (via report bad block nethod) doesn't check whether the block 
> belongs to the datanode before it adds to corrupt replicas map.
> In one of our cluster we found that there were 3 lingering corrupt blocks.
> It happened in the following order.
> 1. Two clients called getBlockLocations for a particular file.
> 2. Client C1 tried to open the file and encountered checksum error from   
> node N3 and it reported bad block (blk1) to the namenode.
> 3. Namenode added that node N3 and block blk1  to corrrupt replicas map   and 
> ask one of the good node (one of the 2 nodes) to replicate the block to 
> another node N4.
> 4. After receiving the block, N4 sends an IBR (with RECEIVED_BLOCK) to 
> namenode.
> 5. Namenode removed the block and node N3 from corrupt replicas map.
>It also removed N3's storage from triplets and queued an invalidate 
> request for N3.
> 6. In the mean time, Client C2 tries to open the file and the request went to 
> node N3.
>C2 also encountered the checksum exception and reported bad block to 
> namenode.
> 7. Namenode added the corrupt block blk1 and node N3 to the corrupt replicas 
> map without confirming whether node N3 has the block or not.
> After deleting the block, N3 sends an IBR (with DELETED) and the namenode 
> simply ignores the report since the N3's storage is no longer in the 
> triplets(from step 5)
> We took the node out of rotation, but still the block was present only in the 
> corruptReplciasMap. 
> Since on removing the node, we only goes through the block which are present 
> in the triplets for a given datanode.
> [~kshukla]'s patch fixed this bug via 
> https://issues.apache.org/jira/browse/HDFS-9958.
> But I think the following check should be made in the 
> BlockManager#markBlockAsCorrupt instead of 
> BlockManager#findAndMarkBlockAsCorrupt.
> {noformat}
> if (storage == null) {
>   storage = storedBlock.findStorageInfo(node);
> }
> if (storage == null) {
>   blockLog.debug("BLOCK* findAndMarkBlockAsCorrupt: {} not found on {}",
>   blk, dn);
>   return;
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10348) Namenode report bad block method doesn't check whether the block belongs to datanode before adding it to corrupt replicas map.

2016-05-03 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-10348:
--
Status: Open  (was: Patch Available)

> Namenode report bad block method doesn't check whether the block belongs to 
> datanode before adding it to corrupt replicas map.
> --
>
> Key: HDFS-10348
> URL: https://issues.apache.org/jira/browse/HDFS-10348
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10348.patch
>
>
> Namenode (via report bad block nethod) doesn't check whether the block 
> belongs to the datanode before it adds to corrupt replicas map.
> In one of our cluster we found that there were 3 lingering corrupt blocks.
> It happened in the following order.
> 1. Two clients called getBlockLocations for a particular file.
> 2. Client C1 tried to open the file and encountered checksum error from   
> node N3 and it reported bad block (blk1) to the namenode.
> 3. Namenode added that node N3 and block blk1  to corrrupt replicas map   and 
> ask one of the good node (one of the 2 nodes) to replicate the block to 
> another node N4.
> 4. After receiving the block, N4 sends an IBR (with RECEIVED_BLOCK) to 
> namenode.
> 5. Namenode removed the block and node N3 from corrupt replicas map.
>It also removed N3's storage from triplets and queued an invalidate 
> request for N3.
> 6. In the mean time, Client C2 tries to open the file and the request went to 
> node N3.
>C2 also encountered the checksum exception and reported bad block to 
> namenode.
> 7. Namenode added the corrupt block blk1 and node N3 to the corrupt replicas 
> map without confirming whether node N3 has the block or not.
> After deleting the block, N3 sends an IBR (with DELETED) and the namenode 
> simply ignores the report since the N3's storage is no longer in the 
> triplets(from step 5)
> We took the node out of rotation, but still the block was present only in the 
> corruptReplciasMap. 
> Since on removing the node, we only goes through the block which are present 
> in the triplets for a given datanode.
> [~kshukla]'s patch fixed this bug via 
> https://issues.apache.org/jira/browse/HDFS-9958.
> But I think the following check should be made in the 
> BlockManager#markBlockAsCorrupt instead of 
> BlockManager#findAndMarkBlockAsCorrupt.
> {noformat}
> if (storage == null) {
>   storage = storedBlock.findStorageInfo(node);
> }
> if (storage == null) {
>   blockLog.debug("BLOCK* findAndMarkBlockAsCorrupt: {} not found on {}",
>   blk, dn);
>   return;
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10354) Fix compilation & unit test issues on Mac OS X with clang compiler

2016-05-03 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268892#comment-15268892
 ] 

Bob Hansen commented on HDFS-10354:
---

Thanks for bringing that to our attention, [~tibor.k...@gmail.com].  We don't 
currently have CI infrastructure for OSX, so things like this may creep in from 
time to time; we do our best to keep them out.

Also thanks for submitting a patch for the more mechanical issues.  It's a big 
help.

The thread_local issue is a sticky one;  according to 
http://stackoverflow.com/questions/28094794/why-does-apple-clang-disallow-c11-thread-local-when-official-clang-supports,
 the homebrew clang supports thread_local as part of C++11, but the Apple clang 
does not.  For the API to be rational, it is important that getLastError return 
the last error from the calling thread, so we would need to find a dodge to 
support the same semantics for thread-locals for Apple clang.  Perhaps a map of 
threadID->lastError string?

The LD_PRELOAD additions look good.

The cast in Configuration::GetInt() is probably a good call.

In hdfsTell(hdfs.cc:383), I think a more correct change would be changing the 
type of the local offset variable to be off_t to match the expected type of the 
Seek call.  If we have to do a cast of the resulting value in the return 
statement, that would be OK.  Casting a pointer to a possibly differently-sized 
type should make us all feel very uncomfortable.

For the serialization errors, might I recommend fixing up LogMessage to take 
size_type (iff it is not the same as uint32_t or uint64_t) rather than doing a 
cast at the call site?





> Fix compilation & unit test issues on Mac OS X with clang compiler
> --
>
> Key: HDFS-10354
> URL: https://issues.apache.org/jira/browse/HDFS-10354
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
> Environment: OS X: 10.11
> clang: Apple LLVM version 7.0.2 (clang-700.1.81)
>Reporter: Tibor Kiss
>Assignee: Tibor Kiss
> Attachments: HDFS-10354.HDFS-8707.001.patch, 
> HDFS-10354.HDFS-8707.002.patch
>
>
> Compilation fails with multiple errors on Mac OS X.
> Unit test test_test_libhdfs_zerocopy_hdfs_static also fails to execute on OS 
> X.
> Compile error 1:
> {noformat}
>  [exec] Scanning dependencies of target common_obj
>  [exec] [ 45%] Building CXX object 
> main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/base64.cc.o
>  [exec] [ 45%] Building CXX object 
> main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/status.cc.o
>  [exec] [ 46%] Building CXX object 
> main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/sasl_digest_md5.cc.o
>  [exec] [ 46%] Building CXX object 
> main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/hdfs_public_api.cc.o
>  [exec] [ 47%] Building CXX object 
> main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/options.cc.o
>  [exec] [ 48%] Building CXX object 
> main/native/libhdfspp/lib/common/CMakeFiles/common_obj.dir/configuration.cc.o
>  [exec] 
> /Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/common/configuration.cc:85:12:
>  error: no viable conversion from 'optional' to 'optional'
>  [exec] return result;
>  [exec]^~
>  [exec] 
> /Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party/tr2/optional.hpp:427:13:
>  note: candidate constructor not viable: no known conversion from 
> 'std::experimental::optional' to 'std::experimental::nullopt_t' for 1st 
> argument
>  [exec]   constexpr optional(nullopt_t) noexcept : OptionalBase() {};
>  [exec] ^
>  [exec] 
> /Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party/tr2/optional.hpp:429:3:
>  note: candidate constructor not viable: no known conversion from 
> 'std::experimental::optional' to 'const 
> std::experimental::optional &' for 1st argument
>  [exec]   optional(const optional& rhs)
>  [exec]   ^
>  [exec] 
> /Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party/tr2/optional.hpp:438:3:
>  note: candidate constructor not viable: no known conversion from 
> 'std::experimental::optional' to 'std::experimental::optional long> &&' for 1st argument
>  [exec]   optional(optional&& rhs) 
> noexcept(is_nothrow_move_constructible::value)
>  [exec]   ^
>  [exec] 
> /Users/tiborkiss/workspace/apache-hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party/tr2/optional.hpp:447:13:
>  note: candidate constructor 

[jira] [Commented] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268841#comment-15268841
 ] 

Hadoop QA commented on HDFS-8449:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
6s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 18s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 26s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 
61 unchanged - 0 fixed = 63 total (was 61) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 107m 0s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_92. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 12s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
30s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 242m 44s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_92 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicy |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.TestAsyncDFSRename |
| JDK 

[jira] [Commented] (HDFS-10328) Add per-cache-pool default replication num configuration

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268740#comment-15268740
 ] 

Hadoop QA commented on HDFS-10328:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 33s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
34s {color} | {color:green} hadoop-hdfs-project: patch generated 0 new + 157 
unchanged - 2 fixed = 157 total (was 159) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 1s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_92. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 22s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_92. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 9s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {col

[jira] [Commented] (HDFS-7330) Unclosed RandomAccessFile warnings in FSDatasetIml.

2016-05-03 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268735#comment-15268735
 ] 

Andras Bokor commented on HDFS-7330:


[~shv] Where do you see the warning?

> Unclosed RandomAccessFile warnings in FSDatasetIml.
> ---
>
> Key: HDFS-7330
> URL: https://issues.apache.org/jira/browse/HDFS-7330
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.1
>Reporter: Konstantin Shvachko
>Assignee: Andras Bokor
>  Labels: newbie
>
> RandomAccessFile is opened as an underline file for FileInputStream. It 
> should be closed when the stream is closed. So to fix these 2 warning (in 
> getBlockInputStream() and getTmpInputStreams()) we just need suppress them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10359) Allow trigger block report from all datanodes

2016-05-03 Thread Lin Yiqun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268541#comment-15268541
 ] 

Lin Yiqun commented on HDFS-10359:
--

Hi, [~Tao Jie], I don't think this is a good idea. If I trigger all the 
datanode in my cluster, it will flood my namenode with block report. Maybe the 
property of {{dfs.blockreport.initialDelay}} can be used for that, but I still 
think remain the original command unchanged will be better.

> Allow trigger block report from all datanodes
> -
>
> Key: HDFS-10359
> URL: https://issues.apache.org/jira/browse/HDFS-10359
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0, 2.6.1
>Reporter: Tao Jie
>
> Since we have HDFS-7278 allows trigger block report from one certain 
> datanode. It would be helpful to add a option to this command to trigger 
> block report from all datanodes.
> Command maybe like this:
> *hdfs dfsadmin -triggerBlockReport \[-incremental\] 
> *



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-05-03 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8449:

Attachment: HDFS-8449-008.patch

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch, 
> HDFS-8449-005.patch, HDFS-8449-006.patch, HDFS-8449-007.patch, 
> HDFS-8449-008.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10359) Allow trigger block report from all datanodes

2016-05-03 Thread Tao Jie (JIRA)
Tao Jie created HDFS-10359:
--

 Summary: Allow trigger block report from all datanodes
 Key: HDFS-10359
 URL: https://issues.apache.org/jira/browse/HDFS-10359
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.6.1, 2.7.0
Reporter: Tao Jie


Since we have HDFS-7278 allows trigger block report from one certain datanode. 
It would be helpful to add a option to this command to trigger block report 
from all datanodes.
Command maybe like this:
*hdfs dfsadmin -triggerBlockReport \[-incremental\] 
*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268438#comment-15268438
 ] 

Kai Zheng commented on HDFS-9833:
-

Thanks Rakesh for your understanding. It may help to explain about this in 
other words according to mine.

You're right in client side we need to try each datanode in the group and let 
it do the block group checksum computing. It includes datanodes of both data 
blocks and and parity blocks because parity block datanodes can also do the 
same work. Anyway when a datanode in the group is requested to do the computing 
work, it will request/collect all the checksums for the blocks in the group to 
compute the block group level checksum to respond to the client call. When all 
the blocks are fine the existing block checksums are just requested 
remotely/locally and used, but in case some data block is erased, the similar 
reconstruction task will be executed on the requested datanode to recompute the 
block checksum on the fly. Anyway when it fails then it will return failure to 
the client instead of the normal block group checksum. When the client receives 
failure it means the requested datanode isn't able to do the work so it will 
retry with next datanode in the group.

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10328) Add per-cache-pool default replication num configuration

2016-05-03 Thread xupeng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268433#comment-15268433
 ] 

xupeng commented on HDFS-10328:
---

Attach a new patch, modify cli command arg name , and change the filed name of  
some class. 

> Add per-cache-pool default replication num configuration
> 
>
> Key: HDFS-10328
> URL: https://issues.apache.org/jira/browse/HDFS-10328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching
>Reporter: xupeng
>Assignee: xupeng
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-10328.001.patch, HDFS-10328.002.patch
>
>
> For now, hdfs cacheadmin can not set a default replication num for cached 
> directive in the same cachepool. Each cache directive added in the same cache 
> pool should set their own replication num individually. 
> Consider this situation, we add daily hive table into cache pool "hive" .Each 
> time i should set the same replication num for every table directive in the 
> same cache pool.  
> I think we should enable setting a default replication num for a cachepool 
> that every cache directive in the pool can inherit replication configuration 
> from the pool. Also cache directive can override replication configuration 
> explicitly by calling "add & modify  directive -replication" command from 
> cli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10328) Add per-cache-pool default replication num configuration

2016-05-03 Thread xupeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xupeng updated HDFS-10328:
--
Attachment: HDFS-10328.002.patch

> Add per-cache-pool default replication num configuration
> 
>
> Key: HDFS-10328
> URL: https://issues.apache.org/jira/browse/HDFS-10328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching
>Reporter: xupeng
>Assignee: xupeng
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-10328.001.patch, HDFS-10328.002.patch
>
>
> For now, hdfs cacheadmin can not set a default replication num for cached 
> directive in the same cachepool. Each cache directive added in the same cache 
> pool should set their own replication num individually. 
> Consider this situation, we add daily hive table into cache pool "hive" .Each 
> time i should set the same replication num for every table directive in the 
> same cache pool.  
> I think we should enable setting a default replication num for a cachepool 
> that every cache directive in the pool can inherit replication configuration 
> from the pool. Also cache directive can override replication configuration 
> explicitly by calling "add & modify  directive -replication" command from 
> cli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable

2016-05-03 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268378#comment-15268378
 ] 

John Zhuge commented on HDFS-10287:
---

[~boky01], do you think {{MiniDFSCluster}} should extend {{AbstractService}} or 
even {{CompositeService}}?

> MiniDFSCluster should implement AutoCloseable
> -
>
> Key: HDFS-10287
> URL: https://issues.apache.org/jira/browse/HDFS-10287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
> Attachments: HDFS-10287.01.patch
>
>
> {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support 
> [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html].
>  It will make test code a little cleaner and more reliable.
> Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be 
> backported to Hadoop version prior to 2.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-05-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268300#comment-15268300
 ] 

Hadoop QA commented on HDFS-8449:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 
61 unchanged - 0 fixed = 63 total (was 61) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 42s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 20s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
29s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 188m 40s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.TestRenameWhileOpen |
|   | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
|   | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/se

[jira] [Commented] (HDFS-10328) Add per-cache-pool default replication num configuration

2016-05-03 Thread xupeng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268254#comment-15268254
 ] 

xupeng commented on HDFS-10328:
---

Hi [~cmccabe] :

Thanks for your reply , and sorry for my late reply.

I agree with your opinion, i have updated the name and description of the JIRA 
and i will update a new patch soon.  

> Add per-cache-pool default replication num configuration
> 
>
> Key: HDFS-10328
> URL: https://issues.apache.org/jira/browse/HDFS-10328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching
>Reporter: xupeng
>Assignee: xupeng
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-10328.001.patch
>
>
> For now, hdfs cacheadmin can not set a default replication num for cached 
> directive in the same cachepool. Each cache directive added in the same cache 
> pool should set their own replication num individually. 
> Consider this situation, we add daily hive table into cache pool "hive" .Each 
> time i should set the same replication num for every table directive in the 
> same cache pool.  
> I think we should enable setting a default replication num for a cachepool 
> that every cache directive in the pool can inherit replication configuration 
> from the pool. Also cache directive can override replication configuration 
> explicitly by calling "add & modify  directive -replication" command from 
> cli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-03 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268251#comment-15268251
 ] 

Rakesh R commented on HDFS-9833:


Following is the brief idea about the proposed approach. Kindly go through this 
and would be great to see the feedback on this. Thanks!

In our existing striped checksum logic, client is connecting to the first 
datanode in the block locations and sending {{Op.BLOCK_GROUP_CHECKSUM}} 
command. He will iterate over {{ecPolicy.getNumDataUnits()}} datanodes and 
invokes {{Op.BLOCK_CHECKSUM}} command one by one. During these operations it 
can hit {{IOException}} and fail the checksum call.

To begin with, I think will catch generic {{IOException}} while performing 
operation on a datanode. The block corresponding to the failed datanode will be 
chosen for reconstruction and then recompute checksum with the reconstructed 
block data.
# Datanode side changes:
If there is an IOException while performing {{Op.BLOCK_CHECKSUM}} command then 
it will consider this block for reconstruction and calculate its checksum. 
Again the reconstruction errors will fail the checksum call.
# Client side changes:
Presently {{FileChecksumHelper#checksumBlockGroup()}} function is throwing 
IOException back to the client if the first datanode has errors, instead will 
try connecting to {{#getNumParityUnits()}} number of datanodes before failing 
the checksum operation.

Thanks [~umamaheswararao] for the offline discussions.

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10328) Add per-cache-pool default replication num configuration

2016-05-03 Thread xupeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xupeng updated HDFS-10328:
--
Release Note: Add per-cache-pool default replication num configuration  
(was: add cache pool level replication configuration support)

> Add per-cache-pool default replication num configuration
> 
>
> Key: HDFS-10328
> URL: https://issues.apache.org/jira/browse/HDFS-10328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching
>Reporter: xupeng
>Assignee: xupeng
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-10328.001.patch
>
>
> For now, hdfs cacheadmin can not set a default replication num for cached 
> directive in the same cachepool. Each cache directive added in the same cache 
> pool should set their own replication num individually. 
> Consider this situation, we add daily hive table into cache pool "hive" .Each 
> time i should set the same replication num for every table directive in the 
> same cache pool.  
> I think we should enable setting a default replication num for a cachepool 
> that every cache directive in the pool can inherit replication configuration 
> from the pool. Also cache directive can override replication configuration 
> explicitly by calling "add & modify  directive -replication" command from 
> cli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10328) Add per-cache-pool default replication num configuration

2016-05-03 Thread xupeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xupeng updated HDFS-10328:
--
Summary: Add per-cache-pool default replication num configuration  (was: 
Add per-cache-pool default replication configuration)

> Add per-cache-pool default replication num configuration
> 
>
> Key: HDFS-10328
> URL: https://issues.apache.org/jira/browse/HDFS-10328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching
>Reporter: xupeng
>Assignee: xupeng
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-10328.001.patch
>
>
> For now, hdfs cacheadmin can not set a default replication num for cached 
> directive in the same cachepool. Each cache directive added in the same cache 
> pool should set their own replication num individually. 
> Consider this situation, we add daily hive table into cache pool "hive" .Each 
> time i should set the same replication num for every table directive in the 
> same cache pool.  
> I think we should enable setting a default replication num for a cachepool 
> that every cache directive in the pool can inherit replication configuration 
> from the pool. Also cache directive can override replication configuration 
> explicitly by calling "add & modify  directive -replication" command from 
> cli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10328) Add per-cache-pool default replication configuration

2016-05-03 Thread xupeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xupeng updated HDFS-10328:
--
Description: 
For now, hdfs cacheadmin can not set a default replication num for cached 
directive in the same cachepool. Each cache directive added in the same cache 
pool should set their own replication num individually. 

Consider this situation, we add daily hive table into cache pool "hive" .Each 
time i should set the same replication num for every table directive in the 
same cache pool.  

I think we should enable setting a default replication num for a cachepool that 
every cache directive in the pool can inherit replication configuration from 
the pool. Also cache directive can override replication configuration 
explicitly by calling "add & modify  directive -replication" command from cli.  
  




  was:
For now, hdfs cacheadmin can not set a replication num for cachepool. Each 
cache directive added in the cache pool should set their own replication num 
individually. 

Consider this situation, we add daily hive table into cache pool "hive" .Each 
time i should set the same replication num for every table directive in the 
same cache pool.  

I think we should enable setting a replication num for cachepool that every 
cache directive in the pool can inherit replication configuration from the 
pool. Also cache directive can override replication configuration explicitly by 
calling "add & modify  directive -replication" command from cli.





> Add per-cache-pool default replication configuration
> 
>
> Key: HDFS-10328
> URL: https://issues.apache.org/jira/browse/HDFS-10328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching
>Reporter: xupeng
>Assignee: xupeng
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-10328.001.patch
>
>
> For now, hdfs cacheadmin can not set a default replication num for cached 
> directive in the same cachepool. Each cache directive added in the same cache 
> pool should set their own replication num individually. 
> Consider this situation, we add daily hive table into cache pool "hive" .Each 
> time i should set the same replication num for every table directive in the 
> same cache pool.  
> I think we should enable setting a default replication num for a cachepool 
> that every cache directive in the pool can inherit replication configuration 
> from the pool. Also cache directive can override replication configuration 
> explicitly by calling "add & modify  directive -replication" command from 
> cli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10328) Add per-cache-pool default replication configuration

2016-05-03 Thread xupeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xupeng updated HDFS-10328:
--
Summary: Add per-cache-pool default replication configuration  (was: Add 
cache pool level replication managment )

> Add per-cache-pool default replication configuration
> 
>
> Key: HDFS-10328
> URL: https://issues.apache.org/jira/browse/HDFS-10328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching
>Reporter: xupeng
>Assignee: xupeng
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HDFS-10328.001.patch
>
>
> For now, hdfs cacheadmin can not set a replication num for cachepool. Each 
> cache directive added in the cache pool should set their own replication num 
> individually. 
> Consider this situation, we add daily hive table into cache pool "hive" .Each 
> time i should set the same replication num for every table directive in the 
> same cache pool.  
> I think we should enable setting a replication num for cachepool that every 
> cache directive in the pool can inherit replication configuration from the 
> pool. Also cache directive can override replication configuration explicitly 
> by calling "add & modify  directive -replication" command from cli.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org