[jira] [Commented] (HDFS-9149) Consider multi datacenter when sortByDistance

2015-10-02 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941100#comment-14941100
 ] 

He Tianyi commented on HDFS-9149:
-

I think that's a good point [~hexiaoqiao].

One simple idea is generalizes {{getWeight}} into a function that calculates 
distance between two locations (more like {{getDistance}}), regardless of the 
meaning of each hierarchy. 

The only thing is that, I'm not aware why did {{getWeight}} designed to be like 
this in the first place, i.e. whether there is some particular concern. 
Does someone know the idea behind this design choice?

> Consider multi datacenter when sortByDistance
> -
>
> Key: HDFS-9149
> URL: https://issues.apache.org/jira/browse/HDFS-9149
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Tianyi
>
> {{sortByDistance}} doesn't consider multi-datacenter when read data, so there 
> my be reading data via other datacenter when hadoop deployment with multi-IDC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9149) Consider multi datacenter when sortByDistance

2015-10-02 Thread He Tianyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Tianyi reassigned HDFS-9149:
---

Assignee: He Tianyi

> Consider multi datacenter when sortByDistance
> -
>
> Key: HDFS-9149
> URL: https://issues.apache.org/jira/browse/HDFS-9149
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Tianyi
>
> {{sortByDistance}} doesn't consider multi-datacenter when read data, so there 
> my be reading data via other datacenter when hadoop deployment with multi-IDC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-02 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940929#comment-14940929
 ] 

Rakesh R commented on HDFS-9185:


Note: It looks like test case failures are not related to the patch. 
[TestRecoverStripedFile|https://builds.apache.org/job/PreCommit-HDFS-Build/12769/testReport/org.apache.hadoop.hdfs/TestRecoverStripedFile/]
 case is consistently passing now.

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9180) Update excluded DataNodes in DFSStripedOutputStream based on failures in data streamers

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941035#comment-14941035
 ] 

Hadoop QA commented on HDFS-9180:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  21m  0s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   9m  1s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 13s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 11s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   5m  5s | The patch appears to introduce 7 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 40s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 239m 52s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 33s | Tests passed in 
hadoop-hdfs-client. |
| | | 295m 12s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-client |
| Failed unit tests | hadoop.hdfs.TestRecoverStripedFile |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.TestWriteReadStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStream |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764700/HDFS-9180.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fd026f5 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12771/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12771/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs-client.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12771/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12771/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12771/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12771/console |


This message was automatically generated.

> Update excluded DataNodes in DFSStripedOutputStream based on failures in data 
> streamers
> ---
>
> Key: HDFS-9180
> URL: https://issues.apache.org/jira/browse/HDFS-9180
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9180.000.patch, HDFS-9180.001.patch
>
>
> This is a TODO in HDFS-9040: based on the failures all the striped data 
> streamers hit, the DFSStripedOutputStream should keep a record of all the 
> DataNodes that should be excluded.
> This jira will also fix several bugs in the DFSStripedOutputStream. Will 
> provide more details in the comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9189) javadoc jar contains full build, not just javadoc, making it really big

2015-10-02 Thread JIRA
André Kelpe created HDFS-9189:
-

 Summary: javadoc jar contains full build, not just javadoc, making 
it really big
 Key: HDFS-9189
 URL: https://issues.apache.org/jira/browse/HDFS-9189
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.7.1, 2.6.1
 Environment: For some reason the build of the javadoc jars includes 
all of the build including third party jars, class files and all sorts of other 
stuff making the jars really big (128MB).



Reporter: André Kelpe






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9149) Consider multi datacenter when sortByDistance

2015-10-02 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941243#comment-14941243
 ] 

He Xiaoqiao commented on HDFS-9149:
---

hi [~He Tianyi]], thank you for your comments.
{quote}
The only thing is that, I'm not aware why did getWeight designed to be like 
this in the first place, i.e. whether there is some particular concern. 
{quote}
maybe there is  no any particular concerns. From the original implemention of 
{{pseudoSortByDistance}} to 
[HDFS-6268|https://issues.apache.org/jira/browse/HDFS-6268] which is first time 
to restructure by {{SortByDistance}} there is no indication to consider 
multi-IDC scenario.
{quote}
One simple idea is generalizes getWeight into a function that calculates 
distance between two locations (more like getDistance), regardless of the 
meaning of each hierarchy.
{quote}
i think it could be simple and resonable to add if statement based on 
{{getWeight}}:
{code:java}
   protected int getWeight(Node reader, Node node) {
-// 0 is local, 1 is same rack, 2 is off rack
+// 0 is local, 1 is same rack, 2 is same IDC, 3 is off IDC
 // Start off by initializing to off rack
-int weight = 2;
+int weight = 3;
 if (reader != null) {
   if (reader.equals(node)) {
 weight = 0;
   } else if (isOnSameRack(reader, node)) {
 weight = 1;
+  } else {
+rParent = reader.getParent();
+nParent = node.getParent();
+if (null != rParent && null != nParent && isSameParent(rParent, 
nParent))
+  weight = 2;
   }
 }
 return weight;
{code}

> Consider multi datacenter when sortByDistance
> -
>
> Key: HDFS-9149
> URL: https://issues.apache.org/jira/browse/HDFS-9149
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Tianyi
>
> {{sortByDistance}} doesn't consider multi-datacenter when read data, so there 
> my be reading data via other datacenter when hadoop deployment with multi-IDC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9149) Consider multi datacenter when sortByDistance

2015-10-02 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941264#comment-14941264
 ] 

He Tianyi commented on HDFS-9149:
-

Thanks, [~hexiaoqiao].

The simpler idea sounds good! But I'm not quite sure adding one if statement 
could cover all cases.
We'll need to assume that grandparent represents a IDC node if we go with it, 
which does not always hold (since {{NetworkTopology}} did not imply that). e.g. 
I have a real scenario that location are configured like 
{{/DC/BUILDING/RACK/NODE}}. In this case, it is true that locality will happen 
to be better, but perhaps not better enough.


> Consider multi datacenter when sortByDistance
> -
>
> Key: HDFS-9149
> URL: https://issues.apache.org/jira/browse/HDFS-9149
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Tianyi
>
> {{sortByDistance}} doesn't consider multi-datacenter when read data, so there 
> my be reading data via other datacenter when hadoop deployment with multi-IDC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9190) VolumeScanner throwing NPE while scanning suspect block.

2015-10-02 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-9190:
-
Summary: VolumeScanner throwing NPE while scanning suspect block.  (was: 
VolumeScanner throwing NPE while scanning suspect blocks.)

> VolumeScanner throwing NPE while scanning suspect block.
> 
>
> Key: HDFS-9190
> URL: https://issues.apache.org/jira/browse/HDFS-9190
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Priority: Critical
>
> Volume scanner NPEs while scanning suspect Block.
> Following is the stack trace:
> {noformat}
> 2015-10-02 06:45:30,333 [VolumeScannerThread(dataDir)] ERROR 
> datanode.VolumeScanner: VolumeScanner(dataDir, 
> DS-5fc4263e-7a5c-4463-9f82-842108c0ab3b) exiting because of exception 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:539)
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:619)
> 2015-10-02 06:45:30,333 
> [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@7768ca5] WARN 
> datanode.DataNode: DatanodeRegistration(sourceDN:1004, 
> datanodeUuid=f554982f-7c45-4fd4-ad57-9d472a39729e, infoPort=1006, 
> infoSecurePort=0, ipcPort=8020, 
> storageInfo=lv=-56;cid=CID-ddc217ab-5203-48ef-9695-a348feb4dac2;nsid=1872110141;c=1443758672580):Failed
>  to transfer BP-1749317823--1443758669533:blk_1073742231_1407 to 
> destDN:1004 got 
> java.net.SocketException: Original Exception : java.io.IOException: 
> Connection reset by peer
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> at 
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:443)
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:575)
> at 
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:579)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:759)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:706)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2124)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Connection reset by peer
> ... 9 more
> {noformat}
> It is NPEing at the code in file: VolumeScanner#runLoop
> {noformat}
>   long saveDelta = monotonicMs - curBlockIter.getLastSavedMs();
> {noformat}
> curBlockIter is not initialized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Catherine Palmer (JIRA)
Catherine Palmer created HDFS-9191:
--

 Summary: Typo in  Hdfs.java.  NoSuchElementException is misspelled
 Key: HDFS-9191
 URL: https://issues.apache.org/jira/browse/HDFS-9191
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Reporter: Catherine Palmer
Assignee: Catherine Palmer
Priority: Trivial
 Fix For: 3.0.0


Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941492#comment-14941492
 ] 

Jing Zhao commented on HDFS-9185:
-

The new patch looks good to me. All the failed tests passed in my local 
machine. 

+1. I will commit it shortly.

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9185) Fix null tracer in ErasureCodingWorker

2015-10-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9185:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

I've committed this to trunk. Thanks for the contribution [~rakeshr]! Thanks 
for the review [~umamaheswararao]!

> Fix null tracer in ErasureCodingWorker
> --
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941524#comment-14941524
 ] 

Hudson commented on HDFS-9191:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8556 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8556/])
HDFS-9191. Typo in Hdfs.java. NoSuchElementException is misspelled. (jghoman: 
rev 3929ac9340a5c9f26574dc076a449f7e11931527)
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9149) Consider multi datacenter when sortByDistance

2015-10-02 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941360#comment-14941360
 ] 

He Xiaoqiao commented on HDFS-9149:
---

Thanks, [~He Tianyi].

it's not a good solution exactly. maybe we could calculate weight recursively? 
or any better suggestion?

There is another kind of situation. it is hard to calc weight between reader 
and DN where reader are not one node of Hadoop Cluster but in the same IDC with 
part of cluster.

> Consider multi datacenter when sortByDistance
> -
>
> Key: HDFS-9149
> URL: https://issues.apache.org/jira/browse/HDFS-9149
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Tianyi
>
> {{sortByDistance}} doesn't consider multi-datacenter when read data, so there 
> my be reading data via other datacenter when hadoop deployment with multi-IDC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941391#comment-14941391
 ] 

Allen Wittenauer commented on HDFS-9184:


Let me clarify a bit:

The HDFS audit log is probably the single most widely machine parsed log in the 
entirety of the Hadoop.  It was specifically made a fixed field log to make it 
easy even for beginner admins to use, in a format that doesn't require a lot of 
heavy machinery to actually make useful.  As a result, changing the format of 
this file has an extreme impact on pretty much every Hadoop operations team in 
existence.  So while the functionality may be useful, there is no way in good 
conscious should we be modifying the current layout in branch-2.

So I still stand at:

-1 for branch-2
0 for trunk

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9190) VolumeScanner throwing NPE while scanning suspect block.

2015-10-02 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941462#comment-14941462
 ] 

Xiaoyu Yao commented on HDFS-9190:
--

This should be fixed by HDFS-8850. [~shahrs87], can you try the build latest 
trunk or patch from HDFS-8850 and confirm?

> VolumeScanner throwing NPE while scanning suspect block.
> 
>
> Key: HDFS-9190
> URL: https://issues.apache.org/jira/browse/HDFS-9190
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Priority: Critical
>
> Volume scanner NPEs while scanning suspect Block.
> Following is the stack trace:
> {noformat}
> 2015-10-02 06:45:30,333 [VolumeScannerThread(dataDir)] ERROR 
> datanode.VolumeScanner: VolumeScanner(dataDir, 
> DS-5fc4263e-7a5c-4463-9f82-842108c0ab3b) exiting because of exception 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:539)
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:619)
> 2015-10-02 06:45:30,333 
> [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@7768ca5] WARN 
> datanode.DataNode: DatanodeRegistration(sourceDN:1004, 
> datanodeUuid=f554982f-7c45-4fd4-ad57-9d472a39729e, infoPort=1006, 
> infoSecurePort=0, ipcPort=8020, 
> storageInfo=lv=-56;cid=CID-ddc217ab-5203-48ef-9695-a348feb4dac2;nsid=1872110141;c=1443758672580):Failed
>  to transfer BP-1749317823--1443758669533:blk_1073742231_1407 to 
> destDN:1004 got 
> java.net.SocketException: Original Exception : java.io.IOException: 
> Connection reset by peer
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> at 
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:443)
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:575)
> at 
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:579)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:759)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:706)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2124)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Connection reset by peer
> ... 9 more
> {noformat}
> It is NPEing at the code in file: VolumeScanner#runLoop
> {noformat}
>   long saveDelta = monotonicMs - curBlockIter.getLastSavedMs();
> {noformat}
> curBlockIter is not initialized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Catherine Palmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Catherine Palmer updated HDFS-9191:
---
Status: Patch Available  (was: Open)

> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) Fix null tracer in ErasureCodingWorker

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941516#comment-14941516
 ] 

Hudson commented on HDFS-9185:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8555 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8555/])
HDFS-9185. Fix null tracer in ErasureCodingWorker. Contributed by Rakesh 
(jing9: rev c6cafc77e697317dad0708309b67b900a2e3a413)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt


> Fix null tracer in ErasureCodingWorker
> --
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-9191:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

+1.  Since it's a comment change, not waiting for Jenkins.  Thanks for the 
contribution, Catherine!

> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9186) Simplify embedding libhdfspp into other projects

2015-10-02 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9186:
--
Status: Patch Available  (was: Open)

> Simplify embedding libhdfspp into other projects
> 
>
> Key: HDFS-9186
> URL: https://issues.apache.org/jira/browse/HDFS-9186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9186.HDFS-8707.000.patch
>
>
> I'd like to add a script to the root libhdfspp directory that can prune 
> anything that libhdfspp doesn't need to compile out of the hadoop source 
> tree.  
> This way the project is a lot smaller if it's going to be included in a 
> third-party directory of another project.  The directory structure, aside 
> from missing directories, is preserved so modifications can be diffed against 
> a fresh checkout of the source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941542#comment-14941542
 ] 

Colin Patrick McCabe commented on HDFS-9184:


Is it documented anywhere that the audit log is key/value?  I didn't see any 
specification for the format... did I miss some docs somewhere?  I don't think 
this is similar to protobuf because there is a clearly defined and documented 
way to extend PB.

Many modern Hadoop systems access HDFS through a proxy.  For example, some 
people use Tachyon to get read and write caching.  RecordService provides 
row-level security and deserialization services.  Hive itself usually does its 
work on behalf of some other process like Tableau, or Spark.  How will this 
solution work in those cases?

For me, a lot of this discussion gets back to the reasons why htrace is a 
separate system rather than just part of HDFS or HBase.  You need something 
that can span multiple projects and create a coherent narrative about what's 
going on.  I agree that HTrace should not be run at 100% sampling, but I am not 
convinced by the arguments that we need 100% sampling.

If this is to diagnose performance issues, then 1% or so sampling should be 
fine.  If this is about security issues, then it seems flawed, since it doesn't 
actually stop anyone from accessing anything.  Can you be a little clearer 
about the specific use-cases for this?

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9186) Simplify embedding libhdfspp into other projects

2015-10-02 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9186:
--
Attachment: HDFS-9186.HDFS-8707.000.patch

Simple script that copies only the things needed to compile/test and sticks 
them into the `pwd`/minimized

> Simplify embedding libhdfspp into other projects
> 
>
> Key: HDFS-9186
> URL: https://issues.apache.org/jira/browse/HDFS-9186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9186.HDFS-8707.000.patch
>
>
> I'd like to add a script to the root libhdfspp directory that can prune 
> anything that libhdfspp doesn't need to compile out of the hadoop source 
> tree.  
> This way the project is a lot smaller if it's going to be included in a 
> third-party directory of another project.  The directory structure, aside 
> from missing directories, is preserved so modifications can be diffed against 
> a fresh checkout of the source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9190) VolumeScanner throwing NPE while scanning suspect blocks.

2015-10-02 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created HDFS-9190:


 Summary: VolumeScanner throwing NPE while scanning suspect blocks.
 Key: HDFS-9190
 URL: https://issues.apache.org/jira/browse/HDFS-9190
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.0
Reporter: Rushabh S Shah
Priority: Critical


Volume scanner NPEs while scanning suspect Block.
Following is the stack trace:
{noformat}
2015-10-02 06:45:30,333 [VolumeScannerThread(dataDir)] ERROR 
datanode.VolumeScanner: VolumeScanner(dataDir, 
DS-5fc4263e-7a5c-4463-9f82-842108c0ab3b) exiting because of exception 
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:539)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:619)
2015-10-02 06:45:30,333 
[org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@7768ca5] WARN 
datanode.DataNode: DatanodeRegistration(sourceDN:1004, 
datanodeUuid=f554982f-7c45-4fd4-ad57-9d472a39729e, infoPort=1006, 
infoSecurePort=0, ipcPort=8020, 
storageInfo=lv=-56;cid=CID-ddc217ab-5203-48ef-9695-a348feb4dac2;nsid=1872110141;c=1443758672580):Failed
 to transfer BP-1749317823--1443758669533:blk_1073742231_1407 to 
destDN:1004 got 
java.net.SocketException: Original Exception : java.io.IOException: Connection 
reset by peer
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at 
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:443)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:575)
at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:579)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:759)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:706)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2124)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Connection reset by peer
... 9 more
{noformat}

It is NPEing at the code in file: VolumeScanner#runLoop
{noformat}
  long saveDelta = monotonicMs - curBlockIter.getLastSavedMs();
{noformat}
curBlockIter is not initialized.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941442#comment-14941442
 ] 

Daryn Sharp commented on HDFS-9184:
---

Adding another kvp to the audit log is not an incompatible change, and isn't 
IMHO grounds for a -1.  I'm pretty sure the previous proto=(rpc|webhdfs) key 
was added mid-2.x with no fanfare.

The goal of this jira is sorely needed.  The crux is how can we do it with 
minimal performance impact and no incompatibility.  My concern is the overhead 
with a per-call context.  I'd rather see it in the connection context.  I 
thought we could leverage the dfsclient id, but alas it's not part of the 
connection context like I thought.  But, adding an optional & arbitrary string 
to the connection context might work.  I can envision a conceptually simple api 
to append a delimited value.




> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9185) Fix null tracer in ErasureCodingWorker

2015-10-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9185:

Summary: Fix null tracer in ErasureCodingWorker  (was: 
TestRecoverStripedFile is failing)

> Fix null tracer in ErasureCodingWorker
> --
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9180) Update excluded DataNodes in DFSStripedOutputStream based on failures in data streamers

2015-10-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9180:

Attachment: HDFS-9180.002.patch

Thanks for the review, Yi!

The failed EC related tests are mainly caused by some bugs in the testing code. 
Update the patch to fix.

> Update excluded DataNodes in DFSStripedOutputStream based on failures in data 
> streamers
> ---
>
> Key: HDFS-9180
> URL: https://issues.apache.org/jira/browse/HDFS-9180
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9180.000.patch, HDFS-9180.001.patch, 
> HDFS-9180.002.patch
>
>
> This is a TODO in HDFS-9040: based on the failures all the striped data 
> streamers hit, the DFSStripedOutputStream should keep a record of all the 
> DataNodes that should be excluded.
> This jira will also fix several bugs in the DFSStripedOutputStream. Will 
> provide more details in the comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941454#comment-14941454
 ] 

Allen Wittenauer commented on HDFS-9184:


bq.  I'm pretty sure the previous proto=(rpc|webhdfs) key was added mid-2.x 
with no fanfare.

Believe me, it broke stuff.  I would have -1'd that one too.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941473#comment-14941473
 ] 

Jitendra Nath Pandey commented on HDFS-9184:


 Audit log format is designed to be a key value format so that it can be 
extensible. Addition of a new key optional value pair is not an incompatible 
change.
 However, we can also consider making this feature configurable which is off by 
default, so that there is no change at all.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-9190) VolumeScanner throwing NPE while scanning suspect block.

2015-10-02 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah resolved HDFS-9190.
--
Resolution: Duplicate

[~xyao]: thanks for pointing out to hdfs-8850
Closing this ticket as duplicate

> VolumeScanner throwing NPE while scanning suspect block.
> 
>
> Key: HDFS-9190
> URL: https://issues.apache.org/jira/browse/HDFS-9190
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Priority: Critical
>
> Volume scanner NPEs while scanning suspect Block.
> Following is the stack trace:
> {noformat}
> 2015-10-02 06:45:30,333 [VolumeScannerThread(dataDir)] ERROR 
> datanode.VolumeScanner: VolumeScanner(dataDir, 
> DS-5fc4263e-7a5c-4463-9f82-842108c0ab3b) exiting because of exception 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:539)
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:619)
> 2015-10-02 06:45:30,333 
> [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@7768ca5] WARN 
> datanode.DataNode: DatanodeRegistration(sourceDN:1004, 
> datanodeUuid=f554982f-7c45-4fd4-ad57-9d472a39729e, infoPort=1006, 
> infoSecurePort=0, ipcPort=8020, 
> storageInfo=lv=-56;cid=CID-ddc217ab-5203-48ef-9695-a348feb4dac2;nsid=1872110141;c=1443758672580):Failed
>  to transfer BP-1749317823--1443758669533:blk_1073742231_1407 to 
> destDN:1004 got 
> java.net.SocketException: Original Exception : java.io.IOException: 
> Connection reset by peer
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> at 
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:443)
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:575)
> at 
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:579)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:759)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:706)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2124)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Connection reset by peer
> ... 9 more
> {noformat}
> It is NPEing at the code in file: VolumeScanner#runLoop
> {noformat}
>   long saveDelta = monotonicMs - curBlockIter.getLastSavedMs();
> {noformat}
> curBlockIter is not initialized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941525#comment-14941525
 ] 

Lei (Eddy) Xu commented on HDFS-9015:
-

+1 LGTM.


> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941535#comment-14941535
 ] 

Jitendra Nath Pandey commented on HDFS-9184:


bq ...connection context
  Many applications heavily rely on filesystem cache and connection cache for 
performance. A string in connection context would need to be updated for 
different calls. It may not work in multi-threaded applications. 

  I think if we restrict the length of this additional string these costs can 
be kept to minimal. For example, a default length of 128 bytes will be a small 
increment to current audit log record sizes.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9186) Simplify embedding libhdfspp into other projects

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941544#comment-14941544
 ] 

Hadoop QA commented on HDFS-9186:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   0m  0s | Pre-patch HDFS-8707 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
421 release audit warnings. |
| {color:red}-1{color} | shellcheck |   0m  6s | The applied patch generated  
11 new shellcheck (v0.3.3) issues (total was 25, now 36). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   0m 27s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764814/HDFS-9186.HDFS-8707.000.patch
 |
| Optional Tests | shellcheck |
| git revision | HDFS-8707 / 3668778 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12773/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| shellcheck | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12773/artifact/patchprocess/diffpatchshellcheck.txt
 |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12773/console |


This message was automatically generated.

> Simplify embedding libhdfspp into other projects
> 
>
> Key: HDFS-9186
> URL: https://issues.apache.org/jira/browse/HDFS-9186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9186.HDFS-8707.000.patch
>
>
> I'd like to add a script to the root libhdfspp directory that can prune 
> anything that libhdfspp doesn't need to compile out of the hadoop source 
> tree.  
> This way the project is a lot smaller if it's going to be included in a 
> third-party directory of another project.  The directory structure, aside 
> from missing directories, is preserved so modifications can be diffed against 
> a fresh checkout of the source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941460#comment-14941460
 ] 

Daryn Sharp commented on HDFS-9184:
---

It's a simple and _extensible_ kvp file.  If something doesn't parse it as 
such, it's the parser's fault, not an incompatibility that should hinder 
progress.

Food for thought: by this incompatibility logic, we can't add any new fields to 
protobufs

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Catherine Palmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Catherine Palmer updated HDFS-9191:
---
Attachment: hdfs-9191.patch

quick patch to fix a typo; no tests since the typo is in the comment

> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9142) Namenode Http address is not configured correctly for federated cluster in MiniDFSCluster

2015-10-02 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-9142:
--
Attachment: HDFS-9142.v4.patch

> Namenode Http address is not configured correctly for federated cluster in 
> MiniDFSCluster
> -
>
> Key: HDFS-9142
> URL: https://issues.apache.org/jira/browse/HDFS-9142
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: HDFS-9142.v1.patch, HDFS-9142.v2.patch, 
> HDFS-9142.v3.patch, HDFS-9142.v4.patch
>
>
> When setting up simpleHAFederatedTopology in MiniDFSCluster, each Namenode 
> should have its own configuration object, and the configuration should have 
> "dfs.namenode.http-address--" set up correctly for 
> all  pair



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9015:

  Resolution: Fixed
   Fix Version/s: 2.8.0
  3.0.0
Target Version/s: 3.0.0
  Status: Resolved  (was: Patch Available)

Thanks for the work, [~mingma]. 

Committed to trunk and branch-2.

> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9186) Simplify embedding libhdfspp into other projects

2015-10-02 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9186:
--
Status: Open  (was: Patch Available)

> Simplify embedding libhdfspp into other projects
> 
>
> Key: HDFS-9186
> URL: https://issues.apache.org/jira/browse/HDFS-9186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9186.HDFS-8707.000.patch
>
>
> I'd like to add a script to the root libhdfspp directory that can prune 
> anything that libhdfspp doesn't need to compile out of the hadoop source 
> tree.  
> This way the project is a lot smaller if it's going to be included in a 
> third-party directory of another project.  The directory structure, aside 
> from missing directories, is preserved so modifications can be diffed against 
> a fresh checkout of the source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941613#comment-14941613
 ] 

Allen Wittenauer commented on HDFS-9184:


bq. Is it documented anywhere that the audit log is key/value? I didn't see any 
specification for the format...

It's a) not documented and b) not a kvp.

Story time. This is going to be the shorter version.  

I have few regrets about things I helped design in Hadoop, but this does happen 
to be one of them especially due to all of the misunderstanding around what 
it's purpose in life is and how people actually use it.  When [~chris.douglas] 
and I did the design work on the audit log back in 2008 (IIRC), I specifically 
wanted a fixed field log file format.  We were going to be writing ops tools to 
answer questions that we the ops team simply could not. It was important that 
the format stay fixed for a variety of reasons:

* The ops team at Y! was tiny with a mix of junior and senior folks. The junior 
folks were likely going to be the ones writing the code since the senior folks 
were busy dealing with the continual fallout from the weekly Hadoop upgrades 
and just getting a working infrastructure in place while we moved away from 
YST.  (... and getting ops-specific tooling out of dev was regularly blocked by 
management ...)

* We needed to make sure that no matter what the devs added to Hadoop, the log 
file wouldn't change.  At that point in time, the logs for things like the NN 
were wildly fluctuating and were pretty much impossible to use for any sort of 
metrics or monitoring.  We needed a safespace that was away from the turmoil 
happening in the rest of the system.  If the system would have been open ended, 
it would have been absolute hell to work with.  Forcing a format that at that 
point covered 100% of the foreseeable use cases solved that problem.

*  The content was modeled after Solaris BSM with a few key differences.  BSM 
wrote in binary which just wasn't a real option without us pulling out more 
advanced techniques. It would fail the 'quick and dirty' tests that the ops 
team had to have in order to fulfill user needs. BSM also supported a heck of a 
lot more than Hadoop did.  So a straight logfile it was.

Now one of the things I wanted to avoid was the "tab problem".  e.g., fields 
that are empty end up looking like fieldfield. So we settled on a 
= format where every label would always be present so that 
we could then use spaces to break up the columns.  [Thus why I say it is *not* 
kvp.  In most key-value stores that I've worked with, it's rare to see 
key=(null)]. 

I've also heard that the file is a "weird form of JSON".  No, it's not.  In 
fact, I vetoed JSON because of the extra parsing overhead with very little gain 
to be seen by doing that vs. just fixing all the fields.

Now, what would I do differently?  #1 would be documentation with a clear 
explanation of this history, covering the whys and the hows.  #2 would probably 
be to make it officially key value with some fields being required.  But that's 
a different problem altogether



> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | 

[jira] [Updated] (HDFS-9188) Make block corruption related tests FsDataset-agnostic.

2015-10-02 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9188:

Attachment: HDFS-9188.001.patch

Address release audit and whitespace warnings.

The test failures are not relevant. 

> Make block corruption related tests FsDataset-agnostic. 
> 
>
> Key: HDFS-9188
> URL: https://issues.apache.org/jira/browse/HDFS-9188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9188.000.patch, HDFS-9188.001.patch
>
>
> Currently, HDFS does block corruption tests by directly accessing the files 
> stored on the storage directories, which assumes {{FsDatasetImpl}} is the 
> dataset implementation. However, with works like OZone (HDFS-7240) and 
> HDFS-8679, there will be different FsDataset implementations. 
> So we need a general way to run whitebox tests like corrupting blocks and crc 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941639#comment-14941639
 ] 

Daryn Sharp commented on HDFS-9184:
---

(I'll rest my case, sans history, with the format is "label=val  label=val 
...".  A rather self-documenting format.  If a parser can't handle another 
label, esp. one tacked on to the end, that's just bad programming)

Anyway, the most basic use-case is:  Production user X is pounding the NN.  I 
wonder what job it is?  Let me look at oozie, arg, 20 jobs.  Hey, user X, stop 
abusing the NN, kill your bad job.  You don't know which job?  Can you tell 
from these paths?  You can't?  Fine, I'll login to one of the hosts in the 
audit log and look for the tasks.  Arg, 5 different jobs running tasks as user 
X on this node.  I guess I'll try to intersect the jobs across multiple 
nodes...  Boy, I wish the audit log could tell me which job it is...

I'd love to see a keep-it-simple approach for this most basic issue we've all 
faced.



> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Cathy Palmer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941654#comment-14941654
 ] 

Cathy Palmer commented on HDFS-9191:


Thanks for the tutorial!  I'm going to write up my notes and share.  

Good luck to you at your next gig!  You know that if you take your bike to 
work, you can ride to Seattle and back over the bridge at lunch.  Or there's a 
restaurant at the top of Mercer Island, Roanoke Inn that's a great lunch stop.  
You cannot kayak there though.  :)

You can also ride to Factoria mall area via bike trail for lunch.  It will get 
you out of the office.  

Cathy



> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941574#comment-14941574
 ] 

Hudson commented on HDFS-9191:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2415 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2415/])
HDFS-9191. Typo in Hdfs.java. NoSuchElementException is misspelled. (jghoman: 
rev 3929ac9340a5c9f26574dc076a449f7e11931527)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java


> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941576#comment-14941576
 ] 

Ming Ma commented on HDFS-9015:
---

Thanks [~eddyxu]!

> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) Fix null tracer in ErasureCodingWorker

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941592#comment-14941592
 ] 

Hudson commented on HDFS-9185:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1210 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1210/])
HDFS-9185. Fix null tracer in ErasureCodingWorker. Contributed by Rakesh 
(jing9: rev c6cafc77e697317dad0708309b67b900a2e3a413)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java


> Fix null tracer in ErasureCodingWorker
> --
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941594#comment-14941594
 ] 

Hudson commented on HDFS-9100:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8558 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8558/])
HDFS-9100. HDFS Balancer does not respect (yzhang: rev 
1037ee580f87e6bf13155834c36f26794381678b)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9186) Simplify embedding libhdfspp into other projects

2015-10-02 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941597#comment-14941597
 ] 

James Clampffer commented on HDFS-9186:
---

Decided to close this.  While it does what I need I think most projects that 
incorporate libhdfs++ are going to include it in application/project specific 
ways.

> Simplify embedding libhdfspp into other projects
> 
>
> Key: HDFS-9186
> URL: https://issues.apache.org/jira/browse/HDFS-9186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9186.HDFS-8707.000.patch
>
>
> I'd like to add a script to the root libhdfspp directory that can prune 
> anything that libhdfspp doesn't need to compile out of the hadoop source 
> tree.  
> This way the project is a lot smaller if it's going to be included in a 
> third-party directory of another project.  The directory structure, aside 
> from missing directories, is preserved so modifications can be diffed against 
> a fresh checkout of the source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-10-02 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-9100:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I committed to trunk and branch-2. Thanks Casey for the contribution!


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Fix For: 2.8.0
>
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941687#comment-14941687
 ] 

Hudson commented on HDFS-9191:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #472 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/472/])
HDFS-9191. Typo in Hdfs.java. NoSuchElementException is misspelled. (jghoman: 
rev 3929ac9340a5c9f26574dc076a449f7e11931527)
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) Fix null tracer in ErasureCodingWorker

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941688#comment-14941688
 ] 

Hudson commented on HDFS-9185:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #472 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/472/])
HDFS-9185. Fix null tracer in ErasureCodingWorker. Contributed by Rakesh 
(jing9: rev c6cafc77e697317dad0708309b67b900a2e3a413)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java


> Fix null tracer in ErasureCodingWorker
> --
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941696#comment-14941696
 ] 

Colin Patrick McCabe commented on HDFS-9184:


[~aw]: I feel like this is a good example of why the audit log format should 
have been JSON.  We wouldn't be having this discussion if the format had been 
one JSON record per line, since it would be obvious how to parse it.  It's also 
relatively easy to find libraries for JSON in every language you might want to 
use (although perhaps it wasn't so easy back when the audit log was first added 
to HDFS?)  I'm not sure I understand the desire for COBOL-style fixed fields 
(party like it's 1975?).  But I do agree that compatibility is a concern here 
since there is basically no spec that we can point to when people are writing 
their parsers.  They could easily just be doing {{scanf("%s %s %s", foo, bar, 
baz)}} and then we would break them.

[~daryn]: thanks for giving an example of how this would be used.  I agree this 
has been a pain point for a while.  This is possibly a dumb question, but 
couldn't clientId be used for this purpose?

This solution also presupposes some kind of daemon or service to gather context 
IDs in Hive.  This service hasn't been written yet, but if it were, it seems 
like it might start looking a lot like HTrace.  Like I said earlier, I also 
feel like this solution wouldn't work in the case where HBase was in use, or 
RecordService, or Tachyon.  We are definitely planning some YARN and MR 
integration for HTrace.  I would really like to get more people excited about 
this project and work out what we'd need to do to get it to cover all these 
use-cases.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for 

[jira] [Commented] (HDFS-9185) Fix null tracer in ErasureCodingWorker

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941575#comment-14941575
 ] 

Hudson commented on HDFS-9185:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2415 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2415/])
HDFS-9185. Fix null tracer in ErasureCodingWorker. Contributed by Rakesh 
(jing9: rev c6cafc77e697317dad0708309b67b900a2e3a413)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


> Fix null tracer in ErasureCodingWorker
> --
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-9186) Simplify embedding libhdfspp into other projects

2015-10-02 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer resolved HDFS-9186.
---
Resolution: Not A Problem

> Simplify embedding libhdfspp into other projects
> 
>
> Key: HDFS-9186
> URL: https://issues.apache.org/jira/browse/HDFS-9186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9186.HDFS-8707.000.patch
>
>
> I'd like to add a script to the root libhdfspp directory that can prune 
> anything that libhdfspp doesn't need to compile out of the hadoop source 
> tree.  
> This way the project is a lot smaller if it's going to be included in a 
> third-party directory of another project.  The directory structure, aside 
> from missing directories, is preserved so modifications can be diffed against 
> a fresh checkout of the source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-02 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941652#comment-14941652
 ] 

Allen Wittenauer commented on HDFS-9184:


bq.  If a parser can't handle another label, esp. one tacked on to the end, 
that's just bad programming

You've missed several key points in that story.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2015-10-02 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8850:
-
Labels:   (was: 2.7.2-candidate)

> VolumeScanner thread exits with exception if there is no block pool to be 
> scanned but there are suspicious blocks
> -
>
> Key: HDFS-8850
> URL: https://issues.apache.org/jira/browse/HDFS-8850
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-8850.001.patch
>
>
> The VolumeScanner threads inside the BlockScanner exit with an exception if 
> there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2015-10-02 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8850:
-
Fix Version/s: (was: 2.8.0)
   2.7.2
   3.0.0

> VolumeScanner thread exits with exception if there is no block pool to be 
> scanned but there are suspicious blocks
> -
>
> Key: HDFS-8850
> URL: https://issues.apache.org/jira/browse/HDFS-8850
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-8850.001.patch
>
>
> The VolumeScanner threads inside the BlockScanner exit with an exception if 
> there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9137) DeadLock between DataNode#refreshVolumes and BPOfferService#registrationSucceeded

2015-10-02 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-9137:
--
Attachment: HDFS-9137.00.patch

Attached a patch for fixing this issue.

> DeadLock between DataNode#refreshVolumes and 
> BPOfferService#registrationSucceeded 
> --
>
> Key: HDFS-9137
> URL: https://issues.apache.org/jira/browse/HDFS-9137
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9137.00.patch
>
>
> I can see this code flows between DataNode#refreshVolumes and 
> BPOfferService#registrationSucceeded could cause deadLock.
> In practice situation may be rare as user calling refreshVolumes at the time 
> DN registration with NN. But seems like issue can happen.
>  Reason for deadLock:
>   1) refreshVolumes will be called with DN lock and after at the end it will 
> also trigger Block report. In the Block report call, 
> BPServiceActor#triggerBlockReport calls toString on bpos. Here it takes 
> readLock on bpos.
>  DN lock then boos lock
> 2) BPOfferSetrvice#registrationSucceeded call is taking writeLock on bpos and 
>  calling dn.bpRegistrationSucceeded which is again synchronized call on DN.
> bpos lock and then DN lock.
> So, this can clearly create dead lock.
> I think simple fix could be to move triggerBlockReport call outside out DN 
> lock and I feel that call may not be really needed inside DN lock.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9137) DeadLock between DataNode#refreshVolumes and BPOfferService#registrationSucceeded

2015-10-02 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-9137:
--
Status: Patch Available  (was: Open)

> DeadLock between DataNode#refreshVolumes and 
> BPOfferService#registrationSucceeded 
> --
>
> Key: HDFS-9137
> URL: https://issues.apache.org/jira/browse/HDFS-9137
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1, 3.0.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9137.00.patch
>
>
> I can see this code flows between DataNode#refreshVolumes and 
> BPOfferService#registrationSucceeded could cause deadLock.
> In practice situation may be rare as user calling refreshVolumes at the time 
> DN registration with NN. But seems like issue can happen.
>  Reason for deadLock:
>   1) refreshVolumes will be called with DN lock and after at the end it will 
> also trigger Block report. In the Block report call, 
> BPServiceActor#triggerBlockReport calls toString on bpos. Here it takes 
> readLock on bpos.
>  DN lock then boos lock
> 2) BPOfferSetrvice#registrationSucceeded call is taking writeLock on bpos and 
>  calling dn.bpRegistrationSucceeded which is again synchronized call on DN.
> bpos lock and then DN lock.
> So, this can clearly create dead lock.
> I think simple fix could be to move triggerBlockReport call outside out DN 
> lock and I feel that call may not be really needed inside DN lock.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2015-10-02 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941660#comment-14941660
 ] 

Kihwal Lee commented on HDFS-8850:
--

Cherry-picked to branch-2.7.

> VolumeScanner thread exits with exception if there is no block pool to be 
> scanned but there are suspicious blocks
> -
>
> Key: HDFS-8850
> URL: https://issues.apache.org/jira/browse/HDFS-8850
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-8850.001.patch
>
>
> The VolumeScanner threads inside the BlockScanner exit with an exception if 
> there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-9191:
--
Comment: was deleted

(was: Thanks for the tutorial!  I'm going to write up my notes and share.  

Good luck to you at your next gig!  You know that if you take your bike to 
work, you can ride to Seattle and back over the bridge at lunch.  Or there's a 
restaurant at the top of Mercer Island, Roanoke Inn that's a great lunch stop.  
You cannot kayak there though.  :)

You can also ride to Factoria mall area via bike trail for lunch.  It will get 
you out of the office.  

Cathy

)

> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941693#comment-14941693
 ] 

Hudson commented on HDFS-9100:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2416 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2416/])
HDFS-9100. HDFS Balancer does not respect (yzhang: rev 
1037ee580f87e6bf13155834c36f26794381678b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Fix For: 2.8.0
>
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8873) Allow the directoryScanner to be rate-limited

2015-10-02 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8873:
-
Labels: 2.7.2-candidate  (was: )

> Allow the directoryScanner to be rate-limited
> -
>
> Key: HDFS-8873
> URL: https://issues.apache.org/jira/browse/HDFS-8873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Daniel Templeton
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0
>
> Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, 
> HDFS-8873.003.patch, HDFS-8873.004.patch, HDFS-8873.005.patch, 
> HDFS-8873.006.patch, HDFS-8873.007.patch, HDFS-8873.008.patch, 
> HDFS-8873.009.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-10-02 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941560#comment-14941560
 ] 

Yongjun Zhang commented on HDFS-9100:
-

Sorry for the delay, will commit momentarily.


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) Fix null tracer in ErasureCodingWorker

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941670#comment-14941670
 ] 

Hudson commented on HDFS-9185:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #480 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/480/])
HDFS-9185. Fix null tracer in ErasureCodingWorker. Contributed by Rakesh 
(jing9: rev c6cafc77e697317dad0708309b67b900a2e3a413)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java


> Fix null tracer in ErasureCodingWorker
> --
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2015-10-02 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941558#comment-14941558
 ] 

Rushabh S Shah commented on HDFS-8850:
--

[~hitliuyi], [~cmccabe]: Does it make sense to commit to 2.7.2 ?

> VolumeScanner thread exits with exception if there is no block pool to be 
> scanned but there are suspicious blocks
> -
>
> Key: HDFS-8850
> URL: https://issues.apache.org/jira/browse/HDFS-8850
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0
>
> Attachments: HDFS-8850.001.patch
>
>
> The VolumeScanner threads inside the BlockScanner exit with an exception if 
> there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941568#comment-14941568
 ] 

Hudson commented on HDFS-9015:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8557 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8557/])
HDFS-9015. Refactor TestReplicationPolicy to test different block (lei: rev 
a68b6eb0f4110ba626a44fad6b9eb5d8c5a4901f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java


> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941591#comment-14941591
 ] 

Hudson commented on HDFS-9191:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1210 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1210/])
HDFS-9191. Typo in Hdfs.java. NoSuchElementException is misspelled. (jghoman: 
rev 3929ac9340a5c9f26574dc076a449f7e11931527)
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941593#comment-14941593
 ] 

Hudson commented on HDFS-9015:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1210 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1210/])
HDFS-9015. Refactor TestReplicationPolicy to test different block (lei: rev 
a68b6eb0f4110ba626a44fad6b9eb5d8c5a4901f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java


> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941669#comment-14941669
 ] 

Hudson commented on HDFS-9191:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #480 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/480/])
HDFS-9191. Typo in Hdfs.java. NoSuchElementException is misspelled. (jghoman: 
rev 3929ac9340a5c9f26574dc076a449f7e11931527)
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941677#comment-14941677
 ] 

Hudson commented on HDFS-9100:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #446 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/446/])
HDFS-9100. HDFS Balancer does not respect (yzhang: rev 
1037ee580f87e6bf13155834c36f26794381678b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Fix For: 2.8.0
>
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) Fix null tracer in ErasureCodingWorker

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941679#comment-14941679
 ] 

Hudson commented on HDFS-9185:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #446 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/446/])
HDFS-9185. Fix null tracer in ErasureCodingWorker. Contributed by Rakesh 
(jing9: rev c6cafc77e697317dad0708309b67b900a2e3a413)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java


> Fix null tracer in ErasureCodingWorker
> --
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941678#comment-14941678
 ] 

Hudson commented on HDFS-9191:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #446 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/446/])
HDFS-9191. Typo in Hdfs.java. NoSuchElementException is misspelled. (jghoman: 
rev 3929ac9340a5c9f26574dc076a449f7e11931527)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java


> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941680#comment-14941680
 ] 

Hudson commented on HDFS-9015:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #446 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/446/])
HDFS-9015. Refactor TestReplicationPolicy to test different block (lei: rev 
a68b6eb0f4110ba626a44fad6b9eb5d8c5a4901f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java


> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941694#comment-14941694
 ] 

Hudson commented on HDFS-9015:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2416 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2416/])
HDFS-9015. Refactor TestReplicationPolicy to test different block (lei: rev 
a68b6eb0f4110ba626a44fad6b9eb5d8c5a4901f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java


> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9193) when a datanode's usage go above 70 percent, we can't open datanodes tab in NN UI

2015-10-02 Thread Chang Li (JIRA)
Chang Li created HDFS-9193:
--

 Summary: when a datanode's usage go above 70 percent, we can't 
open datanodes tab in NN UI
 Key: HDFS-9193
 URL: https://issues.apache.org/jira/browse/HDFS-9193
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
Priority: Blocker






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9193) when a datanode's usage go above 70 percent, we can't open datanodes tab in NN UI

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941830#comment-14941830
 ] 

Hadoop QA commented on HDFS-9193:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   0m  0s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | release audit |   0m 12s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   0m 16s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764849/HDFS-9193.patch |
| Optional Tests |  |
| git revision | trunk / fdf02d1 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12778/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12778/console |


This message was automatically generated.

> when a datanode's usage go above 70 percent, we can't open datanodes tab in 
> NN UI
> -
>
> Key: HDFS-9193
> URL: https://issues.apache.org/jira/browse/HDFS-9193
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Blocker
> Attachments: HDFS-9193.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3107) HDFS truncate

2015-10-02 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941846#comment-14941846
 ] 

Konstantin Shvachko commented on HDFS-3107:
---

Sorry, got distracted. It would be good to create a new jira for truncate 
support in nfs.

> HDFS truncate
> -
>
> Key: HDFS-3107
> URL: https://issues.apache.org/jira/browse/HDFS-3107
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Lei Chang
>Assignee: Plamen Jeliazkov
> Fix For: 2.7.0
>
> Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
> HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
> HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
> HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
> HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
> HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-02 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Status: Open  (was: Patch Available)

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-02 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Attachment: HDFS-8164.003.patch

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch, 
> HDFS-8164.003.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9192) Add valgrind suppression for statically initialized library objects

2015-10-02 Thread James Clampffer (JIRA)
James Clampffer created HDFS-9192:
-

 Summary: Add valgrind suppression for statically initialized 
library objects
 Key: HDFS-9192
 URL: https://issues.apache.org/jira/browse/HDFS-9192
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: James Clampffer
Assignee: James Clampffer


When using --leak-check=full there's a lot of noise due to static 
initialization of constants and memory pools, most of them from protobuf.

Add a suppression file that helps cut down on this noise but is selective 
enough that real issues aren't going to be masked as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941808#comment-14941808
 ] 

Hudson commented on HDFS-9100:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #481 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/481/])
HDFS-9100. HDFS Balancer does not respect (yzhang: rev 
1037ee580f87e6bf13155834c36f26794381678b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Fix For: 2.8.0
>
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941809#comment-14941809
 ] 

Hudson commented on HDFS-9015:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #481 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/481/])
HDFS-9015. Refactor TestReplicationPolicy to test different block (lei: rev 
a68b6eb0f4110ba626a44fad6b9eb5d8c5a4901f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java


> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3107) HDFS truncate

2015-10-02 Thread Constantine Peresypkin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941851#comment-14941851
 ] 

Constantine Peresypkin commented on HDFS-3107:
--

Already done:
https://issues.apache.org/jira/browse/HDFS-9164

> HDFS truncate
> -
>
> Key: HDFS-3107
> URL: https://issues.apache.org/jira/browse/HDFS-3107
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Lei Chang
>Assignee: Plamen Jeliazkov
> Fix For: 2.7.0
>
> Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
> HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
> HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
> HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
> HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
> HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-02 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Status: Patch Available  (was: Open)

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch, 
> HDFS-8164.003.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-02 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Attachment: HDFS-8164.003.patch

Thanks [~yzhangal] for the review! Your comments make sense to me.
I have uploaded a new patch encapsulating {{FSNameSystem#getCTime}}. I leave 
{{FSImage}} untouched for now, if in the future {{getCTime}} is needed from 
there, it can be easily added.

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch, 
> HDFS-8164.003.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-02 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Attachment: (was: HDFS-8164.003.patch)

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-02 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Status: Open  (was: Patch Available)

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9193) when a datanode's usage go above 70 percent, we can't open datanodes tab in NN UI

2015-10-02 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated HDFS-9193:
---
Attachment: HDFS-9193.patch

> when a datanode's usage go above 70 percent, we can't open datanodes tab in 
> NN UI
> -
>
> Key: HDFS-9193
> URL: https://issues.apache.org/jira/browse/HDFS-9193
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Blocker
> Attachments: HDFS-9193.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9142) Namenode Http address is not configured correctly for federated cluster in MiniDFSCluster

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941875#comment-14941875
 ] 

Hadoop QA commented on HDFS-9142:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   7m 43s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:red}-1{color} | release audit |   0m 13s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 23s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 25s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 186m 29s | Tests failed in hadoop-hdfs. |
| | | 209m 16s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.tools.TestGetGroups |
|   | hadoop.hdfs.TestHdfsAdmin |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength |
|   | hadoop.hdfs.TestClientReportBadBlock |
|   | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.tools.TestDebugAdmin |
|   | hadoop.hdfs.TestSetrepIncreasing |
|   | hadoop.cli.TestErasureCodingCLI |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestEncryptionZonesWithKMS |
|   | hadoop.hdfs.tools.TestStoragePolicyCommands |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.security.TestPermissionSymlinks |
|   | hadoop.hdfs.TestDFSRollback |
|   | hadoop.fs.TestUnbuffer |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestFileAppend2 |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.security.TestRefreshUserMappings |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.TestReadWhileWriting |
|   | hadoop.hdfs.server.namenode.TestAuditLogs |
|   | hadoop.hdfs.server.namenode.snapshot.TestDisallowModifyROSnapshot |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.TestDFSUpgrade |
|   | hadoop.hdfs.TestGetBlocks |
|   | hadoop.fs.permission.TestStickyBit |
|   | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.cli.TestXAttrCLI |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.fs.TestGlobPaths |
|   | hadoop.hdfs.TestDFSShell |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
|   | hadoop.security.TestPermission |
|   | hadoop.hdfs.server.namenode.snapshot.TestXAttrWithSnapshot |
|   | hadoop.hdfs.server.namenode.TestINodeFile |
|   | hadoop.hdfs.server.namenode.snapshot.TestFileContextSnapshot |
|   | hadoop.hdfs.TestSetrepDecreasing |
|   | hadoop.hdfs.TestDFSFinalize |
|   | hadoop.hdfs.server.namenode.snapshot.TestAclWithSnapshot |
|   | hadoop.hdfs.TestDFSStorageStateRecovery |
|   | hadoop.hdfs.TestDisableConnCache |
|   | hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens |
|   | hadoop.hdfs.TestRestartDFS |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.TestFileCreation |
|   | hadoop.cli.TestAclCLI |
|   | hadoop.hdfs.TestDFSPermission |
|   | hadoop.cli.TestDeleteCLI |
|   | hadoop.cli.TestCryptoAdminCLI |
|   | hadoop.hdfs.TestRollingUpgradeRollback |
|   | hadoop.hdfs.server.namenode.TestFileContextAcl |
|   | hadoop.hdfs.server.namenode.TestNameNodeXAttr |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.TestDFSStartupVersions |
|   | hadoop.hdfs.TestFetchImage |
|   | hadoop.cli.TestCacheAdminCLI |
|   | hadoop.hdfs.web.TestWebHDFSXAttr |
|   | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.TestSnapshotCommands |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshottableDirListing |
|   | hadoop.hdfs.TestFileStatus |
|   | hadoop.hdfs.TestFsShellPermission |
|   | hadoop.fs.loadGenerator.TestLoadGenerator |
|   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
|   | hadoop.hdfs.server.namenode.TestFileContextXAttr |
|   | hadoop.hdfs.server.namenode.TestStorageRestore |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764818/HDFS-9142.v4.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / a68b6eb |
| Release Audit | 

[jira] [Updated] (HDFS-9193) when a datanode's usage go above 70 percent, we can't open datanodes tab in NN UI

2015-10-02 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated HDFS-9193:
---
Status: Patch Available  (was: Open)

the error is caused by a reference error happened in dfshealth.js when trying 
to load datanodes tab page
{code}
} else if (u.usedPercentage < 85) {
{code}
the u is defined no where
uploaded patch fix this problem

> when a datanode's usage go above 70 percent, we can't open datanodes tab in 
> NN UI
> -
>
> Key: HDFS-9193
> URL: https://issues.apache.org/jira/browse/HDFS-9193
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Blocker
> Attachments: HDFS-9193.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941827#comment-14941827
 ] 

Hudson commented on HDFS-9100:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #1211 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1211/])
HDFS-9100. HDFS Balancer does not respect (yzhang: rev 
1037ee580f87e6bf13155834c36f26794381678b)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Fix For: 2.8.0
>
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-02 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Status: Patch Available  (was: Open)

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch, 
> HDFS-8164.003.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941898#comment-14941898
 ] 

Hudson commented on HDFS-9015:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #473 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/473/])
HDFS-9015. Refactor TestReplicationPolicy to test different block (lei: rev 
a68b6eb0f4110ba626a44fad6b9eb5d8c5a4901f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941897#comment-14941897
 ] 

Hudson commented on HDFS-9100:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #473 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/473/])
HDFS-9100. HDFS Balancer does not respect (yzhang: rev 
1037ee580f87e6bf13155834c36f26794381678b)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Fix For: 2.8.0
>
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9180) Update excluded DataNodes in DFSStripedOutputStream based on failures in data streamers

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941913#comment-14941913
 ] 

Hadoop QA commented on HDFS-9180:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 39s | Pre-patch trunk has 7 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 52s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 52s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 28s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 11s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 188m 46s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 32s | Tests passed in 
hadoop-hdfs-client. |
| | | 239m 43s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.web.TestWebHDFSOAuth2 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestParallelShortCircuitRead |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764816/HDFS-9180.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / a68b6eb |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12775/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs-client.html
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12775/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12775/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12775/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12775/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12775/console |


This message was automatically generated.

> Update excluded DataNodes in DFSStripedOutputStream based on failures in data 
> streamers
> ---
>
> Key: HDFS-9180
> URL: https://issues.apache.org/jira/browse/HDFS-9180
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9180.000.patch, HDFS-9180.001.patch, 
> HDFS-9180.002.patch
>
>
> This is a TODO in HDFS-9040: based on the failures all the striped data 
> streamers hit, the DFSStripedOutputStream should keep a record of all the 
> DataNodes that should be excluded.
> This jira will also fix several bugs in the DFSStripedOutputStream. Will 
> provide more details in the comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941946#comment-14941946
 ] 

Hudson commented on HDFS-9100:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2386 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2386/])
HDFS-9100. HDFS Balancer does not respect (yzhang: rev 
1037ee580f87e6bf13155834c36f26794381678b)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Fix For: 2.8.0
>
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9015) Refactor TestReplicationPolicy to test different block placement policies

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941949#comment-14941949
 ] 

Hudson commented on HDFS-9015:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2386 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2386/])
HDFS-9015. Refactor TestReplicationPolicy to test different block (lei: rev 
a68b6eb0f4110ba626a44fad6b9eb5d8c5a4901f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Refactor TestReplicationPolicy to test different block placement policies
> -
>
> Key: HDFS-9015
> URL: https://issues.apache.org/jira/browse/HDFS-9015
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9015.patch
>
>
> TestReplicationPolicy can be parameterized so that default policy, upgrade 
> domain policy and other policies can share some common test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9191) Typo in Hdfs.java. NoSuchElementException is misspelled

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941947#comment-14941947
 ] 

Hudson commented on HDFS-9191:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2386 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2386/])
HDFS-9191. Typo in Hdfs.java. NoSuchElementException is misspelled. (jghoman: 
rev 3929ac9340a5c9f26574dc076a449f7e11931527)
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Typo in  Hdfs.java.  NoSuchElementException is misspelled
> -
>
> Key: HDFS-9191
> URL: https://issues.apache.org/jira/browse/HDFS-9191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Catherine Palmer
>Assignee: Catherine Palmer
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-9191.patch
>
>
> Line 241 NoSuchElementException has a typo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) Fix null tracer in ErasureCodingWorker

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941948#comment-14941948
 ] 

Hudson commented on HDFS-9185:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2386 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2386/])
HDFS-9185. Fix null tracer in ErasureCodingWorker. Contributed by Rakesh 
(jing9: rev c6cafc77e697317dad0708309b67b900a2e3a413)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java


> Fix null tracer in ErasureCodingWorker
> --
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9188) Make block corruption related tests FsDataset-agnostic.

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941955#comment-14941955
 ] 

Hadoop QA commented on HDFS-9188:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   8m  4s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 11 new or modified test files. |
| {color:green}+1{color} | javac |   7m 59s | There were no new javac warning 
messages. |
| {color:red}-1{color} | release audit |   0m 13s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 26s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m 17s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 191m 35s | Tests failed in hadoop-hdfs. |
| | | 215m 11s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestScrLazyPersistFiles |
|   | hadoop.hdfs.TestFileCreation |
|   | hadoop.hdfs.web.TestWebHDFSOAuth2 |
|   | hadoop.hdfs.server.namenode.TestProcessCorruptBlocks |
|   | hadoop.hdfs.util.TestByteArrayManager |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764834/HDFS-9188.001.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 1037ee5 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12776/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12776/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12776/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12776/console |


This message was automatically generated.

> Make block corruption related tests FsDataset-agnostic. 
> 
>
> Key: HDFS-9188
> URL: https://issues.apache.org/jira/browse/HDFS-9188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9188.000.patch, HDFS-9188.001.patch
>
>
> Currently, HDFS does block corruption tests by directly accessing the files 
> stored on the storage directories, which assumes {{FsDatasetImpl}} is the 
> dataset implementation. However, with works like OZone (HDFS-7240) and 
> HDFS-8679, there will be different FsDataset implementations. 
> So we need a general way to run whitebox tests like corrupting blocks and crc 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-02 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941988#comment-14941988
 ] 

Haohui Mai commented on HDFS-8766:
--

bq. I added a simple timeout to clear out the bad datanodes after a specified 
period of time (default to 2 minutes). I think that should be sufficient for 
the initial API

The default timeout is 10 minutes and the timeout should be bounded to every 
single data node. This functionality requires a specific gmock test.

I think it makes sense to separate the integration test to an another jira.

> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch, HDFS-8766.HDFS-8707.004.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9188) Make block corruption related tests FsDataset-agnostic.

2015-10-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942001#comment-14942001
 ] 

Colin Patrick McCabe commented on HDFS-9188:


thanks, [~eddyxu].

{{ReplicaToCorrupt}}: it seems like this should be named something like 
{{MaterializedReplica}}.  Its distinguishing factor is that it is the concrete 
representation of some replica in the {{FSDataset}}.

{code}
92/**
93   * Corrupt the block file by deleting it.
94   * @return true if the deletion is completed.
95   */
96  boolean deleteData();
{code}
This should be able to throw an IOE.  Same with {{deleteMeta}}.

> Make block corruption related tests FsDataset-agnostic. 
> 
>
> Key: HDFS-9188
> URL: https://issues.apache.org/jira/browse/HDFS-9188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9188.000.patch, HDFS-9188.001.patch
>
>
> Currently, HDFS does block corruption tests by directly accessing the files 
> stored on the storage directories, which assumes {{FsDatasetImpl}} is the 
> dataset implementation. However, with works like OZone (HDFS-7240) and 
> HDFS-8679, there will be different FsDataset implementations. 
> So we need a general way to run whitebox tests like corrupting blocks and crc 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >