[jira] [Commented] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693024#comment-14693024
 ] 

Hadoop QA commented on HDFS-8854:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750039/HDFS-8854-HDFS-7285-merge.03.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / fbf7e81 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11975/console |


This message was automatically generated.

> Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
> --
>
> Key: HDFS-8854
> URL: https://issues.apache.org/jira/browse/HDFS-8854
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-8854-Consolidated-20150806.02.txt, 
> HDFS-8854-HDFS-7285-merge.03.patch, HDFS-8854-HDFS-7285-merge.03.txt, 
> HDFS-8854-HDFS-7285.00.patch, HDFS-8854-HDFS-7285.01.patch, 
> HDFS-8854-HDFS-7285.02.patch, HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8887) Expose storage type and storage ID in BlockLocation

2015-08-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8887:
--
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Thanks again for reviewing Eddy, committed to trunk and branch-2.

> Expose storage type and storage ID in BlockLocation
> ---
>
> Key: HDFS-8887
> URL: https://issues.apache.org/jira/browse/HDFS-8887
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.8.0
>
> Attachments: HDFS-8887.001.patch, HDFS-8887.002.patch
>
>
> Some applications schedule based on info like storage type or storage ID, 
> it'd be useful to expose this information in BlockLocation. It's already 
> included in LocatedBlock and sent over the wire.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8887) Expose storage type and storage ID in BlockLocation

2015-08-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8887:
--
Attachment: HDFS-8887.002.patch

Whitespace-only changes patch attached. javac warnings are all because of the 
additional deprecation. checkstyle is because I followed code style of rest of 
BlockLocation.

Thanks for reviewing Eddy, committing this since it's just whitespace changes.

> Expose storage type and storage ID in BlockLocation
> ---
>
> Key: HDFS-8887
> URL: https://issues.apache.org/jira/browse/HDFS-8887
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.8.0
>
> Attachments: HDFS-8887.001.patch, HDFS-8887.002.patch
>
>
> Some applications schedule based on info like storage type or storage ID, 
> it'd be useful to expose this information in BlockLocation. It's already 
> included in LocatedBlock and sent over the wire.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs

2015-08-11 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8854:

Attachment: HDFS-8854-HDFS-7285-merge.03.patch

Thanks Walter for the update! +1 on the latest patch, pending Jenkins.

Uploading renamed patch to trigger Jenkins on the HDFS-7285-merge branch.

> Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
> --
>
> Key: HDFS-8854
> URL: https://issues.apache.org/jira/browse/HDFS-8854
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-8854-Consolidated-20150806.02.txt, 
> HDFS-8854-HDFS-7285-merge.03.patch, HDFS-8854-HDFS-7285-merge.03.txt, 
> HDFS-8854-HDFS-7285.00.patch, HDFS-8854-HDFS-7285.01.patch, 
> HDFS-8854-HDFS-7285.02.patch, HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8388) Time and Date format need to be in sync in Namenode UI page

2015-08-11 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693005#comment-14693005
 ] 

Akira AJISAKA commented on HDFS-8388:
-

bq. Is it possible to use moment.js to parse the date instead?
No, moment.js cannot parse {{zz}} format. I'll check whether moment-timezone.js 
can parse {{zz}}. If it can parse the timezone, we can use it instead of adding 
a new metric.

> Time and Date format need to be in sync in Namenode UI page
> ---
>
> Key: HDFS-8388
> URL: https://issues.apache.org/jira/browse/HDFS-8388
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: HDFS-8388-002.patch, HDFS-8388-003.patch, 
> HDFS-8388.patch, HDFS-8388_1.patch, ScreenShot-InvalidDate.png
>
>
> In NameNode UI Page, Date and Time FORMAT  displayed on the page are not in 
> sync currently.
> Started:Wed May 13 12:28:02 IST 2015
> Compiled:23 Apr 2015 12:22:59 
> Block Deletion Start Time   13 May 2015 12:28:02
> We can keep a common format in all the above places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8880) NameNode metrics logging

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692990#comment-14692990
 ] 

Hadoop QA commented on HDFS-8880:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 50s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  1s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 45s | The applied patch generated  1 
new checkstyle issues (total was 6, now 7). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 20s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  22m  5s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 174m 41s | Tests failed in hadoop-hdfs. |
| | | 241m 46s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.ha.TestZKFailoverController |
|   | hadoop.net.TestNetUtils |
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749988/HDFS-8880.02.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3ae716f |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11974/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11974/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11974/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11974/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11974/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11974/console |


This message was automatically generated.

> NameNode metrics logging
> 
>
> Key: HDFS-8880
> URL: https://issues.apache.org/jira/browse/HDFS-8880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8880.01.patch, HDFS-8880.02.patch, 
> namenode-metrics.log
>
>
> The NameNode can periodically log metrics to help debugging when the cluster 
> is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-11 Thread Nate Edel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Release Note: 
Resubmitting NetUtils version of patch, with bugfixes.  Older version of patch 
seems to need rebasing, but isn't breaking ZKFC, let's see if these fixes fix 
that (I can't replicate the break locally.)


  was:Resubmitting older (non-NetUtils) version of patch to see if NetUtils 
change is breaking ZK related tests, can't repeat locally.

  Status: Patch Available  (was: Open)

> HDFS client gets errors trying to to connect to IPv6 DataNode
> -
>
> Key: HDFS-8078
> URL: https://issues.apache.org/jira/browse/HDFS-8078
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Nate Edel
>Assignee: Nate Edel
>  Labels: BB2015-05-TBR, ipv6
> Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, 
> HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.14.patch, HDFS-8078.9.patch
>
>
> 1st exception, on put:
> 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Does not contain a valid host:port 
> authority: 2401:db00:1010:70ba:face:0:8:0:50010
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> Appears to actually stem from code in DataNodeID which assumes it's safe to 
> append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
> IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
> requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
> Currently using InetAddress.getByName() to validate IPv6 (guava 
> InetAddresses.forString has been flaky) but could also use our own parsing. 
> (From logging this, it seems like a low-enough frequency call that the extra 
> object creation shouldn't be problematic, and for me the slight risk of 
> passing in bad input that is not actually an IPv4 or IPv6 address and thus 
> calling an external DNS lookup is outweighed by getting the address 
> normalized and avoiding rewriting parsing.)
> Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
> ---
> 2nd exception (on datanode)
> 15/04/13 13:18:07 ERROR datanode.DataNode: 
> dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
> operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
> /2401:db00:11:d010:face:0:2f:0:50010
> java.io.EOFException
> at java.io.DataInputStream.readShort(DataInputStream.java:315)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
> at java.lang.Thread.run(Thread.java:745)
> Which also comes as client error "-get: 2401 is not an IP string literal."
> This one has existing parsing logic which needs to shift to the last colon 
> rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
> rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-11 Thread Nate Edel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Attachment: HDFS-8078.14.patch

> HDFS client gets errors trying to to connect to IPv6 DataNode
> -
>
> Key: HDFS-8078
> URL: https://issues.apache.org/jira/browse/HDFS-8078
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Nate Edel
>Assignee: Nate Edel
>  Labels: BB2015-05-TBR, ipv6
> Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, 
> HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.14.patch, HDFS-8078.9.patch
>
>
> 1st exception, on put:
> 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Does not contain a valid host:port 
> authority: 2401:db00:1010:70ba:face:0:8:0:50010
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> Appears to actually stem from code in DataNodeID which assumes it's safe to 
> append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
> IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
> requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
> Currently using InetAddress.getByName() to validate IPv6 (guava 
> InetAddresses.forString has been flaky) but could also use our own parsing. 
> (From logging this, it seems like a low-enough frequency call that the extra 
> object creation shouldn't be problematic, and for me the slight risk of 
> passing in bad input that is not actually an IPv4 or IPv6 address and thus 
> calling an external DNS lookup is outweighed by getting the address 
> normalized and avoiding rewriting parsing.)
> Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
> ---
> 2nd exception (on datanode)
> 15/04/13 13:18:07 ERROR datanode.DataNode: 
> dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
> operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
> /2401:db00:11:d010:face:0:2f:0:50010
> java.io.EOFException
> at java.io.DataInputStream.readShort(DataInputStream.java:315)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
> at java.lang.Thread.run(Thread.java:745)
> Which also comes as client error "-get: 2401 is not an IP string literal."
> This one has existing parsing logic which needs to shift to the last colon 
> rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
> rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-11 Thread Nate Edel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Status: Open  (was: Patch Available)

> HDFS client gets errors trying to to connect to IPv6 DataNode
> -
>
> Key: HDFS-8078
> URL: https://issues.apache.org/jira/browse/HDFS-8078
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Nate Edel
>Assignee: Nate Edel
>  Labels: BB2015-05-TBR, ipv6
> Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, 
> HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.9.patch
>
>
> 1st exception, on put:
> 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Does not contain a valid host:port 
> authority: 2401:db00:1010:70ba:face:0:8:0:50010
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> Appears to actually stem from code in DataNodeID which assumes it's safe to 
> append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
> IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
> requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
> Currently using InetAddress.getByName() to validate IPv6 (guava 
> InetAddresses.forString has been flaky) but could also use our own parsing. 
> (From logging this, it seems like a low-enough frequency call that the extra 
> object creation shouldn't be problematic, and for me the slight risk of 
> passing in bad input that is not actually an IPv4 or IPv6 address and thus 
> calling an external DNS lookup is outweighed by getting the address 
> normalized and avoiding rewriting parsing.)
> Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
> ---
> 2nd exception (on datanode)
> 15/04/13 13:18:07 ERROR datanode.DataNode: 
> dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
> operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
> /2401:db00:11:d010:face:0:2f:0:50010
> java.io.EOFException
> at java.io.DataInputStream.readShort(DataInputStream.java:315)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
> at java.lang.Thread.run(Thread.java:745)
> Which also comes as client error "-get: 2401 is not an IP string literal."
> This one has existing parsing logic which needs to shift to the last colon 
> rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
> rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-11 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8808:

Attachment: HDFS-8808-02.patch

The failed tests are unrelated and all pass locally. Updating patch to fix 
whitespace and checkstyle issues.

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8889) Erasure Coding: cover more test situations of datanode failure during client writing

2015-08-11 Thread Li Bo (JIRA)
Li Bo created HDFS-8889:
---

 Summary: Erasure Coding: cover more test situations of datanode 
failure during client writing
 Key: HDFS-8889
 URL: https://issues.apache.org/jira/browse/HDFS-8889
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8118) Delay in checkpointing Trash can leave trash for 2 intervals before deleting

2015-08-11 Thread Casey Brotherton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casey Brotherton updated HDFS-8118:
---
Attachment: HDFS-8118.001.patch

This is a simplified patch addressing only the defect, and not the testcases.

> Delay in checkpointing Trash can leave trash for 2 intervals before deleting
> 
>
> Key: HDFS-8118
> URL: https://issues.apache.org/jira/browse/HDFS-8118
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Casey Brotherton
>Assignee: Casey Brotherton
>Priority: Trivial
> Attachments: HDFS-8118.001.patch, HDFS-8118.patch
>
>
> When the fs.trash.checkpoint.interval and the fs.trash.interval are set 
> non-zero and the same, it is possible for trash to be left for two intervals.
> The TrashPolicyDefault will use a floor and ceiling function to ensure that 
> the Trash will be checkpointed every "interval" of minutes.
> Each user's trash is checkpointed individually.  The time resolution of the 
> checkpoint timestamp is to the second.
> If the seconds switch while one user is checkpointing, then the next user's 
> timestamp will be later.
> This will cause the next user's checkpoint to not be deleted at the next 
> interval.
> I have recreated this in a lab cluster 
> I also have a suggestion for a patch that I can upload later tonight after 
> testing it further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2015-08-11 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-8879:
-
Target Version/s: 2.7.2

> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692936#comment-14692936
 ] 

Hadoop QA commented on HDFS-8078:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 45s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 13s | The applied patch generated  
12 new checkstyle issues (total was 0, now 12). |
| {color:red}-1{color} | checkstyle |   1m 55s | The applied patch generated  4 
new checkstyle issues (total was 0, now 4). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 24s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  3s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 175m  0s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 28s | Tests passed in 
hadoop-hdfs-client. |
| | | 220m 23s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749997/HDFS-8078.13.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3ae716f |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11972/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 
https://builds.apache.org/job/PreCommit-HDFS-Build/11972/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11972/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11972/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11972/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11972/console |


This message was automatically generated.

> HDFS client gets errors trying to to connect to IPv6 DataNode
> -
>
> Key: HDFS-8078
> URL: https://issues.apache.org/jira/browse/HDFS-8078
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Nate Edel
>Assignee: Nate Edel
>  Labels: BB2015-05-TBR, ipv6
> Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, 
> HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.9.patch
>
>
> 1st exception, on put:
> 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Does not contain a valid host:port 
> authority: 2401:db00:1010:70ba:face:0:8:0:50010
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> Appears to actually stem from code in DataNodeID which assumes it's safe to 
> append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
> IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
> requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
> Currently using InetAddress.getByName() to validate IPv6 (guava 
> InetAddresses.forString has been

[jira] [Updated] (HDFS-8870) Lease is leaked on write failure

2015-08-11 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated HDFS-8870:
--
Target Version/s: 2.7.2, 2.6.2  (was: 2.6.1, 2.7.2)

> Lease is leaked on write failure
> 
>
> Key: HDFS-8870
> URL: https://issues.apache.org/jira/browse/HDFS-8870
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
>
> Creating this ticket on behalf of [~daryn]
> We've seen this in our of our cluster. When a long running process has a 
> write failure, the lease is leaked and gets renewed until the token is 
> expired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2015-08-11 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692908#comment-14692908
 ] 

Xiaoyu Yao commented on HDFS-8879:
--

Thanks [~arpit99] for the review. I will hold off commit until tomorrow in case 
[~kihwal] has additional feedback.

> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8886) Not able to build with 'mvn compile -Pnative'

2015-08-11 Thread Jagadesh Kiran N (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692905#comment-14692905
 ] 

Jagadesh Kiran N commented on HDFS-8886:


http://zutai.blogspot.in/2014/06/build-install-and-run-hadoop-24-240-on.html?showComment=1422091525887#c2264594416650430988



> Not able to build with 'mvn compile -Pnative'
> -
>
> Key: HDFS-8886
> URL: https://issues.apache.org/jira/browse/HDFS-8886
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Puneeth P
>
> I am running into a problem where i am not able to compile the native parts 
> of hadoop-hdfs project. the problem is that it is not finding MakeFile in 
> ${project.build.dir}/native.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8886) Not able to build with 'mvn compile -Pnative'

2015-08-11 Thread Jagadesh Kiran N (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692904#comment-14692904
 ] 

Jagadesh Kiran N commented on HDFS-8886:


Hi, You can refer this 

> Not able to build with 'mvn compile -Pnative'
> -
>
> Key: HDFS-8886
> URL: https://issues.apache.org/jira/browse/HDFS-8886
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Puneeth P
>
> I am running into a problem where i am not able to compile the native parts 
> of hadoop-hdfs project. the problem is that it is not finding MakeFile in 
> ${project.build.dir}/native.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8887) Expose storage type and storage ID in BlockLocation

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692892#comment-14692892
 ] 

Hadoop QA commented on HDFS-8887:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  20m 54s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:red}-1{color} | javac |   7m 47s | The applied patch generated  21  
additional warning messages. |
| {color:green}+1{color} | javadoc |   9m 46s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 32s | The applied patch generated  4 
new checkstyle issues (total was 25, now 29). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   6m 15s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  22m 23s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 172m 47s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 27s | Tests passed in 
hadoop-hdfs-client. |
| | | 245m 56s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.net.TestNetUtils |
|   | hadoop.ha.TestZKFailoverController |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749983/HDFS-8887.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7c796fd |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/diffJavacWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11970/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11970/console |


This message was automatically generated.

> Expose storage type and storage ID in BlockLocation
> ---
>
> Key: HDFS-8887
> URL: https://issues.apache.org/jira/browse/HDFS-8887
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-8887.001.patch
>
>
> Some applications schedule based on info like storage type or storage ID, 
> it'd be useful to expose this information in BlockLocation. It's already 
> included in LocatedBlock and sent over the wire.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692888#comment-14692888
 ] 

Hadoop QA commented on HDFS-8826:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m  3s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 49s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 37s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 19s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  22m 10s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 171m 56s | Tests failed in hadoop-hdfs. |
| | | 236m 54s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.net.TestNetUtils |
|   | hadoop.ha.TestZKFailoverController |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749979/h8826_20150811.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7c796fd |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11971/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11971/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11971/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11971/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11971/console |


This message was automatically generated.

> Balancer may not move blocks efficiently in some cases
> --
>
> Key: HDFS-8826
> URL: https://issues.apache.org/jira/browse/HDFS-8826
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h8826_20150811.patch
>
>
> Balancer is inefficient in the following case:
> || Datanode || Utilization || Rack ||
> | D1 | 95% | A |
> | D2 | 30% | B |
> | D3, D4, D5 | 0% | B |
> The average utilization is 25% so that D2 is within 10% threshold.  However, 
> Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
> are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed

2015-08-11 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692870#comment-14692870
 ] 

Yi Liu edited comment on HDFS-8863 at 8/12/15 3:49 AM:
---

{quote}
What if we let it check against storage type level sum and also make sure there 
is at least one storage with enough space?
{quote}
Still have potential issue.  For example, we have datanode dn0, and three 
storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block 
size remaining space, and s2 has 1+2/3 block size remaining space. We just 
scheduled one block on dn0, it's certainly on s2, now a new block is adding and 
block placement checks dn0, for current patch, it will see the maximum of 
remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat 
it as a good target, but actually it's not.

I am thinking we can do as following:  do storage type level sum, but for each 
storage, we only count the remaining space of multiple block size part, so for 
above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum 
is 1, dn0 is not a good target.  In this approach, we don't need to check the 
maximum too.

{quote}
Datanodes only care about the storage type, so checking a particular 
storagewon't do any good. It will just cause block placement to re-pick target 
more.
{quote}
You are right, I also had another meaning: when iterating storages, it's to 
check the remaining space of storage type, but actually some back storages may 
be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is 
still be counted, right?  So I think you can do these check in 
{{getRemaining}}.  See my JIRA HDFS-8884, which has relation to this, I do 
fast-fail check for datanode, of cause, I can do this part in my JIRA if you 
don't do it here.


was (Author: hitliuyi):
{quote}
What if we let it check against storage type level sum and also make sure there 
is at least one storage with enough space?
{quote}
Still have potential issue.  For example, we have datanode dn0, and three 
storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block 
size remaining space, and s2 has 1+2/3 block size remaining space. We just 
scheduled one block on dn0, it's certainly on s2, now a new block is adding and 
block placement checks dn0, for current patch, it will see the maximum of 
remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat 
it as a good target, but actually it's not.

I am thinking we can do as following:  do storage type level sum, but for each 
storage, we only count the remaining space of multiple block size part, so for 
above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum 
is 1, dn0 is not a good target.

{quote}
Datanodes only care about the storage type, so checking a particular 
storagewon't do any good. It will just cause block placement to re-pick target 
more.
{quote}
You are right, I also had another meaning: when iterating storages, it's to 
check the remaining space of storage type, but actually some back storages may 
be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is 
still be counted, right?  So I think you can do these check in 
{{getRemaining}}.  See my JIRA HDFS-8884, which has relation to this, I do 
fast-fail check for datanode, of cause, I can do this part in my JIRA if you 
don't do it here.

> The remiaing space check in BlockPlacementPolicyDefault is flawed
> -
>
> Key: HDFS-8863
> URL: https://issues.apache.org/jira/browse/HDFS-8863
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
>  Labels: 2.6.1-candidate
> Attachments: HDFS-8863.patch, HDFS-8863.v2.patch
>
>
> The block placement policy calls 
> {{DatanodeDescriptor#getRemaining(StorageType to check whether the block 
> is going to fit. Since the method is adding up all remaining spaces, namenode 
> can allocate a new block on a full node. This causes pipeline construction 
> failure and {{abandonBlock}}. If the cluster is nearly full, the client might 
> hit this multiple times and the write can fail permanently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed

2015-08-11 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692870#comment-14692870
 ] 

Yi Liu edited comment on HDFS-8863 at 8/12/15 3:50 AM:
---

{quote}
What if we let it check against storage type level sum and also make sure there 
is at least one storage with enough space?
{quote}
Still have potential issue.  For example, we have datanode dn0, and three 
storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block 
size remaining space, and s2 has 1+2/3 block size remaining space. We just 
scheduled one block on dn0, it's certainly on s2, now a new block is adding and 
block placement checks dn0, for current patch, it will see the maximum of 
remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat 
it as a good target, but actually it's not.

I am thinking we can do as following:  do storage type level sum, but for each 
storage, we only count the remaining space of multiple block size part, so for 
above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum 
is 1, dn0 is not a good target.  In this approach, we don't need to check the 
maximum too.

{quote}
Datanodes only care about the storage type, so checking a particular 
storagewon't do any good. It will just cause block placement to re-pick target 
more.
{quote}
You are right, I also had another meaning: when iterating storages, it's to 
check the remaining space of storage type, but actually some back storages may 
be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is 
still be counted, right?  So I think you can do these check in 
{{getRemaining}}.  See my JIRA HDFS-8884, which has relation to this, I do 
fast-fail check for datanode, of course, I can do this part in my JIRA if you 
don't do it here.


was (Author: hitliuyi):
{quote}
What if we let it check against storage type level sum and also make sure there 
is at least one storage with enough space?
{quote}
Still have potential issue.  For example, we have datanode dn0, and three 
storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block 
size remaining space, and s2 has 1+2/3 block size remaining space. We just 
scheduled one block on dn0, it's certainly on s2, now a new block is adding and 
block placement checks dn0, for current patch, it will see the maximum of 
remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat 
it as a good target, but actually it's not.

I am thinking we can do as following:  do storage type level sum, but for each 
storage, we only count the remaining space of multiple block size part, so for 
above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum 
is 1, dn0 is not a good target.  In this approach, we don't need to check the 
maximum too.

{quote}
Datanodes only care about the storage type, so checking a particular 
storagewon't do any good. It will just cause block placement to re-pick target 
more.
{quote}
You are right, I also had another meaning: when iterating storages, it's to 
check the remaining space of storage type, but actually some back storages may 
be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is 
still be counted, right?  So I think you can do these check in 
{{getRemaining}}.  See my JIRA HDFS-8884, which has relation to this, I do 
fast-fail check for datanode, of cause, I can do this part in my JIRA if you 
don't do it here.

> The remiaing space check in BlockPlacementPolicyDefault is flawed
> -
>
> Key: HDFS-8863
> URL: https://issues.apache.org/jira/browse/HDFS-8863
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
>  Labels: 2.6.1-candidate
> Attachments: HDFS-8863.patch, HDFS-8863.v2.patch
>
>
> The block placement policy calls 
> {{DatanodeDescriptor#getRemaining(StorageType to check whether the block 
> is going to fit. Since the method is adding up all remaining spaces, namenode 
> can allocate a new block on a full node. This causes pipeline construction 
> failure and {{abandonBlock}}. If the cluster is nearly full, the client might 
> hit this multiple times and the write can fail permanently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed

2015-08-11 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692870#comment-14692870
 ] 

Yi Liu commented on HDFS-8863:
--

{quote}
What if we let it check against storage type level sum and also make sure there 
is at least one storage with enough space?
{quote}
Still have potential issue.  For example, we have datanode dn0, and three 
storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block 
size remaining space, and s2 has 1+2/3 block size remaining space. We just 
scheduled one block on dn0, it's certainly on s2, now a new block is adding and 
block placement checks dn0, for current patch, it will see the maximum of 
remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat 
it as a good target, but actually it's not.

I am thinking we can do as following:  do storage type level sum, but for each 
storage, we only count the remaining space of multiple block size part, so for 
above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum 
is 1, dn0 is not a good target.

{quote}
Datanodes only care about the storage type, so checking a particular 
storagewon't do any good. It will just cause block placement to re-pick target 
more.
{quote}
You are right, I also had another meaning: when iterating storages, it's to 
check the remaining space of storage type, but actually some back storages may 
be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is 
still be counted, right?  So I think you can do these check in 
{{getRemaining}}.  See my JIRA HDFS-8884, which has relation to this, I do 
fast-fail check for datanode, of cause, I can do this part in my JIRA if you 
don't do it here.

> The remiaing space check in BlockPlacementPolicyDefault is flawed
> -
>
> Key: HDFS-8863
> URL: https://issues.apache.org/jira/browse/HDFS-8863
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
>  Labels: 2.6.1-candidate
> Attachments: HDFS-8863.patch, HDFS-8863.v2.patch
>
>
> The block placement policy calls 
> {{DatanodeDescriptor#getRemaining(StorageType to check whether the block 
> is going to fit. Since the method is adding up all remaining spaces, namenode 
> can allocate a new block on a full node. This causes pipeline construction 
> failure and {{abandonBlock}}. If the cluster is nearly full, the client might 
> hit this multiple times and the write can fail permanently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2015-08-11 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692863#comment-14692863
 ] 

Arpit Agarwal commented on HDFS-8879:
-

+1

The test failures look unrelated.

> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692857#comment-14692857
 ] 

Hadoop QA commented on HDFS-8808:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 46s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 33s | The applied patch generated  4 
new checkstyle issues (total was 574, now 578). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 51s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 36s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 42s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  70m 41s | Tests failed in hadoop-hdfs. |
| | | 116m 45s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.hdfs.TestFileCreationDelete |
|   | org.apache.hadoop.hdfs.TestDFSClientExcludedNodes |
|   | org.apache.hadoop.hdfs.TestGetBlocks |
|   | org.apache.hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749985/HDFS-8808-01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3ae716f |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11973/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11973/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11973/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11973/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11973/console |


This message was automatically generated.

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs

2015-08-11 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8854:

Attachment: HDFS-8854-HDFS-7285.03.patch
HDFS-8854-HDFS-7285-merge.03.txt

bq. Since ErasureCodingPolicy has cellSize, can we avoid separate variable 
cellSize in DFSStripedInputStream.java, DFSStripedOutputStream.java classes.
It’s ok because cellSize is involved in calculation.
 
Uploaded 03 patch address all other issues mentioned above. Thanks again 
[~rakeshr] & [~zhz]

> Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
> --
>
> Key: HDFS-8854
> URL: https://issues.apache.org/jira/browse/HDFS-8854
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-8854-Consolidated-20150806.02.txt, 
> HDFS-8854-HDFS-7285-merge.03.txt, HDFS-8854-HDFS-7285.00.patch, 
> HDFS-8854-HDFS-7285.01.patch, HDFS-8854-HDFS-7285.02.patch, 
> HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%

2015-08-11 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8859:
-
Attachment: HDFS-8859.003.patch

> Improve DataNode (ReplicaMap) memory footprint to save about 45%
> 
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%

2015-08-11 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692809#comment-14692809
 ] 

Yi Liu commented on HDFS-8859:
--

Thanks [~szetszwo] for the review.  
For your first question, yes, and another small difference is in 
{{LightWeightHashGSet}} needs to implement {{public Collection values()}} as 
java HashMap, now I add it as an interface of  {{GSet}}

For your second comment, you are right, it's more better to change 
LightWeightHashGSet extends LightWeightGSet, I do it in the new patch.   
Actually when I made the first patch, I ever considered make 
LightWeightHashGSet  to extend LightWeightGSet, at that time I thought to 
support shrink later and more logic may be different, and make them 
independent. But I agree we should extend even so.

> Improve DataNode (ReplicaMap) memory footprint to save about 45%
> 
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-08-11 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692805#comment-14692805
 ] 

Bob Hansen commented on HDFS-8855:
--

Does 2200 DN->NN connections seem a bit... excessive... for 50 concurrent 
reads?  If you set the concurrent_reads environment variable to 500, do you end 
up with 22000 connections (and start running the NN out of ports very quickly)? 
 If the load scales up linearly with the cluster size (a process on each node 
reading 50 files), will your NN run out of ports and fail?

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
> Environment: HDP 2.2
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-11 Thread Ajith S (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692803#comment-14692803
 ] 

Ajith S commented on HDFS-8808:
---

+1

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692787#comment-14692787
 ] 

Hadoop QA commented on HDFS-8823:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 12s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 21s | The applied patch generated  
12 new checkstyle issues (total was 654, now 660). |
| {color:green}+1{color} | whitespace |   0m 11s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 34s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  2s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 192m 45s | Tests failed in hadoop-hdfs. |
| | | 236m 42s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestBlockStoragePolicy |
|   | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotReplication |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencing |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication |
|   | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
|   | hadoop.hdfs.server.namenode.TestProcessCorruptBlocks |
|   | hadoop.hdfs.TestSetrepDecreasing |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
| Timed out tests | 
org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749951/HDFS-8823.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7c796fd |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11968/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11968/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11968/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11968/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11968/console |


This message was automatically generated.

> Move replication factor into individual blocks
> --
>
> Key: HDFS-8823
> URL: https://issues.apache.org/jira/browse/HDFS-8823
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, 
> HDFS-8823.002.patch
>
>
> This jira proposes to record the replication factor in the {{BlockInfo}} 
> class. The changes have two advantages:
> * Decoupling the namespace and the block management layer. It is a 
> prerequisite step to move block management off the heap or to a separate 
> process.
> * Increased flexibility on replicating blocks. Currently the replication 
> factors of all blocks have to be the same. The replication factors of these 
> blocks are equal to the highest replication factor across all snapshots. The 
> changes will allow blocks in a file to have different replication factor, 
> potentially saving some space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8824) Do not use small blocks for balancing the cluster

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692764#comment-14692764
 ] 

Hadoop QA commented on HDFS-8824:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 27s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 52s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 23s | The applied patch generated  5 
new checkstyle issues (total was 523, now 525). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  7s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  90m  5s | Tests failed in hadoop-hdfs. |
| | | 134m 31s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.TestLeaseRecovery2 |
| Timed out tests | org.apache.hadoop.hdfs.TestListFilesInFileContext |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749952/h8824_20150811b.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7c796fd |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11969/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11969/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11969/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11969/console |


This message was automatically generated.

> Do not use small blocks for balancing the cluster
> -
>
> Key: HDFS-8824
> URL: https://issues.apache.org/jira/browse/HDFS-8824
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h8824_20150727b.patch, h8824_20150811b.patch
>
>
> Balancer gets datanode block lists from NN and then move the blocks in order 
> to balance the cluster.  It should not use the blocks with small size since 
> moving the small blocks generates a lot of overhead and the small blocks do 
> not help balancing the cluster much.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7916) 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop

2015-08-11 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7916:
--
Labels:   (was: 2.6.1-candidate)

I earlier removed HDFS-7916 from the 2.6.1-candidate list given HDFS-7704 was 
only in 2.7.0.

[~ctrezzo] added it back and so it appeared in my lists. I removed the label 
again, Chris, please comment on the mailing lists as to why you added it back. 
If you want it included, please comment there and we can add it after we figure 
out the why and the dependent tickets.

> 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for 
> infinite loop
> --
>
> Key: HDFS-7916
> URL: https://issues.apache.org/jira/browse/HDFS-7916
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Vinayakumar B
>Assignee: Rushabh S Shah
>Priority: Critical
> Fix For: 2.7.1
>
> Attachments: HDFS-7916-01.patch, HDFS-7916-1.patch
>
>
> if any badblock found, then BPSA for StandbyNode will go for infinite times 
> to report it.
> {noformat}2015-03-11 19:43:41,528 WARN 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to report bad block 
> BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: 
> stobdtserver3/10.224.54.70:18010
> org.apache.hadoop.hdfs.server.datanode.BPServiceActorActionException: Failed 
> to report bad block 
> BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode:
> at 
> org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:63)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:1020)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:762)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:856)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692658#comment-14692658
 ] 

Hadoop QA commented on HDFS-6407:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 15s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  0s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 22s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | native |   3m  7s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 175m  1s | Tests failed in hadoop-hdfs. |
| | | 213m 46s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.web.TestWebHDFS |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749927/HDFS-6407.011.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / 7c796fd |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11966/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11966/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11966/console |


This message was automatically generated.

> new namenode UI, lost ability to sort columns in datanode tab
> -
>
> Key: HDFS-6407
> URL: https://issues.apache.org/jira/browse/HDFS-6407
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Nathan Roberts
>Assignee: Haohui Mai
>Priority: Critical
>  Labels: BB2015-05-TBR
> Attachments: 002-datanodes-sorted-capacityUsed.png, 
> 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
> HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
> HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
> HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
> HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
> 2.png, sorting table.png
>
>
> old ui supported clicking on column header to sort on that column. The new ui 
> seems to have dropped this very useful feature.
> There are a few tables in the Namenode UI to display  datanodes information, 
> directory listings and snapshots.
> When there are many items in the tables, it is useful to have ability to sort 
> on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-08-11 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692650#comment-14692650
 ] 

Xiaobing Zhou commented on HDFS-8855:
-

Tested HDFS-7597 patch, it’s working well for HDFS-8855.  In my 3 nodes of 
local VMs, the ESTABLISHED connection varies from 1400 to 2200 as load 
generator is running.
The code path is different in two cases. HDFS-8855 case goes to cache in 
org.apache.hadoop.ipc.connection. Let’s investigate on that cache.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
> Environment: HDP 2.2
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-11 Thread Nate Edel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Release Note: Resubmitting older (non-NetUtils) version of patch to see if 
NetUtils change is breaking ZK related tests, can't repeat locally.  (was: Fix 
one checkstyle bug, and found a few more tests that depended on treating 
ipaddress as a null string.  Probably should fix the tests, but avoiding 
breaking on null input is OK here...)
  Status: Patch Available  (was: Open)

> HDFS client gets errors trying to to connect to IPv6 DataNode
> -
>
> Key: HDFS-8078
> URL: https://issues.apache.org/jira/browse/HDFS-8078
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Nate Edel
>Assignee: Nate Edel
>  Labels: BB2015-05-TBR, ipv6
> Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, 
> HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.9.patch
>
>
> 1st exception, on put:
> 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Does not contain a valid host:port 
> authority: 2401:db00:1010:70ba:face:0:8:0:50010
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> Appears to actually stem from code in DataNodeID which assumes it's safe to 
> append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
> IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
> requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
> Currently using InetAddress.getByName() to validate IPv6 (guava 
> InetAddresses.forString has been flaky) but could also use our own parsing. 
> (From logging this, it seems like a low-enough frequency call that the extra 
> object creation shouldn't be problematic, and for me the slight risk of 
> passing in bad input that is not actually an IPv4 or IPv6 address and thus 
> calling an external DNS lookup is outweighed by getting the address 
> normalized and avoiding rewriting parsing.)
> Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
> ---
> 2nd exception (on datanode)
> 15/04/13 13:18:07 ERROR datanode.DataNode: 
> dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
> operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
> /2401:db00:11:d010:face:0:2f:0:50010
> java.io.EOFException
> at java.io.DataInputStream.readShort(DataInputStream.java:315)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
> at java.lang.Thread.run(Thread.java:745)
> Which also comes as client error "-get: 2401 is not an IP string literal."
> This one has existing parsing logic which needs to shift to the last colon 
> rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
> rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-11 Thread Nate Edel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Attachment: HDFS-8078.13.patch

> HDFS client gets errors trying to to connect to IPv6 DataNode
> -
>
> Key: HDFS-8078
> URL: https://issues.apache.org/jira/browse/HDFS-8078
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Nate Edel
>Assignee: Nate Edel
>  Labels: BB2015-05-TBR, ipv6
> Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, 
> HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.9.patch
>
>
> 1st exception, on put:
> 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Does not contain a valid host:port 
> authority: 2401:db00:1010:70ba:face:0:8:0:50010
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> Appears to actually stem from code in DataNodeID which assumes it's safe to 
> append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
> IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
> requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
> Currently using InetAddress.getByName() to validate IPv6 (guava 
> InetAddresses.forString has been flaky) but could also use our own parsing. 
> (From logging this, it seems like a low-enough frequency call that the extra 
> object creation shouldn't be problematic, and for me the slight risk of 
> passing in bad input that is not actually an IPv4 or IPv6 address and thus 
> calling an external DNS lookup is outweighed by getting the address 
> normalized and avoiding rewriting parsing.)
> Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
> ---
> 2nd exception (on datanode)
> 15/04/13 13:18:07 ERROR datanode.DataNode: 
> dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
> operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
> /2401:db00:11:d010:face:0:2f:0:50010
> java.io.EOFException
> at java.io.DataInputStream.readShort(DataInputStream.java:315)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
> at java.lang.Thread.run(Thread.java:745)
> Which also comes as client error "-get: 2401 is not an IP string literal."
> This one has existing parsing logic which needs to shift to the last colon 
> rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
> rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-11 Thread Nate Edel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Status: Open  (was: Patch Available)

> HDFS client gets errors trying to to connect to IPv6 DataNode
> -
>
> Key: HDFS-8078
> URL: https://issues.apache.org/jira/browse/HDFS-8078
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Nate Edel
>Assignee: Nate Edel
>  Labels: BB2015-05-TBR, ipv6
> Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, 
> HDFS-8078.12.patch, HDFS-8078.9.patch
>
>
> 1st exception, on put:
> 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Does not contain a valid host:port 
> authority: 2401:db00:1010:70ba:face:0:8:0:50010
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
>   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> Appears to actually stem from code in DataNodeID which assumes it's safe to 
> append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
> IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
> requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
> Currently using InetAddress.getByName() to validate IPv6 (guava 
> InetAddresses.forString has been flaky) but could also use our own parsing. 
> (From logging this, it seems like a low-enough frequency call that the extra 
> object creation shouldn't be problematic, and for me the slight risk of 
> passing in bad input that is not actually an IPv4 or IPv6 address and thus 
> calling an external DNS lookup is outweighed by getting the address 
> normalized and avoiding rewriting parsing.)
> Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
> ---
> 2nd exception (on datanode)
> 15/04/13 13:18:07 ERROR datanode.DataNode: 
> dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
> operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
> /2401:db00:11:d010:face:0:2f:0:50010
> java.io.EOFException
> at java.io.DataInputStream.readShort(DataInputStream.java:315)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
> at java.lang.Thread.run(Thread.java:745)
> Which also comes as client error "-get: 2401 is not an IP string literal."
> This one has existing parsing logic which needs to shift to the last colon 
> rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
> rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8870) Lease is leaked on write failure

2015-08-11 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated HDFS-8870:
--

Unless the patch is ready to go and the JIRA is a critical fix, we'll defer it 
to 2.6.2. Let me know if you have comments. Thanks!

> Lease is leaked on write failure
> 
>
> Key: HDFS-8870
> URL: https://issues.apache.org/jira/browse/HDFS-8870
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
>
> Creating this ticket on behalf of [~daryn]
> We've seen this in our of our cluster. When a long running process has a 
> write failure, the lease is leaked and gets renewed until the token is 
> expired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8886) Not able to build with 'mvn compile -Pnative'

2015-08-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-8886.
---
Resolution: Not A Problem

Please use the user list rather than JIRA for problems like this. In this case, 
I recommend you follow the instructions in BUILDING.txt, namely to run "mvn 
install -DskipTests" from the top level.

> Not able to build with 'mvn compile -Pnative'
> -
>
> Key: HDFS-8886
> URL: https://issues.apache.org/jira/browse/HDFS-8886
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Puneeth P
>
> I am running into a problem where i am not able to compile the native parts 
> of hadoop-hdfs project. the problem is that it is not finding MakeFile in 
> ${project.build.dir}/native.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8888) Support volumes in HDFS

2015-08-11 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-:
-
Description: 
There are multiple types of zones (e.g., snapshottable directories, encryption 
zones, directories with quotas) which are conceptually close to namespace 
volumes in traditional file systems.

This jira proposes to introduce the concept of volume to simplify the 
implementation of snapshots and encryption zones.

  was:
There are multiple types of zones (e.g., snapshot, encryption zone) which are 
conceptually close to namespace volumes in traditional filesystems.

This jira proposes to introduce the concept of volume to simplify the 
implementation of snapshots and encryption zones.


> Support volumes in HDFS
> ---
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>
> There are multiple types of zones (e.g., snapshottable directories, 
> encryption zones, directories with quotas) which are conceptually close to 
> namespace volumes in traditional file systems.
> This jira proposes to introduce the concept of volume to simplify the 
> implementation of snapshots and encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8887) Expose storage type and storage ID in BlockLocation

2015-08-11 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692594#comment-14692594
 ] 

Lei (Eddy) Xu commented on HDFS-8887:
-

LGTM, Thanks [~andrew.wang].

+1, pending jenkins.



> Expose storage type and storage ID in BlockLocation
> ---
>
> Key: HDFS-8887
> URL: https://issues.apache.org/jira/browse/HDFS-8887
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-8887.001.patch
>
>
> Some applications schedule based on info like storage type or storage ID, 
> it'd be useful to expose this information in BlockLocation. It's already 
> included in LocatedBlock and sent over the wire.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8888) Support volumes in HDFS

2015-08-11 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-:
-
Summary: Support volumes in HDFS  (was: Support the volume concepts in HDFS)

> Support volumes in HDFS
> ---
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>
> There are multiple types of zones (e.g., snapshot, encryption zone) which are 
> conceptually close to namespace volumes in traditional filesystems.
> This jira proposes to introduce the concept of volume to simplify the 
> implementation of snapshots and encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8888) Support the volume concepts in HDFS

2015-08-11 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692590#comment-14692590
 ] 

Jing Zhao commented on HDFS-:
-

+1 to have volume in HDFS. Also congrats for the jira number :)

> Support the volume concepts in HDFS
> ---
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>
> There are multiple types of zones (e.g., snapshot, encryption zone) which are 
> conceptually close to namespace volumes in traditional filesystems.
> This jira proposes to introduce the concept of volume to simplify the 
> implementation of snapshots and encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8888) Support the volume concepts in HDFS

2015-08-11 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692587#comment-14692587
 ] 

Haohui Mai commented on HDFS-:
--

>From an implementation standpoint, the concepts of volume provides a basic 
>building block to simplify the implementation of snapshots and encryption 
>zones that are available today. The concept of volume may simplify the tasks 
>of administration and operations.

More detailed design will be available shortly afterward.

> Support the volume concepts in HDFS
> ---
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>
> There are multiple types of zones (e.g., snapshot, encryption zone) which are 
> conceptually close to namespace volumes in traditional filesystems.
> This jira proposes to introduce the concept of volume to simplify the 
> implementation of snapshots and encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8880) NameNode metrics logging

2015-08-11 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8880:

Attachment: HDFS-8880.02.patch

.02 patch:
* Added test cases.
* Replaced {{Timer}} with {{ScheduledThreadPoolExecutor}} in NameNode.

> NameNode metrics logging
> 
>
> Key: HDFS-8880
> URL: https://issues.apache.org/jira/browse/HDFS-8880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8880.01.patch, HDFS-8880.02.patch, 
> namenode-metrics.log
>
>
> The NameNode can periodically log metrics to help debugging when the cluster 
> is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8888) Support the volume concepts in HDFS

2015-08-11 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-:


 Summary: Support the volume concepts in HDFS
 Key: HDFS-
 URL: https://issues.apache.org/jira/browse/HDFS-
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai


There are multiple types of zones (e.g., snapshot, encryption zone) which are 
conceptually close to namespace volumes in traditional filesystems.

This jira proposes to introduce the concept of volume to simplify the 
implementation of snapshots and encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones

2015-08-11 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692549#comment-14692549
 ] 

Jing Zhao commented on HDFS-8833:
-

Thanks for the summary, Zhe. The new proposal looks reasonable to me overall. 
Some thoughts and questions:
# Will we allow associating the EC policy with a non-empty directory? I guess 
we should disallow it, otherwise the semantic of the "create EC Directory" 
command can be very confusing.
# Do we want to allow nested EC directories? Currently since we only support 
one policy, I do not see any benefits to have nested EC directories. Thus in 
the first stage we can disallow it. Also note that it's always easier to remove 
a restriction than adding a new restriction.
# If we agree on the above two, the only change we're proposing here is to 
support rename across EC zone boundary. Since the EC policy bit is already on 
INodeFile, its implementation can be simple.

I also had some offline discussion about this with [~sureshms], [~szetszwo], 
and [~wheat9]. Currently our main concern is still to allow rename can make it 
hard for end user to understand the exact semantic and also make the management 
hard.

> Erasure coding: store EC schema and cell size in INodeFile and eliminate 
> notion of EC zones
> ---
>
> Key: HDFS-8833
> URL: https://issues.apache.org/jira/browse/HDFS-8833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> We have [discussed | 
> https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
>  storing EC schema with files instead of EC zones and recently revisited the 
> discussion under HDFS-8059.
> As a recap, the _zone_ concept has severe limitations including renaming and 
> nested configuration. Those limitations are valid in encryption for security 
> reasons and it doesn't make sense to carry them over in EC.
> This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
> simplicity, we should first implement it as an xattr and consider memory 
> optimizations (such as moving it to file header) as a follow-on. We should 
> also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-11 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8808:

Attachment: HDFS-8808-01.patch

Updating the patch with a unit test and the configuration option for 
transferring images for bootstrapping standby.

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8870) Lease is leaked on write failure

2015-08-11 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692510#comment-14692510
 ] 

Sangjin Lee commented on HDFS-8870:
---

Should this be targeted to 2.6.2? We're trying to release 2.6.1 soon. Let me 
know.

> Lease is leaked on write failure
> 
>
> Key: HDFS-8870
> URL: https://issues.apache.org/jira/browse/HDFS-8870
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
>
> Creating this ticket on behalf of [~daryn]
> We've seen this in our of our cluster. When a long running process has a 
> write failure, the lease is leaked and gets renewed until the token is 
> expired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8887) Expose storage type and storage ID in BlockLocation

2015-08-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8887:
--
Status: Patch Available  (was: Open)

> Expose storage type and storage ID in BlockLocation
> ---
>
> Key: HDFS-8887
> URL: https://issues.apache.org/jira/browse/HDFS-8887
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-8887.001.patch
>
>
> Some applications schedule based on info like storage type or storage ID, 
> it'd be useful to expose this information in BlockLocation. It's already 
> included in LocatedBlock and sent over the wire.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8887) Expose storage type and storage ID in BlockLocation

2015-08-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8887:
--
Attachment: HDFS-8887.001.patch

Patch attached. If this goes in, I'll file another follow-on JIRA to remove the 
getFileBlockStorageLocations API from trunk since it is superceded by storage 
IDs.

> Expose storage type and storage ID in BlockLocation
> ---
>
> Key: HDFS-8887
> URL: https://issues.apache.org/jira/browse/HDFS-8887
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-8887.001.patch
>
>
> Some applications schedule based on info like storage type or storage ID, 
> it'd be useful to expose this information in BlockLocation. It's already 
> included in LocatedBlock and sent over the wire.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8887) Expose storage type and storage ID in BlockLocation

2015-08-11 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-8887:
-

 Summary: Expose storage type and storage ID in BlockLocation
 Key: HDFS-8887
 URL: https://issues.apache.org/jira/browse/HDFS-8887
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.1
Reporter: Andrew Wang
Assignee: Andrew Wang


Some applications schedule based on info like storage type or storage ID, it'd 
be useful to expose this information in BlockLocation. It's already included in 
LocatedBlock and sent over the wire.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8826) Balancer may not move blocks efficiently in some cases

2015-08-11 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8826:
--
Status: Patch Available  (was: Open)

> Balancer may not move blocks efficiently in some cases
> --
>
> Key: HDFS-8826
> URL: https://issues.apache.org/jira/browse/HDFS-8826
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h8826_20150811.patch
>
>
> Balancer is inefficient in the following case:
> || Datanode || Utilization || Rack ||
> | D1 | 95% | A |
> | D2 | 30% | B |
> | D3, D4, D5 | 0% | B |
> The average utilization is 25% so that D2 is within 10% threshold.  However, 
> Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
> are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8826) Balancer may not move blocks efficiently in some cases

2015-08-11 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8826:
--
Attachment: h8826_20150811.patch

h8826_20150811.patch: adds a new -source option.

Will add some tests later.

> Balancer may not move blocks efficiently in some cases
> --
>
> Key: HDFS-8826
> URL: https://issues.apache.org/jira/browse/HDFS-8826
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h8826_20150811.patch
>
>
> Balancer is inefficient in the following case:
> || Datanode || Utilization || Rack ||
> | D1 | 95% | A |
> | D2 | 30% | B |
> | D3, D4, D5 | 0% | B |
> The average utilization is 25% so that D2 is within 10% threshold.  However, 
> Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
> are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692442#comment-14692442
 ] 

Hadoop QA commented on HDFS-8828:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 20s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 20s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 28s | The applied patch generated  1 
new checkstyle issues (total was 120, now 121). |
| {color:green}+1{color} | whitespace |   0m  4s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 50s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   6m 29s | Tests passed in 
hadoop-distcp. |
| | |  45m 36s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749947/HDFS-8828.006.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7c796fd |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11967/artifact/patchprocess/diffcheckstylehadoop-distcp.txt
 |
| hadoop-distcp test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11967/artifact/patchprocess/testrun_hadoop-distcp.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11967/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11967/console |


This message was automatically generated.

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones

2015-08-11 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692302#comment-14692302
 ] 

Zhe Zhang commented on HDFS-8833:
-

bq. Looks like Zhe planned to implement the XAttr-based solution, and I'm fine 
with this direction.
Thanks Jing for confirming this. 

When looking at the code I realized that we should at least leverage the 
{{isStriped}} bit in the file header to represent the system default policy of 
RS(6,3). So below is a revised design:

# ErasureCodingPolicy table: already done in HDFS-8854
# File header change
#* Rename {{isStriped}} to {{erasureCodingPolicy}} in {{INodeFile}} header.
{code}
  /** 
   * Bit format:
   * [4-bit storagePolicyID][1-bit erasureCodingPolicy]
   * [11-bit replication][48-bit preferredBlockSize]
   */
{code}
#* The ECPolicy is *always set* when creating a file; {{0}} represents 
contiguous layout.
#* Since {{ErasureCodingPolicyManager}} / {{ErasureCodingSchemaManager}} only 
has 1 policy, we don't even need to set XAttr on files at this stage.
#* [follow-on] When we support more EC policies on HDFS side, figure out the 
number of additional file header bits to use.
#* [follow-on] Add {{inherit-on-create}} flag as Andrew suggested above
# Directory XAttr change
#* Directory's ECPolicy XAttr can be empty, indicating the ECPolicy is the same 
as ancestor. Otherwise its own XAttr determines the policy for newly created 
files under the directory.
# Renaming
#* A renamed file keeps the ECPolicy in its header.
#* Therefore, a directory can have files with different ECPolicies.
#* Conversion not explicitly support. If needed a file can be converted by 
cp+rm.
#* When renamed, a directory carries over its ECPolicy if it's set (XAttr 
non-empty). Otherwise its XAttr remains empty (and newly created files under 
the moved directory will use policy from the new ancestors). 

Please let me know if it looks reasonable. Thanks.

> Erasure coding: store EC schema and cell size in INodeFile and eliminate 
> notion of EC zones
> ---
>
> Key: HDFS-8833
> URL: https://issues.apache.org/jira/browse/HDFS-8833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> We have [discussed | 
> https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
>  storing EC schema with files instead of EC zones and recently revisited the 
> discussion under HDFS-8059.
> As a recap, the _zone_ concept has severe limitations including renaming and 
> nested configuration. Those limitations are valid in encryption for security 
> reasons and it doesn't make sense to carry them over in EC.
> This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
> simplicity, we should first implement it as an xattr and consider memory 
> optimizations (such as moving it to file header) as a follow-on. We should 
> also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8875) Optimize the wait time in Balancer for federation scenario

2015-08-11 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692284#comment-14692284
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8875:
---

Balancer will exit if one of the NNs succeeds or throws exception.  See if you 
also want to fix it here.

> Optimize the wait time in Balancer for federation scenario
> --
>
> Key: HDFS-8875
> URL: https://issues.apache.org/jira/browse/HDFS-8875
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Chris Trezzo
>
> Balancer has wait time between two consecutive iterations. That is to give 
> some time for block movement to be fully committed ( return from replaceBlock 
> doesn't mean the NN's blockmap has been updated and the block has been 
> invalidated on the source node.).
> This wait time could be 23 seconds if {{dfs.heartbeat.interval}} is set to 10 
> and {{dfs.namenode.replication.interval}} is to 3. In the case of federation, 
> given we iterate through all namespaces in each iteration, this wait time 
> becomes unnecessary as while balancer is processing the next namespace, it 
> gives the previous namespace it just finished time to commit.
> In addition, Balancer calls {{Collections.shuffle(connectors);}} It doesn't 
> seem necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases

2015-08-11 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14691417#comment-14691417
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8826:
---

I suggest to add an option to specify the source node list.  Then, balancer 
only selects blocks to move from those nodes.

> Balancer may not move blocks efficiently in some cases
> --
>
> Key: HDFS-8826
> URL: https://issues.apache.org/jira/browse/HDFS-8826
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>
> Balancer is inefficient in the following case:
> || Datanode || Utilization || Rack ||
> | D1 | 95% | A |
> | D2 | 30% | B |
> | D3, D4, D5 | 0% | B |
> The average utilization is 25% so that D2 is within 10% threshold.  However, 
> Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
> are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8823) Move replication factor into individual blocks

2015-08-11 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8823:
-
Attachment: HDFS-8823.002.patch

> Move replication factor into individual blocks
> --
>
> Key: HDFS-8823
> URL: https://issues.apache.org/jira/browse/HDFS-8823
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, 
> HDFS-8823.002.patch
>
>
> This jira proposes to record the replication factor in the {{BlockInfo}} 
> class. The changes have two advantages:
> * Decoupling the namespace and the block management layer. It is a 
> prerequisite step to move block management off the heap or to a separate 
> process.
> * Increased flexibility on replicating blocks. Currently the replication 
> factors of all blocks have to be the same. The replication factors of these 
> blocks are equal to the highest replication factor across all snapshots. The 
> changes will allow blocks in a file to have different replication factor, 
> potentially saving some space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8824) Do not use small blocks for balancing the cluster

2015-08-11 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8824:
--
Attachment: h8824_20150811b.patch

h8824_20150811b.patch: reverts the NN change.  Will do it in a separated JIRA.

> Do not use small blocks for balancing the cluster
> -
>
> Key: HDFS-8824
> URL: https://issues.apache.org/jira/browse/HDFS-8824
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h8824_20150727b.patch, h8824_20150811b.patch
>
>
> Balancer gets datanode block lists from NN and then move the blocks in order 
> to balance the cluster.  It should not use the blocks with small size since 
> moving the small blocks generates a lot of overhead and the small blocks do 
> not help balancing the cluster much.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-11 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated HDFS-8828:
---
Attachment: HDFS-8828.006.patch

Add exclude list while recursively traverse the created directories in snapshot 
diff report. 

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6244) Make Trash Interval configurable for each of the namespaces

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682498#comment-14682498
 ] 

Hadoop QA commented on HDFS-6244:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  20m 19s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 52s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |   9m 49s | The applied patch generated  1  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 13s | The applied patch generated  3 
new checkstyle issues (total was 195, now 197). |
| {color:red}-1{color} | checkstyle |   2m 53s | The applied patch generated  5 
new checkstyle issues (total was 19, now 24). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 5  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   5m 56s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  22m 18s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | yarn tests |  50m 44s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| {color:red}-1{color} | hdfs tests |   0m 22s | Tests failed in hadoop-hdfs. |
| | | 122m 43s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.ha.TestZKFailoverController |
|   | hadoop.net.TestNetUtils |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
|   | hadoop.yarn.server.resourcemanager.TestRMAdminService |
|   | hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStorePerf |
|   | hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore |
| Timed out tests | org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart 
|
|   | 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens
 |
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749892/HDFS-6244.v4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 1fc3c77 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/diffJavadocWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/diffcheckstylehadoop-common.txt
 
https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11965/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11965/console |


This message was automatically generated.

> Make Trash Interval configurable for each of the namespaces
> ---
>
> Key: HDFS-6244
> URL: https://issues.apache.org/jira/browse/HDFS-6244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, 
> HDFS-6244.v3.patch, HDFS-6244.v4.patch
>
>
> Somehow we need to avoid the cluster filling up.
> One solution is to have a different trash policy per namespace. However, if 
> we can simply make the property configurable per namespace, then the same 
> config can be rolled

[jira] [Commented] (HDFS-8886) Not able to build with 'mvn compile -Pnative'

2015-08-11 Thread Puneeth P (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682469#comment-14682469
 ] 

Puneeth P commented on HDFS-8886:
-

When i try to build it, it fails with 
{noformat}
run (make) on project hadoop-hdfs: An Ant BuildException has occured: no 
targets found
{noformat}

> Not able to build with 'mvn compile -Pnative'
> -
>
> Key: HDFS-8886
> URL: https://issues.apache.org/jira/browse/HDFS-8886
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Puneeth P
>
> I am running into a problem where i am not able to compile the native parts 
> of hadoop-hdfs project. the problem is that it is not finding MakeFile in 
> ${project.build.dir}/native.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8886) Not able to build with 'mvn compile -Pnative'

2015-08-11 Thread Puneeth P (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Puneeth P updated HDFS-8886:

Description: 
I am running into a problem where i am not able to compile the native parts of 
hadoop-hdfs project. the problem is that it is not finding MakeFile in 
${project.build.dir}/native.

  was:I am running into a problem where i am not able to compile the native 
parts of hadoop-hdfs project. the problem is that it is not finding MakeFile in 
*${project.build.dir}/native*.


> Not able to build with 'mvn compile -Pnative'
> -
>
> Key: HDFS-8886
> URL: https://issues.apache.org/jira/browse/HDFS-8886
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Puneeth P
>
> I am running into a problem where i am not able to compile the native parts 
> of hadoop-hdfs project. the problem is that it is not finding MakeFile in 
> ${project.build.dir}/native.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8886) Not able to build with 'mvn compile -Pnative'

2015-08-11 Thread Puneeth P (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Puneeth P updated HDFS-8886:

Description: I am running into a problem where i am not able to compile the 
native parts of hadoop-hdfs project. the problem is that it is not finding 
MakeFile in *${project.build.dir}/native*.  (was: I am running into a problem 
where i am not able to compile the native parts of hadoop-hdfs project. the 
problem is that it is not finding MakeFile in {{${project.build.dir}/native}}.)

> Not able to build with 'mvn compile -Pnative'
> -
>
> Key: HDFS-8886
> URL: https://issues.apache.org/jira/browse/HDFS-8886
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Puneeth P
>
> I am running into a problem where i am not able to compile the native parts 
> of hadoop-hdfs project. the problem is that it is not finding MakeFile in 
> *${project.build.dir}/native*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8886) Not able to build with 'mvn compile -Pnative'

2015-08-11 Thread Puneeth P (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Puneeth P updated HDFS-8886:

Description: I am running into a problem where i am not able to compile the 
native parts of hadoop-hdfs project. the problem is that it is not finding 
MakeFile in {{${project.build.dir}/native}}.  (was: I am running into a problem 
where i am not able to compile the native parts of hadoop-hdfs project. the 
problem is that it is not finding MakeFile in ${project.build.dir}/native.)

> Not able to build with 'mvn compile -Pnative'
> -
>
> Key: HDFS-8886
> URL: https://issues.apache.org/jira/browse/HDFS-8886
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Puneeth P
>
> I am running into a problem where i am not able to compile the native parts 
> of hadoop-hdfs project. the problem is that it is not finding MakeFile in 
> {{${project.build.dir}/native}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8886) Not able to build with 'mvn compile -Pnative'

2015-08-11 Thread Puneeth P (JIRA)
Puneeth P created HDFS-8886:
---

 Summary: Not able to build with 'mvn compile -Pnative'
 Key: HDFS-8886
 URL: https://issues.apache.org/jira/browse/HDFS-8886
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Puneeth P


I am running into a problem where i am not able to compile the native parts of 
hadoop-hdfs project. the problem is that it is not finding MakeFile in 
${project.build.dir}/native.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682433#comment-14682433
 ] 

Hadoop QA commented on HDFS-8808:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 16s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 14s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  0s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 28s | The applied patch generated  1 
new checkstyle issues (total was 152, now 153). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 36s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 11s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  87m 49s | Tests failed in hadoop-hdfs. |
| | | 134m  0s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestQuotaByStorageType |
|   | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy |
|   | hadoop.hdfs.server.namenode.TestFSImageWithSnapshot |
|   | hadoop.hdfs.server.namenode.TestBlockPlacementPolicyRackFaultTolerant |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749886/HDFS-8808-00.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 1fc3c77 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11964/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11964/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11964/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11964/console |


This message was automatically generated.

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8880) NameNode metrics logging

2015-08-11 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8880:

Status: Patch Available  (was: Open)

> NameNode metrics logging
> 
>
> Key: HDFS-8880
> URL: https://issues.apache.org/jira/browse/HDFS-8880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8880.01.patch, namenode-metrics.log
>
>
> The NameNode can periodically log metrics to help debugging when the cluster 
> is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8880) NameNode metrics logging

2015-08-11 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8880:

Status: Open  (was: Patch Available)

> NameNode metrics logging
> 
>
> Key: HDFS-8880
> URL: https://issues.apache.org/jira/browse/HDFS-8880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8880.01.patch, namenode-metrics.log
>
>
> The NameNode can periodically log metrics to help debugging when the cluster 
> is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones

2015-08-11 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682419#comment-14682419
 ] 

Jing Zhao commented on HDFS-8833:
-

bq. I agree that a pure XAttr-based solution is simpler, cleaner, and more 
scalable. We should probably implement that at this stage and pursue memory 
saving ideas as follow-on.

Looks like Zhe planned to implement the XAttr-based solution, and I'm fine with 
this direction.

> Erasure coding: store EC schema and cell size in INodeFile and eliminate 
> notion of EC zones
> ---
>
> Key: HDFS-8833
> URL: https://issues.apache.org/jira/browse/HDFS-8833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> We have [discussed | 
> https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
>  storing EC schema with files instead of EC zones and recently revisited the 
> discussion under HDFS-8059.
> As a recap, the _zone_ concept has severe limitations including renaming and 
> nested configuration. Those limitations are valid in encryption for security 
> reasons and it doesn't make sense to carry them over in EC.
> This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
> simplicity, we should first implement it as an xattr and consider memory 
> optimizations (such as moving it to file header) as a follow-on. We should 
> also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones

2015-08-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682412#comment-14682412
 ] 

Andrew Wang commented on HDFS-8833:
---

[~jingzhao] any additional comment on file header bits? Else I think Zhe wants 
to start working on the design as discussed above.

> Erasure coding: store EC schema and cell size in INodeFile and eliminate 
> notion of EC zones
> ---
>
> Key: HDFS-8833
> URL: https://issues.apache.org/jira/browse/HDFS-8833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> We have [discussed | 
> https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
>  storing EC schema with files instead of EC zones and recently revisited the 
> discussion under HDFS-8059.
> As a recap, the _zone_ concept has severe limitations including renaming and 
> nested configuration. Those limitations are valid in encryption for security 
> reasons and it doesn't make sense to carry them over in EC.
> This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
> simplicity, we should first implement it as an xattr and consider memory 
> optimizations (such as moving it to file header) as a follow-on. We should 
> also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-08-11 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682404#comment-14682404
 ] 

Chang Li commented on HDFS-6407:


ok. +1(non binding)

> new namenode UI, lost ability to sort columns in datanode tab
> -
>
> Key: HDFS-6407
> URL: https://issues.apache.org/jira/browse/HDFS-6407
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Nathan Roberts
>Assignee: Haohui Mai
>Priority: Critical
>  Labels: BB2015-05-TBR
> Attachments: 002-datanodes-sorted-capacityUsed.png, 
> 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
> HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
> HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
> HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
> HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
> 2.png, sorting table.png
>
>
> old ui supported clicking on column header to sort on that column. The new ui 
> seems to have dropped this very useful feature.
> There are a few tables in the Namenode UI to display  datanodes information, 
> directory listings and snapshots.
> When there are many items in the tables, it is useful to have ability to sort 
> on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-08-11 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682399#comment-14682399
 ] 

Arpit Agarwal commented on HDFS-8277:
-

Hi [~surendrasingh], the setting must be persisted in edit log but changing the 
behavior would be incompatible for 2.x. We should consider revisiting this for 
3.x. I am not in favor the original proposal in the v1 patch.

> Safemode enter fails when Standby NameNode is down
> --
>
> Key: HDFS-8277
> URL: https://issues.apache.org/jira/browse/HDFS-8277
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, HDFS, namenode
>Affects Versions: 2.6.0
> Environment: HDP 2.2.0
>Reporter: Hari Sekhon
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: HDFS-8277-safemode-edits.patch, HDFS-8277.patch, 
> HDFS-8277_1.patch, HDFS-8277_2.patch, HDFS-8277_3.patch, HDFS-8277_4.patch
>
>
> HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
> AMBARI-10536).
> {code}hdfs dfsadmin -safemode enter
> safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused{code}
> This appears to be a bug in that it's not trying both NameNodes like the 
> standard hdfs client code does, and is instead stopping after getting a 
> connection refused from nn1 which is down. I verified normal hadoop fs writes 
> and reads via cli did work at this time, using nn2. I happened to run this 
> command as the hdfs user on nn2 which was the surviving Active NameNode.
> After I re-bootstrapped the Standby NN to fix it the command worked as 
> expected again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-08-11 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682398#comment-14682398
 ] 

Haohui Mai commented on HDFS-6407:
--

The discussion on dfs usage and the sorting of the column should be separated. 
Please file another jira for the feature request.


> new namenode UI, lost ability to sort columns in datanode tab
> -
>
> Key: HDFS-6407
> URL: https://issues.apache.org/jira/browse/HDFS-6407
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Nathan Roberts
>Assignee: Haohui Mai
>Priority: Critical
>  Labels: BB2015-05-TBR
> Attachments: 002-datanodes-sorted-capacityUsed.png, 
> 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
> HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
> HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
> HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
> HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
> 2.png, sorting table.png
>
>
> old ui supported clicking on column header to sort on that column. The new ui 
> seems to have dropped this very useful feature.
> There are a few tables in the Namenode UI to display  datanodes information, 
> directory listings and snapshots.
> When there are many items in the tables, it is useful to have ability to sort 
> on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-08-11 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated HDFS-6407:
---
Attachment: HDFS-6407.011.patch

added non dfs usage back to column on .11 patch. 
[~wheat9], [~nroberts] please help review the .11 patch. Thanks!

> new namenode UI, lost ability to sort columns in datanode tab
> -
>
> Key: HDFS-6407
> URL: https://issues.apache.org/jira/browse/HDFS-6407
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Nathan Roberts
>Assignee: Haohui Mai
>Priority: Critical
>  Labels: BB2015-05-TBR
> Attachments: 002-datanodes-sorted-capacityUsed.png, 
> 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
> HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
> HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
> HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
> HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
> 2.png, sorting table.png
>
>
> old ui supported clicking on column header to sort on that column. The new ui 
> seems to have dropped this very useful feature.
> There are a few tables in the Namenode UI to display  datanodes information, 
> directory listings and snapshots.
> When there are many items in the tables, it is useful to have ability to sort 
> on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8865) Improve quota initialization performance

2015-08-11 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682371#comment-14682371
 ] 

Kihwal Lee commented on HDFS-8865:
--

The checkstyle error is for the new config key, which I am not going to fix.
The unit test timeout does not happen when I run it. Looks like it is failing 
in other pre-commit builds too, so it is not being caused by this patch.

> Improve quota initialization performance
> 
>
> Key: HDFS-8865
> URL: https://issues.apache.org/jira/browse/HDFS-8865
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
> HDFS-8865.v2.patch
>
>
> After replaying edits, the whole file system tree is recursively scanned in 
> order to initialize the quota. For big name space, this can take a very long 
> time.  Since this is done during namenode failover, it also affects failover 
> latency.
> By using the Fork-Join framework, I was able to greatly reduce the 
> initialization time.  The following is the test result using the fsimage from 
> one of the big name nodes we have.
> || threads || seconds||
> | 1 (existing) | 55|
> | 1 (fork-join) | 68 |
> | 4 | 16 |
> | 8 | 8 |
> | 12 | 6 |
> | 16 | 5 |
> | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682328#comment-14682328
 ] 

Hadoop QA commented on HDFS-8863:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  5s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 30s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 25s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 17s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 175m 14s | Tests failed in hadoop-hdfs. |
| | | 221m 54s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestDatanodeDeath |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749845/HDFS-8863.v2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fa1d84a |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11963/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11963/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11963/console |


This message was automatically generated.

> The remiaing space check in BlockPlacementPolicyDefault is flawed
> -
>
> Key: HDFS-8863
> URL: https://issues.apache.org/jira/browse/HDFS-8863
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
>  Labels: 2.6.1-candidate
> Attachments: HDFS-8863.patch, HDFS-8863.v2.patch
>
>
> The block placement policy calls 
> {{DatanodeDescriptor#getRemaining(StorageType to check whether the block 
> is going to fit. Since the method is adding up all remaining spaces, namenode 
> can allocate a new block on a full node. This causes pipeline construction 
> failure and {{abandonBlock}}. If the cluster is nearly full, the client might 
> hit this multiple times and the write can fail permanently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8865) Improve quota initialization performance

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682301#comment-14682301
 ] 

Hadoop QA commented on HDFS-8865:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m  9s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 56s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 21s | The applied patch generated  1 
new checkstyle issues (total was 493, now 491). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 21s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 38s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 10s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 186m  4s | Tests failed in hadoop-hdfs. |
| | | 230m 19s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749856/HDFS-8865.v2.checkstyle.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fa1d84a |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11962/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11962/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11962/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11962/console |


This message was automatically generated.

> Improve quota initialization performance
> 
>
> Key: HDFS-8865
> URL: https://issues.apache.org/jira/browse/HDFS-8865
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
> HDFS-8865.v2.patch
>
>
> After replaying edits, the whole file system tree is recursively scanned in 
> order to initialize the quota. For big name space, this can take a very long 
> time.  Since this is done during namenode failover, it also affects failover 
> latency.
> By using the Fork-Join framework, I was able to greatly reduce the 
> initialization time.  The following is the test result using the fsimage from 
> one of the big name nodes we have.
> || threads || seconds||
> | 1 (existing) | 55|
> | 1 (fork-join) | 68 |
> | 4 | 16 |
> | 8 | 8 |
> | 12 | 6 |
> | 16 | 5 |
> | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8052) Move WebHdfsFileSystem into hadoop-hdfs-client

2015-08-11 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682247#comment-14682247
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8052:
---

Agree with Haohui that RetryUtils is not yet a public API although it could be 
useful for other projects.

[~gsaha], the Hadoop APIs by default are for internal use only unless it is 
annotated as \@InterfaceAudience.Public.  If there is a need, please file a 
JIRA so that we could change the annotation.

> Move WebHdfsFileSystem into hadoop-hdfs-client
> --
>
> Key: HDFS-8052
> URL: https://issues.apache.org/jira/browse/HDFS-8052
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-8052.000.patch, HDFS-8052.001.patch, 
> HDFS-8052.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8052) Move WebHdfsFileSystem into hadoop-hdfs-client

2015-08-11 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved HDFS-8052.
--
Resolution: Fixed

Closing this jira.

{{RetryUtils}} is not annotated as a public API thus it might change as the 
project evolve. This is not an incompatible change as it is an internal 
implementation detail.

It looks like that the fix in SLIDER-923 is correct to me. Do I miss anything?

> Move WebHdfsFileSystem into hadoop-hdfs-client
> --
>
> Key: HDFS-8052
> URL: https://issues.apache.org/jira/browse/HDFS-8052
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-8052.000.patch, HDFS-8052.001.patch, 
> HDFS-8052.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8805) Archival Storage: getStoragePolicy should not need superuser privilege

2015-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682209#comment-14682209
 ] 

Hudson commented on HDFS-8805:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8284 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8284/])
HDFS-8805. Archival Storage: getStoragePolicy should not need superuser 
privilege. Contributed by Brahma Reddy Battula. (jing9: rev 
1fc3c779a422bafdb86ad1a5b2349802dda1cb62)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirAppendOp.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java


> Archival Storage: getStoragePolicy should not need superuser privilege
> --
>
> Key: HDFS-8805
> URL: https://issues.apache.org/jira/browse/HDFS-8805
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover, namenode
>Reporter: Hui Zheng
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0
>
> Attachments: HDFS-8805-002.patch, HDFS-8805-003.patch, 
> HDFS-8805-004.patch, HDFS-8805.patch
>
>
> The result of getStoragePolicy command is always 'unspecified' even we has 
> set a StoragePolicy on a directory.But the real placement of blocks is 
> correct. 
> The result of fsck is not correct either.
> {code}
> $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold  -policy COLD
> Set storage policy COLD on /tmp/cold
> $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold
> The storage policy of /tmp/cold is unspecified
> $ hdfs fsck -storagepolicies /tmp/cold
> Blocks NOT satisfying the specified storage policy:
> Storage Policy  Specified Storage Policy  # of blocks 
>   % of blocks
> ARCHIVE:4(COLD) HOT   5   
>55.5556%
> ARCHIVE:3(COLD) HOT   4   
>44.%
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-8052) Move WebHdfsFileSystem into hadoop-hdfs-client

2015-08-11 Thread Gour Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha reopened HDFS-8052:
-

Reopening because this is an incompatible change and breaks SLIDER-923

> Move WebHdfsFileSystem into hadoop-hdfs-client
> --
>
> Key: HDFS-8052
> URL: https://issues.apache.org/jira/browse/HDFS-8052
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-8052.000.patch, HDFS-8052.001.patch, 
> HDFS-8052.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8880) NameNode metrics logging

2015-08-11 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682180#comment-14682180
 ] 

Jitendra Nath Pandey commented on HDFS-8880:


+1

> NameNode metrics logging
> 
>
> Key: HDFS-8880
> URL: https://issues.apache.org/jira/browse/HDFS-8880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8880.01.patch, namenode-metrics.log
>
>
> The NameNode can periodically log metrics to help debugging when the cluster 
> is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces

2015-08-11 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-6244:
--
Attachment: HDFS-6244.v4.patch

> Make Trash Interval configurable for each of the namespaces
> ---
>
> Key: HDFS-6244
> URL: https://issues.apache.org/jira/browse/HDFS-6244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, 
> HDFS-6244.v3.patch, HDFS-6244.v4.patch
>
>
> Somehow we need to avoid the cluster filling up.
> One solution is to have a different trash policy per namespace. However, if 
> we can simply make the property configurable per namespace, then the same 
> config can be rolled everywhere and we'd be done. This seems simple enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8805) Archival Storage: getStoragePolicy should not need superuser privilege

2015-08-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8805:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: (was: 2.6.0)
   2.8.0
   Status: Resolved  (was: Patch Available)

I've committed this to trunk and branch-2. Thanks for the contribution, 
[~brahmareddy]! Thanks for reporting the issue, [~huizane]!

> Archival Storage: getStoragePolicy should not need superuser privilege
> --
>
> Key: HDFS-8805
> URL: https://issues.apache.org/jira/browse/HDFS-8805
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover, namenode
>Reporter: Hui Zheng
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0
>
> Attachments: HDFS-8805-002.patch, HDFS-8805-003.patch, 
> HDFS-8805-004.patch, HDFS-8805.patch
>
>
> The result of getStoragePolicy command is always 'unspecified' even we has 
> set a StoragePolicy on a directory.But the real placement of blocks is 
> correct. 
> The result of fsck is not correct either.
> {code}
> $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold  -policy COLD
> Set storage policy COLD on /tmp/cold
> $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold
> The storage policy of /tmp/cold is unspecified
> $ hdfs fsck -storagepolicies /tmp/cold
> Blocks NOT satisfying the specified storage policy:
> Storage Policy  Specified Storage Policy  # of blocks 
>   % of blocks
> ARCHIVE:4(COLD) HOT   5   
>55.5556%
> ARCHIVE:3(COLD) HOT   4   
>44.%
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8805) Archival Storage: getStoragePolicy should not need superuser privilege

2015-08-11 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682126#comment-14682126
 ] 

Jing Zhao commented on HDFS-8805:
-

Thanks for updating the patch, [~brahmareddy]. +1 for the 004 patch. I will 
commit it shortly.

> Archival Storage: getStoragePolicy should not need superuser privilege
> --
>
> Key: HDFS-8805
> URL: https://issues.apache.org/jira/browse/HDFS-8805
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover, namenode
>Reporter: Hui Zheng
>Assignee: Brahma Reddy Battula
> Fix For: 2.6.0
>
> Attachments: HDFS-8805-002.patch, HDFS-8805-003.patch, 
> HDFS-8805-004.patch, HDFS-8805.patch
>
>
> The result of getStoragePolicy command is always 'unspecified' even we has 
> set a StoragePolicy on a directory.But the real placement of blocks is 
> correct. 
> The result of fsck is not correct either.
> {code}
> $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold  -policy COLD
> Set storage policy COLD on /tmp/cold
> $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold
> The storage policy of /tmp/cold is unspecified
> $ hdfs fsck -storagepolicies /tmp/cold
> Blocks NOT satisfying the specified storage policy:
> Storage Policy  Specified Storage Policy  # of blocks 
>   % of blocks
> ARCHIVE:4(COLD) HOT   5   
>55.5556%
> ARCHIVE:3(COLD) HOT   4   
>44.%
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-11 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8808:

Attachment: HDFS-8808-00.patch

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-11 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8808:

Affects Version/s: 2.7.1
 Target Version/s: 2.7.2
   Status: Patch Available  (was: Open)

Submitting initial patch to trigger Jenkins and collect feedback on the basic 
idea. In the next rev I will add a unit test and the additional config as 
[~ajithshetty] suggested.

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%

2015-08-11 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682093#comment-14682093
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8859:
---

- Is the only difference between LightWeightHashGSet and LightWeightGSet that 
LightWeightHashGSet is resizable?
- It seems that some code in LightWeightHashGSet is copied from 
LightWeightGSet.  Could you change LightWeightHashGSet to extends 
LightWeightGSet?

> Improve DataNode (ReplicaMap) memory footprint to save about 45%
> 
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5274) Add Tracing to HDFS

2015-08-11 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HDFS-5274:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add Tracing to HDFS
> ---
>
> Key: HDFS-5274
> URL: https://issues.apache.org/jira/browse/HDFS-5274
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 2.1.1-beta
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>  Labels: BB2015-05-TBR
> Attachments: 3node_get_200mb.png, 3node_put_200mb.png, 
> 3node_put_200mb.png, HDFS-5274-0.patch, HDFS-5274-1.patch, 
> HDFS-5274-10.patch, HDFS-5274-11.txt, HDFS-5274-12.patch, HDFS-5274-13.patch, 
> HDFS-5274-14.patch, HDFS-5274-15.patch, HDFS-5274-16.patch, 
> HDFS-5274-17.patch, HDFS-5274-2.patch, HDFS-5274-3.patch, HDFS-5274-4.patch, 
> HDFS-5274-5.patch, HDFS-5274-6.patch, HDFS-5274-7.patch, HDFS-5274-8.patch, 
> HDFS-5274-8.patch, HDFS-5274-9.patch, Zipkin   Trace a06e941b0172ec73.png, 
> Zipkin   Trace d0f0d66b8a258a69.png, ss-5274v8-get.png, ss-5274v8-put.png
>
>
> Since Google's Dapper paper has shown the benefits of tracing for a large 
> distributed system, it seems like a good time to add tracing to HDFS.  HBase 
> has added tracing using HTrace.  I propose that the same can be done within 
> HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8818) Allow Balancer to run faster

2015-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681996#comment-14681996
 ] 

Hudson commented on HDFS-8818:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #273 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/273/])
HDFS-8818. Changes the global moveExecutor to per datanode executors and 
changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev 
b56daff6a186599764b046248565918b894ec116)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java


> Allow Balancer to run faster
> 
>
> Key: HDFS-8818
> URL: https://issues.apache.org/jira/browse/HDFS-8818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.8.0
>
> Attachments: h8818_20150723.patch, h8818_20150727.patch
>
>
> The original design of Balancer is intentionally to make it run slowly so 
> that the balancing activities won't affect the normal cluster activities and 
> the running jobs.
> There are new use case that cluster admin may choose to balance the cluster 
> when the cluster load is low, or in a maintain window.  So that we should 
> have an option to allow Balancer to run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8818) Allow Balancer to run faster

2015-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681962#comment-14681962
 ] 

Hudson commented on HDFS-8818:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2211 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2211/])
HDFS-8818. Changes the global moveExecutor to per datanode executors and 
changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev 
b56daff6a186599764b046248565918b894ec116)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Allow Balancer to run faster
> 
>
> Key: HDFS-8818
> URL: https://issues.apache.org/jira/browse/HDFS-8818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.8.0
>
> Attachments: h8818_20150723.patch, h8818_20150727.patch
>
>
> The original design of Balancer is intentionally to make it run slowly so 
> that the balancing activities won't affect the normal cluster activities and 
> the running jobs.
> There are new use case that cluster admin may choose to balance the cluster 
> when the cluster load is low, or in a maintain window.  So that we should 
> have an option to allow Balancer to run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8818) Allow Balancer to run faster

2015-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681915#comment-14681915
 ] 

Hudson commented on HDFS-8818:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2230 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2230/])
HDFS-8818. Changes the global moveExecutor to per datanode executors and 
changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev 
b56daff6a186599764b046248565918b894ec116)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


> Allow Balancer to run faster
> 
>
> Key: HDFS-8818
> URL: https://issues.apache.org/jira/browse/HDFS-8818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.8.0
>
> Attachments: h8818_20150723.patch, h8818_20150727.patch
>
>
> The original design of Balancer is intentionally to make it run slowly so 
> that the balancing activities won't affect the normal cluster activities and 
> the running jobs.
> There are new use case that cluster admin may choose to balance the cluster 
> when the cluster load is low, or in a maintain window.  So that we should 
> have an option to allow Balancer to run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8818) Allow Balancer to run faster

2015-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681918#comment-14681918
 ] 

Hudson commented on HDFS-8818:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #281 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/281/])
HDFS-8818. Changes the global moveExecutor to per datanode executors and 
changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev 
b56daff6a186599764b046248565918b894ec116)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Allow Balancer to run faster
> 
>
> Key: HDFS-8818
> URL: https://issues.apache.org/jira/browse/HDFS-8818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.8.0
>
> Attachments: h8818_20150723.patch, h8818_20150727.patch
>
>
> The original design of Balancer is intentionally to make it run slowly so 
> that the balancing activities won't affect the normal cluster activities and 
> the running jobs.
> There are new use case that cluster admin may choose to balance the cluster 
> when the cluster load is low, or in a maintain window.  So that we should 
> have an option to allow Balancer to run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8865) Improve quota initialization performance

2015-08-11 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8865:
-
Attachment: HDFS-8865.v2.checkstyle.patch

Missed the one checkstyle warning. 

> Improve quota initialization performance
> 
>
> Key: HDFS-8865
> URL: https://issues.apache.org/jira/browse/HDFS-8865
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
> HDFS-8865.v2.patch
>
>
> After replaying edits, the whole file system tree is recursively scanned in 
> order to initialize the quota. For big name space, this can take a very long 
> time.  Since this is done during namenode failover, it also affects failover 
> latency.
> By using the Fork-Join framework, I was able to greatly reduce the 
> initialization time.  The following is the test result using the fsimage from 
> one of the big name nodes we have.
> || threads || seconds||
> | 1 (existing) | 55|
> | 1 (fork-join) | 68 |
> | 4 | 16 |
> | 8 | 8 |
> | 12 | 6 |
> | 16 | 5 |
> | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed

2015-08-11 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8863:
-
Attachment: HDFS-8863.v2.patch

Attaching new patch.

> The remiaing space check in BlockPlacementPolicyDefault is flawed
> -
>
> Key: HDFS-8863
> URL: https://issues.apache.org/jira/browse/HDFS-8863
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
>  Labels: 2.6.1-candidate
> Attachments: HDFS-8863.patch, HDFS-8863.v2.patch
>
>
> The block placement policy calls 
> {{DatanodeDescriptor#getRemaining(StorageType to check whether the block 
> is going to fit. Since the method is adding up all remaining spaces, namenode 
> can allocate a new block on a full node. This causes pipeline construction 
> failure and {{abandonBlock}}. If the cluster is nearly full, the client might 
> hit this multiple times and the write can fail permanently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed

2015-08-11 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681832#comment-14681832
 ] 

Kihwal Lee commented on HDFS-8863:
--

bq.  it should just return current storage remaining space instead of get the 
maximum remaining space of all storages
Datanodes only care about the storage type, so checking a particular 
storagewon't do any good. It will just cause block placement to re-pick target 
more.

bq. Another issue, getBlocksScheduled is for storage type, not for per storage.
Tracking scheduled writes per storage is not going to solve the problem since 
datanodes are free to choose any storage as long as the type matches. Trying to 
achieve precise accounting will have diminishing return as there are 
uncertainties around actual storage being used, blocks being abandoned, control 
loop delays (heartbeats), etc.

What if we let it check against storage type level sum and also make sure there 
is at least one storage with enough space?  I actually had a version of patch 
that does just that.  I will remove unused method and post the patch.

> The remiaing space check in BlockPlacementPolicyDefault is flawed
> -
>
> Key: HDFS-8863
> URL: https://issues.apache.org/jira/browse/HDFS-8863
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
>  Labels: 2.6.1-candidate
> Attachments: HDFS-8863.patch
>
>
> The block placement policy calls 
> {{DatanodeDescriptor#getRemaining(StorageType to check whether the block 
> is going to fit. Since the method is adding up all remaining spaces, namenode 
> can allocate a new block on a full node. This causes pipeline construction 
> failure and {{abandonBlock}}. If the cluster is nearly full, the client might 
> hit this multiple times and the write can fail permanently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8884) Fail-fast check in BlockPlacementPolicyDefault#chooseTarget

2015-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681818#comment-14681818
 ] 

Hadoop QA commented on HDFS-8884:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 10s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 41s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 21s | The applied patch generated  4 
new checkstyle issues (total was 58, now 56). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 7  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 22s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 29s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  3s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 175m 17s | Tests failed in hadoop-hdfs. |
| | | 218m 58s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12749787/HDFS-8884.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fa1d84a |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11961/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11961/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11961/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11961/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11961/console |


This message was automatically generated.

> Fail-fast check in BlockPlacementPolicyDefault#chooseTarget
> ---
>
> Key: HDFS-8884
> URL: https://issues.apache.org/jira/browse/HDFS-8884
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8884.001.patch
>
>
> In current BlockPlacementPolicyDefault, when choosing datanode storage to 
> place block, we have following logic:
> {code}
> final DatanodeStorageInfo[] storages = DFSUtil.shuffle(
> chosenNode.getStorageInfos());
> int i = 0;
> boolean search = true;
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); search && iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (i = 0; i < storages.length; i++) {
> StorageType type = entry.getKey();
> final int newExcludedNodes = addIfIsGoodTarget(storages[i],
> {code}
> We will iterate (actually two {{for}}, although they are usually small value) 
> all storages of the candidate datanode even the datanode itself is not good 
> (e.g. decommissioned, stale, too busy..), since currently we do all the check 
> in {{addIfIsGoodTarget}}.
> We can fail-fast: check the datanode related conditions first, if the 
> datanode is not good, then no need to shuffle and iterate the storages. Then 
> it's more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >